With studies showing cybercrime will cost the world over US$ 5 trillion a year by 2024, cyber criminals continue to breach organisations using a myriad of techniques, with application vulnerabilities corresponding to 25% of all exploitable attack vectors. At the same time, most of these attacks can go unnoticed for a while, resulting in loss of trust towards an organisation.
In light of this, IBM Research announced the open-sourcing of SysFlow, a new system telemetry format and tool suite for monitoring system behaviour for scalable security, compliance and performance analytics.
According to a blog post on IBM Research, SysFlow encodes the representation of system activities into a compact format that records how applications interact with their environment. It connects process behaviours to network and file access activities, providing a richer context for analysis.
This additional context facilitates deeper visibility into host and container workloads and enables a stream of cloud workload protection use cases, including container runtime integrity protection, threat hunting, and forensics. While telemetry of system event information is not new, current monitors collect data at system call granularity, generating massive amounts of data that limit analytics to simple rule-based approaches.
SysFlow drastically reduces data collection rates by orders of magnitude and lifts events into behaviours which enable forensic applications, and more comprehensive analysis approaches. Furthermore, SysFlow’s open serialization format and libraries enable integrations with open source frameworks (e.g., Spark, scikit-learn) and custom analytic microservices.
According to the blog post, SysFlow can be used to uncover a targeted attack in which a cybercriminal exfiltrates data from a cloud-hosted service. For example, during reconnaissance, the attacker detects a vulnerable node.js server that is susceptible to a remote code execution attack exploiting a vulnerability in a node.js module. The attacker exploits the system using a malicious payload, which hijacks the node.js server and downloads a python script from a remote server. The script contacts its command-and-control server and then starts scanning the system for sensitive keys, eventually gaining access to a sensitive customer database. The attack completes when data is exfiltrated off-site.
While state-of-the-art monitoring tools would only capture streams of disconnected events, SysFlow can connect the entities of each attack step on the system. This example showcases the advantages of applying flow analysis to system telemetry. SysFlow provides visibility within host environments, by exposing relationships between containers, processes, files, and network endpoints as events (single operations) and flows (volumetric operations).
As mentioned in the blog post, SysFlow is an ongoing research project, and IBM welcomes feedback and contributions from the community. Unlike other telemetry sources, SysFlow observes and correlates essential system activity, providing security teams with the necessary contextual information to identify cyberattacks and close security incidents quicker, without overwhelming analysts with disposable noise.
As the SysFlow project matures, IBM’s goal is to contribute an open standard and data representation for system telemetry that may be adopted across the industry.