Skip to main content

Mining for Gold: Flow Logs as a Security Resource

by Martin Roesch

The big opportunity you have as a startup is to change the way that people think and solve really hard problems. It’s an exciting challenge and what’s driven me throughout my career. 

As you begin down this path, a useful exercise is to examine why things are the way things are. At Netography, one question we asked was why aren’t organizations incorporating flow logs—a highly valuable and readily available security resource—into their security stack? From there, we strove to become experts on flow and its applications because we saw an opportunity to use it to replace the aging and increasingly inefficient Deep Packet Inspection (DPI) architectures. Gaining a deep understanding of why this is a persistent problem enabled us to devise an approach that allows customers to harness the value of flow data, often considered security gold. 

Operational challenges with flow logs

The reality is that traditional approaches for network security that rely on DPI don’t work very well in cloud environments. And while inspecting cleartext packets remains an invaluable method for threat detection, being able to analyze traffic once it’s encrypted gets computationally, operationally, and therefore, financially expensive. Flow logs are a powerful supplement to existing security measures in modern multi-cloud or hybrid networks and fill ever-widening gaps in our ability to observe activities, threats, and compromise at the network level.

Recognizing the potential value of flow log data, many organizations want to leverage it and end up dumping their VPC flow logs into an S3 bucket or sending cloud flow logs to a data lake in whatever format it comes in. Simply storing flow data without making it actionable creates significant expenses and limits the useability of the data.

When SecOps, CloudOps and NetOps teams try to mine the data they encounter several problems:

  • Unwieldy amount of flow log data. In most cases, the size and volume of flow log data exceeds the aggregate volume of all other data sources combined. The possible exception is EDR logs which are typically managed by proprietary systems. Dumping all that data into a data lake slows performance and increases data storage and query costs.
  • Lack of data standardization. No cloud flow log standards exist so each cloud provider offers a version of flow logs with differences in the type of data provided, the format, and timeliness. This creates a huge normalization challenge that requires a deep understanding of the data each cloud provider supports and creativity to make it usable. Switching cloud providers or operating in a multi-cloud environment adds more pain.
  • Queries are sluggish. Bringing in data “as is” puts the onus on the user to build logic into their queries. Depending on how complex that query is and the number of disparate data sources that have to be pulled together for context, it can take hours to get answers. Customers have told us that with certain platforms they used in the past, they would have to run reports nightly, which doesn’t work when you need to know what’s happening in your environment right now.
  • Limited threat hunting capabilities. The threat hunting process requires the capability to go back in time and replay network interaction for a detailed analysis of what has happened. This helps to determine whether incident response and containment measures have been effective or to gather further insight into what has actually happened. If data is discarded or simply stored “en masse” in a storage service, it has limited to no value to the threat hunting process.

We used these insights into the challenges organizations face when trying to incorporate flow logs into their security stack to architect a solution that is capable of handling the scale and complexity of flow log data and extracting critical insights, without compromising performance or incurring prohibitive costs. 

How Netography extracts the value

The Netography Fusion® platform aggregates and normalizes cloud flow logs from all five major cloud providers (Amazon Web Services, Microsoft Azure, Google Cloud, IBM Cloud, and Oracle Cloud) and flow data (NetFlow, sFlow, and IPFIX) from routers, switches, and other physical or virtual devices. 

We ingest data from the entire multi-cloud or hybrid network so that as flow data arrives, we enrich it with operational context to provide a real-time, contextualized picture of what is happening. Context comes from applications and services in your existing tech stack, such as asset management, CMDB, EDR/EPP, CSPM, and vulnerability management systems. The context can include dozens of attributes about the participants in the activities the flow logs are recording, including things like user information, location, asset types and names, vulnerabilities, and application data.

We have created hundreds of detection models to alert you to real-time security threats happening in your environment. We also leverage flow data to detect anomalous activity and potential signs of compromise, for example: 

  • Unauthorized access attempts and lateral movement
  • Unusual communication patterns
  • Data harvesting before exfiltration
  • Internal misuse and policy violations
  • Network scanning and enumeration
  • Unusual data transfer rates and protocols
  • Configuration errors and network mismanagement

As a 100% SaaS platform, Fusion can start ingesting flow logs in minutes and it operates at scale which means you get access to meaningful detections in seconds versus hours. We are able to store context-enriched flow data for months to give you powerful look-back capabilities to support threat hunting, investigation, and incident response. You can also selectively send the pre-processed flow data to your data lake and format it for SIEM integration which addresses the storage and query costs and useability issues of raw data.

By transforming raw flow logs into actionable insights and seamlessly integrating them into your security stack, Netography enhances visibility, optimizes resource use, and reduces costs. This approach ensures that you’re able to mine the security intelligence gold within flow logs to improve detection and response across your multi-cloud or hybrid network.

To learn more about the value of flow logs in security: Download our solution guide: Solving The Multi-Cloud Flow Log Problem with Netography Fusion