Log collection and processing companies understand that collecting massive amounts of messages is the easy part. Pouring and sifting through the data to perform investigative routines or some type of security log analysis can bring many log reporting systems to their knees. Often times the primary reason extreme processing power is needed to compile the results is due to the unstructured nature of the messages piles.
When I was a kid and my mom asked me to fold 3 loads of laundry that had been piled onto the couch, I often sorted it first into categories so that things like matching ‘white’ socks became an easier task. Well, if you apply this logic to log management, it makes sense to put all messages into some type of similar log structure. By using something like IPFIX to format and export the messages, the logs can be queried easier which ultimately results in much faster processing times for applications such as Splunk and Scrutinizer.
IPFIX has benefits over other message types (syslogs, SNMP Traps, event logs, etc.) because it breaks up the messages into well-defined elements. For example, if we are breaking apart syslogs, facility and severity would be two different elements. Also, inside a syslog is the actual message which is typically a variable length text string. In that text string, the order of the contents can differ greatly between vendors not to mention lack of delimitation to determine how the different portions of the message should be separated. Basically, this portion of the message is the wild west where vendors export anything they want. This unstructured format is what can lead to error prone slow queries. Some Splunk issues are caused by this.
IPFIX still allows vendors to export anything they want but, they have to do it neatly, orderly and specify the format and contents of the data by putting it into elements. Many elements are standardized in fact, there are hundreds of them. IPFIX exporting Vendors look for standard elements before they start specifying unique ones. This is what we mean by structured data. A structured data query looks for something like all messages that include an IP address of 10.1.1.5. With IPFIX, there is only one element across all vendors that carries this field. With syslogs, traps and event logs, the query has to go looking for the IP address in each message. This leads to slow queries and missed messages. In accurate and slow reports is not what you want.
If you don’t have the luxury of being able to take advantage of IPFIX, you have a couple of options:
- IPFIXify - is a free utility that allows hardware vendors and end consumers to export anything they want via IPFIX. For example, it can be installed on Microsoft servers to export event logs or it can be run as a service on any 64 bit OS and using a configuration file, it will export all machine logs as IPFIX.
- Flow Replicator – is an appliance which acts as a gateway for machine messages. For example, it rips apart syslogs and event logs and sends them off in structured format inside IPFIX datagrams to a collector like Scrutinizer for further processing.
Below is a diagram outlining a typical configuration.
Cisco Systems and SonicWALL were some of the first vendors to take advantage of IPFIX for the above purpose. Make sure you keep your ears open for this emerging technology.