I've been speaking to and teaching people all over the world how to use Zeek (Bro) effectively for years. In that time, I continue to be amazed that more organizations and professionals are really only using Zeek as yet-another-log-source. In my experience, the SIEM is the thing that we buy so that we feel better about the fact that we aren't looking at our logs. I don't know about you, but I'm not desperate to add YALS to my SIEM unless it can provide real value.
Certainly, Zeek logs are reach with data. These logs can provide a decent metadata view of activities within the network. However, if all you are doing is pushing the logs to the SIEM, I think you might be using the tool wrongly.
So what is the right way? Behavioral analysis and correlation. "We do all of that with our SIEM," I hear you say. Good for you. You are helping to feed a SIEM engineer. Being a SIEM engineer is a pretty thankless and tedious job. Your mission is to write parsers and converters to allow the SIEM to digest all of the log sources that you have, and to teach your SIEM how to do interesting things and how to analyze all of that data. It's a difficult job.
Part of what makes this a difficult job is that we fail to do the correlation close to the data! Think about this: How difficult would it be to build a correlation in a SIEM that puts together twenty or thirty disparate events to identify some activity that a user is engaged in? This wouldn't simply be a matter of writing a quick correlation; it would require that we are able to attach meaning to all of those different events and that we are able to draw them together in a meaningful way within the SIEM.
On the other hand, what would happen if we did the correlation right at the system that is generating the events? Certainly, this isn't always possible, which is when a SIEM really should come into play. But if I have a source of network intelligence (like Zeek) that can interpret network events in a meaningful way, and which provides a phenomenal scripting language, how much easier would it be to perform the correlation there and then simply inform the SIEM that something bad is happening?
So, then, I'm suggesting that we should do as much correlation as possible at the data source, when it is easy to interpret. To me, that is the right way to use Zeek.
If you want to see examples of what I'm talking about here, stay tuned for the next article on Zeek, which will include a simple but handy example that would be nearly impossible to accomplish with a SIEM.
David Hoelzer is the author and maintainer of theĀ SANS SEC503 Advanced Intrusion Detection course, the leading class for advanced network analysis in the industry. With more than 30 years of experience in information technology and security, he is the author of and a contributor to a number of open source defensive tools. In addition to acting as the Chief of Operations for Enclave Forensics, Inc., an incident response, secure coding, and managed services corporation, David is also the Dean of Faculty for the SANS Technology Institute (STI).