Repurposing Logs

Published: 2015-03-24
Last Updated: 2015-03-24 16:11:42 UTC
by Kevin Liston (Version: 1)
3 comment(s)

Keeping an eye on your logs is critical (really, it's number 14 on the SANS critical list of controls: https://www.sans.org/critical-security-controls/control/14 .)  Earlier Rob VandenBrink shared some techniques to find nuggets hiding in your logs (https://isc.sans.edu/forums/diary/Syslog+Skeet+Shooting+Targetting+Real+Problems+in+Event+Logs/19449/ .)  Today I'm going to share some tricks to squeeze every last bit out of your logs through repurposing logs.  I mean repurposing log files, not this: https://www.pinterest.com/dawnreneedavis/repurposed-logs/ .

Logs are given their original purpose when programs determine when and how they're going to record a log entry.  Today I want to discuss "unintended value," or how to get more out of your logs than the programmers intended, or how to recover value that is easily overlooked.  

Let's start with an example.  Suppose you work in a large siloed environment and you don't have access to the logs from every group.  You're in a security or investigative function, and have access to the AV logs.  The obvious use of the logs is to record the alerts generated by the endpoints, or find machines that aren't updating signatures properly or are have detection engines that are out of date.  A bit that you might be overlooking is the value of the checkin message itself.  I've found it very useful to keep the checkins for a long period of time, which gives you a history of what IP and what user was logged into a machine when it regularly checks in.  It doesn't have the resolution and accuracy that you would get from you AD authentication logs, or your DHCP logs, but you might not have easy access to those.  This small investment in disk space, or simple database can give you quick snapshot views of machine and user mobility.  You can easily see if this desktop consistently has this IP, or if this laptop moves around through your campus.  You can get the same feel out of your user accounts too, without having to invasively dig through badge access logs.

This is the first technique that I want to share: extract a daily event out of your logs and store it over time.  This creates an additional product that keeping a rolling history of logs can't provide.  

Now consider what hidden and unexpected information might be hiding in your web proxy logs.  Take a look at the W3C standard fields.  If you reduce the displayed fields down to just timestamp, c-ip, r-host, and r-ip, you've got yourself a quick passive-DNS feed.  Granted it's just looking at web traffic, but a good chunk of your network mischief is traveling through that channel at least once.

Trick number two: look for unexpectedly-useful combinations of columns in your log entries.

On to number three: data reduction and indexing.  Logs are big, and logs are noisy.  While I recommend that you keep the raw logs for as long as you can, I understand that isn't possible and that you have to make tough choices on what you store and for how long.  One way to squeeze out more time from your logs is to reduce the number of columns that you keep for your archives.  Using the web proxy logs as an example, you might not be able to keep every log entry for 24 months, but keeping just the c-ip,r-host,r-ip columns can be very helpful when you're looking back through an old undiscovered compromise or are dealing with an information request like "has any system on your network interacted with one of these IPs?"

Years ago I would recommend further daily reduction and indexing of these files, but these days you probably have a splunk instance or an ELK stack (https://digital-forensics.sans.org/summit-archives/dfirprague14/Finding_the_Needle_in_the_Haystack_with_FLK_Christophe_Vandeplas.pdf) and you just dump logs in there and hope that magic happens.  There's value in examining and repurposing logs in these days of map reduce.  The reduced files that you create from the logs are easy to drop into your hadoop cluster and build a hive table out of.

 

So, let's tie this all together.  You've received your list of IPs from your intelligence vendor and you're tasked with finding any activity on your network over the past 2 years.  In your web proxy index you see that you had a hit 8 months ago.  Now you've got an IP and and date, what machine had that IP then?  Now you search through your AV checkin data and get machine name.  But the AV checkin logs are daily, not logged by minute, so you search around for the IP history of that machine in the AV logs and hopefully you see it consistently checking in from that IP and not moving around a lot.  If you're not so lucky, well, it's time to open up request tickets to hopefully get at the DHCP logs from back then.

One last parting thought: do you have waste/useless logs?  If you apply one or more of these techniques to it, can you find a way to process them into something useful?

-KL

 

 

 

 

 

Keywords:
3 comment(s)
ISC StormCast for Tuesday, March 24th 2015 http://isc.sans.edu/podcastdetail.html?id=4409

Comments

What's this all about ..?
password reveal .
<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure:

<a hreaf="https://technolytical.com/">the social network</a> is described as follows because they respect your privacy and keep your data secure. The social networks are not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go.

<a hreaf="https://technolytical.com/">the social network</a> is not interested in collecting data about you. They don't care about what you're doing, or what you like. They don't want to know who you talk to, or where you go. The social networks only collect the minimum amount of information required for the service that they provide. Your personal information is kept private, and is never shared with other companies without your permission
https://thehomestore.com.pk/
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> nearest public toilet to me</a>
<a hreaf="https://defineprogramming.com/the-public-bathroom-near-me-find-nearest-public-toilet/"> public bathroom near me</a>
https://defineprogramming.com/
https://defineprogramming.com/
Enter comment here... a fake TeamViewer page, and that page led to a different type of malware. This week's infection involved a downloaded JavaScript (.js) file that led to Microsoft Installer packages (.msi files) containing other script that used free or open source programs.
distribute malware. Even if the URL listed on the ad shows a legitimate website, subsequent ad traffic can easily lead to a fake page. Different types of malware are distributed in this manner. I've seen IcedID (Bokbot), Gozi/ISFB, and various information stealers distributed through fake software websites that were provided through Google ad traffic. I submitted malicious files from this example to VirusTotal and found a low rate of detection, with some files not showing as malware at all. Additionally, domains associated with this infection frequently change. That might make it hard to detect.
https://clickercounter.org/
Enter corthrthmment here...

Diary Archives