Threat Level: green Handler on Duty: Didier Stevens

SANS ISC: Incident-response without NTP - SANS Internet Storm Center SANS ISC InfoSec Forums


Sign Up for Free!   Forgot Password?
Log In or Sign Up for Free!
Incident-response without NTP

 

While we patiently await the arrival of this month's patches from Microsoft (and everyone else who publishes today) I have a little thought experiment for you. We all know that the internet doesn't work too efficiently if DNS isn't working or present. NTP is just as critical for your security infrastructure. Without reliable clock synchronization, piecing together what happened during an incident can become extremely difficult.

Consider a hypothetical services network and DMZ: there's an external firewall, a couple of webservers, an inner firewall with a database server behind it. Let's also assume that something bad happened to the webservers a couple of months ago and you've been brought in as a consultant to piece together the order of events and figure out what the attacker did. The web administration team, and the database team, and the firewall team have all provided your request for logs and you've got them on your system of choice.

More About NTP

For a complete background on NTP I recommend: http://www.ntp.org/ntpfaq

There are two main types of clock error that we are concerned with in this example:

  • Clock Skew  also called Accuracy, determines how close a clock is to an official time reference.
  • Clock Drift or the change in accuracy over time.

Common clock hardware is not very accurate; an error of 0.001% causes a clock to be off by nearly one second per day.  We can expect most clocks to have one second of drift every 2 days.  The oscillator used in computer clocks can be influenced by changes in local temperature, and the quality of the electricity feeding the system.

Today's Challenge

How do you begin order the events between the systems?  First I'll solicit general approaches via comments and email, later I'll summarize and provide some example data to illustrate the most popular/promising approaches.

Kevin Liston

292 Posts
ISC Handler
Thanks for addressing this. Incredibly frustrating for an incident responder to be presented with logs from system administrators who think that time synchronization isn't "worth the time" to configure.
Anonymous
We have a local NTP server. It's a commercial appliance that sync's to GPS. Not that expensive, the only hassle was getting the antenna installed. What drives me nuts is trying to get all the devices reporting in the same time zone. Any more, if I don't find the data I'm looking for, I just check +/- the number of hours we are different from UTC.
John

88 Posts
We have several stratum 1 NTP servers in two different geographies (using GPS receivers). They're also peered with each other (which helps make them both a bit more stable). Then we have a larger number of stratum 2 NTP servers which are slaved to the two stratum 1 servers and peered with each other. It's important to make sure they allow ntp clients to connect but ignore any updates or time info from anyone but the peers and upstream clock sources.

Finally, all of our servers' clocks are monitored via nagios (I just recently uploaded a nagios plugin to exchange.nagios.org for comparing a local clock against multiple NTP servers and reporting the average offset - search for "sgichk" if you want it). For servers whose logs are most critical (email relays, DNS servers, etc) nagios throws an alarm if they're clocks are off by a fraction of a second. Less critical systems we don't throw an alarm unless the clock is off by a few seconds and/or unless the clock is off for a lengthier period of time.

Lastly, Linux VMs can be a pain to keep their clocks in sync. You may need to boot with special kernel tweaks and/or special options set in ntp.conf (vmware has a whole page about this). And what vmware needs is dependent from one flavor/version of linux to another. (whether or not one uses VDR seems to play a role too)
Brent

116 Posts
One other thing to keep in mind is differences in timezone. When you have different servers in different geographies, one way to keep their logs in sync is to configure them all to be UTC (or all the same timezone at least). This may have other ramifications, of course (how many web apps let you tell 'em to use dates/times from a timezone different than the system's configured timezone).

Another possibility is using something like syslog-ng on a centralized server to automagically adjust the times on syslog data coming from a server with a different timezone to the timezone of the centralized server.

Basically, the ideal is to have all of your logged events using times in the same timezone so you don't have to do the mental arithmetic while looking at logs for different servers/devices/services during a crisis. Whether this is done by making all the systems have the same timezone or tweaking the software writing the logs to write events using the same timezone (or some other mechanism) is up to you.
Brent

116 Posts
Also note the effect of DST/non-DST changes on displayed date/time. Windows stores most date and time information in UTC, such as the eventlog, registry key change date/time and NTFS: alle date-time info is displayed using local time (note: in FAT/FAT32 filesystems date/time info is stored and shown in local time).

In reality, what Windows does when DST comes into effect, is add one hour to the local timezone.

Hence a timestamp that was written at 23:30 local _winter_ time may appear to have taken place the next day if you look at it in summertime.

See for example http://ask-leo.com/why_do_file_timestamps_compare_differently_every_time_change.html
Erik van Straten

122 Posts
I've had some luck stitching together the logs using the ssh bruteforce background radiation.
Erik van Straten
39 Posts
First off, I think the phrase "ssh bruteforce background radiation" almost had me on the floor laughing. I suggest you submit it to Randall Munroe and see if it makes it into an xkcd.

Second, 1/1/1972 was a long time ago. I thought Tim Berners Lee was responsible for developing the first web servers at CERN in the early 90s. Wikipedia places it on Aug 6, 1991. So perhaps the records you are getting are a bit suspect. The epoch values are also a little strange - if those are supposed to be Unix epoch timestamps, those should be for Sun Jan 4 01:03:10 1970 or thereabouts, which is even earlier. Hell, all of this even predates the TCP/IP specifications. If y'all are using time machines, you should be aware that they have a tendency to wreak havoc with NTP!

Third, if you have servers in different timezones, it's possible they are also geographically separated. Since you have a single firewall, there could be routing delay issues that further complicate things. I support machines (luckily not servers, just workstations) that are on the other end of satellite links with 600ms latency. Under heavy congestion, I regularly see ping values in the 3 to 5 second range. That could cause some log skew!
Anonymous
the 1/1/1972 dates is for obfuscation purposes.

I am fond of using SSH brute force scans to sync, I'll pull those out of the logs and see how that pans out...
Kevin Liston

292 Posts
ISC Handler
For linear time changes changes, I know you can use Wireshark's editcap tool to massage the time in pcap files.
For web server logs or Windows Event Logs (or really any delimited text-file with a time column), you could work something out with Microsoft's LogParser. Seriously, LogParser is an incredibly useful tool for all sorts of log manipulation and extraction.
Jasey

93 Posts

Sign Up for Free or Log In to start participating in the conversation!