Monday morning incident handler practice

Published: 2011-07-25
Last Updated: 2011-07-25 03:12:44 UTC
by Chris Mohan (Version: 1)
13 comment(s)

This is a hypothetical scenario to get the old grey matter thinking on how you, the incident handler, would respond. To make this a piece of light entertainment when sipping coffee, just focus on these three phase, containment, eradication and recovery, of the six step incident handling process. Feel free to apply your own incident response plans to this scenario and I don’t expect anyone to post their answers to the questions. This is simply something to warm up the brain after the weekend – or help those recover after the week that was SANSFire.

The Scenario:

A very popular news web site is compromised and the front page is offering up known malware, AB, to any that visit it. You first discover this as the AV console frantically starts receiving notifications from client machines visiting the infected site. A quick bit of research reveals the malware AB exploits a vulnerability in only Internet Explorer 6 and then attempts to phone home uploading the compromised machine's IE protected storage details to any one of 30 drop web sites via http. If the malware infects the system, it then attempts to download, via FTP and https, a .exe file containing more malware designed to hunt over port TCP 445 for machines without patch MS08-067 (KB 958644) to drop a hidden .exe file on %SYSTEMROOT%/System32. The AV companies released a signature file to detect and protect against this three weeks ago.

You're the lone security person for a company of 5,000 employees, over 10 sites. The standard operating system is Windows XP with version of Internet Explorer from 6-8. The IT team use Microsoft’s System Center Configuration Manager to manage the Windows systems and deploy software and patches. You are the firewall and AV admin and the IT support staff are competent but over worked and under resourced. Two of the ten sites have no IT staff on site.

The Problem:

Over half your company, including all of senior management, visit that site daily to keep themselves informed or read the gossip of the day. From the IT team's best estimates at least 3000 machines have IE 6 and roughly 300 of those machines probably have don't have the right level of AV definitions on them to protect against it for any number of reasons. The news web site isn’t going to be able to remove the malware distribution for up to 12 hour. 10 server systems you know of don't have the MS08-067 patch due to operational issue with supporting from a 3rd party vendor but requires TCP 445 to be available to internal systems.

The AV console currently has 1200 alerts and growing by the minute.

The Questions:

  •     What do you do to contain this incident?
  •     How can you identify infected machine?
  •     What do you do with infected machines?
  •     How can you identify any other at risk machines?
  •     How can you protect the 10 servers without MS08-067?
  •     What information do you communicate to staff, IT and management?

Chris Mohan --- Internet Storm Center Handler on Duty

13 comment(s)

Comments

What do you do to contain this incident?
1. block the infected site's traffic at the firewall
2. block all FTP and HTTPS outbound traffic. (Assuming you dont know the 30 drop servers. - if you do, only block that traffic)
3. if infected machines are on specific segments, consider isolating the segments.
4. Since AV is detecting and protecting the machines, update the AV signatures on those machines.
How can you identify infected machine?
1. by watching the traffic at the firewall for connection attempts to malicious sites. (both the initial news site and the 30 malware download sites)

What do you do with infected machines?
1. Analyze the malware and identify all of its characteristics and then come up with a remediation plan. (which may be as extreme as re-imaging the systems)
2. ask the users of those machines to change their passwords for the sites they may have logged on to. I would also ask them to change their domain passwords if they use that password on any other systems (whose credentials may have been saved in the protected storage area)

How can you identify any other at risk machines?
by malware analysis, on should be able to determine the threat.

How can you protect the 10 servers without MS08-067?
Update the AV on the system and make sure AV is scanning every file at "disk-writes"

What information do you communicate to staff, IT and management?
that it may be a long week and we may need to re-image all infected systems - ask them to get their backup images and ghost/pxe ready to go. I would also work closely with the Network group on the containment and remediation.
For management, I would tell them exactly what happened in layman's terms and make them understand that we may need to take down some portions of the network for remediation (Including the 10 servers)

your 1. should just last until you can route all traffic for that site through a proxy that silently drops the malware from the pages it loads
What do you do to contain this incident?
Use any resources available that allow blocking on this infected AB site. This AB news site, while used by everyone, does not appear to be business critical. Using of firewalls, content filters, and even manual DNS modifications to get a block in place to prevent new infections is containment.

How can you identify infected machine?
Use central management of your AntiMalware solution to detect infected clients using the latest virus update files.
Use firewall logging to detect connections to the AB website and other 10+ malicious systems.
(I dont work for them, but have used them a bunch). A solution like NetWitness would allow for a central view of network visibility. This used with its Informer reporting would allow you to have proactive monitoring and alerting when new systems access the site. This solution (Wireshark on steroids) would aid in a deeper analysis of exactly what the malware is, how its moving around...etc.

What do you do with infected machines?
VLAN them to a separate ares to prevent infection of other corporate machines. This VLAN may be part of a "remediation VLAN" that has access to certain network devices, like AV server, Windows Update, SCCM, WSUS, Repository for patches. The goal is once contained and the infection is no longer spreading from these machines is to remediate .

How can you identify any other at risk machines?
Utilize patch management systems like WSUS (or any other commercial solution) for reports on systems not patched from this vulnerability.
If no patch management is available or not widely used, perform Vulnerability scanning accross your network (i.e. Nessus , Rapid7...etc) to detect vulnerable machines.
Use central AntiMalware consoles to detect machines not running the latest virus files, or not running AV in general, address them.

How can you protect the 10 servers without MS08-067?
Firewall these servers. Most networking switches now offer firewall modules, so you can perform firewall on specific ports. Servers in general shouldn't be connecting to the internet at all , or directly. Use proxy servers, content filters, and segment these servers into a separate VLAN if possible (this can be a lot of work depending on what they talk to in the back end). Get AV on them if possible, but you'll need to tune a policy that doesnt kill performance.

What information do you communicate to staff, IT and management?
The importance of documentation! If this happens to your company, and you are running around trying to identify things or figure out how to identify things, you might need stronger documentation. In a perfect world, if this were to happen, there should be some documents that show the configuration of servers, vlans, what they talk to...etc.

From a security standpoint of a security guy to other non-security focused IT staff...

- The importance of patching. Patch management solutions, especially WSUS, dont always work. They rely on the client side windows update service to be running properly, and firewalling. Just because a system is configured to use WSUS doesnt mean it will work. Review logs of patch management.
- The importance of AV. Read the logs for central managament. Get alerts. Address all alerts as if it were serious.
- The importance of proxy servers or content filters. This is a management buy-in thing though. Companies look at them as a way to block employees from doing fun stuff. Stress how a solution like this can be used to block the bad stuff.
- The importance of getting an IRP/IRT in place. No one wants to be woken up at 1am, or sitting in work after 5pm not knowing what to do, how to do it, who to see for updates and reporting to. Define a plan an people for when this stuff happens.
- The importance of best security practice. This contains a lot of stuff, but I've used incidents like this example to reinforce things like:
* Normal AD accounts shouldnt be doain admin accounts, seperate the two.
* Use good passwords. Good passwords are not the standard 8 digits, 1 upper, lower, 1 number that all our password policies are :) Use passphrases...etc.
* Lock systems when walking away.
* Dont use corporate machines (heck, even home machines) to browse "suspicious" websites.
* Dont torrent/p2p (the bad kind). This is just asking for bad stuff to happen.
* Keep a clean system. Use Secunia to keep tabs on everything you need to update on your system.
* When at all possible, use UAC and never be logged in as a local admin. The "Run as" feature is awesome :) Most malware needs to install and spread because of admin permissions.
EZ 1 -- role switch to a back in time box. even with understaffed as described (typical) there is not reason to shoot yourself in the head (unless you really want to) ....just fail-over to the snapshot you have one nanosecond before "it" happened. If you are reading this you already know how to do this with cheap hardware and little or NO staff. --- if not call 415 - 515-9445 ....I will tell you have for free. Oh, if you want to preserve the event forensically - setup your role switch with a forensic (no touch) flag on your IPS.
Honestly, with that kind of security staffing the company doesn't care about security. First on my list would be looking for a new job. And second would be a "I told you so" memo to my boss/CEO.
If you are in the U.S.: After the hubub dies down write a letter to the FBI, FTC, your Congress person, your Senators and anyone else you can think of to complain that the Federal Government is doing almost nothing to stop these people.
> 10 server systems you know of don't have the MS08-067 patch due to operational issue with supporting from a 3rd party vendor but requires TCP 445 to be available to internal systems.

Then these servers are offline until the storm clears.

I'll block the feeder sites and the news site but it won't do any good.
Another vote for a proxy firewall.

Squid is fast and free and supports this in various ways:

wiki.squid-cache.org/SquidFaq/ContentAdaptation
I *love* nduda78's containment VLAN idea and may steal it :)
I'd put workstations in there, where they won't be able to reach the non-MS08-067-patched servers or anything business-critical, and won't be able to leak any data out. Once in there, perhaps direct the user to a secure/patched, catch-all HTTP service that explains that their workstation is compromised, and what they can do about it in the meantime. This notification/user education part is very important; the handler will be too busy during this to assist each user individually or promptly, and the more that users can do to clean this up for themselves or each other, the better.
Infected workstations need to be identified and put into containment *quickly* for this to work, though, so it must be scripted somehow. They can be identified from AV alerts, by hits on the original malware source URL (implying the IE6 exploit succeeded, so they need to upgrade), or hits on the download sites of its additional payload (implying infection); or an alert from some internal IDS or honeypot (suggesting active MS08-067 exploit/scanning). The time and reason why a workstation gets moved to containment should be logged, as well as any additional detections whilst the workstation is already in containment, for later review when cleaning up the workstations.
Much of this can be prepared in advance in case of a future incident like this.
@ Steven. Thanks!

I'm pretty sure there are commercial solution out there (you know, open the box, install, good to go type) that can do this kind of stuff....NAC solutions, IPS perhaps. Some networking logic that says "If i detect this kind of traffic then,*Kick*, off you go to the vlan of death. It would take much to put out some remediation solutions on that VLAN...perhaps a conection to the internet, firewalled only for Windows Updates. AV vendor dat update URL's. A local patch repository. A local Rapid7 or Nessus scanner perhaps.

In any case, this stuff should be pretty trivial to put together if you have the right people at your disposal, it just might take some time, but would pay off in the end. I dont know about you, but I'd love to respond to such an incident by saying "We contained all systems on VLANxx, seperated them from other core internal systems, remediated, and tossed them back" rather than "Umm, we just addressed them as we became aware of the issue".

Diary Archives