Threat Level: green Handler on Duty: Brad Duncan

SANS ISC: Monday morning incident handler practice - SANS Internet Storm Center SANS ISC InfoSec Forums


Sign Up for Free!   Forgot Password?
Log In or Sign Up for Free!
Monday morning incident handler practice

This is a hypothetical scenario to get the old grey matter thinking on how you, the incident handler, would respond. To make this a piece of light entertainment when sipping coffee, just focus on these three phase, containment, eradication and recovery, of the six step incident handling process. Feel free to apply your own incident response plans to this scenario and I don’t expect anyone to post their answers to the questions. This is simply something to warm up the brain after the weekend – or help those recover after the week that was SANSFire.

The Scenario:

A very popular news web site is compromised and the front page is offering up known malware, AB, to any that visit it. You first discover this as the AV console frantically starts receiving notifications from client machines visiting the infected site. A quick bit of research reveals the malware AB exploits a vulnerability in only Internet Explorer 6 and then attempts to phone home uploading the compromised machine's IE protected storage details to any one of 30 drop web sites via http. If the malware infects the system, it then attempts to download, via FTP and https, a .exe file containing more malware designed to hunt over port TCP 445 for machines without patch MS08-067 (KB 958644) to drop a hidden .exe file on %SYSTEMROOT%/System32. The AV companies released a signature file to detect and protect against this three weeks ago.

You're the lone security person for a company of 5,000 employees, over 10 sites. The standard operating system is Windows XP with version of Internet Explorer from 6-8. The IT team use Microsoft’s System Center Configuration Manager to manage the Windows systems and deploy software and patches. You are the firewall and AV admin and the IT support staff are competent but over worked and under resourced. Two of the ten sites have no IT staff on site.

The Problem:

Over half your company, including all of senior management, visit that site daily to keep themselves informed or read the gossip of the day. From the IT team's best estimates at least 3000 machines have IE 6 and roughly 300 of those machines probably have don't have the right level of AV definitions on them to protect against it for any number of reasons. The news web site isn’t going to be able to remove the malware distribution for up to 12 hour. 10 server systems you know of don't have the MS08-067 patch due to operational issue with supporting from a 3rd party vendor but requires TCP 445 to be available to internal systems.

The AV console currently has 1200 alerts and growing by the minute.

The Questions:

  •     What do you do to contain this incident?
  •     How can you identify infected machine?
  •     What do you do with infected machines?
  •     How can you identify any other at risk machines?
  •     How can you protect the 10 servers without MS08-067?
  •     What information do you communicate to staff, IT and management?

Chris Mohan --- Internet Storm Center Handler on Duty

Chris

105 Posts
ISC Handler
What do you do to contain this incident?
1. block the infected site's traffic at the firewall
2. block all FTP and HTTPS outbound traffic. (Assuming you dont know the 30 drop servers. - if you do, only block that traffic)
3. if infected machines are on specific segments, consider isolating the segments.
4. Since AV is detecting and protecting the machines, update the AV signatures on those machines.
How can you identify infected machine?
1. by watching the traffic at the firewall for connection attempts to malicious sites. (both the initial news site and the 30 malware download sites)

What do you do with infected machines?
1. Analyze the malware and identify all of its characteristics and then come up with a remediation plan. (which may be as extreme as re-imaging the systems)
2. ask the users of those machines to change their passwords for the sites they may have logged on to. I would also ask them to change their domain passwords if they use that password on any other systems (whose credentials may have been saved in the protected storage area)

How can you identify any other at risk machines?
by malware analysis, on should be able to determine the threat.

How can you protect the 10 servers without MS08-067?
Update the AV on the system and make sure AV is scanning every file at "disk-writes"

What information do you communicate to staff, IT and management?
that it may be a long week and we may need to re-image all infected systems - ask them to get their backup images and ghost/pxe ready to go. I would also work closely with the Network group on the containment and remediation.
For management, I would tell them exactly what happened in layman's terms and make them understand that we may need to take down some portions of the network for remediation (Including the 10 servers)

Anonymous
your 1. should just last until you can route all traffic for that site through a proxy that silently drops the malware from the pages it loads
Anonymous
What do you do to contain this incident?
Use any resources available that allow blocking on this infected AB site. This AB news site, while used by everyone, does not appear to be business critical. Using of firewalls, content filters, and even manual DNS modifications to get a block in place to prevent new infections is containment.

How can you identify infected machine?
Use central management of your AntiMalware solution to detect infected clients using the latest virus update files.
Use firewall logging to detect connections to the AB website and other 10+ malicious systems.
(I dont work for them, but have used them a bunch). A solution like NetWitness would allow for a central view of network visibility. This used with its Informer reporting would allow you to have proactive monitoring and alerting when new systems access the site. This solution (Wireshark on steroids) would aid in a deeper analysis of exactly what the malware is, how its moving around...etc.

What do you do with infected machines?
VLAN them to a separate ares to prevent infection of other corporate machines. This VLAN may be part of a "remediation VLAN" that has access to certain network devices, like AV server, Windows Update, SCCM, WSUS, Repository for patches. The goal is once contained and the infection is no longer spreading from these machines is to remediate .

How can you identify any other at risk machines?
Utilize patch management systems like WSUS (or any other commercial solution) for reports on systems not patched from this vulnerability.
If no patch management is available or not widely used, perform Vulnerability scanning accross your network (i.e. Nessus , Rapid7...etc) to detect vulnerable machines.
Use central AntiMalware consoles to detect machines not running the latest virus files, or not running AV in general, address them.

How can you protect the 10 servers without MS08-067?
Firewall these servers. Most networking switches now offer firewall modules, so you can perform firewall on specific ports. Servers in general shouldn't be connecting to the internet at all , or directly. Use proxy servers, content filters, and segment these servers into a separate VLAN if possible (this can be a lot of work depending on what they talk to in the back end). Get AV on them if possible, but you'll need to tune a policy that doesnt kill performance.

What information do you communicate to staff, IT and management?
The importance of documentation! If this happens to your company, and you are running around trying to identify things or figure out how to identify things, you might need stronger documentation. In a perfect world, if this were to happen, there should be some documents that show the configuration of servers, vlans, what they talk to...etc.

From a security standpoint of a security guy to other non-security focused IT staff...

- The importance of patching. Patch management solutions, especially WSUS, dont always work. They rely on the client side windows update service to be running properly, and firewalling. Just because a system is configured to use WSUS doesnt mean it will work. Review logs of patch management.
- The importance of AV. Read the logs for central managament. Get alerts. Address all alerts as if it were serious.
- The importance of proxy servers or content filters. This is a management buy-in thing though. Companies look at them as a way to block employees from doing fun stuff. Stress how a solution like this can be used to block the bad stuff.
- The importance of getting an IRP/IRT in place. No one wants to be woken up at 1am, or sitting in work after 5pm not knowing what to do, how to do it, who to see for updates and reporting to. Define a plan an people for when this stuff happens.
- The importance of best security practice. This contains a lot of stuff, but I've used incidents like this example to reinforce things like:
* Normal AD accounts shouldnt be doain admin accounts, seperate the two.
* Use good passwords. Good passwords are not the standard 8 digits, 1 upper, lower, 1 number that all our password policies are :) Use passphrases...etc.
* Lock systems when walking away.
* Dont use corporate machines (heck, even home machines) to browse "suspicious" websites.
* Dont torrent/p2p (the bad kind). This is just asking for bad stuff to happen.
* Keep a clean system. Use Secunia to keep tabs on everything you need to update on your system.
* When at all possible, use UAC and never be logged in as a local admin. The "Run as" feature is awesome :) Most malware needs to install and spread because of admin permissions.
Anonymous
EZ 1 -- role switch to a back in time box. even with understaffed as described (typical) there is not reason to shoot yourself in the head (unless you really want to) ....just fail-over to the snapshot you have one nanosecond before "it" happened. If you are reading this you already know how to do this with cheap hardware and little or NO staff. --- if not call 415 - 515-9445 ....I will tell you have for free. Oh, if you want to preserve the event forensically - setup your role switch with a forensic (no touch) flag on your IPS.
Anonymous
Honestly, with that kind of security staffing the company doesn't care about security. First on my list would be looking for a new job. And second would be a "I told you so" memo to my boss/CEO.
geekyone

1 Posts
If you are in the U.S.: After the hubub dies down write a letter to the FBI, FTC, your Congress person, your Senators and anyone else you can think of to complain that the Federal Government is doing almost nothing to stop these people.
KBR

63 Posts
> 10 server systems you know of don't have the MS08-067 patch due to operational issue with supporting from a 3rd party vendor but requires TCP 445 to be available to internal systems.

Then these servers are offline until the storm clears.

I'll block the feeder sites and the news site but it won't do any good.
KBR
39 Posts
Another vote for a proxy firewall.

Squid is fast and free and supports this in various ways:

wiki.squid-cache.org/SquidFaq/ContentAdaptation
KBR
5 Posts
I *love* nduda78's containment VLAN idea and may steal it :)
I'd put workstations in there, where they won't be able to reach the non-MS08-067-patched servers or anything business-critical, and won't be able to leak any data out. Once in there, perhaps direct the user to a secure/patched, catch-all HTTP service that explains that their workstation is compromised, and what they can do about it in the meantime. This notification/user education part is very important; the handler will be too busy during this to assist each user individually or promptly, and the more that users can do to clean this up for themselves or each other, the better.
Infected workstations need to be identified and put into containment *quickly* for this to work, though, so it must be scripted somehow. They can be identified from AV alerts, by hits on the original malware source URL (implying the IE6 exploit succeeded, so they need to upgrade), or hits on the download sites of its additional payload (implying infection); or an alert from some internal IDS or honeypot (suggesting active MS08-067 exploit/scanning). The time and reason why a workstation gets moved to containment should be logged, as well as any additional detections whilst the workstation is already in containment, for later review when cleaning up the workstations.
Much of this can be prepared in advance in case of a future incident like this.
Steven C.

171 Posts
@ Steven. Thanks!

I'm pretty sure there are commercial solution out there (you know, open the box, install, good to go type) that can do this kind of stuff....NAC solutions, IPS perhaps. Some networking logic that says "If i detect this kind of traffic then,*Kick*, off you go to the vlan of death. It would take much to put out some remediation solutions on that VLAN...perhaps a conection to the internet, firewalled only for Windows Updates. AV vendor dat update URL's. A local patch repository. A local Rapid7 or Nessus scanner perhaps.

In any case, this stuff should be pretty trivial to put together if you have the right people at your disposal, it just might take some time, but would pay off in the end. I dont know about you, but I'd love to respond to such an incident by saying "We contained all systems on VLANxx, seperated them from other core internal systems, remediated, and tossed them back" rather than "Umm, we just addressed them as we became aware of the issue".
Anonymous
Containment

1) Update Senior Management of the issue and risk associated – if possible block news site while it is still infected
2) Block all 30 sites at the firewall for both internal and external communication
3) Verify that Server systems have AV software updates via AV Console

Eradication

4) Push, via AV console, the latest definition files to all clients
5) Use MSCCM to find all machines without patch MS08-067
6) Using MSCCM to push patch MS08-067 to all machines without it
7) Scan firewall logs for blocked external communication from internal machines (these are still infected)
8) If present and available, work with IDS admin to determine which machines are port scanning (i.e. searching for open port 445)
9) Repeat scanning of logs and IDS until no more communication is found
10) Reimage systems if risk requires it (based on malware analysis and results of above)

Recovery

10) Force full AV scan for all machines via AV console
11) Scan firewall logs for blocked external communication to 30 known sites
12) Reenable news website once it is free from malware
12) Revisit Plan for Windows update patching
13) Revisit Plan on AV update schedule
Anonymous
@ gduquette, good stuff! also,
12) Revisit Plan for Windows update patching
13) Revisit Plan on AV update schedule

To me these sound like something to discuss in the "Lessons learned" phase, or at least an action coming out of that phase....to which I ask everyone to keep this going, What are some lessons learned to address before the P in PICERL kicks in?
Anonymous
Thanks. It appears I was getting ahead of myself. I am also not very experienced, but learning. Here is my latest update (with additions from comments above). Please help steer me in the right direction:

Containment

1) Update Senior Management of the issue and risk associated – update and get approval on plan below.
2) Block news site at the firewall while it is still infected
3) Place vulnerable servers into server quarantine during remediation
4) Block all 30 sites at the firewall for both internal and external communication (all protocols)
5) Verify that Server systems have AV software updates via AV Console
6) Move infected machines into endpoint quarantine VLAN (this may be an iterative process)

Eradication

7) Push, via AV console, the latest definition files to all clients (including those in quarantine)
8) Use MSCCM to find all machines without patch MS08-067
9) Using MSCCM to push patch MS08-067 to all machines without it
10) Scan firewall logs for blocked external communication from internal machines (these are still infected)
11) If present and available, work with IDS admin to determine which machines are port scanning (i.e. searching for open port 445)
12) Repeat scanning of firewall and IDS logs until no more malware communication is found
13) Review quarantined servers for malware/security risks
14) Rebuild servers based on malware analysis and risk analysis
15) Review quarantined end point systems for malware/security risks
16) Reimage end point systems based on malware analysis and risk analysis
17) Provide Update to Senior Management

Recovery

18) Force full AV scan for all machines
19) Scan firewall and IDS logs for malware communication (verification)
20) Move servers back into operation
21) Move infected end point systems back into operational VLAN(s)
22) Unblock blocked news site (once malware is removed)
23) Provide Update to Senior Management

Lesson Learned

24) Revisit Plan for scheduled Windows patching
25) Revisit Plan for scheduled AV updates
26) Revisit security exceptions (i.e. servers without patches, etc for remediation – verify they are still exceptions)
27) Update Security Plan based on lessons learned
28) Provide Update to Senior Management
Anonymous

Sign Up for Free or Log In to start participating in the conversation!