Detection of Trojan control channels

Published: 2008-11-16
Last Updated: 2008-11-16 09:22:41 UTC
by Maarten Van Horenbeeck (Version: 1)
2 comment(s)

Recently I was working with an organization whose network had been deeply compromised by a persistent threat agent: they had very little remaining trust in the network. A full rebuild of the network was not financially feasible for this organization, as it would have meant losing much of the unique intellectual property the organization had to offer–truly a scenario that was not acceptable.

Given that a “nuke from high orbit” would not be feasible, we worked on several techniques to identify those hosts which had been compromised.  Note that we did not want to identify internal data being trafficked out per se: while Data Loss Prevention solutions have greatly improved over the last few years, there are hundreds of ways to smuggle a binary piece of data out in a difficult-to-detect form.  Our goal was to detect behavior indicating an active Trojan on a system.

  • Initially we worked on increasing situational awareness. While in our case this did include costly measures such as implementing intrusion detection systems, situational awareness can also be significantly improved by small configuration changes, such as configuring BIND to log all DNS queries, storing netflows and extending firewalls to log accepted connections;

  • In order to detect variants of existing, known, Trojans, we deployed an IDS on the perimeter, and installed the virus-rules from EmergingThreats. Matt Jonkman’s team regularly publishes updated signatures for known Command and Control channels. If setting up such system sounds like a bit of work, have a look at BotHunter;

  • We started sniffing all DNS requests from hosts on the internal network, and then applied several heuristics on the resulting DNS data:
    • DNS responses which had a low to very low TTL (time to live) value, which is somewhat unusual;
    • DNS responses which contained a domain that belonged to one of a long list of dynamic DNS providers;
    • DNS queries which were issued more frequently by the client than would be expected given the TTL for that hostname;
    • DNS requests for a hostname outside of the local namespace which were responded to with a resource record pointing to an IP address within either,, RFC1918 IP space, or anywhere inside the public or private IP space of the organization;
    • Consecutive DNS responses for a single unique hostname which contained only a single resource record, but which changed more than twice every 24 hours.
  • Anomaly detection of network traffic can be a very powerful tool in detecting command & control channels. Unfortunately, to be most effective the baselining (defining what is “good” about the network) should take place before the first compromise. However, some forms of anomaly detection still add tremendous value:
    • We wrote a quick set of signatures to ensure that each TCP session on port 80 and 443 consisted of valid HTTP or SSL traffic, respectively. You can also do this using a tool such as FlowGrep, or by reviewing your proxy logs for failures. This would be a useful exercise in general for all traffic that is not relayed through an application proxy, and is not blocked from direct access to internet resources.

    • Persistent connections to HTTP servers on the internet, even outside regular office hours, can be normal: just think of software update mechanisms. However, they should be exceptions, not the rule, so these valid exceptions can be filtered out, making this a potent mechanism to identify compromises. Is the attacker operating from the same time zone as your organization?
    • Persistent requests for the same file on a remote web server, but using a different parameter can indicate data smuggling over HTTP.

We also took some action on the host based front. A shortlist was created of anti virus vendors that were successful on so-called “proactive detection tests” (such as the AV-Comparatives one), where month old signature sets are tested against today’s malware. We licensed the software appropriately and created a live-cd that ran each solution sequentially across all local hard drives. This CD was distributed to the offices and ran on a large sample of systems over a weekend. 

Upon completing the scan, the CD logged into a central FTP server and stored all suspicious binaries on this share. Each of the samples was afterwards analyzed in depth, and if found malicious, detection logic was created and deployed onto the various network based detection mechanisms.

On a set of critical systems, we deployed a logon policy which ran Sysinternals’ RootkitRevealer and stored its report on a remote network share. Once these reports were verified and we had some assurance that the file system API was not hooked to hide specific files, we ran a copy of Mandiant’s Red Curtain on the system to identify suspicious binaries. These were once again hooked into the analysis process above.

Regardless of whether you go for a pure-play network or host based aproach, or a combination, the investigative approach should be to identify that which is unusual, validate whether it is a manifestation of a threat, and reapply what is learned to our detection probes, or identify additional monitoring that would add value. The next step is to improve our understanding of the threat agent and how it interfaces with our network. One way to get there is nodal link analysis, an analytical technique which we'll cover in a future diary entry.

If you have other ideas on how to approach this problem, do get in touch!

Maarten Van Horenbeeck

2 comment(s)


They went to this effort after being "deeply compromised by a persistent threat agent". How well did this effort restore their trust?
Hey Dick, thanks for your comment.

You make a very good point.

The real question though is, after the initial installation, how certain are you your network was never compromised? In today's IT environment, incident response moves from incident to incident without ever having the assurance of being in a "clean" state.

You can look at a "computer system" as compromised and requiring a reinstall (a nuke from high orbit). Where do you delineate the difference between an individual system and a network? Where does the cost of rebuilding become too high to bear ?

Once a network is compromised, even after a rebuild, the trust factor of executive leadership in IT always decreases, and integrating new critical business processes in IT becomes more difficult.

Thanks again,

Diary Archives