Converting PCAP Web Traffic to Apache Log
PCAP data can be really useful when you must investigate an incident but when the amount of PCAP files to analyse is counted in gigabytes, it may quickly become tricky to handle. Often, the first protocol to be analysed is HTTP because it remains a classic infection or communication vector used by malware. What if you could analyze HTTP connections like an Apache access log? This kind of log can be easily indexed/processed by many tools.
Haka[1] isn’t a new tool (the first version was released in 2013) but it remains below the radar for many people. Haka is defined as "an open source security-oriented language which allows to describe protocols and apply security policies on (live) captured traffic”. Based on the LUA[2] programming language, it is extremely powerful to extract information from network flows but also to alter them on the fly (playing a man-in-the-middle role).
I had to analyze a lot of HTTP requests from big PCAP files and I decided to automate this boring task. I found on the Haka blog an article[3] that explained how to generate an Apache access log from a PCAP file. Unfortunately, it did not work anymore probably due to the evolution of the language. So, I jumped into the code to fix it (with some Google support of course).
Let’s start a docker container based on Ubuntu and install the latest Haka package:
$ docker run -it --name haka --hostname haka ubuntu root@haka:~# apt-get update && apt-get upgrade root@haka:~# apt-get install libpcap0.8 # Required by Haka! root@haka:~# curl http://github.com/haka-security/haka/releases/download/v0.3.0/haka_0.3.0_amd64.deb root@haka:~# dpkg -i haka_0.3.0_amd64.deb root@haka:~# akapcap -h Usage: hakapcap [options] <config> <pcapfile> Options: -h,--help: Display this information --version: Display version information -d,--debug: Display debug output -l,--loglevel <level>: Set the log level (debug, info, warning, error or fatal) -a,--alert-to <file>: Redirect alerts to given file --debug-lua: Activate lua debugging --dump-dissector-graph: Dump dissector internals (grammar and state machine) in file <name>.dot --no-pass-through, --pass-through: Select pass-through mode (default: true) -o <output>: Save result in a pcap file
Ready!
Basically, Haka works with hooks that are called when a condition is matched. In our example, we collect traffic from interesting ports:
http.install_tcp_rule(80) http.install_tcp_rule(3128) http.install_tcp_rule(8080)
Then we created a hook that will trigger HTTP response detected in the PCAP files:
hook = http.events.response, eval = function (http, response) { ... your code here ... }
The hook extracts information from the HTTP response to build an Apache log entry:
<clientip> - - [<date>] “<request> HTTP/<version>” <response> <size> “<referer>” "<useragent>”
Let’s try it with a PCAP file generated on a network:
$ docker cp test.pcap haka:/tmp $ docker exec -it haka bash root@haka:~# hakapcap http-dissector.lua /tmp/test.pcap | grep “GET /“ 192.168.254.222 - - [05/Jun/2018:18:34:13 +0000] "GET /connecttest.txt HTTP/1.1" 200 10 "-" "Microsoft NCSI” 192.168.254.215 - - [05/Jun/2018:18:34:14 +0000] "GET /session/...HTTP/1.1" 200 10 "-" "AppleCoreMedia/1.0.0.15E216 (iPad; U; CPU OS 11_3 like Mac OS X; en_us)" 192.168.254.215 - - [05/Jun/2018:18:34:19 +0000] "GET /session/...m3u8 HTTP/1.1" 200 10 "-" "AppleCoreMedia/1.0.0.15E216 (iPad; U; CPU OS 11_3 like Mac OS X; en_us)" 192.168.254.66 - - [05/Jun/2018:18:34:21 +0000] "GET / HTTP/1.1" 200 0 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"
For now, the script returns a request size of ‘10’. It is hardcoded like usernames (default to "- -"). I’m still looking for a way to get the number of bytes per HTTP transaction. Also, you get only the client IP address and not the destination one. If you've improvement ideas, let me know!
My script compatible with Hack 0.3.0 is available on github.com[4].
[1] http://www.haka-security.org/
[2] https://www.lua.org/
[3] http://www.haka-security.org/blog/2014/03/18/transform-a-pcap-to-an-apache-log-file.html
[4] https://github.com/xme/toolbox/blob/master/haka_http_log.lua
Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant
PGP Key
Reverse-Engineering Malware: Malware Analysis Tools and Techniques | London | Mar 3rd - Mar 8th 2025 |
Comments
http://justniffer.sourceforge.net/
Cheers, Gebhard
Anonymous
Jun 6th 2018
6 years ago
Many ways to achieve the same result, that's why I like open source software!
Anonymous
Jun 6th 2018
6 years ago
Anonymous
Jun 30th 2018
6 years ago
tcpdump may be able to get you the packets isolated if you filter by port and maybe IP address. But it will not reassemble sessions. Newer versions of tcpdump will actually recognize some of the HTTP payloads, but I don't think you can write a filter for them.
Anonymous
Jun 30th 2018
6 years ago
With tcpdump -v -n | grep -o "[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}.[0-9]+\s>\s[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}.[0-9]+" | [unique|sort -u]
Unique or sort unique depending how much and how you wanted to look at the conversations.
You could even sed away the ports for just IPs for cursory glance.
This would give you all the conversations in a scriptable way, from there you could make a script with gawk or simple bash to get tcpdump to output the data payloads with the proper bpf filter, and concatonate them. I would first decrypt with private key before doing this...
I am everything bash it helps me as a solo admin for my many projects and all my many many verbosely made logs xD
I'm by far not perfect or all knowing, but I have never found a more useful cli tool than tcpdump, and ngrep
Anonymous
Jun 30th 2018
6 years ago
In other words respect for all that you do, it's just more fun and enlightening, if I sift through the muck myself like all I had was a openwrt or a pi shell...
XD
Anonymous
Jun 30th 2018
6 years ago