Moloch: Capturing and Indexing Network Traffic in Realtime

   Comments

What is moloch?

As his own website says: “Moloch is an open source, large scale IPv4 packet capturing (PCAP), indexing and database system. A simple web interface is provided for PCAP browsing, searching, and exporting. APIs are exposed that allow PCAP data and JSON-formatted session data to be downloaded directly.” it will be very useful as a network forensic tool to analyze captured traffic (moloch can also index previously captured pcap files as we will see) in case of a security incident or detecting some suspicious behaviour like, for example, some kind of alert in our IDS.

Thanks of indexing pcaps with elasticsearch, moloch provide us with the ability to perform almost real-time searches among dozens or hundreds of captured GB network traffic being able to apply several filtering options on the way. It isn’t as complete as Wireshark filtering system for example but will save us tons of work when dealing with some filtering and visualization as well as Moloch will provide us with some features Wireshark lacks, like filtering by country or AS.

I’m sure to not be the only who would have loved to rely on moloch when analyzing dozens of GB with tshark and wireshark, particularly each time you apply a filter to show some kind of data…

Installing moloch

For deploying a moloch machine in a “all-in-one” setup i created a virtual machine with Ubuntu server 12.10 64bits and assigned about 100GB of HDD, 16GB of RAM and 4 CPU cores, moloch is a highly consuming platform, to have a more detailed info about this go to hardware requirements.

First step will be updating the box, installing java and cloning github repository:

Updating system and cloning repo
1
2
3
# apt-get update && apt-get upgrade -y && apt-get install git openjdk-7-jdk openjdk-7-jre -y

# git clone https://github.com/aol/moloch.git

Once cloned the repo we must install, at least, one of his components: capture, viewer or elasticsearch. Because we are going to mess up a bit with moloch to get an overview of functionalities and capabilities we will take the shortest path, installing moloch through provided bash script to setup everything in the same machine; if you prefer to install it manually or are going to build a distributed cluster check ”Building and Installing”:

Installing moloch automatically
1
~/moloch# ./easybutton-singlehost.sh

Now the wizard will make us a few questions to configure moloch (capturer, viewer and elasticsearch instance) for us and everything will be running in a few moments (moloch will be installed by default at “/data/moloch/”) and we can access to web interface at “https://MOLOCH_IP_ADDRESS:8005”:

As can be seen, moloch have already started to index all traffic seen on eth0, included every request to moloch web interface. If we don’t want this then we have to specify a capture filtering in Berkeley Packet Filter (bpf) format at “/data/moloch/etc/config.ini”:

Don’t index ANY traffic related with moloch box
1
bpf=not host 192.168.1.39

To change elasticsearch configuration and allow access from other IP address than moloch host itself (it could pose a security risk, using SSH tunneling would be a better aproach) go to “/data/moloch/etc/elasticsearch.yml” and edit network parameters (network.host), to view/change moloch configuration take a look to “/data/moloch/etc/config.ini”:

Changing binded IP address
1
2
3
4
5
6
7
8
9
10
11
12
# Set the bind address specifically (IPv4 or IPv6):
#
network.bind_host: 0.0.0.0

# Set the address other nodes will use to communicate with this node. If not
# set, it is automatically derived. It must point to an actual IP address.
#
network.publish_host: 0.0.0.0

# Set both 'bind_host' and 'publish_host':
#
network.host: 0.0.0.0

We need to shutdown elasticsearch node and start it again, so here we go:

Restarting elasticsearch
1
2
3
# curl -XPOST 'http://localhost:9200/_shutdown'

# nohup /data/moloch/bin/run_es.sh &

We can also start viewer and capturer from same dir “/data/moloch/bin/run_viewer.sh” and “/data/moloch/bin/run_capture.sh” respectively.
Now we have access to elasticsearch-head plugin to see elasticsearch cluster health and manage it at “https://MOLOCH_IP_ADDRESS:9200/_plugin/head/”:

Moloch overview

To have some info indexed by moloch in a few minutes we are going to make some light random nmap scans, having in mind the interface assigned to virtual machine. If you want to use virtual interface and launch nmap scan from moloch box then you could need to change bpf filter to “bpf=not port (9200 or 8005)” (this isn’t, by far, the correct way, but will be enough for a quick test).

Quick nmap scan to index some HTTP headers
1
# ./nmap -sS -Pn -n -v -p80 -iR 10000 --script=http-headers

If we take a look again to moloch web interface now we will see some pretty info:

We can see more info about any session clicking on “green plus” icon:

A new dropdown will appear and will give us some interesting options like downloading pcap (for example, to make a deeper manual analysis with wireshark), downloading data in RAW format, and showing use a set of links to make some filtering.

Let’s click on “User-Agent link” and then make a search to show only those indexed packets using the NSE user-agent, now you know who have scanned your network with nmap’s HTTP plugins in just a second ;).

Moloch also have a useful “stats” menu to have realtime statistics about traffic being captured and indexed:

Indexing previously captured traffic

To index traffic captured in pcap format we have to use “moloch-capture” stored in “/data/moloch/bin/moloch-capture”:

moloch-capture options
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# ./moloch-capture -h
Usage:
  moloch-capture [OPTION...] - capture

Help Options:
  -h, --help         Show help options

Application Options:
  -c, --config       Config file name, default '/data/moloch/etc/config.ini'
  -r, --pcapfile     Offline pcap file
  -R, --pcapdir      Offline pcap directory, all *.pcap files will be processed
  --recursive        When in offline pcap directory mode, recurse sub directories
  -n, --node         Our node name, defaults to hostname.  Multiple nodes can run on same host.
  -t, --tag          Extra tag to add to all packets, can be used multiple times
  -v, --version      Show version number
  -d, --debug        Turn on all debugging
  --copy             When in offline mode copy the pcap files into the pcapDir from the config file
  --dryrun           dry run, noting written to database

I’m going to index a sample of about 7,5GB from a DNS amplification DDoS attack i had to analyze and help to mitigate some months ago, but to quickly download some pcaps to play around NetreseC have a published a good list:

Indexing pcaps from a dir
1
# ./moloch-capture -R /tmp/ddos_pcaps/ --tag ddos --copy

After some minutes i already had indexed some millions of packets and can view them just searching for tag ddos (i have stripped out map and some info to don’t disclose anything about customer / attack):

Let’s say we want to show every DNS datagram originating from port 53 by servers geolocated at Russia:

As can be seen, there were peaks of almost 60.000 packets per second (DNS answers) with an average of approximately 20.000 at regular intervals in this six minutes slot.

Moloch give us the chance to visualize indexed traffic from a graph’s theory point of view (“Connections” tab), using hosts as nodes and connections (with or without port) as edges:

This is really useful to get an idea at a glance of what event is being analyzed, in this case we can easily spot few targets and thousands of hosts targeting them.

Moloch API

At the beginning of this post i said that Moloch have an API to query and get some info about indexed pcaps and so on in JSON format. At this moment probably the best way to see which calls exists is directly reading the viewer code.

There is an example of python code to query moloch API and show some statistics:

Using moloch API to show some statistics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#!/usr/bin/env python

import json
import sys
import urllib2


MOLOCH_URL = 'https://192.168.1.39:8005'
MOLOCH_USER = 'admin'
MOLOCH_PASSWORD = 'admin'
MOLOCH_REALM = 'Moloch'


if __name__=='__main__':

    # Set up authentication
    auth_handler = urllib2.HTTPDigestAuthHandler()
    auth_handler.add_password(MOLOCH_REALM, MOLOCH_URL, MOLOCH_USER, MOLOCH_PASSWORD)
    opener = urllib2.build_opener(auth_handler)

    try:
        response = opener.open('%s/esstats.json' % MOLOCH_URL)
        if response.code == 200:
            # Read html response and transform to JSON
            plain_answer = response.read()
            json_data = json.loads(plain_answer)

            # Extract info
            node_name = json_data['aaData'][0]['name']
            documents_num = json_data['aaData'][0]['docs']
            searches_num = json_data['aaData'][0]['searches']
            searches_time_total = json_data['aaData'][0]['searchesTime'] # milliseconds
            store_size_bytes = json_data['aaData'][0]['storeSize'] # bytes

            # Show it
            store_size_mb = store_size_bytes / (1024 * 1024)
            searches_time_average_seconds = float(searches_time_total / searches_num)/1000

            print '[*] Some statistics about elasticsearch at node "%s"' % node_name
            print '   [+] There are %i indexed documents within %i MB of index'\
                  % (documents_num, store_size_mb)
            print '   [+] This elasticsearch node has served up %i queries with an average\
            of %f seconds per query' % (searches_num, searches_time_average_seconds)
            print '[-]'

    except Exception, e:
        raise e

This simple code will show something similar to this:

Output for moloch_api_example.py
1
2
3
4
5
$ python moloch_api_example.py
[*] Some statistics about elasticsearch at node "molocha"
   [+] There are 1624416 indexed documents within 963 MB of index
   [+] This elasticsearch node has served up 1042 queries with an average of 0.012000 seconds per query
[-]

That is all for now, hope you liked this and find it useful, i think moloch is a really powerful tool and will turn to a must-have in network forensics as well as saving us countless hours when dealing with big amounts of network traffic.

See you soon!

Comments