Bro messaging

Debugging Bro message queues

To enable debugging Bro message queues, insert the following into the local.bro script in /usr/share/bro/sfa/policy directory. This is also useful for troubleshooting script-land memory leakage.

@load misc/profiling
redef Log::enable_local_logging = T;
redef profiling_interval = 10secs;
redef expensive_profiling_multiple = 2;

Bro UNIX server created a prof.log file. As a UNIX server, prof.log are found in each of the /var/spool/bro/bro-process-type directory. Running Bro as standalone puts prof.log in the current working directory.

Example output of prof.log:

    0.000000 ------------------------
    0.000000 Command line: /usr/bin/bro -C -d -B threading -t trace.log -e redef
    profiling_interval=30secs;
    redef expensive_profiling_multiple=1 ; redef Log::enable_local_logging=T; redef
    Reporter::info_to_stderr =
    T; redef Reporter::errors_to_stderr = T; redef Reporter::warnings_to_stderr = T;
    -r ftp.pcap
    sfa/policy/local.bro
    0.000000 ------------------------
    0.000000 Memory: total=30160K total_adj=0K malloced: 23126K
    0.000000 Run-time: user+sys=0.0 user=0.0 sys=0.0 real=0.0
    0.000000 Conns: total=0 current=0/0 ext=0 mem=0K avg=-nan table=1K connvals=0K
    0.000000 Conns: tcp=0/0 udp=0/0 icmp=0/0
    0.000000 TCP-States:        Inact.  Syn.    SA      Part.   Est.    Fin.    Rst.
    0.000000 TCP-States:Inact.
    0.000000 TCP-States:Syn.
    0.000000 TCP-States:SA
    0.000000 TCP-States:Part.
    0.000000 TCP-States:Est.
    0.000000 TCP-States:Fin.
    0.000000 TCP-States:Rst.
    0.000000 Connections expired due to inactivity: 0
    0.000000 Total reassembler data: 0K
    0.000000 RuleMatcher: matchers=2 dfa_states=2 ncomputed=0 mem=2K
    avg_nfa_states=29
    0.000000 Timers: current=19 max=19 mem=1K lag=0.00s
    0.000000 DNS_Mgr: requests=0 succesful=0 failed=0 pending=0 cached_hosts=0
    cached_addrs=0
    0.000000 Triggers: total=0 pending=0
    0.000000
    0.000000
    0.000000
    0.000000 Threads: current=4
    0.000000   /var/spool/bro/bv_intel/bv_intel.txt/Input::READER_ASCII in=1 out=1
    pending=0/1 (#queue r/w:
    in=1/1 out=0/1)
    0.000000
    0.000000
    0.000000
    0.000000
    0.000000 Global_sizes > 100k: 0K
    ProfileTimer = 1
    ScheduleTimer = 2
    TableValTimer = 16
    packet_filter/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1
    out=0/0)
    packet_filter/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1
    out=0/0)
    loaded_scripts/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1
    out=0/0)
    reporter/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1 out=0/0)
    0.000000
    0.000000
    0.000000
    0.000000
    0.000000 Global_sizes total: 1173K
    0.000000 Total number of table entries: 1648/1691
    SSL::cipher_desc = 99K (351/351 entries)
    SSL::root_certs = 208K (147/147 entries)
    FTP::cmd_reply_code = 48K (325/325 entries)
    Weird::actions = 36K (162/162 entries)

Please note the (#queue r/w: in=1/1 out=0/1) by bv_intel.txt/WRITER_ASCII: this means one message was inserted into the input queue and one message was removed from the same input queue, while only one message was written to the output queue, but nobody has retrieved it yet.

The prof.log output continues on (note the timestamp):

    1413471640.952530 ------------------------
    1413471640.952530 Memory: total=30160K total_adj=0K malloced: 23127K
    1413471640.952530 Run-time: user+sys=0.0 user=0.0 sys=0.0 real=0.0
    1413471640.952530 Conns: total=0 current=0/0 ext=0 mem=0K avg=-nan table=1K
    connvals=0K
    1413471640.952530 Conns: tcp=0/0 udp=0/0 icmp=0/0
    1413471640.952530 TCP-States:        Inact.  Syn.    SA      Part.   Est.
    Fin.    Rst.
    1413471640.952530 TCP-States:Inact.
    1413471640.952530 TCP-States:Syn.
    1413471640.952530 TCP-States:SA
    1413471640.952530 TCP-States:Part.
    1413471640.952530 TCP-States:Est.
    1413471640.952530 TCP-States:Est.
    1413471640.952530 TCP-States:Fin.
    1413471640.952530 TCP-States:Rst.
    1413471640.952530 Connections expired due to inactivity: 0
    1413471640.952530 Total reassembler data: 0K
    1413471640.952530 RuleMatcher: matchers=2 dfa_states=2 ncomputed=0 mem=2K
    avg_nfa_states=29
    1413471640.952530 Timers: current=18 max=19 mem=1K lag=1413471639.95s
    1413471640.952530 DNS_Mgr: requests=0 succesful=0 failed=0 pending=0
    cached_hosts=0 cached_addrs=0
    1413471640.952530 Triggers: total=0 pending=0
    1413471640.952530         ScheduleTimer = 2
    1413471640.952530         TableValTimer = 16
    1413471640.952530 Threads: current=4
    1413471640.952530   /var/spool/bro/bv_intel/bv_intel.txt/Input::READER_ASCII
    in=1 out=1 pending=0/0 (#queue
    r/w: in=1/1 out=1/1)
    1413471640.952530
    1413471640.952530
    1413471640.952530
    1413471640.952530 Global_sizes > 100k: 0K
    packet_filter/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1
    out=0/0)
    loaded_scripts/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1
    out=0/0)
    reporter/Log::WRITER_ASCII in=1 out=0 pending=0/0 (#queue r/w: in=1/1 out=0/0)
    1413471640.952530
    1413471640.952530
    1413471640.952530
    1413471640.952530
    1413471640.952530 Global_sizes total: 1173K
    1413471640.952530 Total number of table entries: 1648/1691
    1413471671.003551 ------------------------
    1413471671.003551 Memory: total=30160K total_adj=0K malloced: 23329K
    1413471671.003551 Run-time: user+sys=0.1 user=0.0 sys=0.0 real=0.1
    1413471671.003551 Conns: total=14 current=9/9 ext=0 mem=48488K avg=5387.6
    table=50K connvals=35K
    1413471671.003551 Conns: tcp=7/8 udp=2/6 icmp=0/0
    1413471671.003551 TCP-States:        Inact.  Syn.    SA      Part.   Est.
    Fin.    Rst.
    1413471671.003551 TCP-States:Inact.
    1413471671.003551 TCP-States:Syn.
    1413471671.003551 TCP-States:SA
    SSL::cipher_desc = 99K (351/351 entries)
    SSL::root_certs = 208K (147/147 entries)
    FTP::cmd_reply_code = 48K (325/325 entries)
    Weird::actions = 36K (162/162 entries)
    1413471671.003551 TCP-States:Part.
    1413471671.003551 TCP-States:Est.
    1413471671.003551 TCP-States:Fin.
    1413471671.003551 TCP-States:Rst.
    1413471671.003551 Connections expired due to inactivity: 0
    1413471671.003551 Total reassembler data: 0K
    1413471671.003551 RuleMatcher: matchers=2 dfa_states=40 ncomputed=150 mem=62K
    avg_nfa_states=1
    1413471671.003551 Timers: current=36 max=47 mem=2K lag=0.05s
    1413471671.003551 DNS_Mgr: requests=0 succesful=0 failed=0 pending=0
    cached_hosts=0 cached_addrs=0
    1413471671.003551 Triggers: total=0 pending=0
    1413471671.003551         ConnectionInactivityTimer = 11
    1413471671.003551         FileAnalysisInactivityTimer = 1
    ...

Analyzing prof.log

Intel::data_store size should remain unchanged during leakage/polling periods. (Note: It is proportional to the size of bv_intel.txt file.)

Global_sizes > 100k: 0K
               Communication::connected_peers = 2556K (18/18 entries)
               [FTP::cmd_reply_code](FTP::cmd_reply_code) = 48K (325/325 entries)
               Cluster::worker2manager_events = 155K
               Weird::actions = 37K (162/162 entries)
               Intel::min_data_store = 5386K
               Intel::data_store = 33904K
               SSL::cipher_desc = 101K (351/351 entries)
               Communication::nodes = 2588K (20/20 entries)
               SSL::root_certs = 209K (147/147 entries)
Global_sizes total: 45812K

Also Global_sizes total should remain unchanged during all statistical collection periods after initialization.

READER_ASCII would not leak if it had a perfect one-to-one read/write with input portion of Intel reader queue.

/var/spool/bro/bv_intel/bv_intel.txt/Input::READER_ASCII in=151 out=76527
pending=0/0 (\#queue r/w: in=151/151 out=76527/76527)

Total memory should be closely paid attention to for Bro Core Engine
leakage as it did in this 20-second example:

    Memory: total=32608K total_adj=0K malloced: 32478K
    ...
    Memory: total=123236K total_adj=90628K malloced: 78143K
    ...
    Memory: total=143088K total_adj=110480K malloced: 87989K
    ...
    Memory: total=164320K total_adj=131712K malloced: 108040K
    ...
    Memory: total=177040K total_adj=144432K malloced: 132104K
    ...
    Memory: total=177040K total_adj=144432K malloced: 133631K
    ...
    Memory: total=177040K total_adj=144432K malloced: 137221K
    ...
    Memory: total=177040K total_adj=144432K malloced: 139141K

Checkpointing of memory structures between Bro workers were less than 1K every 20 seconds as denoted by this prof.log output:

    Conns: total=78555 current=1622/1622 ext=0 mem=0K avg=0.0 table=0K connvals=0K
    ...
    Conns: total=78555 current=1622/1622 ext=0 mem=0K avg=0.0 table=1K connvals=0K
    ...
    Conns: total=78555 current=1622/1622 ext=0 mem=0K avg=0.0 table=0K connvals=0K
    ...