Introduction
Network traffic baselining is a cornerstone of proactive threat detection in Security Operations Centers (SOCs). Precise baselining enables detection of subtle deviations indicative of advanced persistent threats (APTs), lateral movement, and data exfiltration. Zeek (formerly Bro) is a high-fidelity network security monitoring framework capable of deep protocol parsing and extensive contextual logging—ideal for creating actionable baselines.
This guide details an advanced deployment of Zeek in a lab or production environment, ingestion and parsing of Zeek logs for baseline extraction, anomaly simulation, and crafting SIEM correlation rules optimized for SOC workflows.
Step 1: Zeek Deployment and Configuration
1.1 Environment Preparation
- Use a dedicated Ubuntu 22.04 LTS server or VM with at least 4 vCPUs and 8 GB RAM for real-time processing.
- Network interface must be in promiscuous mode or mirror/SPAN port to capture full traffic payloads.
- Confirm kernel parameters for large buffers (
net.core.rmem_maxandnet.core.wmem_max) are tuned for high throughput capture.
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
1.2 Installation
Leverage the official Zeek apt repository for latest stable releases:
sudo apt install software-properties-common curl
curl -fsSL https://download.zeek.org/zeek-key.asc | sudo apt-key add -
sudo add-apt-repository "deb https://download.zeek.org/packages/apt focal main"
sudo apt update
sudo apt install zeek
Verify version and build configuration:
zeek --version
1.3 Interface and Node Configuration
Edit /usr/local/zeek/etc/node.cfg or /opt/zeek/etc/node.cfg depending on install path:
[zeek-1]
type=standalone
host=localhost
interface=eth0
For multi-node cluster setups, define manager and workers accordingly.
1.4 Custom Local Script Configuration
Edit /usr/local/zeek/share/zeek/site/local.zeek to enable or tune scripts, e.g., enabling scan detection:
@load policy/misc/scan
# Thresholds for port scan detection (default: 100 unique ports per 60 seconds)
redef Scan::default_scan_threshold = 50;
1.5 Launch Zeek
sudo zeekctl deploy
Monitor logs in /usr/local/zeek/logs/current/.
Step 2: Network Traffic Capture & Log Collection
Zeek automatically produces high-fidelity logs, including but not limited to:
conn.log: TCP/UDP/ICMP connection summariesdns.log: DNS query and response metadatahttp.log: HTTP requests and responsesfiles.log: File transfer metadatassl.log: TLS handshake detailsscan.log: Detected scanning activitynotice.log: Alerts and warnings generated by Zeek scripts
Capture baseline traffic continuously for 24–72 hours during normal network activity to capture diurnal usage patterns and protocol diversity.
Step 3: Baseline Extraction and Analysis
3.1 Raw Log Parsing Using CLI Tools
Zeek logs are tab-separated value (TSV) files with a header prefixed by #. Use zeek-cut (Zeek-specific column extraction tool) to parse and filter fields.
Example: Extract and count top destination IPs in conn.log:
zcat logs/current/conn.log.gz | zeek-cut id.resp_h | sort | uniq -c | sort -nr | head -20
Example: Summarize DNS query types and frequencies:
zcat logs/current/dns.log.gz | zeek-cut query | sort | uniq -c | sort -nr | head -20
3.2 Importing Logs into SIEM/ELK for Visualization
- Use Filebeat or Logstash to ingest Zeek logs into Elasticsearch.
- Define index mappings reflecting Zeek’s TSV fields to allow fielded queries.
- Create Kibana dashboards to visualize:
- Protocol distribution (
conn.log’sprotofield) - Top internal/external talkers (
id.orig_handid.resp_h) - DNS query patterns (
query,rcode) - Scan activities (aggregate counts from
scan.log)
- Protocol distribution (
3.3 Statistical Baseline Metrics
| Metric | Field(s) | Typical Baseline Description |
|---|---|---|
| Top Protocols | conn.log: proto | TCP/HTTP/TLS predominant |
| Common External IPs | conn.log: id.resp_h | Trusted external IP ranges |
| DNS Query Volume | dns.log: query | Frequent legitimate domains |
| Connection Duration | conn.log: duration | Mean duration per protocol |
| Port Scan Frequency | scan.log | Near zero during normal operation |
| File Transfer Sizes | files.log: fuid, size | Typical small to medium file sizes |
Use these metrics to build threshold values for anomaly detection.
Step 4: Anomaly Simulation and Detection
4.1 Generating Anomalous Traffic
- Port Scanning: Use Nmap to scan entire subnet.
nmap -p- 192.168.1.0/24
- DNS Anomalies: Generate random or high-volume DNS queries.
dig randomsubdomain$(date +%s).example.com @8.8.8.8
- Data Exfiltration: Transfer large files over HTTP or FTP from internal to external IPs.
4.2 Zeek Detection Capabilities
- Zeek’s
scan.logwill log IPs exhibiting scanning behavior with timestamps and port ranges. - Elevated DNS query NXDOMAIN rates logged in
dns.logindicate suspicious domain generation algorithms (DGA). - Large connection sizes and long durations appear in
conn.logandfiles.log, useful for exfiltration detection.
Example command to detect abnormal port scan events:
zcat logs/current/scan.log.gz | jq '. | {ts, src_ip: .src, dest_ip: .dst, num_ports: .num_ports}'
Step 5: Correlation Rules & SOC Integration
5.1 Forward Zeek Logs to SIEM
- Use syslog or Filebeat forwarding for real-time ingestion.
- Normalize fields using SIEM parsing pipelines.
5.2 Sample Detection Rules (Pseudocode)
rule: Port Scan Detected
when:
event_type == "zeek_scan" and num_ports > 20
then:
alert("Potential reconnaissance activity from " + src_ip)
rule: Suspicious DNS NXDOMAIN Spike
when:
dns.rcode == "NXDOMAIN" and query_rate > baseline_threshold
then:
alert("Possible DGA activity from " + id.orig_h)
rule: Large Data Exfiltration
when:
conn.bytes_sent > baseline_average * 10 and conn.duration > 300
then:
alert("Potential data exfiltration from " + id.orig_h)
5.3 Automation and Playbook Integration
- Trigger SOAR workflows based on alerts for automated containment.
- Enrich events with threat intelligence based on IP/domain reputation.
- Use Zeek’s scripting to generate
notice.logalerts, integrated into alert pipelines.
Conclusion
Zeek offers SOC analysts unparalleled visibility into network behavior. Mastering baseline creation and anomaly detection using Zeek’s rich log data, combined with SIEM correlation, enables early detection of stealthy adversaries and reduces alert fatigue. This advanced setup provides a scalable, extensible foundation for sophisticated SOC operations.
If you’d like, I can also help draft sample Zeek scripts for customized detection or SIEM ingestion configurations. Just let me know!
Leave a comment