
📌 Context
Regex is powerful, but only if you know how to use it inside the tools that matter. This is where analysts stop copying patterns from blogs and start operationalizing regex in the SOC. grep, Splunk, ELK, Suricata, Sigma, and YARA all speak regex — but the syntax and workflows differ. This part shows how regex is applied directly in blue team contexts to make your searches faster, your detections sharper, and your hunts more effective.
🔬 Regex in grep
About grep: grep searches text using regex. It’s fast, brutal, and perfect for triage. Many SOC pipelines still begin at the command line with grep before moving into Splunk or ELK.
Quick reminder on pipes (|): The pipe takes the output of one command and feeds it into the next. Think of it as an assembly line: each stage cleans, sorts, or reshapes the data for the next tool. You’ll see this in the IOC frequency examples below.
# Find lines starting with ERROR
grep -E "^ERROR" app.log
# Hunt executables in proxy logs
grep -E "\.exe$" proxy.log | less
# Extract Event IDs from Windows log exports
grep -E "EventID=[0-9]{4}" security.evtx.txt | cut -d= -f2 | sort | uniq -c
IOC frequency analysis with pipes:
# Count unique IPs from logs
grep -Eo "([0-9]{1,3}\.){3}[0-9]{1,3}" access.log \
| sort \
| uniq -c \
| sort -nr \
| head
This chain of commands extracts all IPs (grep -Eo), sorts them, counts unique values, re-sorts by frequency, and shows the top offenders. That’s regex plus shell logic — giving you a quick and dirty IOC leaderboard.
🔬 Regex in Splunk
Regex powers both regex (for filtering) and rex (for field extraction). Anchoring patterns keeps false positives low.
# Drop all traffic not ending in .exe
index=proxy | regex url="\.exe$"
# Extract user IDs from log lines
index=auth | rex "UserID=(?<userid>\d+)"
Regex in Splunk lets you move past keyword search into targeted extractions. Once fields exist, they can power dashboards, alerts, and pivots.
🔬 Regex in ELK (Elastic / Logstash)
Regex shows up in Logstash grok patterns and Kibana queries. When grok falls short, raw regex closes the gap.
grok {
match => { "message" => "(?<domain>([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,})" }
}
Here regex pulls domains out of arbitrary log messages, something built-in grok patterns might miss.
🔬 Regex in Suricata/Snort
pcre rules embed regex inside IDS signatures. This adds flexibility when static strings won’t cut it.
alert http any any -> any any (msg:"Suspicious domain"; \
pcre:"/([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}/"; \
sid:100002; rev:1;)
This rule alerts on any outbound HTTP request with a suspicious-looking domain. Regex catches new variants attackers spin up faster than signature updates.
🔬 Regex in Sigma & YARA
Sigma (SIEM detection):
title: Suspicious File Extension
logsource:
category: proxy
detection:
selection:
url: /\.exe$/
condition: selection
YARA (file/memory scanning):
rule Suspicious_SHA256 {
strings:
$sha256 = /\b[a-fA-F0-9]{64}\b/
condition:
$sha256
}
📎 Attacker Behavior Snapshot
Regex makes attacker behavior visible across multiple layers:
- Beaconing IPs in firewall logs with grep.
- Odd domains in Splunk proxy dashboards.
- Suspicious downloads flagged by Suricata pcre rules.
- Persistence indicators codified into Sigma and YARA.
🛠️ SOC Detection Strategy
- Tier 1: IOC checks with grep — fast, dirty, effective.
- Tier 2: Regex searches in Splunk/ELK for validation and correlation.
- Tier 3: Promotion into Suricata, Sigma, or YARA for codified detection.
🔐 Hardening & Mitigation
- Anchor expressions (
^,$) to control scope. - Test patterns in regex101 or CyberChef before production.
- Benchmark regex in Splunk/ELK — bad expressions can crush performance.
📋 Incident Response Snippets
- grep IOC sweep:
grep -E "192\.168\.[0-9]{1,3}\.[0-9]{1,3}" access.log - Splunk beaconing hunt:
index=proxy | regex url="([0-9]{1,3}\.){3}[0-9]{1,3}" - YARA domain rule:
rule Regex_Domain {
strings:
$domain = /([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,}/
condition:
$domain
}
🧾 Final Thoughts
This is where regex leaves the whiteboard and enters battle. grep pipelines, Splunk field extractions, ELK grok fallbacks, Suricata pcre rules, Sigma/YARA detections — regex ties them all together. If you know how to wield regex across these tools, you’ll see attacker fingerprints everywhere they try to hide. Next up: Part 4 — Pitfalls & Tuning, where we cover the traps that trip analysts and cripple SIEMs.
Published: September 8, 2025
Leave a comment