Automated Attack Surface Discovery with OWASP ZAP: CI/CD Integration & Compliance Workflows
Modern web architectures expose endpoints faster than any manual inventory can keep pace with. OWASP ZAP, deployed as an automated discovery engine inside a DevSecOps pipeline, continuously maps that expanding surface and flags regressions the moment they appear. This guide is part of Attack Surface Mapping Techniques, which sits within the broader Threat Modeling Fundamentals & Methodology practice. Where relevant, this page also cross-references defining trust boundaries for scope enforcement and the STRIDE framework for classifying what ZAP uncovers.
Prerequisites
- OWASP ZAP 2.14+ (Docker image
ghcr.io/zaproxy/zaproxy:stableor local install) - A target application accessible from the CI runner (Docker Compose or a dedicated staging slot)
- A CI/CD system with secrets management (GitHub Actions, GitLab CI, or equivalent)
- Python 3.10+ for API scripting;
jqfor report post-processing - A dedicated CI service account with least-privilege access to the target app
Expected Outcomes
- ZAP context configured with strict inclusion/exclusion rules and authenticated sessions
- Baseline DAST scan running on every pull request, uploading SARIF to the repository’s Security tab
- SPA routes, GraphQL endpoints, and WebSocket upgrades included in the discovered attack surface
- Compliance evidence JSON mapped to SOC 2, OWASP ASVS, and PCI DSS controls, ready for auditors
- Discovered endpoints diffed against the threat model registry with automated ticket creation on drift
Step 1: Define ZAP Scan Scope and Authentication Contexts
Dynamic scanning without explicit boundaries generates false positives, violates rate limits, and risks data corruption. ZAP contexts enforce strict inclusion/exclusion rules and define authentication lifecycles. Align these boundaries with the trust boundaries already established in your architecture documentation so the scanner never crosses into a zone it does not own.
ZAP contexts are imported via the REST API or CLI. The JSON below defines a strict scope, JWT injection, and session validation indicators.
{
"context": {
"name": "production-api-scope",
"description": "Authenticated API surface for CI/CD baseline",
"inScope": true,
"urls": [
"https://api\\.example\\.com/.*",
"https://app\\.example\\.com/.*"
],
"excludeFromScan": [
"https://api\\.example\\.com/health",
"https://api\\.example\\.com/admin/.*",
"https://cdn\\.thirdparty\\.com/.*"
],
"authentication": {
"type": "json",
"method": "POST",
"loginUrl": "https://api.example.com/v1/auth/login",
"loginRequestData": "{\"email\":\"{%username%}\",\"password\":\"{%password%}\"}",
"loggedInIndicator": "\"status\":\"authenticated\"",
"loggedOutIndicator": "\"status\":\"unauthorized\""
},
"users": [
{
"name": "test-scanner-user",
"credentials": {
"username": "[email protected]",
"password": "${ZAP_SCANNER_PASSWORD}"
}
}
],
"sessionManagement": {
"type": "cookieBasedSessionManagement",
"parameters": {
"cookieName": "session_id"
}
}
}
}
Security boundaries to enforce at this step:
- Never scan
/admin,/internal, or third-party CDN paths. - Use isolated CI service accounts with least-privilege RBAC — this applies the same principle as injection attack prevention where reducing blast radius limits damage from any tooling misconfiguration.
- Enforce
excludeFromScanregexes to block destructive endpoints such asDELETE /v1/users/*. - Rotate credentials via CI secret managers; never hardcode credentials in the context file.
Step 2: CI/CD Pipeline Integration and Baseline Gating
Baseline scans identify obvious misconfigurations and run fast enough to gate every pull request. The workflow below is non-blocking on PRs but fails hard on critical findings in mainline branches.
name: ZAP Baseline DAST Scan
on:
pull_request:
branches: [main, develop]
push:
branches: [main]
jobs:
zap-baseline:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Start Target App
run: docker compose -f docker-compose.test.yml up -d --wait
- name: Run ZAP Baseline Scan
uses: zaproxy/action-[email protected]
with:
target: 'http://localhost:8080'
cmd_options: '-a -j -r zap-baseline.html -d'
allow_issue_writing: false
fail_action: ${{ github.ref == 'refs/heads/main' }}
rules_file_name: '.zap/rules.tsv'
token: ${{ secrets.GITHUB_TOKEN }}
artifact_name: 'zap-baseline-report'
- name: Upload SARIF Report
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: 'zap-baseline.sarif'
Pipeline gating strategy:
fail_actionevaluates tofalseon PRs so merges are not blocked, but alerts appear in the GitHub Security tab for developer review.- On
mainpushes,fail_actionresolves totrue, blocking promotion on any High or Critical finding. - Use
.zap/rules.tsvto tune alert severity thresholds per environment — suppress known false positives by alert ID rather than by risk category to avoid masking real issues.
Step 3: Handle SPAs, GraphQL, and WebSocket Surfaces
Traditional link-following crawlers fail on client-side routing, GraphQL mutations, and WebSocket upgrade requests. ZAP requires explicit configuration to map these surfaces, otherwise entire attack vectors are invisible to the scanner — a gap that compounds when you apply the STRIDE framework to assess what those missed surfaces expose.
AJAX Spider for SPA Route Discovery
Hash-based (#route) and History API (/route) endpoints require DOM parsing. The Python script below uses the ZAP API to force route discovery and queue discovered routes for active scanning.
#!/usr/bin/env python3
"""
ZAP Standalone Script: SPA Route Discovery & AJAX Spider Trigger
Run via: python spa_crawler.py
Requires: OWASP ZAP running on ZAP_URL with ZAP_API_KEY set
"""
import requests
import time
ZAP_API_KEY = "changeme"
ZAP_URL = "http://localhost:8080"
TARGET = "https://app.example.com"
def configure_and_run_ajax_spider():
headers = {"X-ZAP-API-Key": ZAP_API_KEY}
# Configure AJAX Spider for SPA wait times and route extraction
requests.post(
f"{ZAP_URL}/JSON/ajaxSpider/action/setOptionBrowserId/",
params={"String": "firefox-headless"},
headers=headers,
)
requests.post(
f"{ZAP_URL}/JSON/ajaxSpider/action/setOptionMaxDuration/",
params={"Integer": "15"},
headers=headers,
)
requests.post(
f"{ZAP_URL}/JSON/ajaxSpider/action/setOptionEventWait/",
params={"Integer": "500"},
headers=headers,
)
# Trigger crawl
resp = requests.get(
f"{ZAP_URL}/JSON/ajaxSpider/action/scan/",
params={"url": TARGET, "inScopeOnly": "true"},
headers=headers,
)
print(f"Spider started: {resp.json()}")
# Poll until completion
while True:
status = requests.get(
f"{ZAP_URL}/JSON/ajaxSpider/view/status/", headers=headers
).json()
if status.get("status") == "stopped":
break
time.sleep(5)
# Extract discovered URLs and queue for active scan
results = requests.get(
f"{ZAP_URL}/JSON/ajaxSpider/view/results/", headers=headers
).json()
urls = results.get("results", [])
for url in urls:
requests.post(
f"{ZAP_URL}/JSON/ascan/action/scan/",
params={"url": url, "recurse": "false"},
headers=headers,
)
print(f"Active scan queued for {len(urls)} SPA routes.")
if __name__ == "__main__":
configure_and_run_ajax_spider()
GraphQL and WebSocket Configuration
| Surface | ZAP Configuration | Key Risk |
|---|---|---|
| GraphQL introspection | Add Content-Type: application/json header; seed {"query":"{__schema{types{name}}}"} as the initial request body |
Schema enumeration exposes all types and mutations to an attacker |
| WebSocket | Enable via Options > WebSocket > Enable WebSocket Support; add custom active scan scripts for frame fuzzing | Real-time data exfiltration; message injection |
| Server-Sent Events | Passive scan only; record stream URLs manually and add to context | Persistent data leakage from event channels |
| Service mesh internal traffic | Exclude *.svc.cluster.local via regex; scan only ingress controller URLs |
Avoids scope creep into internal service-to-service channels |
Cross-site scripting vulnerabilities are often first surfaced by ZAP’s reflected-content checks — see XSS mitigation patterns for the corresponding remediation guidance once ZAP alerts on these.
Step 4: Compliance Evidence Generation and Threat Model Synchronization
Compliance Mapping
ZAP alert IDs carry structured metadata. The table below maps high-value alerts to the audit controls you are most likely to be assessed against.
| ZAP Alert ID | Vulnerability | SOC 2 CC6.1 | OWASP ASVS 4.0 | PCI DSS 6.3.2 |
|---|---|---|---|---|
| 10010 | Cookie Without Secure Flag | Access Control | V3.4.1 | Secure Coding |
| 10011 | Cookie Without HttpOnly Flag | Data Protection | V3.4.2 | Secure Coding |
| 10020 | X-Frame-Options Header Missing | System Integrity | V14.4.7 | Secure Coding |
| 40012 | Cross-Site Scripting (Reflected) | Input Validation | V5.3.3 | Secure Coding |
| 40014 | Cross-Site Scripting (Persistent) | Input Validation | V5.3.3 | Secure Coding |
| 90033 | Loosely Scoped Cookie | Data Protection | V3.4.5 | Secure Coding |
Automated Evidence Export
# Export JSON report with full alert metadata
curl -s "http://localhost:8080/JSON/core/view/alerts/?apikey=${ZAP_API_KEY}&baseurl=https://api.example.com" \
> zap-alerts.json
# Filter High and Medium alerts; map to compliance controls
jq '[.alerts[] | select(.risk == "High" or .risk == "Medium") |
{
alert_id: .id,
risk: .risk,
control: (
if .id == "10010" then "SOC2_CC6.1 / ASVS_V3.4.1"
elif .id == "40012" then "PCI_DSS_6.3.2 / ASVS_V5.3.3"
else "ISO_27001_A14 / ASVS_V14"
end
),
evidence: .other,
remediation: .solution
}]' zap-alerts.json > compliance-evidence.json
Store compliance-evidence.json in an immutable artifact repository (AWS S3 with Object Lock or Azure Blob WORM storage) to satisfy auditor retention requirements.
Threat Model Drift Detection
Static threat models diverge from the live application as code evolves. The diff pipeline below compares ZAP-discovered endpoints against a Git-tracked threat-model.json and raises a ticket on any unregistered surface. This enforces the continuous validation loop described in threat model documentation patterns.
# Extract ZAP-discovered endpoints after spider completes
curl -s "http://localhost:8080/JSON/spider/view/results/?apikey=${ZAP_API_KEY}&scanId=0" \
| jq -r '.results[]' | sort -u > discovered_endpoints.txt
# Extract expected endpoints from the threat model registry
jq -r '.endpoints[].path' threat-model.json | sort -u > expected_endpoints.txt
# Identify net-new endpoints (present in scan but missing from model)
NEW=$(comm -13 expected_endpoints.txt discovered_endpoints.txt)
if [ -n "$NEW" ]; then
echo "DRIFT DETECTED — unregistered endpoints found:"
echo "$NEW"
# Raise a GitHub issue automatically
gh issue create \
--title "Attack surface drift: unregistered endpoints detected" \
--body "$(echo "$NEW" | sed 's/^/- /')" \
--label "security,p2"
fi
If expected endpoints are absent from scan results, validate authentication context or routing configuration before assuming they were removed.
Verification
After running a full scan, confirm the pipeline is working correctly:
# 1. Confirm ZAP discovered the expected number of URLs
curl -s "http://localhost:8080/JSON/spider/view/results/?apikey=${ZAP_API_KEY}&scanId=0" \
| jq '.results | length'
# 2. Confirm no High alerts are present (exit 1 if any found)
HIGH=$(curl -s "http://localhost:8080/JSON/core/view/alerts/?apikey=${ZAP_API_KEY}" \
| jq '[.alerts[] | select(.risk == "High")] | length')
echo "High alerts: $HIGH"
[ "$HIGH" -eq 0 ] || exit 1
# 3. Verify SARIF output contains expected alert categories
jq '.runs[0].results | length' zap-baseline.sarif
# 4. Confirm compliance evidence file was generated and is non-empty
[ -s compliance-evidence.json ] && echo "Evidence file OK" || echo "Evidence file MISSING"
Expected baseline output for a healthy staging environment: zero High alerts, compliance-evidence.json with at least one mapped entry, and the SARIF file uploaded successfully to the GitHub Security tab under the PR.
Troubleshooting
| Failure Mode | Diagnosis | Fix |
|---|---|---|
| ZAP reports 0 discovered URLs | Authentication context misconfigured; scanner is hitting login redirect on every request | Test loggedInIndicator regex against a real auth response using curl; click the “Test” button in ZAP’s Authentication panel before CI runs |
| AJAX Spider finds no SPA routes | eventWait too short for slow JS rendering; Firefox headless not installed on runner |
Increase eventWait to 1000ms; add --shm-size=2g to the Docker run flags; confirm Firefox is available in the CI image |
| High false-positive rate on third-party domains | excludeFromScan regexes not anchored correctly |
Prefix each regex with https:// and suffix with /.*; validate using ZAP’s Context editor before committing |
| SARIF upload fails with “file not found” | zaproxy/action-baseline only generates SARIF when -j flag is present |
Confirm cmd_options includes -j; check the runner’s working directory matches the SARIF path in the upload step |
| Compliance evidence JSON is empty | ZAP found no alerts matching the select(.risk == "High" or .risk == "Medium") filter |
Lower the filter to include "Low" for an initial run to confirm the pipeline is functional; then investigate why higher-risk alerts are absent |
Related
- Attack Surface Mapping Techniques — parent page covering the full discipline of surface discovery
- Mapping Trust Boundaries in Cloud-Native Apps — how to define the scope boundaries ZAP should enforce
- How to Apply STRIDE to Microservices Architecture — classify the threats ZAP surfaces using the STRIDE model
- Threat Model Documentation Patterns — maintain the threat model registry that ZAP diffs against
- SSRF Allowlists and Metadata Service Protection — SSRF vulnerabilities are commonly surfaced by ZAP active scans; see this page for remediation