Security Architecture
GitHub Agentic Workflows implements a defense-in-depth security architecture that protects against untrusted Model Context Protocol (MCP) servers and compromised agents. This document provides an overview of our security model and visual diagrams of the key components.
Security Model
Section titled “Security Model”Agentic Workflows (AW) adopts a layered approach that combines substrate-enforced isolation, declarative specification, and staged execution. Each layer enforces distinct security properties under different assumptions and constrains the impact of failures above it.
Threat Model
Section titled “Threat Model”We consider an adversary that may compromise untrusted user-level components, e.g., containers, and may cause them to behave arbitrarily within the privileges granted to them. The adversary may attempt to:
- Access or corrupt the memory or state of other components
- Communicate over unintended channels
- Abuse legitimate channels to perform unintended actions
- Confuse higher-level control logic by deviating from expected workflows
We assume the adversary does not compromise the underlying hardware or cryptographic primitives. Attacks exploiting side channels and covert channels are also out of scope.
Layer 1: Substrate-Level Trust
Section titled “Layer 1: Substrate-Level Trust”AWs run on a GitHub Actions runner virtual machine (VM) and trust Actions’ hardware and kernel-level enforcement mechanisms, including the CPU, MMU, kernel, and container runtime. AW also relies on two privileged containers: (1) a network firewall that is trusted to configure connectivity for other components via iptables, and (2) an MCP Gateway that is trusted to configure and spawn isolated containers, e.g., local MCP servers. Collectively, the substrate level ensures memory isolation between components, CPU and resource isolation, mediation of privileged operations and system calls, and explicit, kernel-enforced communication boundaries. These guarantees hold even if an untrusted user-level component is fully compromised and executes arbitrary code. Trust violations at the substrate level require vulnerabilities in the firewall, MCP Gateway, container runtime, kernel, hypervisor, or hardware. If this layer fails, higher-level security guarantees may not hold.
Layer 2: Configuration-Level Trust
Section titled “Layer 2: Configuration-Level Trust”AW trusts declarative configuration artifacts, e.g., Action steps, network-firewall policies, MCP server configurations, and the toolchains that interpret them to correctly instantiate system structure and connectivity. The configuration level constrains which components are loaded, how components are connected, which communication channels are permitted, and what component privileges are assigned. Externally minted authentication tokens, e.g., agent API keys and GitHub access tokens, are a critical configuration input and are treated as imported capabilities that bound components’ external effects; declarative configuration controls their distribution, e.g., which tokens are loaded into which containers. Security violations arise due to misconfigurations, overly permissive specifications, and limitations of the declarative model. This layer defines what components exist and how they communicate, but it does not constrain how components use those channels over time.
Layer 3: Plan-Level Trust
Section titled “Layer 3: Plan-Level Trust”AW additionally relies on plan-level trust to constrain component behavior over time. At this layer, the trusted compiler decomposes a workflow into stages. For each stage, the plan specifies (1) which components are active and their permissions, (2) the data produced by the stage, and (3) how that data may be consumed by subsequent stages. In particular, plan-level trust ensures that important external side effects are explicit and undergo thorough vetting.
A primary instantiation of plan-level trust is the SafeOutputs subsystem. SafeOutputs is a set of trusted components that operate on external state. An agent can interact with read-only MCP servers, e.g., the GitHub MCP server, but externalized writes, such as creating GitHub pull requests, are buffered as artifacts by SafeOutputs rather than applied immediately. When the agent finishes, SafeOutputs’ buffered artifacts can be processed by a deterministic sequence of filters and analyses defined by configuration. These checks can include structural constraints, e.g., limiting the number of pull requests, policy enforcement, and automated sanitization to ensure that sensitive information such as authentication tokens are not exported. These filtered and transformed artifacts are passed to a subsequent stage in which they are externalized.
Security violations at the planning layer arise from incorrect plan construction, incomplete or overly permissive stage definitions, or errors in the enforcement of plan transitions. This layer does not protect against failures of substrate-level isolation or mis-allocation of permissions at credential-minting or configuration time. However, it limits the blast radius of a compromised component to the stage in which it is active and its influence on the artifacts passed to the next stage.
Component Overview
Section titled “Component Overview”The security architecture operates across multiple layers: compilation-time validation, runtime isolation, permission separation, network controls, and output sanitization. The following diagram illustrates the relationships between these components and the flow of data through the system.
flowchart TB
subgraph Input["📥 Input Layer"]
WF[/"Workflow (.md)"/]
IMPORTS[/"Imports & Includes"/]
EVENT[/"GitHub Event<br/>(Issue, PR, Comment)"/]
end
subgraph Compile["🔒 Compilation-Time Security"]
SCHEMA["Schema Validation"]
EXPR["Expression Safety Check"]
PIN["Action SHA Pinning"]
SCAN["Security Scanners<br/>(actionlint, zizmor, poutine)"]
end
subgraph Runtime["⚙️ Runtime Security"]
PRE["Pre-Activation<br/>Role & Permission Checks"]
ACT["Activation<br/>Content Sanitization"]
AGENT["Agent Execution<br/>Read-Only Permissions"]
REDACT_MAIN["Secret Redaction<br/>Credential Protection"]
end
subgraph Isolation["🛡️ Isolation Layer"]
AWF["Agent Workflow Firewall<br/>Network Egress Control"]
MCP["MCP Server Sandboxing<br/>Container Isolation"]
TOOL["Tool Allowlisting<br/>Explicit Permissions"]
end
subgraph Output["📤 Output Security"]
DETECT["Threat Detection<br/>AI-Powered Analysis"]
SAFE["Safe Outputs<br/>Permission Separation"]
SANITIZE["Output Sanitization<br/>Content Validation"]
end
subgraph Result["✅ Controlled Actions"]
ISSUE["Create Issue"]
PR["Create PR"]
COMMENT["Add Comment"]
end
WF --> SCHEMA
IMPORTS --> SCHEMA
SCHEMA --> EXPR
EXPR --> PIN
PIN --> SCAN
SCAN -->|".lock.yml"| PRE
EVENT --> ACT
PRE --> ACT
ACT --> AGENT
AGENT <--> AWF
AGENT <--> MCP
AGENT <--> TOOL
AGENT --> REDACT_MAIN
REDACT_MAIN --> DETECT
DETECT --> SAFE
SAFE --> SANITIZE
SANITIZE --> ISSUE
SANITIZE --> PR
SANITIZE --> COMMENT
Safe Outputs: Permission Isolation
Section titled “Safe Outputs: Permission Isolation”The SafeOutputs subsystem enforces permission isolation by ensuring that agent execution never has direct write access to external state. The agent job runs with minimal read-only permissions, while write operations are deferred to separate jobs that execute only after the agent completes. This separation ensures that even a fully compromised agent cannot directly modify repository state.
flowchart LR
subgraph AgentJob["Agent Job<br/>🔐 Read-Only Permissions"]
AGENT["AI Agent Execution"]
OUTPUT[/"agent_output.json<br/>(Artifact)"/]
AGENT --> OUTPUT
end
subgraph Detection["Threat Detection Job"]
ANALYZE["Analyze for:<br/>• Secret Leaks<br/>• Malicious Patches"]
end
subgraph SafeJobs["Safe Output Jobs<br/>🔓 Write Permissions (Scoped)"]
direction TB
ISSUE["create_issue<br/>issues: write"]
COMMENT["add_comment<br/>issues: write"]
PR["create_pull_request<br/>contents: write<br/>pull-requests: write"]
LABEL["add_labels<br/>issues: write"]
end
subgraph GitHub["GitHub API"]
API["GitHub REST/GraphQL API"]
end
OUTPUT -->|"Download Artifact"| ANALYZE
ANALYZE -->|"✅ Approved"| SafeJobs
ANALYZE -->|"❌ Blocked"| BLOCKED["Workflow Fails"]
ISSUE --> API
COMMENT --> API
PR --> API
LABEL --> API
Agent Workflow Firewall (AWF)
Section titled “Agent Workflow Firewall (AWF)”The Agent Workflow Firewall (AWF) provides network egress control at the substrate level. AWF mediates all outbound network requests from the agent, enforcing a domain allowlist that constrains which external endpoints the agent may contact. This mechanism prevents unauthorized data exfiltration and limits the blast radius of a compromised agent to only those domains explicitly permitted by configuration.
flowchart TB
subgraph Agent["AI Agent Process"]
COPILOT["Copilot CLI"]
WEB["WebFetch Tool"]
SEARCH["WebSearch Tool"]
end
subgraph Firewall["Agent Workflow Firewall (AWF)"]
WRAP["Process Wrapper"]
ALLOW["Domain Allowlist"]
LOG["Activity Logging"]
WRAP --> ALLOW
ALLOW --> LOG
end
subgraph Network["Network Layer"]
direction TB
ALLOWED_OUT["✅ Allowed Domains"]
BLOCKED_OUT["❌ Blocked Domains"]
end
subgraph Ecosystems["Ecosystem Bundles"]
direction TB
DEFAULTS["defaults<br/>certificates, JSON schema"]
PYTHON["python<br/>PyPI, Conda"]
NODE["node<br/>npm, npmjs.com"]
CUSTOM["Custom Domains<br/>api.example.com"]
end
COPILOT --> WRAP
WEB --> WRAP
SEARCH --> WRAP
ALLOW --> ALLOWED_OUT
ALLOW --> BLOCKED_OUT
DEFAULTS --> ALLOW
PYTHON --> ALLOW
NODE --> ALLOW
CUSTOM --> ALLOW
ALLOWED_OUT --> INTERNET["🌐 Internet"]
BLOCKED_OUT --> DROP["🚫 Dropped"]
Configuration Example:
engine: copilot
network: firewall: true allowed: - defaults # Basic infrastructure - python # PyPI ecosystem - node # npm ecosystem - "api.example.com" # Custom domainMCP Gateway and Firewall Integration
Section titled “MCP Gateway and Firewall Integration”When the MCP gateway is enabled, it operates in conjunction with AWF to ensure that MCP traffic remains contained within trusted boundaries. The gateway spawns isolated containers for MCP servers while AWF mediates all network egress, ensuring that agent-to-server communication traverses only approved channels.
flowchart LR
subgraph Host["Host machine"]
GATEWAY["gh-aw-mcpg\nDocker container\nHost port 80 maps to container port 8000"]
GH_MCP["GitHub MCP Server\nspawned via Docker socket"]
GATEWAY -->|"spawns"| GH_MCP
end
subgraph AWFNet["AWF network namespace"]
AGENT["Agent container\nCopilot CLI + MCP client\n172.30.0.20"]
PROXY["Squid proxy\n172.30.0.10"]
end
AGENT -->|"CONNECT host.docker.internal:80"| PROXY
PROXY -->|"allowed domain\n(host.docker.internal)"| GATEWAY
GATEWAY -->|"forwards to"| GH_MCP
Architecture Summary
- AWF establishes an isolated network with a Squid proxy that enforces the workflow
network.allowedlist. - The agent container can only egress through Squid. To reach the gateway, it uses
host.docker.internal:80(Docker’s host alias). This hostname must be included in the firewall’s allowed list. - The
gh-aw-mcpgcontainer publishes host port 80 mapped to container port 8000. It uses the Docker socket to spawn MCP server containers. - All MCP traffic remains within the host boundary: AWF restricts egress, and the gateway routes requests to sandboxed MCP servers.
MCP Server Sandboxing
Section titled “MCP Server Sandboxing”MCP servers execute within isolated containers, enforcing substrate-level separation between the agent and each server instance. Tool filtering at the configuration level restricts which operations each server may expose, limiting the attack surface available to a compromised agent. This isolation ensures that even if an MCP server is compromised, it cannot access the memory or state of other components.
flowchart TB
subgraph Agent["AI Agent"]
ENGINE["AI Engine<br/>(Copilot, Claude, Codex)"]
end
subgraph MCPLayer["MCP Server Layer"]
direction TB
subgraph GitHub["GitHub MCP"]
GH_TOOLS["Enabled Tools:<br/>• issue_read<br/>• list_commits<br/>• search_code"]
GH_BLOCKED["Blocked Tools:<br/>• delete_repository<br/>• update_branch_protection"]
end
subgraph Custom["Custom MCP (Docker)"]
CONTAINER["🐳 Isolated Container"]
NET["Network Allowlist"]
ENV["Env Var Injection"]
end
subgraph HTTP["HTTP MCP"]
ENDPOINT["HTTPS Endpoint"]
HEADERS["Secure Headers"]
end
end
subgraph Toolfilter["Tool Filtering"]
ALLOWED["allowed: [tool1, tool2]"]
DENIED["❌ Unlisted tools blocked"]
end
ENGINE <-->|"stdio/HTTP"| GitHub
ENGINE <-->|"stdio"| CONTAINER
ENGINE <-->|"HTTP"| ENDPOINT
ALLOWED --> GH_TOOLS
ALLOWED --> GH_BLOCKED
CONTAINER --> NET
CONTAINER --> ENV
ENDPOINT --> HEADERS
Isolation Properties:
- Container Isolation: Custom MCP servers run in Docker containers with no shared state
- Network Controls: Per-container domain allowlists enforced via Squid proxy
- Tool Allowlisting: Explicit
allowed:lists restrict available operations - Secret Injection: Secrets are passed via environment variables, never in configuration files
Threat Detection Pipeline
Section titled “Threat Detection Pipeline”The threat detection job is a sub-stage within the SafeOutputs subsystem. After the agent job completes and its outputs are buffered as artifacts, a separate detection job downloads these artifacts and invokes a prompted AI agent to analyze them for suspicious content. This detection agent operates with a security-focused system prompt and examines the agent’s outputs, patches, and execution context. The detection job runs in isolation from the original agent and has no access to write permissions; its sole responsibility is to emit a pass/fail verdict that gates the subsequent safe output jobs.
Detection checks include identification of secret leakage, malicious code patterns, and policy violations. If the detection agent identifies threats, the workflow terminates before any writes are externalized. Workflow authors can customize detection behavior by providing additional detection prompts or integrating external security scanners.
flowchart TB
subgraph Input["SafeOutputs Artifacts"]
JSON[/"agent_output.json<br/>(Buffered actions)"/]
PATCH[/"aw.patch<br/>(Git diff from agent)"/]
PROMPT[/"prompt.txt<br/>(Original workflow context)"/]
end
subgraph DetectionJob["Threat Detection Job"]
direction TB
DOWNLOAD["Download artifacts"]
AGENT["Detection Agent<br/>(Security-focused prompt)"]
subgraph Checks["Analysis Targets"]
SECRETS["Secret Leaks<br/>API keys, tokens<br/>Credentials in outputs"]
MALICIOUS["Malicious Patches<br/>Backdoors, vulnerabilities<br/>Suspicious modifications"]
POLICY["Policy Violations<br/>Scope violations<br/>Unauthorized operations"]
end
CUSTOM["Custom Detection Steps"]
end
subgraph Verdict["Verdict"]
SAFE_CHECK{{"Threats<br/>Detected?"}}
end
subgraph Outcome["Outcome"]
PROCEED["✅ Safe output jobs proceed"]
BLOCK["❌ Workflow fails<br/>No writes externalized"]
end
JSON --> DOWNLOAD
PATCH --> DOWNLOAD
PROMPT --> DOWNLOAD
DOWNLOAD --> AGENT
AGENT --> Checks
Checks --> CUSTOM
CUSTOM --> SAFE_CHECK
SAFE_CHECK -->|"No"| PROCEED
SAFE_CHECK -->|"Yes"| BLOCK
Detection Job Properties:
- Isolated Execution: The detection agent runs in a separate job with no write permissions and no access to the original agent’s runtime state
- Prompted Analysis: Detection uses the same AI engine as the workflow, but with a security-focused system prompt that instructs the agent to identify threats
- Artifact-Based: The detection agent only sees the buffered artifacts (outputs, patches, context), not live repository state
- Blocking Verdict: The detection job must complete successfully and emit a “safe” verdict before any safe output jobs execute
Detection Mechanisms:
- AI Detection: Default AI-powered analysis using the workflow engine with a security-focused detection prompt
- Custom Steps: Integration with security scanners (Semgrep, TruffleHog, LlamaGuard) via
threat-detection.stepsconfiguration - Custom Prompts: Domain-specific detection instructions for specialized threat models via
threat-detection.promptconfiguration
Configuration Example:
threat-detection: prompt: | Additionally check for: - References to internal infrastructure URLs - Attempts to modify CI/CD configuration files - Changes to security-sensitive files (.github/workflows, package.json scripts) steps: - name: Run TruffleHog run: trufflehog filesystem /tmp/gh-aw --only-verified - name: Run Semgrep run: semgrep scan /tmp/gh-aw/aw.patch --config=autoCompilation-Time Security
Section titled “Compilation-Time Security”AW enforces security constraints at compilation time through schema validation, expression allowlisting, and action pinning. The trusted compiler validates declarative configuration artifacts before they are deployed, rejecting misconfigurations and overly permissive specifications. This layer constrains what components may be loaded and how they may be connected, but it does not constrain runtime behavior.
flowchart TB
subgraph Source["Source Files"]
MD[/"workflow.md"/]
IMPORTS[/"imports/*.md"/]
end
subgraph Validation["Schema & Expression Validation"]
SCHEMA["JSON Schema Validation<br/>• Valid frontmatter fields<br/>• Correct types & formats"]
EXPR["Expression Safety<br/>• Allowlisted expressions only<br/>• No secrets in expressions"]
end
subgraph Pinning["Action Pinning"]
SHA["SHA Resolution<br/>actions/checkout@sha # v4"]
CACHE[/"actions-lock.json<br/>(Cached SHAs)"/]
end
subgraph Scanners["Security Scanners"]
ACTIONLINT["actionlint<br/>Workflow linting<br/>(includes shellcheck & pyflakes)"]
ZIZMOR["zizmor<br/>Security vulnerabilities<br/>Privilege escalation"]
POUTINE["poutine<br/>Supply chain risks<br/>Third-party actions"]
end
subgraph Strict["Strict Mode Enforcement"]
PERMS["❌ No write permissions"]
NETWORK["✅ Explicit network config"]
WILDCARD["❌ No wildcard domains"]
DEPRECATED["❌ No deprecated fields"]
end
subgraph Output["Compilation Output"]
LOCK[/".lock.yml<br/>(Validated Workflow)"/]
ERROR["❌ Compilation Error"]
end
MD --> SCHEMA
IMPORTS --> SCHEMA
SCHEMA --> EXPR
EXPR --> SHA
SHA <--> CACHE
SHA --> ACTIONLINT
ACTIONLINT --> ZIZMOR
ZIZMOR --> POUTINE
POUTINE --> Strict
Strict -->|"All Checks Pass"| LOCK
Strict -->|"Violation Found"| ERROR
Compilation Commands:
# Standard compilationgh aw compile
# Strict mode enforces additional security constraints (no write permissions, explicit network configuration)gh aw compile --strict
# Enable security scanners for additional validationgh aw compile --strict --actionlint --zizmor --poutineContent Sanitization
Section titled “Content Sanitization”User-generated content is sanitized before being passed to the agent. The sanitization pipeline applies a series of transformations to normalize potentially problematic content. This mechanism operates at the activation stage boundary, ensuring that untrusted input is processed before it is passed to the agent.
flowchart LR
subgraph Raw["Raw Event Content"]
TITLE["Issue Title"]
BODY["Issue/PR Body"]
COMMENT["Comment Text"]
end
subgraph Sanitization["Content Sanitization Pipeline"]
direction TB
MENTIONS["@mention Neutralization<br/>@user → `@user`"]
BOTS["Bot Trigger Protection<br/>fixes #123 → `fixes #123`"]
XML["XML/HTML Tag Conversion<br/><script> → (script)"]
URI["URI Filtering<br/>Only HTTPS from trusted domains"]
SPECIAL["Special Character Handling<br/>Unicode normalization"]
LIMIT["Content Limits<br/>0.5MB max, 65k lines"]
CONTROL["Control Character Removal<br/>ANSI escapes stripped"]
end
subgraph Safe["Sanitized Output"]
SAFE_TEXT["needs.activation.outputs.text<br/>✅ Safe for AI consumption"]
end
TITLE --> MENTIONS
BODY --> MENTIONS
COMMENT --> MENTIONS
MENTIONS --> BOTS
BOTS --> XML
XML --> URI
URI --> SPECIAL
SPECIAL --> LIMIT
LIMIT --> CONTROL
CONTROL --> SAFE_TEXT
Sanitization Properties:
| Mechanism | Input | Output | Protection |
|---|---|---|---|
| @mention Neutralization | @user | `@user` | Prevents unintended user notifications |
| Bot Trigger Protection | fixes #123 | `fixes #123` | Prevents automatic issue linking |
| XML/HTML Tag Conversion | <script> | (script) | Prevents injection via XML tags |
| URI Filtering | http://evil.com | (redacted) | Restricts to HTTPS from trusted domains |
| Special Characters | Unicode homoglyphs | Normalized | Prevents visual spoofing attacks |
| Content Limits | Large payloads | Truncated | Enforces 0.5MB max size, 65k lines max |
| Control Characters | ANSI escapes | Stripped | Removes terminal manipulation codes |
URI Filtering Behavior:
The URI filtering mechanism applies strict validation:
- ✅ Allowed:
https://github.com/...,https://api.github.com/... - ✅ Allowed: URLs from explicitly trusted domains in configuration
- ❌ Blocked:
http://URLs (non-HTTPS) - ❌ Blocked: URLs with suspicious patterns
- ❌ Blocked: Data URLs, javascript: URLs
- ❌ Blocked: URLs from untrusted domains → replaced with
(redacted)
Configuring Additional Domains:
To permit URLs from additional domains in sanitized content, configure the network: field in the workflow frontmatter:
network: allowed: - defaults # Basic infrastructure - "api.example.com" # Your custom domain - "trusted.com" # Another trusted domainDomains configured here apply to both network egress control (when firewall is enabled) and content sanitization. See Network Permissions for the complete list of ecosystem identifiers and configuration options.
XML/HTML Tag Handling:
XML and HTML tags are converted to a safe parentheses format to prevent injection:
<script>alert('xss')</script> → (script)alert('xss')(/script)<img src=x onerror=...> → (img src=x onerror=...)<!-- hidden comment --> → (!-- hidden comment --)Secret Redaction
Section titled “Secret Redaction”Before workflow artifacts are uploaded, all files in the /tmp/gh-aw directory are scanned for secret values and redacted. This mechanism prevents accidental credential leakage through logs, outputs, or artifacts. Secret redaction executes unconditionally (with if: always()), ensuring that secrets are protected even if the workflow fails at an earlier stage.
flowchart LR
subgraph Sources["Secret Sources"]
YAML["Workflow YAML"]
ENV["Environment Variables"]
MCP_CONF["MCP Server Config"]
end
subgraph Collection["Secret Collection"]
SCAN["Scan for secrets.* patterns"]
EXTRACT["Extract secret names:<br/>SECRET_NAME_1<br/>SECRET_NAME_2"]
end
subgraph Redaction["Secret Redaction Step"]
direction TB
FIND["Find files in /tmp/gh-aw<br/>(.txt, .json, .log, .md, .yml)"]
MATCH["Match exact secret values"]
REPLACE["Replace with masked value:<br/>abc***** (first 3 chars + asterisks)"]
end
subgraph Output["Safe Artifacts"]
LOGS["Redacted Logs"]
JSON_OUT["Sanitized JSON"]
PROMPT["Clean Prompt Files"]
end
YAML --> SCAN
ENV --> SCAN
MCP_CONF --> SCAN
SCAN --> EXTRACT
EXTRACT --> FIND
FIND --> MATCH
MATCH --> REPLACE
REPLACE --> LOGS
REPLACE --> JSON_OUT
REPLACE --> PROMPT
Redaction Properties:
- Automatic Detection: Scans workflow YAML for
secrets.*patterns and collects all secret references - Exact String Matching: Uses safe string matching (not regex) to prevent injection attacks
- Partial Visibility: Displays first 3 characters followed by asterisks for debugging without exposing full secrets
- Custom Masking: Supports additional custom secret masking steps via
secret-masking:configuration
Configuration Example:
secret-masking: steps: - name: Redact custom patterns run: | find /tmp/gh-aw -type f -exec sed -i 's/password123/REDACTED/g' {} +Job Execution Flow
Section titled “Job Execution Flow”Workflow execution follows a strict dependency order that enforces security checks at each stage boundary. The plan-level decomposition ensures that each stage has explicit inputs and outputs, and that transitions between stages are mediated by validation steps.
flowchart TB
subgraph PreActivation["Pre-Activation Job"]
ROLE["Role Permission Check"]
DEADLINE["Stop-After Deadline"]
SKIP["Skip-If-Match Check"]
COMMAND["Command Position Validation"]
end
subgraph Activation["Activation Job"]
CONTEXT["Prepare Workflow Context"]
SANITIZE["Sanitize Event Text"]
LOCK_CHECK["Validate Lock File"]
end
subgraph Agent["Agent Job"]
CHECKOUT["Repository Checkout"]
RUNTIME["Runtime Setup<br/>(Node.js, Python)"]
CACHE_RESTORE["Cache Restore"]
MCP_START["Start MCP Containers"]
PROMPT["Generate Prompt"]
EXECUTE["Execute AI Engine"]
REDACT["🔐 Secret Redaction"]
UPLOAD["Upload Output Artifact"]
CACHE_SAVE["Save Cache"]
end
subgraph Detection["Detection Job"]
DOWNLOAD_DETECT["Download Artifact"]
ANALYZE["AI + Custom Analysis"]
VERDICT["Security Verdict"]
end
subgraph SafeOutputs["Safe Output Jobs"]
CREATE_ISSUE["create_issue"]
ADD_COMMENT["add_comment"]
CREATE_PR["create_pull_request"]
end
subgraph Conclusion["Conclusion Job"]
AGGREGATE["Aggregate Results"]
SUMMARY["Generate Summary"]
end
ROLE --> DEADLINE
DEADLINE --> SKIP
SKIP --> COMMAND
COMMAND -->|"✅ Pass"| CONTEXT
COMMAND -->|"❌ Fail"| SKIP_ALL["Skip All Jobs"]
CONTEXT --> SANITIZE
SANITIZE --> LOCK_CHECK
LOCK_CHECK --> CHECKOUT
CHECKOUT --> RUNTIME
RUNTIME --> CACHE_RESTORE
CACHE_RESTORE --> MCP_START
MCP_START --> PROMPT
PROMPT --> EXECUTE
EXECUTE --> REDACT
REDACT --> UPLOAD
UPLOAD --> CACHE_SAVE
CACHE_SAVE --> DOWNLOAD_DETECT
DOWNLOAD_DETECT --> ANALYZE
ANALYZE --> VERDICT
VERDICT -->|"✅ Safe"| CREATE_ISSUE
VERDICT -->|"✅ Safe"| ADD_COMMENT
VERDICT -->|"✅ Safe"| CREATE_PR
VERDICT -->|"❌ Threat"| BLOCK_ALL["Block All Safe Outputs"]
CREATE_ISSUE --> AGGREGATE
ADD_COMMENT --> AGGREGATE
CREATE_PR --> AGGREGATE
AGGREGATE --> SUMMARY
Observability
Section titled “Observability”AW provides comprehensive observability through GitHub Actions runs and artifacts. Workflow artifacts preserve prompts, outputs, patches, and logs for post-hoc analysis. This observability layer supports debugging, security auditing, and cost monitoring without compromising runtime isolation.
flowchart TB
subgraph Workflow["Workflow Execution"]
RUN["GitHub Actions Run"]
JOBS["Job Logs"]
STEPS["Step Outputs"]
end
subgraph Artifacts["Workflow Artifacts"]
AGENT_OUT[/"agent_output.json<br/>AI decisions & actions"/]
PROMPT[/"prompt.txt<br/>Generated prompts"/]
PATCH[/"aw.patch<br/>Code changes"/]
LOGS[/"engine logs<br/>Token usage & timing"/]
FIREWALL[/"firewall logs<br/>Network requests"/]
end
subgraph CLI["CLI Tools"]
AW_LOGS["gh aw logs<br/>Download & analyze runs"]
AW_AUDIT["gh aw audit<br/>Investigate failures"]
AW_STATUS["gh aw status<br/>Workflow health"]
end
subgraph Insights["Observability Insights"]
COST["💰 Cost Tracking<br/>Token usage per run"]
DEBUG["🔍 Debugging<br/>Step-by-step trace"]
SECURITY["🛡️ Security Audit<br/>Network & tool access"]
PERF["⚡ Performance<br/>Duration & bottlenecks"]
end
RUN --> JOBS
JOBS --> STEPS
STEPS --> Artifacts
AGENT_OUT --> AW_LOGS
PROMPT --> AW_LOGS
PATCH --> AW_AUDIT
LOGS --> AW_LOGS
FIREWALL --> AW_AUDIT
AW_LOGS --> COST
AW_LOGS --> PERF
AW_AUDIT --> DEBUG
AW_AUDIT --> SECURITY
AW_STATUS --> DEBUG
Observability Properties:
- Artifact Preservation: All workflow outputs (prompts, patches, logs) are saved as downloadable artifacts
- Cost Monitoring: Token usage and costs across workflow runs are tracked via
gh aw logs - Failure Analysis: Failed runs can be investigated with
gh aw auditto examine prompts, errors, and network activity - Firewall Logs: All network requests made by the agent are logged for security auditing
- Step Summaries: Rich markdown summaries in GitHub Actions display agent decisions and outputs
CLI Commands for Observability:
# Download and analyze workflow run logsgh aw logs
# Investigate a specific workflow rungh aw audit <run-id>
# Check workflow health and statusgh aw statusSecurity Layers Summary
Section titled “Security Layers Summary”| Layer | Mechanism | Protection Against |
|---|---|---|
| Substrate | GitHub Actions runner (VM, kernel, hypervisor) | Memory corruption, privilege escalation, host escape |
| Substrate | Docker container runtime | Process isolation bypass, shared state access |
| Substrate | AWF network controls (iptables) | Data exfiltration, unauthorized API calls |
| Substrate | MCP sandboxing (container isolation) | Container escape, unauthorized tool access |
| Configuration | Schema validation, expression allowlist | Invalid configurations, unauthorized expressions |
| Configuration | Action SHA pinning | Supply chain attacks, tag hijacking |
| Configuration | Security scanners (actionlint, zizmor, poutine) | Privilege escalation, misconfigurations, supply chain risks |
| Configuration | Pre-activation checks (role/permission) | Unauthorized users, expired workflows |
| Plan | Content sanitization | @mention abuse, bot triggers |
| Plan | Secret redaction | Credential leakage in logs/artifacts |
| Plan | Threat detection | Malicious patches, secret leaks |
| Plan | Permission separation (SafeOutputs) | Direct write access abuse |
| Plan | Output sanitization | Content injection, XSS |
| Plan | Artifact preservation, CLI tools | Debugging failures, auditing security, cost tracking |
Related Documentation
Section titled “Related Documentation”- Security Best Practices - Comprehensive security guidelines
- Threat Detection Guide - Configuring threat analysis
- Network Permissions - Network access control
- Safe Outputs Reference - Output processing configuration
- AI Engines - Engine-specific security features
- Compilation Process - Build-time security validation
- CLI Commands - Workflow management and observability tools