Skip to content
GitHub Agentic Workflows

Security Architecture

GitHub Agentic Workflows implements a defense-in-depth security architecture that protects against untrusted Model Context Protocol (MCP) servers and compromised agents. This document provides an overview of our security model and visual diagrams of the key components.

Agentic Workflows (AW) adopts a layered approach that combines substrate-enforced isolation, declarative specification, and staged execution. Each layer enforces distinct security properties under different assumptions and constrains the impact of failures above it.

We consider an adversary that may compromise untrusted user-level components, e.g., containers, and may cause them to behave arbitrarily within the privileges granted to them. The adversary may attempt to:

  • Access or corrupt the memory or state of other components
  • Communicate over unintended channels
  • Abuse legitimate channels to perform unintended actions
  • Confuse higher-level control logic by deviating from expected workflows

We assume the adversary does not compromise the underlying hardware or cryptographic primitives. Attacks exploiting side channels and covert channels are also out of scope.


AWs run on a GitHub Actions runner virtual machine (VM) and trust Actions’ hardware and kernel-level enforcement mechanisms, including the CPU, MMU, kernel, and container runtime. AW also relies on two privileged containers: (1) a network firewall that is trusted to configure connectivity for other components via iptables, and (2) an MCP Gateway that is trusted to configure and spawn isolated containers, e.g., local MCP servers. Collectively, the substrate level ensures memory isolation between components, CPU and resource isolation, mediation of privileged operations and system calls, and explicit, kernel-enforced communication boundaries. These guarantees hold even if an untrusted user-level component is fully compromised and executes arbitrary code. Trust violations at the substrate level require vulnerabilities in the firewall, MCP Gateway, container runtime, kernel, hypervisor, or hardware. If this layer fails, higher-level security guarantees may not hold.


AW trusts declarative configuration artifacts, e.g., Action steps, network-firewall policies, MCP server configurations, and the toolchains that interpret them to correctly instantiate system structure and connectivity. The configuration level constrains which components are loaded, how components are connected, which communication channels are permitted, and what component privileges are assigned. Externally minted authentication tokens, e.g., agent API keys and GitHub access tokens, are a critical configuration input and are treated as imported capabilities that bound components’ external effects; declarative configuration controls their distribution, e.g., which tokens are loaded into which containers. Security violations arise due to misconfigurations, overly permissive specifications, and limitations of the declarative model. This layer defines what components exist and how they communicate, but it does not constrain how components use those channels over time.


AW additionally relies on plan-level trust to constrain component behavior over time. At this layer, the trusted compiler decomposes a workflow into stages. For each stage, the plan specifies (1) which components are active and their permissions, (2) the data produced by the stage, and (3) how that data may be consumed by subsequent stages. In particular, plan-level trust ensures that important external side effects are explicit and undergo thorough vetting.

A primary instantiation of plan-level trust is the SafeOutputs subsystem. SafeOutputs is a set of trusted components that operate on external state. An agent can interact with read-only MCP servers, e.g., the GitHub MCP server, but externalized writes, such as creating GitHub pull requests, are buffered as artifacts by SafeOutputs rather than applied immediately. When the agent finishes, SafeOutputs’ buffered artifacts can be processed by a deterministic sequence of filters and analyses defined by configuration. These checks can include structural constraints, e.g., limiting the number of pull requests, policy enforcement, and automated sanitization to ensure that sensitive information such as authentication tokens are not exported. These filtered and transformed artifacts are passed to a subsequent stage in which they are externalized.

Security violations at the planning layer arise from incorrect plan construction, incomplete or overly permissive stage definitions, or errors in the enforcement of plan transitions. This layer does not protect against failures of substrate-level isolation or mis-allocation of permissions at credential-minting or configuration time. However, it limits the blast radius of a compromised component to the stage in which it is active and its influence on the artifacts passed to the next stage.

The security architecture operates across multiple layers: compilation-time validation, runtime isolation, permission separation, network controls, and output sanitization. The following diagram illustrates the relationships between these components and the flow of data through the system.

flowchart TB
    subgraph Input["📥 Input Layer"]
        WF[/"Workflow (.md)"/]
        IMPORTS[/"Imports & Includes"/]
        EVENT[/"GitHub Event<br/>(Issue, PR, Comment)"/]
    end

    subgraph Compile["🔒 Compilation-Time Security"]
        SCHEMA["Schema Validation"]
        EXPR["Expression Safety Check"]
        PIN["Action SHA Pinning"]
        SCAN["Security Scanners<br/>(actionlint, zizmor, poutine)"]
    end

    subgraph Runtime["⚙️ Runtime Security"]
        PRE["Pre-Activation<br/>Role & Permission Checks"]
        ACT["Activation<br/>Content Sanitization"]
        AGENT["Agent Execution<br/>Read-Only Permissions"]
        REDACT_MAIN["Secret Redaction<br/>Credential Protection"]
    end

    subgraph Isolation["🛡️ Isolation Layer"]
        AWF["Agent Workflow Firewall<br/>Network Egress Control"]
        MCP["MCP Server Sandboxing<br/>Container Isolation"]
        TOOL["Tool Allowlisting<br/>Explicit Permissions"]
    end

    subgraph Output["📤 Output Security"]
        DETECT["Threat Detection<br/>AI-Powered Analysis"]
        SAFE["Safe Outputs<br/>Permission Separation"]
        SANITIZE["Output Sanitization<br/>Content Validation"]
    end

    subgraph Result["✅ Controlled Actions"]
        ISSUE["Create Issue"]
        PR["Create PR"]
        COMMENT["Add Comment"]
    end

    WF --> SCHEMA
    IMPORTS --> SCHEMA
    SCHEMA --> EXPR
    EXPR --> PIN
    PIN --> SCAN
    SCAN -->|".lock.yml"| PRE

    EVENT --> ACT
    PRE --> ACT
    ACT --> AGENT

    AGENT <--> AWF
    AGENT <--> MCP
    AGENT <--> TOOL

    AGENT --> REDACT_MAIN
    REDACT_MAIN --> DETECT
    DETECT --> SAFE
    SAFE --> SANITIZE

    SANITIZE --> ISSUE
    SANITIZE --> PR
    SANITIZE --> COMMENT

The SafeOutputs subsystem enforces permission isolation by ensuring that agent execution never has direct write access to external state. The agent job runs with minimal read-only permissions, while write operations are deferred to separate jobs that execute only after the agent completes. This separation ensures that even a fully compromised agent cannot directly modify repository state.

flowchart LR
    subgraph AgentJob["Agent Job<br/>🔐 Read-Only Permissions"]
        AGENT["AI Agent Execution"]
        OUTPUT[/"agent_output.json<br/>(Artifact)"/]
        AGENT --> OUTPUT
    end

    subgraph Detection["Threat Detection Job"]
        ANALYZE["Analyze for:<br/>• Secret Leaks<br/>• Malicious Patches"]
    end

    subgraph SafeJobs["Safe Output Jobs<br/>🔓 Write Permissions (Scoped)"]
        direction TB
        ISSUE["create_issue<br/>issues: write"]
        COMMENT["add_comment<br/>issues: write"]
        PR["create_pull_request<br/>contents: write<br/>pull-requests: write"]
        LABEL["add_labels<br/>issues: write"]
    end

    subgraph GitHub["GitHub API"]
        API["GitHub REST/GraphQL API"]
    end

    OUTPUT -->|"Download Artifact"| ANALYZE
    ANALYZE -->|"✅ Approved"| SafeJobs
    ANALYZE -->|"❌ Blocked"| BLOCKED["Workflow Fails"]

    ISSUE --> API
    COMMENT --> API
    PR --> API
    LABEL --> API

The Agent Workflow Firewall (AWF) provides network egress control at the substrate level. AWF mediates all outbound network requests from the agent, enforcing a domain allowlist that constrains which external endpoints the agent may contact. This mechanism prevents unauthorized data exfiltration and limits the blast radius of a compromised agent to only those domains explicitly permitted by configuration.

flowchart TB
    subgraph Agent["AI Agent Process"]
        COPILOT["Copilot CLI"]
        WEB["WebFetch Tool"]
        SEARCH["WebSearch Tool"]
    end

    subgraph Firewall["Agent Workflow Firewall (AWF)"]
        WRAP["Process Wrapper"]
        ALLOW["Domain Allowlist"]
        LOG["Activity Logging"]

        WRAP --> ALLOW
        ALLOW --> LOG
    end

    subgraph Network["Network Layer"]
        direction TB
        ALLOWED_OUT["✅ Allowed Domains"]
        BLOCKED_OUT["❌ Blocked Domains"]
    end

    subgraph Ecosystems["Ecosystem Bundles"]
        direction TB
        DEFAULTS["defaults<br/>certificates, JSON schema"]
        PYTHON["python<br/>PyPI, Conda"]
        NODE["node<br/>npm, npmjs.com"]
        CUSTOM["Custom Domains<br/>api.example.com"]
    end

    COPILOT --> WRAP
    WEB --> WRAP
    SEARCH --> WRAP

    ALLOW --> ALLOWED_OUT
    ALLOW --> BLOCKED_OUT

    DEFAULTS --> ALLOW
    PYTHON --> ALLOW
    NODE --> ALLOW
    CUSTOM --> ALLOW

    ALLOWED_OUT --> INTERNET["🌐 Internet"]
    BLOCKED_OUT --> DROP["🚫 Dropped"]

Configuration Example:

engine: copilot
network:
firewall: true
allowed:
- defaults # Basic infrastructure
- python # PyPI ecosystem
- node # npm ecosystem
- "api.example.com" # Custom domain

When the MCP gateway is enabled, it operates in conjunction with AWF to ensure that MCP traffic remains contained within trusted boundaries. The gateway spawns isolated containers for MCP servers while AWF mediates all network egress, ensuring that agent-to-server communication traverses only approved channels.

flowchart LR
    subgraph Host["Host machine"]
        GATEWAY["gh-aw-mcpg\nDocker container\nHost port 80 maps to container port 8000"]
        GH_MCP["GitHub MCP Server\nspawned via Docker socket"]
        GATEWAY -->|"spawns"| GH_MCP
    end

    subgraph AWFNet["AWF network namespace"]
        AGENT["Agent container\nCopilot CLI + MCP client\n172.30.0.20"]
        PROXY["Squid proxy\n172.30.0.10"]
    end

    AGENT -->|"CONNECT host.docker.internal:80"| PROXY
    PROXY -->|"allowed domain\n(host.docker.internal)"| GATEWAY
    GATEWAY -->|"forwards to"| GH_MCP

Architecture Summary

  1. AWF establishes an isolated network with a Squid proxy that enforces the workflow network.allowed list.
  2. The agent container can only egress through Squid. To reach the gateway, it uses host.docker.internal:80 (Docker’s host alias). This hostname must be included in the firewall’s allowed list.
  3. The gh-aw-mcpg container publishes host port 80 mapped to container port 8000. It uses the Docker socket to spawn MCP server containers.
  4. All MCP traffic remains within the host boundary: AWF restricts egress, and the gateway routes requests to sandboxed MCP servers.

MCP servers execute within isolated containers, enforcing substrate-level separation between the agent and each server instance. Tool filtering at the configuration level restricts which operations each server may expose, limiting the attack surface available to a compromised agent. This isolation ensures that even if an MCP server is compromised, it cannot access the memory or state of other components.

flowchart TB
    subgraph Agent["AI Agent"]
        ENGINE["AI Engine<br/>(Copilot, Claude, Codex)"]
    end

    subgraph MCPLayer["MCP Server Layer"]
        direction TB

        subgraph GitHub["GitHub MCP"]
            GH_TOOLS["Enabled Tools:<br/>• issue_read<br/>• list_commits<br/>• search_code"]
            GH_BLOCKED["Blocked Tools:<br/>• delete_repository<br/>• update_branch_protection"]
        end

        subgraph Custom["Custom MCP (Docker)"]
            CONTAINER["🐳 Isolated Container"]
            NET["Network Allowlist"]
            ENV["Env Var Injection"]
        end

        subgraph HTTP["HTTP MCP"]
            ENDPOINT["HTTPS Endpoint"]
            HEADERS["Secure Headers"]
        end
    end

    subgraph Toolfilter["Tool Filtering"]
        ALLOWED["allowed: [tool1, tool2]"]
        DENIED["❌ Unlisted tools blocked"]
    end

    ENGINE <-->|"stdio/HTTP"| GitHub
    ENGINE <-->|"stdio"| CONTAINER
    ENGINE <-->|"HTTP"| ENDPOINT

    ALLOWED --> GH_TOOLS
    ALLOWED --> GH_BLOCKED
    CONTAINER --> NET
    CONTAINER --> ENV
    ENDPOINT --> HEADERS

Isolation Properties:

  • Container Isolation: Custom MCP servers run in Docker containers with no shared state
  • Network Controls: Per-container domain allowlists enforced via Squid proxy
  • Tool Allowlisting: Explicit allowed: lists restrict available operations
  • Secret Injection: Secrets are passed via environment variables, never in configuration files

The threat detection job is a sub-stage within the SafeOutputs subsystem. After the agent job completes and its outputs are buffered as artifacts, a separate detection job downloads these artifacts and invokes a prompted AI agent to analyze them for suspicious content. This detection agent operates with a security-focused system prompt and examines the agent’s outputs, patches, and execution context. The detection job runs in isolation from the original agent and has no access to write permissions; its sole responsibility is to emit a pass/fail verdict that gates the subsequent safe output jobs.

Detection checks include identification of secret leakage, malicious code patterns, and policy violations. If the detection agent identifies threats, the workflow terminates before any writes are externalized. Workflow authors can customize detection behavior by providing additional detection prompts or integrating external security scanners.

flowchart TB
    subgraph Input["SafeOutputs Artifacts"]
        JSON[/"agent_output.json<br/>(Buffered actions)"/]
        PATCH[/"aw.patch<br/>(Git diff from agent)"/]
        PROMPT[/"prompt.txt<br/>(Original workflow context)"/]
    end

    subgraph DetectionJob["Threat Detection Job"]
        direction TB
        DOWNLOAD["Download artifacts"]
        AGENT["Detection Agent<br/>(Security-focused prompt)"]
        
        subgraph Checks["Analysis Targets"]
            SECRETS["Secret Leaks<br/>API keys, tokens<br/>Credentials in outputs"]
            MALICIOUS["Malicious Patches<br/>Backdoors, vulnerabilities<br/>Suspicious modifications"]
            POLICY["Policy Violations<br/>Scope violations<br/>Unauthorized operations"]
        end
        
        CUSTOM["Custom Detection Steps"]
    end

    subgraph Verdict["Verdict"]
        SAFE_CHECK{{"Threats<br/>Detected?"}}
    end

    subgraph Outcome["Outcome"]
        PROCEED["✅ Safe output jobs proceed"]
        BLOCK["❌ Workflow fails<br/>No writes externalized"]
    end

    JSON --> DOWNLOAD
    PATCH --> DOWNLOAD
    PROMPT --> DOWNLOAD
    
    DOWNLOAD --> AGENT
    AGENT --> Checks
    Checks --> CUSTOM
    CUSTOM --> SAFE_CHECK

    SAFE_CHECK -->|"No"| PROCEED
    SAFE_CHECK -->|"Yes"| BLOCK

Detection Job Properties:

  • Isolated Execution: The detection agent runs in a separate job with no write permissions and no access to the original agent’s runtime state
  • Prompted Analysis: Detection uses the same AI engine as the workflow, but with a security-focused system prompt that instructs the agent to identify threats
  • Artifact-Based: The detection agent only sees the buffered artifacts (outputs, patches, context), not live repository state
  • Blocking Verdict: The detection job must complete successfully and emit a “safe” verdict before any safe output jobs execute

Detection Mechanisms:

  • AI Detection: Default AI-powered analysis using the workflow engine with a security-focused detection prompt
  • Custom Steps: Integration with security scanners (Semgrep, TruffleHog, LlamaGuard) via threat-detection.steps configuration
  • Custom Prompts: Domain-specific detection instructions for specialized threat models via threat-detection.prompt configuration

Configuration Example:

threat-detection:
prompt: |
Additionally check for:
- References to internal infrastructure URLs
- Attempts to modify CI/CD configuration files
- Changes to security-sensitive files (.github/workflows, package.json scripts)
steps:
- name: Run TruffleHog
run: trufflehog filesystem /tmp/gh-aw --only-verified
- name: Run Semgrep
run: semgrep scan /tmp/gh-aw/aw.patch --config=auto

AW enforces security constraints at compilation time through schema validation, expression allowlisting, and action pinning. The trusted compiler validates declarative configuration artifacts before they are deployed, rejecting misconfigurations and overly permissive specifications. This layer constrains what components may be loaded and how they may be connected, but it does not constrain runtime behavior.

flowchart TB
    subgraph Source["Source Files"]
        MD[/"workflow.md"/]
        IMPORTS[/"imports/*.md"/]
    end

    subgraph Validation["Schema & Expression Validation"]
        SCHEMA["JSON Schema Validation<br/>• Valid frontmatter fields<br/>• Correct types & formats"]
        EXPR["Expression Safety<br/>• Allowlisted expressions only<br/>• No secrets in expressions"]
    end

    subgraph Pinning["Action Pinning"]
        SHA["SHA Resolution<br/>actions/checkout@sha # v4"]
        CACHE[/"actions-lock.json<br/>(Cached SHAs)"/]
    end

    subgraph Scanners["Security Scanners"]
        ACTIONLINT["actionlint<br/>Workflow linting<br/>(includes shellcheck & pyflakes)"]
        ZIZMOR["zizmor<br/>Security vulnerabilities<br/>Privilege escalation"]
        POUTINE["poutine<br/>Supply chain risks<br/>Third-party actions"]
    end

    subgraph Strict["Strict Mode Enforcement"]
        PERMS["❌ No write permissions"]
        NETWORK["✅ Explicit network config"]
        WILDCARD["❌ No wildcard domains"]
        DEPRECATED["❌ No deprecated fields"]
    end

    subgraph Output["Compilation Output"]
        LOCK[/".lock.yml<br/>(Validated Workflow)"/]
        ERROR["❌ Compilation Error"]
    end

    MD --> SCHEMA
    IMPORTS --> SCHEMA
    SCHEMA --> EXPR
    EXPR --> SHA
    SHA <--> CACHE

    SHA --> ACTIONLINT
    ACTIONLINT --> ZIZMOR
    ZIZMOR --> POUTINE
    POUTINE --> Strict

    Strict -->|"All Checks Pass"| LOCK
    Strict -->|"Violation Found"| ERROR

Compilation Commands:

Terminal window
# Standard compilation
gh aw compile
# Strict mode enforces additional security constraints (no write permissions, explicit network configuration)
gh aw compile --strict
# Enable security scanners for additional validation
gh aw compile --strict --actionlint --zizmor --poutine

User-generated content is sanitized before being passed to the agent. The sanitization pipeline applies a series of transformations to normalize potentially problematic content. This mechanism operates at the activation stage boundary, ensuring that untrusted input is processed before it is passed to the agent.

flowchart LR
    subgraph Raw["Raw Event Content"]
        TITLE["Issue Title"]
        BODY["Issue/PR Body"]
        COMMENT["Comment Text"]
    end

    subgraph Sanitization["Content Sanitization Pipeline"]
        direction TB
        MENTIONS["@mention Neutralization<br/>@user → `@user`"]
        BOTS["Bot Trigger Protection<br/>fixes #123 → `fixes #123`"]
        XML["XML/HTML Tag Conversion<br/>&lt;script&gt; → (script)"]
        URI["URI Filtering<br/>Only HTTPS from trusted domains"]
        SPECIAL["Special Character Handling<br/>Unicode normalization"]
        LIMIT["Content Limits<br/>0.5MB max, 65k lines"]
        CONTROL["Control Character Removal<br/>ANSI escapes stripped"]
    end

    subgraph Safe["Sanitized Output"]
        SAFE_TEXT["needs.activation.outputs.text<br/>✅ Safe for AI consumption"]
    end

    TITLE --> MENTIONS
    BODY --> MENTIONS
    COMMENT --> MENTIONS

    MENTIONS --> BOTS
    BOTS --> XML
    XML --> URI
    URI --> SPECIAL
    SPECIAL --> LIMIT
    LIMIT --> CONTROL
    CONTROL --> SAFE_TEXT

Sanitization Properties:

MechanismInputOutputProtection
@mention Neutralization@user`@user`Prevents unintended user notifications
Bot Trigger Protectionfixes #123`fixes #123`Prevents automatic issue linking
XML/HTML Tag Conversion<script>(script)Prevents injection via XML tags
URI Filteringhttp://evil.com(redacted)Restricts to HTTPS from trusted domains
Special CharactersUnicode homoglyphsNormalizedPrevents visual spoofing attacks
Content LimitsLarge payloadsTruncatedEnforces 0.5MB max size, 65k lines max
Control CharactersANSI escapesStrippedRemoves terminal manipulation codes

URI Filtering Behavior:

The URI filtering mechanism applies strict validation:

  • Allowed: https://github.com/..., https://api.github.com/...
  • Allowed: URLs from explicitly trusted domains in configuration
  • Blocked: http:// URLs (non-HTTPS)
  • Blocked: URLs with suspicious patterns
  • Blocked: Data URLs, javascript: URLs
  • Blocked: URLs from untrusted domains → replaced with (redacted)

Configuring Additional Domains:

To permit URLs from additional domains in sanitized content, configure the network: field in the workflow frontmatter:

network:
allowed:
- defaults # Basic infrastructure
- "api.example.com" # Your custom domain
- "trusted.com" # Another trusted domain

Domains configured here apply to both network egress control (when firewall is enabled) and content sanitization. See Network Permissions for the complete list of ecosystem identifiers and configuration options.

XML/HTML Tag Handling:

XML and HTML tags are converted to a safe parentheses format to prevent injection:

<script>alert('xss')</script> → (script)alert('xss')(/script)
<img src=x onerror=...> → (img src=x onerror=...)
<!-- hidden comment --> → (!-- hidden comment --)

Before workflow artifacts are uploaded, all files in the /tmp/gh-aw directory are scanned for secret values and redacted. This mechanism prevents accidental credential leakage through logs, outputs, or artifacts. Secret redaction executes unconditionally (with if: always()), ensuring that secrets are protected even if the workflow fails at an earlier stage.

flowchart LR
    subgraph Sources["Secret Sources"]
        YAML["Workflow YAML"]
        ENV["Environment Variables"]
        MCP_CONF["MCP Server Config"]
    end

    subgraph Collection["Secret Collection"]
        SCAN["Scan for secrets.* patterns"]
        EXTRACT["Extract secret names:<br/>SECRET_NAME_1<br/>SECRET_NAME_2"]
    end

    subgraph Redaction["Secret Redaction Step"]
        direction TB
        FIND["Find files in /tmp/gh-aw<br/>(.txt, .json, .log, .md, .yml)"]
        MATCH["Match exact secret values"]
        REPLACE["Replace with masked value:<br/>abc***** (first 3 chars + asterisks)"]
    end

    subgraph Output["Safe Artifacts"]
        LOGS["Redacted Logs"]
        JSON_OUT["Sanitized JSON"]
        PROMPT["Clean Prompt Files"]
    end

    YAML --> SCAN
    ENV --> SCAN
    MCP_CONF --> SCAN

    SCAN --> EXTRACT
    EXTRACT --> FIND

    FIND --> MATCH
    MATCH --> REPLACE

    REPLACE --> LOGS
    REPLACE --> JSON_OUT
    REPLACE --> PROMPT

Redaction Properties:

  • Automatic Detection: Scans workflow YAML for secrets.* patterns and collects all secret references
  • Exact String Matching: Uses safe string matching (not regex) to prevent injection attacks
  • Partial Visibility: Displays first 3 characters followed by asterisks for debugging without exposing full secrets
  • Custom Masking: Supports additional custom secret masking steps via secret-masking: configuration

Configuration Example:

secret-masking:
steps:
- name: Redact custom patterns
run: |
find /tmp/gh-aw -type f -exec sed -i 's/password123/REDACTED/g' {} +

Workflow execution follows a strict dependency order that enforces security checks at each stage boundary. The plan-level decomposition ensures that each stage has explicit inputs and outputs, and that transitions between stages are mediated by validation steps.

flowchart TB
    subgraph PreActivation["Pre-Activation Job"]
        ROLE["Role Permission Check"]
        DEADLINE["Stop-After Deadline"]
        SKIP["Skip-If-Match Check"]
        COMMAND["Command Position Validation"]
    end

    subgraph Activation["Activation Job"]
        CONTEXT["Prepare Workflow Context"]
        SANITIZE["Sanitize Event Text"]
        LOCK_CHECK["Validate Lock File"]
    end

    subgraph Agent["Agent Job"]
        CHECKOUT["Repository Checkout"]
        RUNTIME["Runtime Setup<br/>(Node.js, Python)"]
        CACHE_RESTORE["Cache Restore"]
        MCP_START["Start MCP Containers"]
        PROMPT["Generate Prompt"]
        EXECUTE["Execute AI Engine"]
        REDACT["🔐 Secret Redaction"]
        UPLOAD["Upload Output Artifact"]
        CACHE_SAVE["Save Cache"]
    end

    subgraph Detection["Detection Job"]
        DOWNLOAD_DETECT["Download Artifact"]
        ANALYZE["AI + Custom Analysis"]
        VERDICT["Security Verdict"]
    end

    subgraph SafeOutputs["Safe Output Jobs"]
        CREATE_ISSUE["create_issue"]
        ADD_COMMENT["add_comment"]
        CREATE_PR["create_pull_request"]
    end

    subgraph Conclusion["Conclusion Job"]
        AGGREGATE["Aggregate Results"]
        SUMMARY["Generate Summary"]
    end

    ROLE --> DEADLINE
    DEADLINE --> SKIP
    SKIP --> COMMAND
    COMMAND -->|"✅ Pass"| CONTEXT
    COMMAND -->|"❌ Fail"| SKIP_ALL["Skip All Jobs"]

    CONTEXT --> SANITIZE
    SANITIZE --> LOCK_CHECK
    LOCK_CHECK --> CHECKOUT

    CHECKOUT --> RUNTIME
    RUNTIME --> CACHE_RESTORE
    CACHE_RESTORE --> MCP_START
    MCP_START --> PROMPT
    PROMPT --> EXECUTE
    EXECUTE --> REDACT
    REDACT --> UPLOAD
    UPLOAD --> CACHE_SAVE
    CACHE_SAVE --> DOWNLOAD_DETECT

    DOWNLOAD_DETECT --> ANALYZE
    ANALYZE --> VERDICT

    VERDICT -->|"✅ Safe"| CREATE_ISSUE
    VERDICT -->|"✅ Safe"| ADD_COMMENT
    VERDICT -->|"✅ Safe"| CREATE_PR
    VERDICT -->|"❌ Threat"| BLOCK_ALL["Block All Safe Outputs"]

    CREATE_ISSUE --> AGGREGATE
    ADD_COMMENT --> AGGREGATE
    CREATE_PR --> AGGREGATE
    AGGREGATE --> SUMMARY

AW provides comprehensive observability through GitHub Actions runs and artifacts. Workflow artifacts preserve prompts, outputs, patches, and logs for post-hoc analysis. This observability layer supports debugging, security auditing, and cost monitoring without compromising runtime isolation.

flowchart TB
    subgraph Workflow["Workflow Execution"]
        RUN["GitHub Actions Run"]
        JOBS["Job Logs"]
        STEPS["Step Outputs"]
    end

    subgraph Artifacts["Workflow Artifacts"]
        AGENT_OUT[/"agent_output.json<br/>AI decisions & actions"/]
        PROMPT[/"prompt.txt<br/>Generated prompts"/]
        PATCH[/"aw.patch<br/>Code changes"/]
        LOGS[/"engine logs<br/>Token usage & timing"/]
        FIREWALL[/"firewall logs<br/>Network requests"/]
    end

    subgraph CLI["CLI Tools"]
        AW_LOGS["gh aw logs<br/>Download & analyze runs"]
        AW_AUDIT["gh aw audit<br/>Investigate failures"]
        AW_STATUS["gh aw status<br/>Workflow health"]
    end

    subgraph Insights["Observability Insights"]
        COST["💰 Cost Tracking<br/>Token usage per run"]
        DEBUG["🔍 Debugging<br/>Step-by-step trace"]
        SECURITY["🛡️ Security Audit<br/>Network & tool access"]
        PERF["⚡ Performance<br/>Duration & bottlenecks"]
    end

    RUN --> JOBS
    JOBS --> STEPS
    STEPS --> Artifacts

    AGENT_OUT --> AW_LOGS
    PROMPT --> AW_LOGS
    PATCH --> AW_AUDIT
    LOGS --> AW_LOGS
    FIREWALL --> AW_AUDIT

    AW_LOGS --> COST
    AW_LOGS --> PERF
    AW_AUDIT --> DEBUG
    AW_AUDIT --> SECURITY
    AW_STATUS --> DEBUG

Observability Properties:

  • Artifact Preservation: All workflow outputs (prompts, patches, logs) are saved as downloadable artifacts
  • Cost Monitoring: Token usage and costs across workflow runs are tracked via gh aw logs
  • Failure Analysis: Failed runs can be investigated with gh aw audit to examine prompts, errors, and network activity
  • Firewall Logs: All network requests made by the agent are logged for security auditing
  • Step Summaries: Rich markdown summaries in GitHub Actions display agent decisions and outputs

CLI Commands for Observability:

Terminal window
# Download and analyze workflow run logs
gh aw logs
# Investigate a specific workflow run
gh aw audit <run-id>
# Check workflow health and status
gh aw status
LayerMechanismProtection Against
SubstrateGitHub Actions runner (VM, kernel, hypervisor)Memory corruption, privilege escalation, host escape
SubstrateDocker container runtimeProcess isolation bypass, shared state access
SubstrateAWF network controls (iptables)Data exfiltration, unauthorized API calls
SubstrateMCP sandboxing (container isolation)Container escape, unauthorized tool access
ConfigurationSchema validation, expression allowlistInvalid configurations, unauthorized expressions
ConfigurationAction SHA pinningSupply chain attacks, tag hijacking
ConfigurationSecurity scanners (actionlint, zizmor, poutine)Privilege escalation, misconfigurations, supply chain risks
ConfigurationPre-activation checks (role/permission)Unauthorized users, expired workflows
PlanContent sanitization@mention abuse, bot triggers
PlanSecret redactionCredential leakage in logs/artifacts
PlanThreat detectionMalicious patches, secret leaks
PlanPermission separation (SafeOutputs)Direct write access abuse
PlanOutput sanitizationContent injection, XSS
PlanArtifact preservation, CLI toolsDebugging failures, auditing security, cost tracking