Meet the Workflows: Fault Investigation
Ah, splendid! Welcome back to Peli’s Agent Factory! Come, let me show you the chamber where vigilant caretakers investigate faults before they escalate!
In our previous post, we explored issue and PR management workflows.
Now let’s shift from collaboration ceremony to fault investigation.
While issue workflows help us handle what comes in, fault investigation workflows act as vigilant caretakers - spotting problems before they escalate and keeping our codebase healthy. These are the agents that investigate failed CI runs, detect schema drift, and catch breaking changes before users do.
Fault Investigation Workflows
Section titled “Fault Investigation Workflows”These are our diligent caretakers - the agents that spot problems before they become bigger problems:
- CI Doctor - Investigates failed workflows and opens diagnostic issues - 9 merged PRs out of 13 proposed (69% merge rate)
- Schema Consistency Checker - Detects when schemas, code, and docs drift apart - 55 analysis discussions created
- Breaking Change Checker - Watches for changes that might break things for users - creates alert issues
The CI Doctor (also known as “CI Failure Doctor”) was one of our most important workflows. Instead of drowning in CI failure notifications, we now get timely, investigated failures with actual diagnostic insights. The agent doesn’t just tell us something broke - it analyzes logs, identifies patterns, searches for similar past issues, and even suggests fixes - even before the human has read the failure notification. CI Failure Doctor has contributed 9 merged PRs out of 13 proposed (69% merge rate), including fixes like adding Go module download pre-flight checks and adding retry logic to prevent proxy 403 failures. We learned that agents excel at the tedious investigation work that humans find draining.
The Schema Consistency Checker has created 55 analysis discussions examining schema drift between JSON schemas, Go structs, and documentation - for example, #7020 analyzing conditional logic consistency across the codebase. It caught drift that would have taken us days to notice manually.
Breaking Change Checker is a newer workflow that monitors for backward-incompatible changes and creates alert issues (e.g., #14113 flagging CLI version updates) before they reach production.
These “hygiene” workflows became our first line of defense, catching issues before they reached users.
The CI Doctor has inspired a growing range of similar workflows inside GitHub, where agents proactively do depth investigations of site incidents and failures. This is the future of operational excellence: AI agents kicking in immediately to do depth investigation, for faster organizational response.
Using These Workflows
Section titled “Using These Workflows”You can add these workflows to your own repository and remix them. Get going with our Quick Start, then run one of the following:
CI Doctor:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-doctor.mdSchema Consistency Checker:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/schema-consistency-checker.mdBreaking Change Checker:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/breaking-change-checker.mdThen edit and remix the workflow specifications to meet your needs, regenerate the lock file using gh aw compile, and push to your repository. See our Quick Start for further installation and setup instructions.
You can also create your own workflows.
Learn More
Section titled “Learn More”- GitHub Agentic Workflows - The technology behind the workflows
- Quick Start - How to write and compile workflows
Next Up: Metrics & Analytics Workflows
Section titled “Next Up: Metrics & Analytics Workflows”Next up, we look at workflows which help us understand if the agent collection as a whole is working well That’s where metrics and analytics workflows come in.
Continue reading: Metrics & Analytics Workflows →
This is part 8 of a 19-part series exploring the workflows in Peli’s Agent Factory.