Meet the Workflows: Testing & Validation
Right this way! Let’s continue our grand tour of Peli’s Agent Factory! Into the verification chamber where nothing escapes scrutiny!
In our previous post, we explored ChatOps workflows - agents that respond to slash commands and GitHub reactions, providing on-demand assistance with full context.
But making code better is only half the battle. We also need to ensure it keeps working. As we refactor, optimize, and evolve our codebase, how do we know we haven’t broken something? How do we catch regressions before users do? That’s where testing and validation workflows come in - the skeptical guardians that continuously verify our systems still function as expected. We learned that AI infrastructure needs constant health checks, because what worked yesterday might silently fail today. These workflows embody trust but verify.
Testing & Validation Workflows
Section titled “Testing & Validation Workflows”These agents keep everything running smoothly through continuous testing:
Code Quality & Test Validation
Section titled “Code Quality & Test Validation”- Daily Testify Uber Super Expert - Analyzes test files daily and suggests testify-based improvements - 19 issues created, 13 led to merged PRs (100% causal chain merge rate)
- Daily Test Improver - Identifies coverage gaps and implements new tests incrementally
- Daily Compiler Quality Check - Analyzes compiler code to ensure it meets quality standards
User Experience & Integration Testing
Section titled “User Experience & Integration Testing”- Daily Multi-Device Docs Tester - Tests documentation across devices with Playwright - 2 merged PRs out of 2 proposed (100% merge rate)
- CLI Consistency Checker - Inspects the CLI for inconsistencies, typos, and documentation gaps - 80 merged PRs out of 102 proposed (78% merge rate)
CI/CD Optimization
Section titled “CI/CD Optimization”- CI Coach - Analyzes CI pipelines and suggests optimizations - 9 merged PRs out of 9 proposed (100% merge rate)
- Workflow Health Manager - Meta-orchestrator monitoring health of all agentic workflows - 40 issues created, 5 direct PRs + 14 causal chain PRs merged
The Daily Testify Expert has created 19 issues analyzing test quality, and 13 of those issues led to merged PRs by downstream agents - a perfect 100% causal chain merge rate. For example, issue #13701 led to #13722 modernizing console render tests with testify assertions. The Daily Test Improver works alongside it to identify coverage gaps and implement new tests.
The Multi-Device Docs Tester uses Playwright to test our documentation on different screen sizes - it has created 2 PRs (both merged), including adding —network host to Playwright Docker containers. It found mobile rendering issues we never would have caught manually. The CLI Consistency Checker has contributed 80 merged PRs out of 102 proposed (78% merge rate), maintaining consistency in CLI interface and documentation. Recent examples include removing undocumented CLI commands and fixing upgrade command documentation.
CI Optimization Coach has contributed 9 merged PRs out of 9 proposed (100% merge rate), optimizing CI pipelines for speed and efficiency with perfect execution. Examples include removing unnecessary test dependencies and fixing duplicate test execution.
The Workflow Health Manager has created 40 issues monitoring the health of all other workflows, with 25 of those issues leading to 34 PRs (14 merged) by downstream agents - plus 5 direct PRs merged. For example, issue #14105 about a missing runtime file led to #14127 fixing the workflow configuration.
These workflows embody the principle: trust but verify. Just because it worked yesterday doesn’t mean it works today.
Using These Workflows
Section titled “Using These Workflows”You can add these workflows to your own repository and remix them. Get going with our Quick Start, then run one of the following:
Daily Testify Uber Super Expert:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-testify-uber-super-expert.mdDaily Test Improver:
gh aw add-wizard githubnext/agentics/daily-test-improverDaily Compiler Quality Check:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-compiler-quality.mdDaily Multi-Device Docs Tester:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-multi-device-docs-tester.mdCLI Consistency Checker:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/cli-consistency-checker.mdCI Coach:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-coach.mdWorkflow Health Manager:
gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.mdThen edit and remix the workflow specifications to meet your needs, regenerate the lock file using gh aw compile, and push to your repository. See our Quick Start for further installation and setup instructions.
You can also create your own workflows.
Learn More
Section titled “Learn More”- GitHub Agentic Workflows - The technology behind the workflows
- Quick Start - How to write and compile workflows
Next Up: Monitoring the Monitors
Section titled “Next Up: Monitoring the Monitors”But what about the infrastructure itself? Who watches the watchers? Time to go meta.
Continue reading: Tool & Infrastructure Workflows →
This is part 14 of a 19-part series exploring the workflows in Peli’s Agent Factory.