Meet the Workflows: Testing & Validation

Jan 13, 2026

Right this way! Let’s continue our grand tour of Peli’s Agent Factory! Into the verification chamber where nothing escapes scrutiny!

In our previous post, we explored ChatOps workflows - agents that respond to slash commands and GitHub reactions, providing on-demand assistance with full context.

But making code better is only half the battle. We also need to ensure it keeps working. As we refactor, optimize, and evolve our codebase, how do we know we haven’t broken something? How do we catch regressions before users do? That’s where testing and validation workflows come in - the skeptical guardians that continuously verify our systems still function as expected. We learned that AI infrastructure needs constant health checks, because what worked yesterday might silently fail today. These workflows embody trust but verify.

Testing & Validation Workflows

These agents keep everything running smoothly through continuous testing:

Code Quality & Test Validation

Daily Testify Uber Super Expert - Analyzes test files daily and suggests testify-based improvements - 19 issues created, 13 led to merged PRs (100% causal chain merge rate)
Daily Test Improver - Identifies coverage gaps and implements new tests incrementally
Daily Compiler Quality Check - Analyzes compiler code to ensure it meets quality standards

User Experience & Integration Testing

Daily Multi-Device Docs Tester - Tests documentation across devices with Playwright - 2 merged PRs out of 2 proposed (100% merge rate)
CLI Consistency Checker - Inspects the CLI for inconsistencies, typos, and documentation gaps - 80 merged PRs out of 102 proposed (78% merge rate)

CI/CD Optimization

CI Coach - Analyzes CI pipelines and suggests optimizations - 9 merged PRs out of 9 proposed (100% merge rate)
Workflow Health Manager - Meta-orchestrator monitoring health of all agentic workflows - 40 issues created, 5 direct PRs + 14 causal chain PRs merged

The Daily Testify Expert has created 19 issues analyzing test quality, and 13 of those issues led to merged PRs by downstream agents - a perfect 100% causal chain merge rate. For example, issue #13701 led to #13722 modernizing console render tests with testify assertions. The Daily Test Improver works alongside it to identify coverage gaps and implement new tests.

The Multi-Device Docs Tester uses Playwright to test our documentation on different screen sizes - it has created 2 PRs (both merged), including adding —network host to Playwright Docker containers. It found mobile rendering issues we never would have caught manually. The CLI Consistency Checker has contributed 80 merged PRs out of 102 proposed (78% merge rate), maintaining consistency in CLI interface and documentation. Recent examples include removing undocumented CLI commands and fixing upgrade command documentation.

CI Optimization Coach has contributed 9 merged PRs out of 9 proposed (100% merge rate), optimizing CI pipelines for speed and efficiency with perfect execution. Examples include removing unnecessary test dependencies and fixing duplicate test execution.

The Workflow Health Manager has created 40 issues monitoring the health of all other workflows, with 25 of those issues leading to 34 PRs (14 merged) by downstream agents - plus 5 direct PRs merged. For example, issue #14105 about a missing runtime file led to #14127 fixing the workflow configuration.

These workflows embody the principle: trust but verify. Just because it worked yesterday doesn’t mean it works today.

Using These Workflows

You can add these workflows to your own repository and remix them. Get going with our Quick Start, then run one of the following:

Daily Testify Uber Super Expert:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-testify-uber-super-expert.md

Daily Test Improver:

gh aw add-wizard githubnext/agentics/daily-test-improver

Daily Compiler Quality Check:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-compiler-quality.md

Daily Multi-Device Docs Tester:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-multi-device-docs-tester.md

CLI Consistency Checker:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/cli-consistency-checker.md

CI Coach:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/ci-coach.md

Workflow Health Manager:

gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/workflow-health-manager.md

Then edit and remix the workflow specifications to meet your needs, regenerate the lock file using gh aw compile, and push to your repository. See our Quick Start for further installation and setup instructions.

You can also create your own workflows.

Learn More

GitHub Agentic Workflows - The technology behind the workflows
Quick Start - How to write and compile workflows

Next Up: Monitoring the Monitors

But what about the infrastructure itself? Who watches the watchers? Time to go meta.

Continue reading: Tool & Infrastructure Workflows →

This is part 14 of a 19-part series exploring the workflows in Peli’s Agent Factory.