AI-Assisted Development
The TABS website is built almost entirely with AI coding agents. This is not a marketing claim — it is a literal description of how code gets written. A human defines the goal, an AI agent researches the codebase, writes the implementation, and a different AI reviews the result. The human approves, requests changes, or redirects. This page documents exactly how that works, which models we use, and what we have learned.
The Models We Use
We do not rely on a single AI model. Different models have different strengths, and we choose based on the task at hand:
Claude (Anthropic)
Our primary development agent. Claude excels at deep code analysis, multi-file refactoring, and following complex project conventions. It reads our instruction files and maintains context across long sessions involving dozens of file edits.
Best for: Large features, complex refactoring, documentation, accessibility fixes
GitHub Copilot
Serves dual roles: inline code completion during editing, and automated pull request reviewer. Every PR gets a Copilot code review that catches issues ranging from unused variables to accessibility problems. We typically require 3 clean reviews before merging.
Best for: Code review, inline completions, quick edits
GPT-4o / o1 (OpenAI)
Used for brainstorming, content drafting, and reasoning about complex architectural decisions. The o1 reasoning model is particularly useful for thinking through multi-step problems where the solution path is not obvious.
Best for: Content drafting, architecture planning, complex reasoning
Gemini (Google)
Another agent option within our IDE. Gemini offers strong performance on code generation tasks and provides an alternative perspective when we want to compare approaches across models.
Best for: Alternative perspectives, code generation, comparative analysis
How We Actually Build
Our development workflow is a structured loop between human intent and AI execution. Here is the actual process, not idealized:
Human Opens an Issue
Every change starts as a GitHub Issue. The human describes the goal — fix a bug, add a page, refactor a component. Parent issues can have sub-tasks for larger initiatives.
AI Agent Implements
The human tells an AI agent (usually Claude) what to build. The agent reads the codebase, understands the patterns, creates a feature branch, writes the code, runs tests, and pushes the branch — all within the IDE.
PR Opens, CI Runs
The agent opens a pull request referencing the issue. CI automatically runs: formatting checks, linting, Jest unit tests (including accessibility), static build, four Playwright E2E shards, and CodeQL security scanning.
AI Reviews AI
GitHub Copilot automatically reviews the PR. It checks for unused variables, misleading comments, type safety issues, accessibility problems, and consistency with the rest of the codebase. We request multiple review rounds.
Fix, Re-review, Merge
Review comments get addressed — the same or a different AI agent reads the feedback, makes fixes, pushes again, and requests another review. Once reviews are clean and CI is green, the human approves the merge.
Instruction Files: Teaching AI Your Codebase
AI agents are only as good as the context they receive. We maintain several instruction files that teach agents about our project conventions, architecture, and workflows:
| File | Purpose | Used By |
|---|---|---|
| AGENTS.md | General instructions for all AI coding agents — architecture, naming conventions, testing requirements, API integrations | All agents |
| CLAUDE.md | Claude-specific instructions — strengths to leverage, IDE capabilities, MCP integration details, terminal access patterns | Claude |
| GEMINI.md | Gemini-specific instructions — optimized for Google's agent capabilities and context window | Gemini |
| .github/copilot-instructions.md | GitHub Copilot instructions — automatically loaded by Copilot Chat and code review, covers full project context | GitHub Copilot |
These files are "meta-documentation" — documentation that teaches AI how to read documentation. They are version-controlled alongside the code and updated whenever the project's conventions change. This creates a virtuous cycle: better instruction files produce better AI output, which reveals what the instruction files are missing.
MCP: Giving AI Agents Real Tools
The Model Context Protocol (MCP) extends AI agents beyond code generation by giving them access to external APIs directly within the IDE:
GitHub MCP
Allows agents to create issues, open pull requests, request reviews, search code, and check CI status — all without leaving the conversation. The agent can manage the full issue-to-merge lifecycle.
Qualtrics MCP
Connects agents to the Qualtrics survey platform via OAuth. Agents can list surveys, inspect definitions, and manage survey operations — useful for the annual survey rollover process.
MCP means an agent can go from "create an issue for this bug" to "fix it, open a PR, and request a review" in a single conversation. This is how most of our development actually happens.
Quality Guardrails
AI-generated code is not trusted by default. Every change passes through multiple automated quality gates before it reaches production:
- Prettier formatting — enforced in CI; code that does not match the project style is rejected
- ESLint — catches code quality issues; new errors fail the build
- Jest unit tests — 124+ tests covering components and utilities
- jest-axe accessibility — automated WCAG compliance checks that catch missing ARIA labels, incorrect roles, and color contrast issues
- Static build — the entire site must build successfully as a static export
- Playwright E2E — four parallel shards testing real browser interactions across the full site
- CodeQL security scanning — GitHub's code analysis catches security vulnerabilities
- Copilot code review — automated review catching logical errors, unused code, misleading comments, and consistency issues
- Lighthouse CI — performance, accessibility, and SEO scoring on every merge to main
These guardrails mean that AI can write code aggressively — the safety net catches problems before they reach users.
What We Have Learned
Where AI Excels
- Boilerplate and scaffolding — creating new pages, components, and test files following established patterns
- Multi-file refactoring — renaming, restructuring, and moving code across many files consistently
- Writing tests — generating unit and E2E tests with good coverage of edge cases
- Documentation — writing clear, structured technical documentation and inline comments
- Accessibility fixes — identifying and fixing ARIA issues, semantic HTML problems, and keyboard navigation gaps
- Code review response — reading review comments and implementing precise fixes
Where AI Struggles
- Complex merge conflicts — when multiple branches have diverged significantly, AI agents sometimes revert unrelated changes or create inconsistent resolutions
- Nuanced design decisions — visual layout choices, color palette decisions, and UX trade-offs still need human judgment
- Cross-session context — agents lose context between conversations; the instruction files partially solve this, but complex multi-day work requires human continuity
- External API quirks — when an API behaves unexpectedly (like Qualtrics rejecting a valid-looking request), agents can get stuck in loops trying the same approach
- Knowing when to stop — agents sometimes over-engineer solutions or make unnecessary "improvements" that the human did not request
The Impact
This approach allows a small team to maintain a 149+ page website with comprehensive testing, daily automated reports, and multiple API integrations. Tasks that would take hours of manual coding — creating six new pages with consistent styling, breadcrumbs, metadata, sitemap entries, and navigation updates — can be completed in a single conversation.
The key insight is not that AI writes perfect code. It does not. The key insight is that with good guardrails — CI, automated review, accessibility checks — the imperfect code gets caught and fixed before it matters. The velocity gain comes from the AI handling the mechanical work while the human focuses on direction and quality judgment.
This page was, naturally, written by an AI agent and reviewed by GitHub Copilot. The human approved it.
