AI-Assisted Development

The TABS website is built almost entirely with AI coding agents. This is not a marketing claim — it is a literal description of how code gets written. A human defines the goal, an AI agent researches the codebase, writes the implementation, and a different AI reviews the result. The human approves, requests changes, or redirects. This page documents exactly how that works, which models we use, and what we have learned.

The Models We Use

We do not rely on a single AI model. Different models have different strengths, and we choose based on the task at hand:

Claude (Anthropic)

Our primary development agent. Claude excels at deep code analysis, multi-file refactoring, and following complex project conventions. It reads our instruction files and maintains context across long sessions involving dozens of file edits.

Best for: Large features, complex refactoring, documentation, accessibility fixes

GitHub Copilot

Serves dual roles: inline code completion during editing, and automated pull request reviewer. Every PR gets a Copilot code review that catches issues ranging from unused variables to accessibility problems. We typically require 3 clean reviews before merging.

Best for: Code review, inline completions, quick edits

GPT-4o / o1 (OpenAI)

Used for brainstorming, content drafting, and reasoning about complex architectural decisions. The o1 reasoning model is particularly useful for thinking through multi-step problems where the solution path is not obvious.

Best for: Content drafting, architecture planning, complex reasoning

Gemini (Google)

Another agent option within our IDE. Gemini offers strong performance on code generation tasks and provides an alternative perspective when we want to compare approaches across models.

Best for: Alternative perspectives, code generation, comparative analysis

How We Actually Build

Our development workflow is a structured loop between human intent and AI execution. Here is the actual process, not idealized:

1

Human Opens an Issue

Every change starts as a GitHub Issue. The human describes the goal — fix a bug, add a page, refactor a component. Parent issues can have sub-tasks for larger initiatives.

2

AI Agent Implements

The human tells an AI agent (usually Claude) what to build. The agent reads the codebase, understands the patterns, creates a feature branch, writes the code, runs tests, and pushes the branch — all within the IDE.

3

PR Opens, CI Runs

The agent opens a pull request referencing the issue. CI automatically runs: formatting checks, linting, Jest unit tests (including accessibility), static build, four Playwright E2E shards, and CodeQL security scanning.

4

AI Reviews AI

GitHub Copilot automatically reviews the PR. It checks for unused variables, misleading comments, type safety issues, accessibility problems, and consistency with the rest of the codebase. We request multiple review rounds.

5

Fix, Re-review, Merge

Review comments get addressed — the same or a different AI agent reads the feedback, makes fixes, pushes again, and requests another review. Once reviews are clean and CI is green, the human approves the merge.

Instruction Files: Teaching AI Your Codebase

AI agents are only as good as the context they receive. We maintain several instruction files that teach agents about our project conventions, architecture, and workflows:

FilePurposeUsed By
AGENTS.mdGeneral instructions for all AI coding agents — architecture, naming conventions, testing requirements, API integrationsAll agents
CLAUDE.mdClaude-specific instructions — strengths to leverage, IDE capabilities, MCP integration details, terminal access patternsClaude
GEMINI.mdGemini-specific instructions — optimized for Google's agent capabilities and context windowGemini
.github/copilot-instructions.mdGitHub Copilot instructions — automatically loaded by Copilot Chat and code review, covers full project contextGitHub Copilot

These files are "meta-documentation" — documentation that teaches AI how to read documentation. They are version-controlled alongside the code and updated whenever the project's conventions change. This creates a virtuous cycle: better instruction files produce better AI output, which reveals what the instruction files are missing.

MCP: Giving AI Agents Real Tools

The Model Context Protocol (MCP) extends AI agents beyond code generation by giving them access to external APIs directly within the IDE:

GitHub MCP

Allows agents to create issues, open pull requests, request reviews, search code, and check CI status — all without leaving the conversation. The agent can manage the full issue-to-merge lifecycle.

Qualtrics MCP

Connects agents to the Qualtrics survey platform via OAuth. Agents can list surveys, inspect definitions, and manage survey operations — useful for the annual survey rollover process.

MCP means an agent can go from "create an issue for this bug" to "fix it, open a PR, and request a review" in a single conversation. This is how most of our development actually happens.

Quality Guardrails

AI-generated code is not trusted by default. Every change passes through multiple automated quality gates before it reaches production:

  • Prettier formatting — enforced in CI; code that does not match the project style is rejected
  • ESLint — catches code quality issues; new errors fail the build
  • Jest unit tests — 124+ tests covering components and utilities
  • jest-axe accessibility — automated WCAG compliance checks that catch missing ARIA labels, incorrect roles, and color contrast issues
  • Static build — the entire site must build successfully as a static export
  • Playwright E2E — four parallel shards testing real browser interactions across the full site
  • CodeQL security scanning — GitHub's code analysis catches security vulnerabilities
  • Copilot code review — automated review catching logical errors, unused code, misleading comments, and consistency issues
  • Lighthouse CI — performance, accessibility, and SEO scoring on every merge to main

These guardrails mean that AI can write code aggressively — the safety net catches problems before they reach users.

What We Have Learned

Where AI Excels

  • Boilerplate and scaffolding — creating new pages, components, and test files following established patterns
  • Multi-file refactoring — renaming, restructuring, and moving code across many files consistently
  • Writing tests — generating unit and E2E tests with good coverage of edge cases
  • Documentation — writing clear, structured technical documentation and inline comments
  • Accessibility fixes — identifying and fixing ARIA issues, semantic HTML problems, and keyboard navigation gaps
  • Code review response — reading review comments and implementing precise fixes

Where AI Struggles

  • Complex merge conflicts — when multiple branches have diverged significantly, AI agents sometimes revert unrelated changes or create inconsistent resolutions
  • Nuanced design decisions — visual layout choices, color palette decisions, and UX trade-offs still need human judgment
  • Cross-session context — agents lose context between conversations; the instruction files partially solve this, but complex multi-day work requires human continuity
  • External API quirks — when an API behaves unexpectedly (like Qualtrics rejecting a valid-looking request), agents can get stuck in loops trying the same approach
  • Knowing when to stop — agents sometimes over-engineer solutions or make unnecessary "improvements" that the human did not request

The Impact

This approach allows a small team to maintain a 149+ page website with comprehensive testing, daily automated reports, and multiple API integrations. Tasks that would take hours of manual coding — creating six new pages with consistent styling, breadcrumbs, metadata, sitemap entries, and navigation updates — can be completed in a single conversation.

The key insight is not that AI writes perfect code. It does not. The key insight is that with good guardrails — CI, automated review, accessibility checks — the imperfect code gets caught and fixed before it matters. The velocity gain comes from the AI handling the mechanical work while the human focuses on direction and quality judgment.

This page was, naturally, written by an AI agent and reviewed by GitHub Copilot. The human approved it.