Overview
The Operations Agent handles the third phase of AI-DLC. It takes constructed features to production, verifies they work correctly, and sets up monitoring.Invocation
- Claude Code
- Cursor
- GitHub Copilot
Commands
| Command | Purpose |
|---|---|
build | Build the project |
deploy | Deploy to environment |
verify | Verify deployment |
monitor | Set up monitoring |
build
Builds the project for deployment:Example Session
deploy
Deploys to a target environment:Deployment Strategies
The agent supports common strategies based on your infrastructure:| Strategy | Description |
|---|---|
| Rolling | Replace instances gradually |
| Blue-Green | Switch between two environments |
| Canary | Route percentage of traffic to new version |
verify
Runs verification after deployment:Example Output
monitor
Sets up or checks monitoring:Logging
Structured logging configuration
Metrics
Key performance indicators
Alerts
Alert rules and thresholds
Dashboards
Visualization of system health
Key Metrics
The agent suggests monitoring:| Category | Metrics |
|---|---|
| Availability | Uptime, error rate, success rate |
| Performance | Latency (p50, p95, p99), throughput |
| Resources | CPU, memory, disk, connections |
| Business | Active users, transactions, conversions |
Human Checkpoints
The Operations Agent has 4 human checkpoints aligned with environment progression:| Gate | Location | Purpose |
|---|---|---|
| Gate 1 | After build | Approve build artifacts before deployment |
| Gate 2 | Before staging deploy | Confirm ready for staging environment |
| Gate 3 | Before production deploy | Critical approval for production |
| Gate 4 | After monitoring setup | Confirm operations complete |
Artifacts
Operations artifacts are stored in:Runbooks
The agent generates runbooks for common operations:Deployment Runbook
Deployment Runbook
Step-by-step deployment procedure:
- Pre-deployment checklist
- Deployment commands
- Verification steps
- Rollback procedure
Incident Response
Incident Response
What to do when things go wrong:
- Detection and triage
- Escalation matrix
- Communication template
- Post-mortem process
Scaling Runbook
Scaling Runbook
How to handle increased load:
- Signs of scaling needs
- Horizontal vs vertical
- Scaling commands
- Verification
Best Practices
Always Verify
Always Verify
Never skip verification after deployment. Automated checks catch issues humans miss.
Stage Environments
Stage Environments
Deploy to staging before production. Test the deployment process itself.
Monitor Proactively
Monitor Proactively
Set up alerts before you need them. Don’t wait for production issues.
Document Runbooks
Document Runbooks
Keep runbooks updated. They’re essential during incidents.
