Advanced22 min

Learn how to work with AI agents that can autonomously perform multi-step tasks—with appropriate oversight and guardrails.

Agentic Workflows

Agentic AI represents a shift from simple request-response to autonomous task completion. AI agents can plan, execute, and iterate on complex tasks with minimal human intervention—but require appropriate oversight.

What Makes AI "Agentic"

Traditional AI assistance:

Terminal
Human: "Write a function to validate emails"
AI: [Returns function]
Human: "Now add tests"
AI: [Returns tests]
Human: "Run the tests"
AI: "I can't run commands"

Agentic AI:

Terminal
Human: "Create an email validation function with tests"
AI:
  1. Writes function
  2. Creates test file
  3. Runs tests
  4. Sees failures
  5. Fixes code
  6. Runs tests again
  7. Reports success

Agentic Capabilities

Modern AI agents can:

CapabilityExample
Read filesExamine existing code for patterns
Write filesCreate new code and configurations
Run commandsExecute builds, tests, linters
Search codeFind relevant files and patterns
Browse webResearch documentation and solutions
IterateFix issues based on feedback
PlanBreak tasks into steps
Use toolsQuery databases via MCP, call APIs

Types of AI Agents

Synchronous Agents (Interactive)

Run in your terminal, complete tasks while you wait:

  • Claude Code: Terminal agent with extended thinking
  • OpenAI Codex: Multi-tool terminal agent
  • Cline / Aider: VS Code integrated agents

Asynchronous Agents (Background)

Work independently on complex tasks:

  • GitHub Copilot Coding Agent: Assign issues, get PRs
  • Devin: Autonomous software engineer (hours-long tasks)
  • Google Jules: Async coding agent

When to use async agents:

  • Tasks that would take hours
  • Work that doesn't require immediate review
  • Feature implementation while you focus elsewhere
  • Overnight or background processing

When to Use Agentic Workflows

Good Candidates

  • Multi-file refactoring
  • Adding features with tests
  • Fixing bugs across codebase
  • Upgrading dependencies
  • Code migrations
  • Documentation generation

Poor Candidates

  • Simple one-file changes (overhead not worth it)
  • Security-critical code (needs careful review)
  • Architecture decisions (needs human judgment)
  • Production deployments (too risky)

The Agentic Workflow Loop

Terminal
┌──────────────────────────────────────────┐
              HUMAN OVERSIGHT             
  ┌────────────────────────────────────┐  
            AI AGENT LOOP               
                                        
    1. Understand Task                  
                                       
    2. Plan Approach                    
                                       
    3. Execute Step                     
                                       
    4. Verify Result                    
                                       
    5. Success?  Done                  
       Failure?  Adjust & Retry        
           ↑____________↓               
                                        
  └────────────────────────────────────┘  
                                          
  Review at checkpoints                   
  Approve critical changes                
  Intervene if needed                     
└──────────────────────────────────────────┘

Setting Up Agentic Tasks

Clear Task Definition

Terminal
Task: Add user profile editing functionality

## Scope
- Frontend: Profile edit form
- Backend: PATCH /api/users/:id endpoint
- Validation: Name, email, avatar URL
- Tests: Unit tests for validation, integration test for API

## Constraints
- Must use existing form components
- Must follow REST conventions
- Must not change existing user routes

## Success Criteria
- All new tests pass
- Existing tests still pass
- Endpoint documented in OpenAPI spec

## Checkpoints (notify me)
- After creating API endpoint
- After creating form component
- After all tests pass

Defining Guardrails

Terminal
Permissions:
- [x] Read any file
- [x] Write to src/ and tests/ directories
- [x] Run npm test and npm run lint
- [ ] Do not modify package.json
- [ ] Do not run npm install
- [ ] Do not delete files without confirmation
- [ ] Do not commit changes

Stop and ask if:
- Unsure about architectural decisions
- Need to modify more than 10 files
- Tests fail more than 3 times
- Encounter security-sensitive code

Verification Strategies

Continuous Verification

Terminal
After each code change:
1. Run linter
2. Run type checker
3. Run affected tests

If any fail:
1. Read error message
2. Identify cause
3. Fix issue
4. Re-verify

Test-Driven Agentic Work

Terminal
Task: Add password strength validation

Approach:
1. First, write tests for password validation
2. Run tests (should fail - TDD red phase)
3. Implement validation function
4. Run tests (should pass - green phase)
5. Refactor if needed
6. Run tests again to verify

This ensures code is verifiably correct before moving on.

Iteration and Recovery

Handling Failures

When agents encounter issues:

Terminal
If build fails:
  1. Read error output
  2. Identify affected file/line
  3. Understand the error
  4. Apply fix
  5. Rebuild

If stuck (3+ attempts):
  1. Stop and report to human
  2. Explain what was tried
  3. Present options for resolution
  4. Wait for guidance

Progress Tracking

Terminal
## Task Progress

### Completed
- [x] Created user schema types
- [x] Added validation functions
- [x] Created API endpoint (verified with test)

### In Progress
- [ ] Creating frontend form (80% done, styling remaining)

### Remaining
- [ ] Integration tests
- [ ] OpenAPI documentation

### Issues Encountered
- Issue: Form validation not matching API validation
- Resolution: Extracted shared validation schema

Supervision Levels

Level 1: Full Supervision

Terminal
Before each step:
- Show plan
- Wait for approval
- Execute
- Report result

Best for: Learning the tool, high-risk tasks

Level 2: Checkpoint Supervision

Terminal
Work autonomously through defined phases.
Stop at checkpoints for review.

Best for: Medium complexity, moderate risk

Level 3: Outcome Supervision

Terminal
Work autonomously to completion.
Present final result for review.

Best for: Well-defined tasks, low risk, trusted tools

Agentic Prompt Patterns

The Task Definition Pattern

Terminal
# Task: [Clear, specific task name]

## Context
[Relevant background information]

## Requirements
- [Requirement 1]
- [Requirement 2]

## Acceptance Criteria
- [ ] [How to know it's done]
- [ ] [Tests that must pass]

## Approach
[Suggested approach, or "Plan and propose approach"]

## Constraints
- [What not to do]
- [Limits on scope]

## Verification
[How to verify success]

The Multi-Step Pattern

Terminal
Complete this task in steps, verifying each step before continuing:

Step 1: [First step]
Verify: [How to verify]

Step 2: [Second step]
Verify: [How to verify]

Step 3: [Third step]
Verify: [How to verify]

If any step fails verification, stop and report the issue.

The Research-Then-Implement Pattern

Terminal
Phase 1: Research
- Read existing code in [directories]
- Understand current patterns
- Identify integration points
- Report findings before proceeding

Phase 2: Plan
- Propose implementation approach
- List files to create/modify
- Identify potential risks
- Wait for approval

Phase 3: Implement
- Follow approved plan
- Verify each change works
- Run tests continuously
- Report completion

Safety Considerations

What Agents Should Never Do

  • Deploy to production
  • Delete important files without confirmation
  • Modify authentication/authorization without review
  • Access or transmit sensitive data
  • Make irreversible changes
  • Install arbitrary packages

Sandboxing

Terminal
Run in isolated environment:
- Separate branch (not main)
- Test database (not production)
- Feature flags (disabled by default)
- No access to secrets

Human Review Points

Always require human review for:

  • Security-related code changes
  • Database schema changes
  • API contract changes
  • Dependency additions
  • Configuration changes
  • Any code going to production

Measuring Agentic Effectiveness

Metrics to Track

Terminal
Task Completion:
- Tasks completed autonomously: X%
- Required intervention: Y%
- Failed completely: Z%

Quality:
- First-attempt success rate
- Number of iterations needed
- Tests passing on completion

Efficiency:
- Time to completion vs manual
- Human review time required

Practice Exercise

Set up an agentic task with proper guardrails:

Task: Add a "copy to clipboard" button to code blocks

  1. Define the task with acceptance criteria
  2. Set permissions and constraints
  3. Specify checkpoints for review
  4. Define verification strategy
  5. Execute with agent
  6. Review results and iterate

Summary

  • Agentic AI can autonomously complete multi-step tasks
  • Define clear tasks with acceptance criteria
  • Set appropriate guardrails and permissions
  • Use verification at each step
  • Maintain human oversight at checkpoints
  • Review all generated code before merging

Next Steps

Let's wrap up with context management across workflows—how to maintain continuity in long-running AI sessions.

Mark this lesson as complete to track your progress