Advanced22 min

Learn how to work with AI agents that can autonomously perform multi-step tasks—with appropriate oversight and guardrails.

Agentic Workflows

Agentic AI represents a shift from simple request-response to autonomous task completion. AI agents can plan, execute, and iterate on complex tasks with minimal human intervention—but require appropriate oversight.

What Makes AI "Agentic"

Traditional AI assistance:

Terminal

Human: "Write a function to validate emails"
AI: [Returns function]
Human: "Now add tests"
AI: [Returns tests]
Human: "Run the tests"
AI: "I can't run commands"

Agentic AI:

Terminal

Human: "Create an email validation function with tests"
AI:
  1. Writes function
  2. Creates test file
  3. Runs tests
  4. Sees failures
  5. Fixes code
  6. Runs tests again
  7. Reports success

Agentic Capabilities

Modern AI agents can:

Capability	Example
Read files	Examine existing code for patterns
Write files	Create new code and configurations
Run commands	Execute builds, tests, linters
Search code	Find relevant files and patterns
Browse web	Research documentation and solutions
Iterate	Fix issues based on feedback
Plan	Break tasks into steps
Use tools	Query databases via MCP, call APIs

Types of AI Agents

Synchronous Agents (Interactive)

Run in your terminal, complete tasks while you wait:

Claude Code: Terminal agent with extended thinking
OpenAI Codex: Multi-tool terminal agent
Cline / Aider: VS Code integrated agents

Asynchronous Agents (Background)

Work independently on complex tasks:

GitHub Copilot Coding Agent: Assign issues, get PRs
Devin: Autonomous software engineer (hours-long tasks)
Google Jules: Async coding agent

When to use async agents:

Tasks that would take hours
Work that doesn't require immediate review
Feature implementation while you focus elsewhere
Overnight or background processing

When to Use Agentic Workflows

Good Candidates

Multi-file refactoring
Adding features with tests
Fixing bugs across codebase
Upgrading dependencies
Code migrations
Documentation generation

Poor Candidates

Simple one-file changes (overhead not worth it)
Security-critical code (needs careful review)
Architecture decisions (needs human judgment)
Production deployments (too risky)

The Agentic Workflow Loop

Terminal

┌──────────────────────────────────────────┐
│              HUMAN OVERSIGHT             │
│  ┌────────────────────────────────────┐  │
│  │          AI AGENT LOOP             │  │
│  │                                    │  │
│  │  1. Understand Task                │  │
│  │         ↓                          │  │
│  │  2. Plan Approach                  │  │
│  │         ↓                          │  │
│  │  3. Execute Step                   │  │
│  │         ↓                          │  │
│  │  4. Verify Result                  │  │
│  │         ↓                          │  │
│  │  5. Success? → Done                │  │
│  │     Failure? → Adjust & Retry      │  │
│  │         ↑____________↓             │  │
│  │                                    │  │
│  └────────────────────────────────────┘  │
│                                          │
│  Review at checkpoints                   │
│  Approve critical changes                │
│  Intervene if needed                     │
└──────────────────────────────────────────┘

Setting Up Agentic Tasks

Clear Task Definition

Terminal

Task: Add user profile editing functionality

## Scope
- Frontend: Profile edit form
- Backend: PATCH /api/users/:id endpoint
- Validation: Name, email, avatar URL
- Tests: Unit tests for validation, integration test for API

## Constraints
- Must use existing form components
- Must follow REST conventions
- Must not change existing user routes

## Success Criteria
- All new tests pass
- Existing tests still pass
- Endpoint documented in OpenAPI spec

## Checkpoints (notify me)
- After creating API endpoint
- After creating form component
- After all tests pass

Defining Guardrails

Terminal

Permissions:
- [x] Read any file
- [x] Write to src/ and tests/ directories
- [x] Run npm test and npm run lint
- [ ] Do not modify package.json
- [ ] Do not run npm install
- [ ] Do not delete files without confirmation
- [ ] Do not commit changes

Stop and ask if:
- Unsure about architectural decisions
- Need to modify more than 10 files
- Tests fail more than 3 times
- Encounter security-sensitive code

Verification Strategies

Continuous Verification

Terminal

After each code change:
1. Run linter
2. Run type checker
3. Run affected tests

If any fail:
1. Read error message
2. Identify cause
3. Fix issue
4. Re-verify

Test-Driven Agentic Work

Terminal

Task: Add password strength validation

Approach:
1. First, write tests for password validation
2. Run tests (should fail - TDD red phase)
3. Implement validation function
4. Run tests (should pass - green phase)
5. Refactor if needed
6. Run tests again to verify

This ensures code is verifiably correct before moving on.

Iteration and Recovery

Handling Failures

When agents encounter issues:

Terminal

If build fails:
  1. Read error output
  2. Identify affected file/line
  3. Understand the error
  4. Apply fix
  5. Rebuild

If stuck (3+ attempts):
  1. Stop and report to human
  2. Explain what was tried
  3. Present options for resolution
  4. Wait for guidance

Progress Tracking

Terminal

## Task Progress

### Completed
- [x] Created user schema types
- [x] Added validation functions
- [x] Created API endpoint (verified with test)

### In Progress
- [ ] Creating frontend form (80% done, styling remaining)

### Remaining
- [ ] Integration tests
- [ ] OpenAPI documentation

### Issues Encountered
- Issue: Form validation not matching API validation
- Resolution: Extracted shared validation schema

Supervision Levels

Level 1: Full Supervision

Terminal

Before each step:
- Show plan
- Wait for approval
- Execute
- Report result

Best for: Learning the tool, high-risk tasks

Level 2: Checkpoint Supervision

Terminal

Work autonomously through defined phases.
Stop at checkpoints for review.

Best for: Medium complexity, moderate risk

Level 3: Outcome Supervision

Terminal

Work autonomously to completion.
Present final result for review.

Best for: Well-defined tasks, low risk, trusted tools

Agentic Prompt Patterns

The Task Definition Pattern

Terminal

# Task: [Clear, specific task name]

## Context
[Relevant background information]

## Requirements
- [Requirement 1]
- [Requirement 2]

## Acceptance Criteria
- [ ] [How to know it's done]
- [ ] [Tests that must pass]

## Approach
[Suggested approach, or "Plan and propose approach"]

## Constraints
- [What not to do]
- [Limits on scope]

## Verification
[How to verify success]

The Multi-Step Pattern

Terminal

Complete this task in steps, verifying each step before continuing:

Step 1: [First step]
Verify: [How to verify]

Step 2: [Second step]
Verify: [How to verify]

Step 3: [Third step]
Verify: [How to verify]

If any step fails verification, stop and report the issue.

The Research-Then-Implement Pattern

Terminal

Phase 1: Research
- Read existing code in [directories]
- Understand current patterns
- Identify integration points
- Report findings before proceeding

Phase 2: Plan
- Propose implementation approach
- List files to create/modify
- Identify potential risks
- Wait for approval

Phase 3: Implement
- Follow approved plan
- Verify each change works
- Run tests continuously
- Report completion

Safety Considerations

What Agents Should Never Do

Deploy to production
Delete important files without confirmation
Modify authentication/authorization without review
Access or transmit sensitive data
Make irreversible changes
Install arbitrary packages

Sandboxing

Terminal

Run in isolated environment:
- Separate branch (not main)
- Test database (not production)
- Feature flags (disabled by default)
- No access to secrets

Human Review Points

Always require human review for:

Security-related code changes
Database schema changes
API contract changes
Dependency additions
Configuration changes
Any code going to production

Measuring Agentic Effectiveness

Metrics to Track

Terminal

Task Completion:
- Tasks completed autonomously: X%
- Required intervention: Y%
- Failed completely: Z%

Quality:
- First-attempt success rate
- Number of iterations needed
- Tests passing on completion

Efficiency:
- Time to completion vs manual
- Human review time required

Practice Exercise

Set up an agentic task with proper guardrails:

Task: Add a "copy to clipboard" button to code blocks

Define the task with acceptance criteria
Set permissions and constraints
Specify checkpoints for review
Define verification strategy
Execute with agent
Review results and iterate

Summary

Agentic AI can autonomously complete multi-step tasks
Define clear tasks with acceptance criteria
Set appropriate guardrails and permissions
Use verification at each step
Maintain human oversight at checkpoints
Review all generated code before merging

Next Steps

Let's wrap up with context management across workflows—how to maintain continuity in long-running AI sessions.

Mark this lesson as complete to track your progress