Ciao! Sono

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

Contattami

Chi Sono

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

Le Mie Competenze

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

Automazione Processi

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

Sistemi Custom

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

03 - Agentic Workflows: Decomposing Problems for AI

There is a precise moment when AI stops being an assistant and becomes a collaborator: when you stop asking it to "write this function" and start asking it to "solve this problem". That mental shift - from a single instruction to a complex task - is at the heart of agentic workflows. It is also the point where most developers get stuck, because building an effective agentic workflow requires a deep paradigm shift: it is not about longer prompts, it is about architecture.

In 2025, 92% of American developers use AI tools in their daily work (Stack Overflow Developer Survey 2025), but only a fraction of them truly leverage the potential of agentic systems. The problem is not access to technology: it is the lack of clear architectural patterns for decomposing complex problems into tasks that an AI agent can solve reliably, verifiably and repeatably.

In this article, we will build together a deep understanding of agentic workflows: from fundamental patterns (Sequential, Parallel, Hierarchical, Iterative) to practical implementation with Claude Code, including context management, quality metrics and the anti-patterns that turn a promising workflow into an unreliable system. All with real case studies and working code.

What You Will Learn

The four fundamental decomposition patterns: Sequential, Parallel, Hierarchical, Iterative
Agentic workflow architecture: Planner, Executor, Reviewer, Memory
How to structure CLAUDE.md to guide agents on real projects
The Plan-Execute-Review loop and self-healing workflows
Tool use and context management for long agentic sessions
Metrics to evaluate agentic workflow quality
The most dangerous anti-patterns and how to avoid them
Case study: Angular codebase refactoring with an agentic workflow

What Is an Agentic Workflow

An agentic workflow is a structured sequence of operations where one or more AI agents plan, execute, and verify actions to achieve a complex objective. Unlike a single LLM call, an agentic workflow has memory across steps, can use tools (file system, terminal, web, APIs), can delegate subtasks to specialized agents, and can adapt to errors without human intervention.

The key distinction from "naive" vibe coding lies in deliberate structure. Asking Claude to "refactor this codebase" produces mediocre results. Building a workflow that: (1) analyzes the codebase, (2) identifies components to refactor in priority order, (3) refactors one component at a time with regression tests, (4) verifies each step before proceeding, produces professional results. The difference is decomposition.

Operational Definition

An agentic workflow is reliable when: every step is independently verifiable, failure of one step does not corrupt the entire workflow, the final output is deterministic with respect to the input, and a human operator can inspect and correct the workflow at any checkpoint. This definition comes from the Anthropic "Building effective agents" framework (2024) and remains the compass for evaluating any implementation.

Problem Decomposition: The Core of Agentic Workflows

The TDAG research (Task Decomposition and Agent Generation, 2025) demonstrated that decomposition quality is the most predictive factor of multi-agent system success: wrong decomposition propagates errors exponentially through subsequent steps, while correct decomposition isolates failures and enables recovery.

There are four fundamental decomposition patterns, each suited to different types of problems:

Pattern 1: Sequential (Chain)

The simplest pattern: each task depends on the output of the previous one. Suited to linear workflows where order is semantically significant (e.g., analysis -> design -> implementation -> test).

Sequential Pattern:

  [Input]
     |
     v
  [Task A] --> output_A
                  |
                  v
              [Task B] --> output_B
                              |
                              v
                          [Task C] --> [Final Output]

Characteristics:
- Each task receives previous output as context
- Task B failure blocks Task C
- Simple debugging: error localized to failed task
- Latency: sum of all times (A + B + C)

Typical use case: Code generation pipeline
  1. Requirements analysis
  2. Architectural design
  3. Implementation
  4. Test generation
  5. Documentation

Pattern 2: Parallel (Scatter-Gather)

Multiple independent tasks are executed simultaneously by separate agents, with an aggregator collecting and synthesizing results. Dramatically reduces latency but requires subtasks to be truly independent (no shared mutable state).

Parallel Pattern (Scatter-Gather):

              [Input]
                 |
         [Orchestrator/Splitter]
        /          |           \
       v           v            v
  [Task A]    [Task B]     [Task C]
  out_A       out_B        out_C
       \           |           /
        v          v          v
         [Aggregator/Merger]
                  |
            [Final Output]

Characteristics:
- Tasks A, B, C run in parallel (via sub-agents)
- Latency: max(A, B, C) instead of A + B + C
- Partial failure: manageable if aggregator is robust
- Risk: race conditions on shared resources

Typical use case: Multi-dimensional review
  Agent 1: Security review
  Agent 2: Performance analysis
  Agent 3: Code style check
  Agent 4: Test coverage analysis
  Aggregator: Synthesize findings

Pattern 3: Hierarchical (Supervisor-Worker)

A supervisor agent decomposes the problem into subtasks and delegates to specialized workers. Workers can in turn have sub-workers. It is the most powerful pattern for large-scale problems but also the most complex to debug. LangGraph documented this pattern as the most adopted in 2025 for enterprise systems.

Hierarchical Pattern:

                [Planner Agent]
                "Refactor auth module"
               /        |         \
              v          v          v
     [Backend Agent] [Frontend] [Test Agent]
     "Refactor        "Update     "Update
      AuthService"     LoginCmp"   test suite"
         |               |             |
    [sub-tasks]     [sub-tasks]   [sub-tasks]
         |               |             |
        done            done          done
               \        |         /
                v        v        v
              [Planner: Merge & Verify]
                         |
                  [Final Output]

Typical levels in real systems:
  L0: Problem Planner (decomposes global goal)
  L1: Domain Agents (backend, frontend, infra)
  L2: Task Workers (individual files, functions)
  L3: Tool Calls (bash, file system, tests)

Pattern 4: Iterative (ReAct / Reflexion)

The agent operates in a loop: executes an action, observes the result, reflects on the current state, and decides the next step. This is the pattern of the ReAct framework (Reasoning + Acting) and its extension Reflexion (which adds an explicit critique). Suited to exploratory problems where the solution path is not known in advance.

Iterative Pattern (ReAct + Reflexion):

  [Goal]
     |
     v
  [Think] <-----------+
     |                |
     v                |
  [Act / Tool Use]    |  (if max_iter not
     |                |   reached and goal
     v                |   not satisfied)
  [Observe]           |
     |                |
     v                |
  [Critique/Reflect]--+
     |
  (if goal satisfied)
     |
     v
  [Output]

Key elements:
- Scratchpad: memory of previous steps
- Stop condition: goal reached OR max_iterations
- Critique: explicit evaluation of partial output
- Tool repertoire: set of tools available to the agent

Use case: Debugging a failing test
  1. Read error message
  2. Analyze related code
  3. Formulate hypothesis
  4. Apply fix
  5. Run test
  6. If still failing: return to step 2
  7. If passing: write explanation

Agentic Workflow Architecture

Regardless of the chosen pattern, a mature agentic workflow has four fundamental components. Understanding them is the prerequisite for building robust systems in production.

1. Planner

The Planner receives the high-level goal and transforms it into a structured plan: a sequence (or DAG) of subtasks with dependencies, agent assignments, success criteria for each step, and resource estimates (token budget, required tools). A good Planner produces verifiable plans: every step has a well-defined expected output.

2. Executor

The Executor takes individual tasks from the Planner and executes them, using available tools: file system, bash, web search, APIs. Each specialized Executor (backend agent, test agent, doc agent) has access only to the tools necessary for its domain, following the principle of least privilege. Claude Code implements this through the permissions system and custom sub-agents with configurable allowedTools.

3. Reviewer

The Reviewer verifies that each Executor's output satisfies the success criteria defined by the Planner. It is not a simple "looks good": a quality Reviewer runs automated tests, static analysis, regression checks. The Reviewer can approve (workflow proceeds), request modifications (Executor retries), or escalate to a human (mandatory checkpoint).

4. Memory

Memory manages context across workflow steps. It has two levels:

Short-term (in-context): the content of the current context window, including outputs from previous steps. Limited by available tokens.
Long-term (external): state files (e.g., claude-progress.txt), databases, git history. Enables resuming interrupted workflows across different sessions.

The claude-progress.txt Pattern

Anthropic recommends using a claude-progress.txt file in the project root for inter-session memory. The initializer agent writes workflow state at each checkpoint; the subsequent agent reads this file to understand where work stands and what to do next. Combined with git log, it provides full context without saturating the context window.

Practical Implementation with Claude Code

Claude Code offers three main levers for building agentic workflows: the CLAUDE.md file for guiding agent behavior, the Task tool for delegating to sub-agents, and custom agents (defined in .claude/agents/) for specialization. Let us see how to combine them.

CLAUDE.md Structure for Agentic Workflows

The CLAUDE.md file is your project's "constitution" for AI agents. A CLAUDE.md designed for agentic workflows includes not only project information, but the workflow structure itself: which agents exist, how they coordinate, what the success criteria are for each phase.

# CLAUDE.md - Agentic Workflow for Angular Project

## Project
Portfolio Angular with SSR, Angular 21, TypeScript strict.
Stack: Angular, Firebase, SCSS.

## Available Agentic Workflows

### Workflow: Feature Development
Use this workflow to implement new features:

**Phase 1 - Planning** (mandatory):
- Read all files related to the feature
- Create `docs/plans/[feature-name].md` with:
  - Components to create/modify
  - TypeScript interfaces needed
  - Tests to write (TDD: write tests first)
  - Dependencies and risks
- DO NOT implement until plan is approved

**Phase 2 - TDD Implementation**:
- Write unit tests BEFORE implementation
- Implement the minimum needed to pass tests
- Then refactor
- Verify `npm test` passes without errors

**Phase 3 - Review**:
- Run `npm run lint`
- Verify SSR build compiles: `npm run build`
- Check no regressions

### Workflow: Refactoring
For refactoring existing components:
1. Create branch: `git checkout -b refactor/[name]`
2. Analyze component dependencies with grep/Glob
3. Refactor ONE component at a time
4. Verify tests after each component
5. Atomic commit for each component

### Workflow: Debug
For bug fixing:
1. Reproduce the bug with a failing test
2. Identify root cause (do NOT fix symptoms)
3. Apply minimal fix
4. Verify that the test now passes
5. Check for regressions

## Specialized Agents
Available in `.claude/agents/`:
- `architect.md`: For architectural decisions
- `security-reviewer.md`: Before every commit
- `code-reviewer.md`: After every implementation
- `tdd-guide.md`: For TDD workflow

## Global Success Criteria
- TypeScript strict: zero errors `tsc --noEmit`
- Test coverage: minimum 80%
- SSR Build: `npm run build` must complete without errors
- No implicit `any`
- No state mutation (immutability pattern)

## Error Handling
If a command fails:
1. Read the complete error
2. DO NOT proceed to the next step
3. Identify and resolve before continuing
4. If unable after 2 attempts: STOP and ask for clarification

Prompt Engineering for Task Decomposition

Decomposition quality depends directly on the quality of the initial prompt. Here is a tested template for guiding an agent through complex task decomposition:

# Prompt Template: Task Decomposition

## Context
You are a Planning Agent. Your task is to decompose the following goal
into concrete, verifiable subtasks assignable to specialized agents.

## Goal
[GOAL DESCRIPTION]

## Constraints
- Each subtask must have: ID, description, expected input, expected output,
  responsible agent, success criteria (automatically verifiable)
- Subtasks must be ordered by dependencies (DAG)
- No subtask should take more than [X] minutes / [Y] tokens
- Define mandatory checkpoints where a human must approve

## Expected Output
Produce a structured plan in this JSON format:

{
  "goal": "goal description",
  "estimated_complexity": "low|medium|high",
  "subtasks": [
    {
      "id": "T001",
      "description": "Analyze AuthService component structure",
      "agent": "analyzer",
      "inputs": ["src/app/services/auth.service.ts"],
      "outputs": ["docs/analysis/auth-service.md"],
      "success_criteria": ["file created", "contains sections: deps, interfaces, methods"],
      "depends_on": [],
      "estimated_tokens": 8000
    },
    {
      "id": "T002",
      "description": "Write tests for AuthService",
      "agent": "tdd-agent",
      "inputs": ["docs/analysis/auth-service.md", "src/app/services/auth.service.ts"],
      "outputs": ["src/app/services/auth.service.spec.ts"],
      "success_criteria": ["npm test -- --testPathPattern=auth.service passes"],
      "depends_on": ["T001"],
      "estimated_tokens": 12000
    }
  ],
  "checkpoints": ["after T001: plan review", "after T003: implementation review"],
  "rollback_strategy": "git stash before every destructive change"
}

Using the Task Tool for Sub-Agents

Claude Code exposes the Task tool to delegate work to sub-agents. Each sub-agent operates in an isolated context with its own context window, which allows managing workflows that exceed the limits of a single session. Anthropic research (2026 Agentic Coding Trends Report) indicates the most effective pattern uses Opus for orchestration and Sonnet for workers, reducing costs by 40-60% while maintaining quality.

# Prompt for orchestrating sub-agents with Task tool

I need to refactor the authentication module of this Angular app.
I have analyzed the codebase and identified these independent parallel tasks:

**Task 1 - Security Review** (use Task tool):
Prompt: "Read src/app/services/auth.service.ts and all files that import it.
Analyze security vulnerabilities: token storage, session management,
CSRF protection. Produce a markdown report in docs/security/auth-review.md
with priorities: CRITICAL, HIGH, MEDIUM, LOW."
Tools: Read, Grep, Write

**Task 2 - Test Coverage Analysis** (use Task tool):
Prompt: "Analyze src/app/services/auth.service.spec.ts vs auth.service.ts.
Identify uncovered functions. Produce list in docs/analysis/test-gaps.md"
Tools: Read, Glob, Grep, Write

**Task 3 - Dependency Graph** (use Task tool):
Prompt: "Map all AuthService dependencies using grep and glob.
Create docs/analysis/auth-deps.md with ASCII dependency graph."
Tools: Read, Glob, Grep, Write

Run the 3 Tasks in parallel. When all complete, read the 3 reports
and produce docs/plans/auth-refactoring.md with the consolidated
refactoring plan, ordered by priority."

Advanced Workflow Patterns

Plan-Execute-Review Loop

The PER (Plan-Execute-Review) loop is the most robust pattern for complex workflows. Each iteration produces a verifiable artifact before proceeding to the next. The key is that the Review step is not optional: it is the mechanism that prevents error propagation.

Plan-Execute-Review Loop:

ITERATION 1:
  Plan: "Analyze AuthService and create refactoring plan"
  Execute: Agent reads code, writes docs/plans/auth.md
  Review: Verify docs/plans/auth.md exists with required sections
  -> PASS: proceed to iteration 2
  -> FAIL: retry Execute (max 2 times), then escalate

ITERATION 2:
  Plan: "Write tests for AuthService (based on plan)"
  Execute: Agent writes auth.service.spec.ts
  Review: `npm test -- --testPathPattern=auth` must pass
  -> PASS: proceed to iteration 3
  -> FAIL: agent debug + retry

ITERATION 3:
  Plan: "Refactor AuthService (TDD: tests must stay green)"
  Execute: Agent modifies auth.service.ts
  Review: `npm test` + `tsc --noEmit` + `npm run lint`
  -> PASS: proceed to iteration 4
  -> FAIL: `git checkout src/app/services/auth.service.ts` + retry

ITERATION 4:
  Plan: "Update components using AuthService"
  Execute: Agent updates LoginComponent, ProfileComponent, etc.
  Review: `npm run build` (SSR build completes)
  -> PASS: workflow completed
  -> FAIL: rollback + root cause analysis

Loop Metrics:
  - Retry rate per iteration (ideal: <20%)
  - Tokens consumed per iteration
  - Time per iteration
  - Overall success rate

Multi-Agent Code Review Pipeline

A multi-agent code review pipeline is one of the most mature use cases for agentic workflows in 2025. Each specialized agent brings a different perspective, and the aggregation produces a more complete review than any single agent could deliver.

Multi-Agent Code Review Pipeline:

Input: Pull Request with codebase changes

PHASE 1 - Parallel Review (4 agents in parallel):

  Agent A: Security Reviewer
  - Looks for: SQL injection, XSS, CSRF, exposed secrets
  - Tools: Read, Grep (patterns: hardcoded secrets, eval, innerHTML)
  - Output: security-report.md (CRITICAL/HIGH/MEDIUM/LOW)

  Agent B: Performance Analyst
  - Looks for: memory leaks, N+1 queries, bundle size impact
  - Tools: Read, Glob, Bash (npm run analyze)
  - Output: performance-report.md

  Agent C: Type Safety Checker
  - Looks for: implicit any, unsafe type assertions, missing null checks
  - Tools: Read, Bash (tsc --noEmit --strict)
  - Output: types-report.md

  Agent D: Test Coverage
  - Verifies: new functions have tests, edge cases covered
  - Tools: Read, Bash (npm test -- --coverage)
  - Output: coverage-report.md

PHASE 2 - Synthesis (1 agent):
  Input: the 4 parallel reports
  Task: Synthesize into PR-review.md with:
    - Issues grouped by severity
    - Blockers (no merge until resolved)
    - Suggestions (optional but recommended)
    - Positive findings (what is done well)

PHASE 3 - Human Checkpoint:
  Developer reads PR-review.md and decides:
    - Merge as is (zero blockers)
    - Fix blockers and re-run pipeline
    - Request clarification on specific issues

Self-Healing Workflows: Retry and Fallback

Production workflows fail. The question is not "if" but "when" and "how to recover." A self-healing workflow implements retry strategies with exponential backoff, automatic rollback to the last valid checkpoints, and fallback to alternative strategies when the primary path fails repeatedly.

Self-Healing Workflow Pattern:

STRATEGY 1: Retry with Backoff
  Attempts: 1, 2, 3 (max)
  Wait: 0s, 30s, 120s
  Retry condition: transient error (timeout, rate limit)
  NO retry condition: logic error (file not found, syntax error)

STRATEGY 2: Checkpoint + Rollback
  Before every destructive change:
    $ git stash push -m "checkpoint-[step-id]-[timestamp]"
  If step fails after max retries:
    $ git stash pop  # rollback to previous state
    -> Notify human with error context

STRATEGY 3: Alternative Path
  Primary step: Refactor with TypeScript strict
  If fails 3 times:
    Fallback 1: Refactor with non-strict TypeScript + TODO comments
    If still failing:
      Fallback 2: Document the problem instead of solving
      Escalate: create docs/issues/[step-id]-blocked.md

STRATEGY 4: Failure Isolation
  10-step workflow:
    Steps 1-5: completed successfully
    Step 6: fails
    -> DO NOT undo steps 1-5
    -> Save "partially completed" state
    -> Resume from step 6 after manual fix

IMPLEMENTATION in Claude Code:
  "If the command fails, DO NOT proceed to the next step.
   Before any file modifications, run:
     git stash push -m 'pre-[description]'
   If after 2 attempts the task does not work, STOP.
   Create docs/blocked/[task-id].md with:
     - Command that failed
     - Complete error output
     - Hypotheses about the cause
     - What was completed before the block"

Tool Use and Context Management

Context management is probably the most subtle technical challenge in agentic workflows. A poorly managed context window produces degraded results, omissions, and hallucinations. A well-managed context window enables agentic sessions lasting hours across codebases of thousands of files.

Token Budget and Prioritization

Claude Code operates with a 200,000-token context window, but consuming all of it in a single session is an anti-pattern. Anthropic research suggests operating in the 60-80% range of the maximum context window to maintain consistent quality. Beyond 80%, response quality degrades significantly.

Context Management Strategies:

1. PROGRESSIVE SUMMARIZATION
   After each completed step:
   "Update claude-progress.txt with a concise summary of this step:
    - What was done
    - Modified files (list)
    - Test status (pass/fail)
    - Expected next step
    MAX 200 words. Then use /compact to compress the conversation."

2. SELECTIVE LOADING
   Do NOT read all project files at the start.
   Read only files relevant to the current task:
   - Use Glob to identify files by pattern
   - Use Grep to find specific dependencies
   - Read files only when necessary (lazy loading)

3. EXTERNAL STATE
   Persistent state files (survive across sessions):
   - claude-progress.txt: current workflow state
   - docs/plans/[feature].md: approved plan
   - docs/analysis/[component].md: completed analyses

   Session start: "Read claude-progress.txt and active docs/plans/.
   Tell me where we are in the workflow and what to do now."

4. EFFICIENT TOOL CHAINING
   INSTEAD OF: Read 20 files + analyze everything
   DO THIS:
     1. Grep for specific pattern (find relevant files)
     2. Glob for directory structure
     3. Read ONLY identified relevant files
     4. Process
     Typical savings: 60-70% tokens

5. CHECKPOINT COMPACTION
   Every 5-10 complex steps, use /compact in Claude Code.
   Then reload context from claude-progress.txt.
   Keep the session fresh for remaining tasks.

Anti-Pattern: Tool Call Storm

A common mistake is asking the agent to read files one by one with explicit loops. This generates hundreds of tool calls and quickly saturates the context. Instead, use patterns like Glob + Grep to identify relevant files, then read only those. Claude Code can execute multiple tool calls in parallel when they are independent: this feature reduces latency by 40-60% compared to sequential calls.

Metrics and Evaluation of Agentic Workflows

"It works" is not a metric. To evaluate and improve an agentic workflow, you need quantitative metrics across four dimensions: reliability, output quality, efficiency, and security.

Metrics Framework for Agentic Workflows:

RELIABILITY
  - Task Completion Rate (TCR): % tasks completed without human intervention
    Target: >85% for production workflows
  - Retry Rate: % steps requiring more than 1 attempt
    Target: <20% per step (>20% indicates poorly specified task)
  - Escalation Rate: % steps escalated to human
    Target: depends on domain risk (5-30%)

OUTPUT QUALITY
  - Test Pass Rate: % tests passing after workflow
    Target: 100% (zero regressions acceptable)
  - Build Success Rate: % SSR builds completing without errors
    Target: 100%
  - Code Review Score: score from code-reviewer agent (1-10)
    Target: >7 before merge

EFFICIENCY
  - Tokens per Task: tokens consumed / task completed
    Baseline: measure in first 10 executions
  - End-to-End Latency: total workflow time
    Optimize with parallelism when bottleneck identified
  - Cost per Feature: total API cost / feature implemented
    Target: business-defined, typical $0.10 - $2.00

SECURITY
  - Destructive Operations Count: ops that modify/delete data
    Flag every operation with rm, DROP, DELETE, overwrite
  - Unauthorized Tool Use: tool calls not permitted by CLAUDE.md
    Target: 0 (monitored via hooks)
  - Secret Exposure: secrets in generated code
    Automated check with grep patterns

Anti-Patterns to Avoid

The 2025 literature on agentic systems has identified recurring failure patterns. Knowing them in advance is the most efficient way to build robust workflows.

1. Over-Decomposition

Breaking a task into too many subtasks creates coordination overhead that exceeds the benefit. If a task takes less than 5 minutes and 3,000 tokens, it probably does not make sense to delegate it to a separate sub-agent. Multi-agent research shows that systems with more than 5-7 agents active simultaneously tend to have exponentially higher error rates (the so-called "17x error trap" documented by Towards Data Science, 2025).

Over-Decomposition: Example

Wrong: Creating 15 sub-agents to refactor 15 functions in a single 200-line file. The coordination overhead (context setup for each agent, result merging, conflict management) exceeds the time a single agent would have taken.

Correct: A single agent reads the file, identifies the 15 functions, refactors them sequentially with intermediate tests. Sub-agents only for truly independent files/modules of significant size.

2. Under-Specification

Vaguely described tasks produce unpredictable outputs. "Improve the code" is not a task: it is a hope. Every task must specify: what to do, on which files, what constraints to respect, how to verify success. Under-specification is the number 1 cause of workflows that appear to work but produce poor-quality output.

3. Context Pollution

Loading too much irrelevant context into the context window (unnecessary files, previous conversations, verbose documentation) degrades response quality. The "lost in the middle" phenomenon - documented in LLM research in 2024 - shows that LLMs pay less attention to information in the middle of the context window. Keep the context clean, focused, and structured.

4. Missing Rollback Strategy

The 2025 Replit incident - where an agent deleted a production database - is the most cited example of a workflow without a rollback strategy. Every destructive operation (delete, overwrite, DROP) must have an undo mechanism: git stash, backup, reversible transactions. "The agent knew what it was doing" is not a disaster recovery strategy.

5. No Human-in-the-Loop

Fully automated workflows without human checkpoints are appropriate only for low-risk, well-understood tasks. Anthropic research (2026 Agentic Coding Trends Report) shows that developers delegate completely (0% supervision) only 20% of tasks: the remaining 80% requires at least one review checkpoint. Design workflows with explicit Human-in-the-Loop for architectural decisions, deployments, and data modifications.

6. Agent Without Inter-Session Memory

Every new Claude Code session starts from scratch. A complex workflow (5+ hours of work) that does not write external state is destined to lose progress. Always use claude-progress.txt, plan files in docs/ and frequent commits as persistence mechanisms.

Case Study: Angular Codebase Refactoring with Agentic Workflow

Let us see how these principles apply to a real case: refactoring the blog module of an Angular portfolio from a legacy architecture (large components, logic in templates, no tests) to a modern architecture (small components, separate services, 80%+ coverage).

Context

Codebase: Angular 21, SSR, ~3,000 lines in the blog module
Problem: 0% test coverage, 500+ line components, no separation of concerns
Goal: complete refactoring without functional regressions
Constraint: active production, zero downtime tolerated

Designed Workflow

WORKFLOW: Blog Module Refactoring

PHASE 0 - Setup (5 min, human):
  $ git checkout -b refactor/blog-module
  Create claude-progress.txt with initial state
  Create docs/plans/blog-refactoring.md (empty, agent will fill)

PHASE 1 - Analysis (agent, ~30 min):
  Task: "Analyze the complete blog module.
  Read all files in src/app/articles/ and src/app/services/blog.service.ts.
  Create docs/analysis/blog-module.md with:
  - Complete list of components and their sizes (lines)
  - Dependencies between components (who imports whom)
  - Identified duplicate logic
  - Separation of concerns violations
  - Refactoring priority (impact x effort)"

  Review: Human reads docs/analysis/blog-module.md and approves plan

PHASE 2 - Test Foundation (agent, ~60 min):
  Task: "Write E2E tests for critical blog features
  BEFORE any refactoring. Use Playwright.
  Features to cover:
  - Navigate to article list
  - Open specific article
  - Series navigation (prev/next)
  - IT/EN language switch
  Tests must pass on the CURRENT version of the code."

  Review: `npm run test:e2e` must have 100% green E2E tests

PHASE 3 - Service Extraction (agent, ~90 min, iterative):
  Task: "Extract logic from BlogComponent into separate services.
  ONE service at a time, in this order:
  1. BlogFilterService (article filtering and search)
  2. BlogSeriesService (series navigation)
  3. BlogSEOService (meta tags for articles)

  For each service:
  a) Create .service.ts file with extracted logic
  b) Create .service.spec.ts with unit tests
  c) Update BlogComponent to use the service
  d) Verify: npm test passes, npm run build passes
  e) Commit: git commit -am 'refactor: extract [ServiceName]'"

  Review after each service: E2E tests must stay green

PHASE 4 - Component Split (agent, ~120 min, iterative):
  Task: "Split BlogComponent (current 480 lines) into:
  - BlogListComponent (article list)
  - BlogCardComponent (single article card)
  - BlogFilterComponent (filters and search)
  - BlogPaginationComponent (pagination)

  One component at a time. Same pattern as PHASE 3:
  implement, test, verify build, atomic commit."

  Review: Human verifies UI visually + E2E tests

PHASE 5 - Coverage Check (agent, ~30 min):
  Task: "Verify coverage is >80% for all new files.
  For each file below threshold, add missing tests.
  Produce report in docs/analysis/coverage-report.md"

  Review: Coverage report + npm test

PHASE 6 - Final Review (human):
  - Reads complete PR diff
  - Verifies docs/analysis/coverage-report.md
  - Merges to master if all ok

TYPICAL RESULTS FOR THIS WORKFLOW:
  Coverage: 0% -> 82%
  Component size: 480 lines -> 95 lines average
  Build time: -15% (smaller components, more effective lazy loading)
  Human time: ~45 min (phase 0 + review + phase 6)
  Agent time: ~5-6 hours
  Estimated API cost: $3-8 (depends on model)

Lessons Learned from the Case Study

Three insights from running this type of workflow on real codebases:

E2E tests before refactoring are non-negotiable. Without a functional safety net, every refactoring step is a leap in the dark. The time invested in PHASE 2 (E2E tests) pays off 10x in preventing undetected regressions.
Atomic commits are workflow memory. One commit per refactored service/component makes rollback surgical: if a refactoring introduces a problem, you do git revert on that single commit without losing previous work.
Human time drops to 10-15% of the total. With a well-designed workflow, the developer spends most of their time reviewing outputs and approving checkpoints, not writing code. This is mature vibe coding: not delegating everything, but delegating strategically while maintaining oversight on key decisions.

Key Data Point

21% of YC Winter 2025 startups have a codebase with over 91% AI-generated code. But the most mature of these startups do not use "raw" vibe coding: they use structured agentic workflows with human checkpoints, automated tests, and rollback strategies. The difference between "generated code" and "quality software generated by AI" is precisely the structure of the workflow.

The Future of Agentic Workflows

Research in 2025-2026 indicates three directions of evolution for agentic workflows in the context of software development:

Dynamic Task Decomposition: Frameworks like TDAG (2025) shift decomposition from static (defined by the developer) to dynamic (the agent decides the workflow structure based on the problem). Early results are promising but still require human supervision for production codebases.
Persistent Agent Memory: The integration of vector databases with agent frameworks (LangGraph + pgvector, CrewAI + Chroma) allows agents to remember patterns across different projects, accumulating "experience" on specific codebases.
Agent Skills as Standard: Anthropic introduced Agent Skills in 2025: bundles of instructions, scripts, and resources that agents can load dynamically. The idea is that skills like "Angular Expert", "Security Auditor", or "Performance Optimizer" become reusable modules across teams and organizations.

Conclusions

Agentic workflows are not magic: they are architecture. The difference between an agentic system that works and one that fails almost always comes down to decomposition quality, clarity of success criteria, and robustness of error recovery strategies.

The four patterns (Sequential, Parallel, Hierarchical, Iterative) are not mutually exclusive: the most effective workflows combine them. A hierarchical Planner can delegate parallel tasks, each of which uses an iterative loop to reach the required quality. Structure always serves the specific problem.

The most important practical step you can take today: take a complex task that you typically solve in a single AI session and try to decompose it into 3-5 verifiable subtasks. Define the success criteria for each before starting. Add a human checkpoint after each critical phase. Then measure: does the structured workflow produce better output? Almost certainly yes. And that is the starting point for becoming an architect of agentic systems.

Series: Vibe Coding and Agentic Development

01 - Vibe Coding: The Paradigm That Changed 2025
02 - Claude Code: Agentic Development from the Terminal
03 - Agentic Workflows: Decomposing Problems for AI (this article)
04 - Multi-Agent Coding: LangGraph, CrewAI and AutoGen
05 - Testing AI-Generated Code
06 - Prompt Engineering for IDEs and Code Generation
07 - Security in Vibe Coding: Risks and Mitigations
08 - The Future of Agentic Development in 2026