Phase 3

The New Factory

Systems built for automation from day one.

Agentic AI - Current State

Companies with AI agents in production

G2, 2025

Anticipate >100% ROI

PagerDuty, 2025

up to 0%

Productivity gains (KYC/AML, banks)

McKinsey, 2025

Have mature agent governance

Deloitte, 2026

Agentic / Agent vs Autonomous AI

Agentic / Agent

AI that chooses its own tools, plans steps, and orchestrates workflows. You give it a goal; it figures out the how. A human starts it.

Example: Claude Code reading your codebase, deciding which files to edit, running tests - but you told it what to do.

Autonomous AI

Fully autonomous AI that acts on triggers or schedules without human initiation. It runs in the background, watching for events.

Example: An agent that polls Azure DevOps every 30s, claims tickets tagged “CLAUDE,” executes work, and closes them.

The magic isn't the agent - it's the process it's embedded in.

Tool Selection Is Still Just Text

The surprising truth: when an agent “decides” to call a tool, it's not executing code - it's generating text that describes what it wants to do. The system interprets that text and runs the tool.

What “Autonomous” Looks Like

CoWork Scheduled Tasks

•Set cadence: hourly, daily, weekly, weekdays, on demand
•Each task = its own session with all files + plugins
•Desktop app only, paid plans

Dispatch

•Send tasks from phone → Claude executes on desktop
•Scan QR code → connected in 2 taps
•Note: tasks are routed through Anthropic’s cloud — not fully local processing

Claude CLI (Claude Code)

•Terminal-based agentic coding tool (GA May 2025) - human-driven, not autonomous
•Natural language → code, file ops, git, shell, web search
•Autonomous only via /loop (recurring tasks) or scheduled triggers

These tools don't just answer questions - they DO things. Browse the web. Open files. Write code. Send emails. All autonomously, with permission controls.

Demo

Parallel research with Claude Code

Running multiple research tasks in parallel from the CLI - real-time autonomous execution.

Demo: Automated Lead Research Pipeline

Demo

Fully autonomous sales pipeline

End-to-end: Google Form intake → CSV → CoWork scheduled job picks up new entries → AI researches each company → saves research → second job generates personalized outreach emails. Zero manual work.

Result: Lead intake → personalized email draft - human only reviews and sends.

Demo

Google AI Studio App with Database

Link the CSV from the lead pipeline into Google AI Studio. AI builds a full app to browse leads, view research, edit outreach drafts. Built entirely for free - no code.

Don't Ask AI to DO Finance - Ask It to BUILD the Tool

AI makes mistakes. Math doesn't. The solution: use AI to build a deterministic app that executes reliably every time.

Instead of asking Claude to calculate your KPIs, ask it to build an app that pulls data from Azure DevOps, calculates the KPIs with exact formulas, and shows a dashboard. The app is reviewable, testable, and repeatable.

Wrong approach

“Calculate our sprint velocity and defect rate from this data.”

AI may hallucinate numbers, round differently, miss edge cases.

Right approach

“Build me an app that connects to Azure DevOps, fetches work items for the last 6 sprints, and calculates velocity and defect rate using [these exact formulas].”

Deterministic. Reviewable. Repeatable.

Demo

Azure DevOps KPI automation

Using Claude Code / AI Studio to build an app that pulls Azure DevOps data and generates KPI dashboards - deterministic code, not AI guesswork.

Demo

Connecting Claude to your tools

MCP servers, connectors, and app integrations - giving Claude access to email, databases, and APIs.

Demo

Building a production agent

Using the Claude Agent SDK to build an autonomous workflow with tool access and human-in-the-loop controls.

The Human as Operator

Model	Human Role	AI Role	Analogy
Human-in-the-loop	Does the work, AI assists	Copilot/assistant	Worker with power tools
Human-on-the-loop	Monitors, intervenes on exception	Autonomous executor	Factory floor manager
Human-out-of-the-loop	Sets policy, reviews outcomes	Fully autonomous	Business owner reviewing reports

The Klarna Rollercoaster

Early 2024

AI Deployed

2.3M conversations/month. Equivalent to 700 full-time agents. Satisfaction up 47%.

Mid 2024

Full Speed Ahead

Headcount 5,000 → 3,000. $10M annual savings. Response time: 15 min → under 2 min.

2025

Complex Cases Fail

Quietly rebuilding human customer service team. Full AI replacement failed for sensitive cases.

2026

Hybrid Model

Human-on-the-loop stabilized. AI handles volume, humans handle complexity.

The Klarna Rollercoaster

2.3M conversations/month, 700 agents replaced - then they reversed course.

The process guards quality, not the individual. Review gates, validation steps, automated checks.

Real performance gains come from process redesign, NOT AI “assisting” the human.

Every factory has a dark side. Let's talk about what happens when you build without guardrails.

Enter the shadow factory →