Wrapping Claude Code with o3: Building Advanced Agents That Actually Work in Your Codebase

A few weeks ago, I found myself in a familiar frustration loop. Claude Code would work brilliantly on simple examples, but the moment I pointed it at my actual codebase—a TypeScript project with custom tooling, specific patterns, and years of accumulated complexity—it would stumble. The responses were generic, it missed important context, and I spent more time explaining my setup than actually coding.

That's when I had an idea: what if I could wrap Claude Code with o3 to create a "translator" layer? o3 could understand my vague requests, structure them into precise prompts for Claude Code, and then verify the results. The outcome? A tool that makes Claude Code work like it was built specifically for my codebase.

The Problem: Generic AI Meets Real Codebases

Let me show you what I mean with a real example. Here's the kind of vague prompt I naturally want to give:

# What I want to say:
qckfx "this build error is confusing, can you fix it?"

# What Claude Code actually needs:
claude "Analyze the TypeScript build error in src/core/Agent.ts related to 
missing type exports. The error indicates import issues with the ModelProvider 
type. Please examine the file structure, identify the circular dependency, 
and implement proper type re-exports while maintaining backward compatibility."

The difference is stark. One is natural human communication, the other is precise AI instruction. But manually translating every request gets exhausting.

Enter Advanced Agent: o3 + Claude Code

I built advanced-agent to solve this translation problem. Here's how it works under the hood:

{
  "defaultModel": "o3",
  "systemPrompt": "You are a STRATEGIC PLANNER and REVIEWER, not a direct implementer. Your core role is PLAN → DELEGATE → VALIDATE → REPORT. You should delegate 100% of code editing work to the claude tool.",
  "tools": [
    "bash", "glob", "grep", "ls", "file_read", "file_edit", "file_write", 
    "think", "batch", "claude"
  ]
}

The key insight is in that system prompt. o3 doesn't write code—it plans, delegates to Claude Code via the claude tool, then validates the results. Here's what that looks like in practice:

// From src/tools/ClaudeTool.ts - the bridge between o3 and Claude Code
export const createClaudeTool = (): Tool<ClaudeToolResult> => {
  return createTool({
    id: 'claude',
    name: 'ClaudeTool',
    description: 
      `claude code is an agentic coding CLI tool. You are passing in the query to instruct the agent on what to do. ` +
      `The agent is fully autonomous and capable of writing and running code, running linters, tests, etc.`,
    
    async execute(args, context) {
      const { query } = args as { query: string };
      
      // Create a temporary file with the o3-crafted prompt
      const tmpPath = `.claude_prompt_${Date.now()}.txt`;
      await context.executionAdapter.writeFile(context.executionId, tmpPath, query);
      
      // Execute Claude Code with the structured prompt
      const cmd = `claude -p --dangerously-skip-permissions < "${tmpPath}"`;
      const { stdout, stderr, exitCode } = await context.executionAdapter.executeCommand(
        context.executionId, cmd, undefined, false, 20 * 60 * 1000 // 20 minute timeout
      );
      
      return { ok: true, data: { stdout, stderr, command: cmd } };
    }
  });
};

Real-World Usage: From Build Errors to Working Code

Let me show you how this plays out with actual build errors. Yesterday, I was working on the CLI validation system and hit this error:

# I just piped the error directly to advanced-agent:
npm run build 2>&1 | qckfx -a advanced-agent "fix this build error"

Here's what happened behind the scenes:

o3 analyzed the error: TypeScript was complaining about circular dependencies in the tool registry
o3 crafted a specific prompt for Claude Code with full context about my project structure
Claude Code implemented the fix using the detailed prompt
o3 validated the result by running the build again

The beautiful part? I never had to explain my project structure or debugging approach. The advanced-agent's system prompt already contains that knowledge:

// Excerpt from the actual system prompt
"When delegating to claude tool:
• Instruct it to check existing libraries (package.json, imports, neighboring files)
• Follow existing patterns, naming conventions, and code style
• Never assume libraries are available - verify first
• Maintain security best practices"

The Economics: Power vs Cost Trade-offs

Let me be honest about the costs: o3 is expensive. I'm currently seeing $50-$100/day OpenAI bills using this approach heavily. Interestingly, this matches what I was spending on Claude Code with Sonnet 3.7 before Anthropic rolled out their Pro/Max plan support. However, the productivity gains have been worth it for me—the quality and speed of results far outweigh the cost when I'm shipping features quickly.

That said, we're actively working on bringing costs down:

OpenAI caching: Should help with repeated context, though it may not be working optimally yet
Browser sub-agent: Already building better context for o3 to reduce token usage
Prompt optimization: Continuously refining to be more efficient

The cost breakdown looks like:

o3 usage: The expensive part—high-quality reasoning and planning
Claude Code: Just $20/month with Anthropic Pro plan (haven't hit limits yet)
Total value: Higher quality results with fewer manual iterations

For many developers, especially those billing by the hour, the time savings easily justify the cost. But I'm committed to making this more accessible as the tooling matures.

Customization: Your Codebase, Your Rules

The real magic happens when you customize the system prompt for your specific codebase. I use the agent-editor (another agent in my toolkit) to continuously refine the advanced-agent's behavior:

# When something goes wrong, I can ask for immediate feedback:
qckfx -a agent-editor -c "The advanced-agent just made a change I didn't expect. 
Can you read what happened and update its prompt to handle this better?"

The -c flag lets me continue the conversation but swap agents, so agent-editor can see the full context of what advanced-agent did and adjust accordingly. This creates a feedback loop where my agents get better over time.

Here's how the agent-editor approaches prompt refinement:

// From .qckfx/agent-editor.json - the system prompt for agent-editor
{
  "systemPrompt": "You are PROMPT-EDITOR, an expert system prompt architect specializing in crafting, refining, and maintaining agent system prompts within this repository.\n\n## CORE MISSION\nDesign and optimize system prompts that are clear, effective, and aligned with each agent's specific role and operational requirements.\n\n## OPERATIONAL MANDATES\n\n### AUTHORING & IMPROVEMENT\n• Craft new system prompts with precision and clarity\n• Enhance existing prompts for better performance and comprehension\n• Ensure prompts accurately reflect agent capabilities and constraints\n• Apply prompt engineering best practices (specificity, structure, examples)\n• Balance comprehensiveness with conciseness",
  "tools": ["bash", "glob", "grep", "ls", "file_read", "file_edit", "file_write", "think", "batch"]
}

Real Examples from My Daily Workflow

Here are some actual commands I run regularly:

# Morning standup prep
qckfx -a advanced-agent "review recent commits and identify any potential issues"

# Debugging a test failure
npm test 2>&1 | qckfx -a advanced-agent "these tests are failing, fix them"

# Code review assistance  
qckfx -a advanced-agent "review the changes in src/core/ and suggest improvements"

# Documentation updates
qckfx -a documentation-writer "update the README to reflect the new CLI commands"

# Clean commit management
qckfx -a commit "commit everything, break into logical commits, pay attention to the order of commits to ensure it's logical"

That last one is particularly important. The commit agent takes all my staged and unstaged changes and breaks them into logical, well-described commits. This matters because other agents like advanced-agent often need to look back at commit history to understand what's happening in the codebase. Poor commit messages mean missing context.

The Architecture: How It All Fits Together

The CLI implements a clean tool-based architecture where each agent can use other agents as tools:

// From src/cli/augmentSubAgentTools.ts - how sub-agents are loaded
export function augmentAgentConfigWithSubAgents(
  agentConfig: AgentConfig,
  subAgentNames: string[],
  cwd = process.cwd(),
): void {
  const subAgentEntries = subAgentNames.map(rawName => {
    const name = rawName.endsWith('.json') ? path.basename(rawName, '.json') : rawName;
    const configFile = resolveAgentConfigPath(rawName, cwd);
    return { name, configFile };
  });
  
  // Add to tools array
  agentConfig.tools = [...(agentConfig.tools || []), ...subAgentEntries];
}

This architecture means you can compose agents however you want. Need the browser sub-agent for research? Add --with-subagent browser. Want to bring in your custom documentation agent? Easy.

Getting Started: Your Own Advanced Agent

Want to try this setup? It's completely free and open source. Here's how to get started:

# Install the CLI
npm i -g @qckfx/agent

# Set up your API keys
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key" 

# Get starter configurations
qckfx init

The most important first step is customizing the advanced-agent for your codebase:

# Use agent-editor to analyze your project and update the prompt
qckfx -a agent-editor --with-subagent browser \
  "Read through my codebase and update the advanced-agent system prompt 
   to match my dependencies and coding patterns"

If you're not using TypeScript, make sure to mention that—the default configuration is tuned for my TypeScript projects.

The Future: Continuous Improvement

What excites me most about this approach is how it gets better over time. Every interaction teaches the system more about my codebase and preferences. The agent-editor can continuously refine the prompts based on actual usage patterns.

And because it's all open source, you can inspect exactly how it works, modify it for your needs, or contribute improvements back to the community.

Try It Yourself

The code is available at github.com/qckfx/agent-sdk. If you're tired of wrestling with generic AI tools that don't understand your codebase, give it a shot. The setup takes about 5 minutes, and you'll have an AI assistant that actually knows how your project works.

The best part? You're not locked into anyone else's vision of how AI should work. You own the prompts, you control the behavior, and you can always peek under the hood to see exactly what's happening.

That's the kind of AI tooling I want to use—and build.