The Simplest Way to Improve Your AI Agent: Just Ask It

Building AI agents that actually work well is hard. We spend countless hours tweaking system prompts, adjusting tool interfaces, and running tests, often making educated guesses about what might improve performance. But what if I told you there's a simpler approach that most developers never try?

Just ask your agent what's wrong.

A Real-World Discovery

I recently built an open source coding agent SDK and was testing it by having the agent build some features. The results were... okay. Not terrible, but not as good as I expected. The agent would make multiple unnecessary tool calls, sometimes struggle to complete tasks efficiently, and occasionally miss the mark on what I was asking for.

Instead of diving into debugging mode or making assumptions about what needed fixing, I tried something different. In the same chat session, right after the agent finished its work, I asked: "Why didn't you do exactly what I wanted?"

The response was eye-opening.

The agent (I was using OpenAI's o3 model) immediately identified two specific issues:

Unclear completion criteria: The system prompt didn't clearly define when work should be considered finished
Tool limitations: The file edit tool I'd provided made it difficult to delete code effectively

But here's where it gets interesting - I didn't stop there.

Going Deeper: Let the Agent Design Its Own Tools

I followed up with another question: "How would you like to improve the tool if you could?"

The agent's suggestions were incredibly specific and practical:

Instead of pure search/replace functionality, add a regex option for fuzzy matching
Include the ability to specify start and end line numbers to limit search scope
Add a delete-only option that doesn't require providing replacement text
Support a "delete-all matches" flag for bulk operations

These weren't abstract suggestions - they were concrete improvements based on the agent's actual experience using the tools.

The Results Were Immediate

I implemented several of these suggestions (actually, I had the agent implement them itself) and updated the system prompt to be clearer about completion criteria. The improvement was dramatic:

Fewer unnecessary tool calls
More efficient task completion
Better alignment with intended outcomes
Overall more confident and decisive behavior

Building a Self-Improvement Loop

This experience was so valuable that I'm now building a formal feedback mechanism into my SDK. I'm packaging this as an MCP (Model Context Protocol) server that agents can use to file improvement suggestions as GitHub issues. When an agent encounters friction or has suggestions, it can inform the MCP, which will either create a new issue or upvote an existing one - helping prioritize improvements based on frequency of encounters.

Why This Works (And Why We Don't Do It)

AI agents, especially advanced models, have remarkable self-awareness about their own limitations. They experience the friction in your tools, notice gaps in their instructions, and understand when they're working inefficiently. But we rarely think to tap into this knowledge directly.

Why don't we ask more often? A few reasons:

We think of agents as tools, not collaborators: We're used to debugging code, not having conversations with it
We assume we know what's wrong: Our developer intuition kicks in before we gather actual data
We forget they can meta-reason: We don't realize agents can reflect on their own performance

Practical Steps to Try This Yourself

Here's how you can start getting better feedback from your agents:

1. Ask Performance Questions

After any task, try asking:

"What made this task difficult?"
"What would have made this easier?"
"What information were you missing?"

2. Get Tool Feedback

If your agent uses custom tools:

"Which tools worked well? Which were frustrating?"
"What additional functionality would help?"
"How would you redesign this tool?"

3. Prompt Improvement

For system prompt optimization:

"What parts of your instructions were unclear?"
"What additional context would be helpful?"
"When did you feel uncertain about what to do?"

4. Make It Systematic

Consider building feedback collection into your agent workflow:

Add a post-task reflection step
Log improvement suggestions automatically
Track which changes actually improve performance

The Bigger Picture

This approach represents a shift from traditional software debugging to something more collaborative. Instead of trying to reverse-engineer what's wrong from external behavior, we're having a direct conversation with the system about its experience.

It's not just about fixing bugs - it's about co-designing better tools and processes with an AI partner that actually uses them.

Your Turn

Have you tried asking your agents for direct feedback? I'm curious to hear from others who might want to experiment with this kind of self-improving loop. The potential feels huge - imagine agents that continuously refine their own capabilities through structured self-reflection.

What would you ask your agent if you knew it could give you honest, actionable feedback about its own performance?

If you're interested in experimenting with agent self-improvement, I'd love to connect. The GitHub issues approach is just the beginning - there's so much more we could explore in making our AI tools genuinely collaborative partners in their own development.