How I use AI to automate boring coding tasks
I just finished something that I would normally probably not even start - implementing 27 Symfony String functions in Flow PHP.
I let Claude Code handle it while I focused on other important tasks (house cleaning, to be more specific). The result? 68 files changed, over 3,000 lines of code added, comprehensive test coverage, and zero regressions. All done systematically.
The Problem That Made Me Try Something Different
It all started with issue #1316 in the Flow PHP repository. Flow PHP needed integration with Symfony's string manipulation library. Sounds simple, right?
Well, yes and no. Here's what I was looking at:
- 27 different string functions to implement
- Each one needed its own ScalarFunction class and a dedicated method in the ScalarFunctionChain integration
- Unit tests for edge cases, but focused on integration only, not Symfony String behavior
- Integration tests to make sure everything works in real scenarios
- Static analysis and coding standards compliance
Doing this manually would have been soul-crushing. Copy-paste-modify-test, repeat 27 times. What a nightmare...
So I Tried Something Else
Instead of going through this manually, I decided to let Claude Code Agent handle it.
The trick wasn't just telling Claude "implement these 27 functions." That would have been a disaster.
Instead, I broke it down into 29 separate, laser-focused tasks in a .claude/tasks/gh-1316
folder.
Each task file was like a mini specification:
- What exactly needs to be built
- Step-by-step implementation plan
- Clear success criteria
- Quality checks that must pass
How It Actually Worked Out
First, Figure Out What's Actually Needed
I started by having Claude do the analysis:
Task 01: Integration analysis - mapped existing functions
Task 02: Gap analysis - identified what needed implementation
Since Flow PHP already had some string functions, we didn't need to implement everything from scratch.
Then, Assembly Line Programming
This is where things got interesting. I created a template that Claude would follow for every single function:
- Write the ScalarFunction class - core logic
- Add it to ScalarFunctionChain - so it works with Flow's column reference fluent API
- Unit tests for edge cases - but focused on the integration itself
- Integration tests - does it actually work in a real DataFrame?
- Make PHPStan happy - because we need our code to be strict and precise
Here's what one of these functions looks like.
final class Reverse extends ScalarFunctionChain
{
public function __construct(
private readonly ScalarFunction|string $value,
) {
}
public function eval(Row $row) : ?string
{
$value = (new Parameter($this->value))->asString($row);
if ($value === null) {
return null;
}
return s($value)->reverse()->toString();
}
}
Null handling, type checking, then just wrapping Symfony's s()
function.
Simple and consistent across all 27 functions.
Finally, Polish and Cleanup
The last couple of tasks were about making everything consistent:
- Renaming functions that didn't follow our conventions
- Cleaning up tests (following the "don't test what you don't own" rule)
- Making sure everything still passes all quality checks
The Final Result
In PR #1782, here's what we ended up with:
- 27 new string functions - complete Symfony String coverage
- 68 files touched - 3,322 lines added, only 16 deleted
- Comprehensive tests - both unit and integration coverage
- PHPStan level 9 clean - no shortcuts taken
- Zero breaking changes - all existing tests still pass
The functions cover pretty much everything you'd want to do with strings:
- Basic manipulation:
reverse()
,truncate()
,repeat()
- Building strings:
append()
,prepend()
,ensureStart()
,ensureEnd()
- Checking things:
isEmpty()
,length()
,width()
,equalsTo()
- Finding stuff:
indexOfLast()
,containsAny()
,match()
,matchAll()
- Cleaning up:
collapseWhitespace()
,wordwrap()
,normalize()
What tools did I use
Alright, let's talk about the tools that actually made this possible. Claude Code Agent doesn't work in isolation. It uses these things called MCP servers to connect with other tools and use them.
Here's what I had set up:
IDE Integration
claude mcp add jetbrains -s user -- npx -y @jetbrains/mcp-proxy
This connects Claude directly to my IDE. Super handy because Claude can use IDE features for searching existing code, understanding the project structure, and modifying files.
GitHub Integration
claude mcp add --transport http -s user github https://api.githubcopilot.com/mcp/ -H "Authorization: Bearer xxx"
This server connects Claude directly to GitHub APIs, allowing it to read the repository, pull requests, and issues.
Pro tip: Even though this can create PRs and push code, I usually follow Murphy's law: not letting AI push code directly or make any changes to remote services.
Web Content (Images) Fetching
claude mcp add fetch -s user -- npx -y @kazuph/mcp-fetch
This server allows Claude to fetch and analyze web content, images in particular.
Sequential Thinking
claude mcp add sequential-thinking -s user -- npx -y @modelcontextprotocol/server-sequential-thinking
This one is crucial for handling tasks that require multiple steps or sequential thinking. It allows Claude to break down complex tasks into smaller, manageable parts and execute them in order. It also helps agent to not go off the rails when it comes to implementing the tasks one after another.
The combination of these MCP servers created a development environment where Claude could work across multiple tools and platforms seamlessly. This integration was crucial for maintaining the systematic approach throughout all 29 tasks. It also allowed me to reduce the number of tokens used since the agent was delegating some tasks to the servers rather than doing everything itself.
Some Safety Rules
Working with AI on your codebase can go sideways fast if you're not careful. Here's what kept me sane:
Version Control
Rule #1: commit after every single task. I'm serious about this. After Claude finished implementing each function, I'd review it, apply patches if needed, run static analysis, and then commit it immediately with a clear message.
Usually It's Faster to Start Over
If the agent generates code that's working but not following the pattern you defined, don't try to fix it manually. Just toss it and ask for a new version with better instructions.
Why this works:
- Speed - Claude can regenerate in 30 seconds what might take you 30 minutes to debug
- No weird edge cases - fresh code doesn't inherit previous mistakes
- Git has your back - worst case, you revert to the last good commit
- Better instructions = better results - each attempt helps you clarify what you actually want
Start Easy With Permissions
MCP servers can do a lot, but I always give them readonly permissions only. When attempting to use any MCP server features, Claude will ask for permission to use them. Just say no to anything that poses a risk of breaking something outside of your local development environment.
The problem with LLMs is that they are, by nature, not deterministic. Understanding this helps avoid situations where AI might try to do something unexpected. It's as simple as that: if something comes with a risk of breaking, don't let AI do it. Just follow Murphy's law: "If something can go wrong, it will go wrong."
Why This Actually Worked
One Function, One Task
The trick was to keep each task stupidly simple. One function per task, nothing more. No context switching, no distractions.
Same Checklist Every Time
I made Claude follow the exact same checklist for every single function:
- Unit tests pass (including all the weird edge cases)
- Integration tests pass (does it work in practice?)
- PHPStan level 9 is happy
- Code formatting is perfect
- Nothing else breaks
AI Is Good at Boring, Repetitive Work
This wasn't about creative problem-solving. It was about following the same pattern 27 times without screwing up. AI is decent at this kind of work—it doesn't get bored like I would after the 5th function.
What I Learned From This Experiment
What Worked (With Heavy Supervision)
- Detailed specs are crucial—AI needs extremely clear instructions or it goes off the rails
- Review everything—the code looked fine but had subtle issues I caught during review
- Keep tasks simple—anything complex and AI starts making questionable decisions
What I'd Do Differently
- Spend even more time on upfront architectural planning
- Build in more review checkpoints (I caught a few issues too late)
Want to Try This Yourself?
If you want to experiment with this approach, be prepared for a lot of upfront work defining patterns and reviewing output. This isn't a "set it and forget it" solution—it's more like having a junior developer who needs detailed instructions and constant code review.
I would much more prefer to have a junior developer who can learn and grow, rather than an AI that needs constant supervision. But since I'm the only "full time" developer on this project, it's better than nothing.
The Basic Setup
These three will give you a solid foundation to start generating code:
claude mcp add jetbrains -s user -- npx -y @jetbrains/mcp-proxy
claude mcp add --transport http -s user github https://api.githubcopilot.com/mcp/ -H "Authorization: Bearer YOUR_TOKEN"
claude mcp add fetch -s user -- npx -y @kazuph/mcp-fetch
claude mcp add sequential-thinking -s user -- npx -y @modelcontextprotocol/server-sequential-thinking
The Task Template
Here's the task template I use. There are probably better ones out there, but this one works for me. It's easy to copy-paste and modify for each new function.
# Task Template
## Context
[What exactly needs to be built and why]
## Implementation Steps
1. [Be stupidly specific here]
2. [Like, embarrassingly detailed]
3. [Trust me, more detail = better results]
## Definition of Done
- [ ] Code works as expected
- [ ] Tests pass (unit + integration)
- [ ] Static analysis is clean
- [ ] Nothing else broke
- [ ] Update docs if needed
The secret is to be ridiculously specific in your tasks. Don't make Claude guess what you want—spell it out like you're talking to someone who's never seen your code before.
But how do you make Claude Code actually work on the given task? For that, I created something called a
Slash
Command.
Here's the /work-on-task
command I use:
# Your task
You are going to work on a task described in #$ARGUMENTS path.
All tasks are in `.claude/tasks/{task-slug}` folder.
Inside the folder, tasks are saved as `{task-number}-{task-name}.md`.
Once you are done, summarize what you did in a new file (the same name as the task with suffix `-summary`).
# Task Summary Rules
- The Summary MUST be short and to the point.
- When everything is done and the Definition of Done is met, just write "Task completed." in the summary—don't repeat the task description.
- Always add to the summary how many tokens were used to complete the task.
- When the definition of done is not met, write what is missing so any other agent/human can take it from there.
What's interesting about this command is that it allows me to keep track of what Claude is working on, but it also shows approximately how many tokens were used to complete the task.
When a given task consumes too many tokens, it might be a good idea to look for a dedicated MCP server.
The Summary
I think AI hasn't changed the way I approach problems since I remember always trying to automate the boring parts of my work. AI can handle repetitive, boring tasks pretty well if you supervise it properly.
AI can execute well-defined, repetitive tasks, but if you don't make those tasks small and precise, it's likely to cut corners or misunderstand your intent. After all, it's all based on probabilities, not true understanding.
If anything, this reinforced how important architectural thinking and code review skills are.
This blog post was written with help from Claude Code.
I came up with the idea, wrote the initial draft, and edited the final version. The AI Agent was used for rephrasing and polishing the text, fixing grammar, and ensuring the content was clear and concise. It was a collaborative effort, which I hope will let me blog more often and make sharing my thoughts easier and faster.