How I use AI to automate boring coding tasks

Posted on July 22, 2025 00:00

AI automation PHP development claude

I just finished something that I would normally probably not even start - implementing 27 Symfony String functions in Flow PHP.

I let Claude Code handle it while I focused on other important tasks (house cleaning, to be more specific). The result? 68 files changed, over 3,000 lines of code added, comprehensive test coverage, and zero regressions. All done systematically.

The Problem That Made Me Try Something Different

It all started with issue #1316 in the Flow PHP repository. Flow PHP needed integration with Symfony's string manipulation library. Sounds simple, right?

Well, yes and no. Here's what I was looking at:

27 different string functions to implement
Each one needed its own ScalarFunction class and a dedicated method in the ScalarFunctionChain integration
Unit tests for edge cases, but focused on integration only, not Symfony String behavior
Integration tests to make sure everything works in real scenarios
Static analysis and coding standards compliance

Doing this manually would have been soul-crushing. Copy-paste-modify-test, repeat 27 times. What a nightmare...

So I Tried Something Else

Instead of going through this manually, I decided to let Claude Code Agent handle it.

The trick wasn't just telling Claude "implement these 27 functions." That would have been a disaster. Instead, I broke it down into 29 separate, laser-focused tasks in a .claude/tasks/gh-1316 folder.

Each task file was like a mini specification:

What exactly needs to be built
Step-by-step implementation plan
Clear success criteria
Quality checks that must pass

How It Actually Worked Out

First, Figure Out What's Actually Needed

I started by having Claude do the analysis:

Task 01: Integration analysis - mapped existing functions
Task 02: Gap analysis - identified what needed implementation

Since Flow PHP already had some string functions, we didn't need to implement everything from scratch.

Then, Assembly Line Programming

This is where things got interesting. I created a template that Claude would follow for every single function:

Write the ScalarFunction class - core logic
Add it to ScalarFunctionChain - so it works with Flow's column reference fluent API
Unit tests for edge cases - but focused on the integration itself
Integration tests - does it actually work in a real DataFrame?
Make PHPStan happy - because we need our code to be strict and precise

Here's what one of these functions looks like.

final class Reverse extends ScalarFunctionChain
{
    public function __construct(
        private readonly ScalarFunction|string $value,
    ) {
    }

    public function eval(Row $row) : ?string
    {
        $value = (new Parameter($this->value))->asString($row);

        if ($value === null) {
            return null;
        }

        return s($value)->reverse()->toString();
    }
}

Null handling, type checking, then just wrapping Symfony's s() function. Simple and consistent across all 27 functions.

Finally, Polish and Cleanup

The last couple of tasks were about making everything consistent:

Renaming functions that didn't follow our conventions
Cleaning up tests (following the "don't test what you don't own" rule)
Making sure everything still passes all quality checks

The Final Result

In PR #1782, here's what we ended up with:

27 new string functions - complete Symfony String coverage
68 files touched - 3,322 lines added, only 16 deleted
Comprehensive tests - both unit and integration coverage
PHPStan level 9 clean - no shortcuts taken
Zero breaking changes - all existing tests still pass

The functions cover pretty much everything you'd want to do with strings:

Basic manipulation: reverse(), truncate(), repeat()
Building strings: append(), prepend(), ensureStart(), ensureEnd()
Checking things: isEmpty(), length(), width(), equalsTo()
Finding stuff: indexOfLast(), containsAny(), match(), matchAll()
Cleaning up: collapseWhitespace(), wordwrap(), normalize()

What tools did I use

Alright, let's talk about the tools that actually made this possible. Claude Code Agent doesn't work in isolation. It uses these things called MCP servers to connect with other tools and use them.

Here's what I had set up:

IDE Integration

claude mcp add jetbrains -s user -- npx -y @jetbrains/mcp-proxy

This connects Claude directly to my IDE. Super handy because Claude can use IDE features for searching existing code, understanding the project structure, and modifying files.

GitHub Integration

claude mcp add --transport http -s user github https://api.githubcopilot.com/mcp/ -H "Authorization: Bearer xxx"

This server connects Claude directly to GitHub APIs, allowing it to read the repository, pull requests, and issues.

Pro tip: Even though this can create PRs and push code, I usually follow Murphy's law: not letting AI push code directly or make any changes to remote services.

Web Content (Images) Fetching

claude mcp add fetch -s user -- npx -y @kazuph/mcp-fetch

This server allows Claude to fetch and analyze web content, images in particular.

Sequential Thinking

claude mcp add sequential-thinking -s user -- npx -y @modelcontextprotocol/server-sequential-thinking

This one is crucial for handling tasks that require multiple steps or sequential thinking. It allows Claude to break down complex tasks into smaller, manageable parts and execute them in order. It also helps agent to not go off the rails when it comes to implementing the tasks one after another.

The combination of these MCP servers created a development environment where Claude could work across multiple tools and platforms seamlessly. This integration was crucial for maintaining the systematic approach throughout all 29 tasks. It also allowed me to reduce the number of tokens used since the agent was delegating some tasks to the servers rather than doing everything itself.

Some Safety Rules

Working with AI on your codebase can go sideways fast if you're not careful. Here's what kept me sane:

Version Control

Rule #1: commit after every single task. I'm serious about this. After Claude finished implementing each function, I'd review it, apply patches if needed, run static analysis, and then commit it immediately with a clear message.

Usually It's Faster to Start Over

If the agent generates code that's working but not following the pattern you defined, don't try to fix it manually. Just toss it and ask for a new version with better instructions.

Why this works:

Speed - Claude can regenerate in 30 seconds what might take you 30 minutes to debug
No weird edge cases - fresh code doesn't inherit previous mistakes
Git has your back - worst case, you revert to the last good commit
Better instructions = better results - each attempt helps you clarify what you actually want

Start Easy With Permissions

MCP servers can do a lot, but I always give them readonly permissions only. When attempting to use any MCP server features, Claude will ask for permission to use them. Just say no to anything that poses a risk of breaking something outside of your local development environment.

The problem with LLMs is that they are, by nature, not deterministic. Understanding this helps avoid situations where AI might try to do something unexpected. It's as simple as that: if something comes with a risk of breaking, don't let AI do it. Just follow Murphy's law: "If something can go wrong, it will go wrong."

Why This Actually Worked

One Function, One Task

The trick was to keep each task stupidly simple. One function per task, nothing more. No context switching, no distractions.

Same Checklist Every Time

I made Claude follow the exact same checklist for every single function:

Unit tests pass (including all the weird edge cases)
Integration tests pass (does it work in practice?)
PHPStan level 9 is happy
Code formatting is perfect
Nothing else breaks

AI Is Good at Boring, Repetitive Work

This wasn't about creative problem-solving. It was about following the same pattern 27 times without screwing up. AI is decent at this kind of work—it doesn't get bored like I would after the 5th function.

What I Learned From This Experiment

What Worked (With Heavy Supervision)

Detailed specs are crucial—AI needs extremely clear instructions or it goes off the rails
Review everything—the code looked fine but had subtle issues I caught during review
Keep tasks simple—anything complex and AI starts making questionable decisions

What I'd Do Differently

Spend even more time on upfront architectural planning
Build in more review checkpoints (I caught a few issues too late)

Want to Try This Yourself?

If you want to experiment with this approach, be prepared for a lot of upfront work defining patterns and reviewing output. This isn't a "set it and forget it" solution—it's more like having a junior developer who needs detailed instructions and constant code review.

I would much more prefer to have a junior developer who can learn and grow, rather than an AI that needs constant supervision. But since I'm the only "full time" developer on this project, it's better than nothing.

The Basic Setup

These three will give you a solid foundation to start generating code:

claude mcp add jetbrains -s user -- npx -y @jetbrains/mcp-proxy
claude mcp add --transport http -s user github https://api.githubcopilot.com/mcp/ -H "Authorization: Bearer YOUR_TOKEN"
claude mcp add fetch -s user -- npx -y @kazuph/mcp-fetch
claude mcp add sequential-thinking -s user -- npx -y @modelcontextprotocol/server-sequential-thinking

The Task Template

Here's the task template I use. There are probably better ones out there, but this one works for me. It's easy to copy-paste and modify for each new function.

# Task Template
## Context
[What exactly needs to be built and why]

## Implementation Steps
1. [Be stupidly specific here]
2. [Like, embarrassingly detailed]
3. [Trust me, more detail = better results]

## Definition of Done
- [ ] Code works as expected
- [ ] Tests pass (unit + integration)
- [ ] Static analysis is clean
- [ ] Nothing else broke
- [ ] Update docs if needed

The secret is to be ridiculously specific in your tasks. Don't make Claude guess what you want—spell it out like you're talking to someone who's never seen your code before.

But how do you make Claude Code actually work on the given task? For that, I created something called a Slash Command. Here's the /work-on-task command I use:

# Your task
You are going to work on a task described in #$ARGUMENTS path.
All tasks are in `.claude/tasks/{task-slug}` folder.
Inside the folder, tasks are saved as `{task-number}-{task-name}.md`.
Once you are done, summarize what you did in a new file (the same name as the task with suffix `-summary`).

# Task Summary Rules
- The Summary MUST be short and to the point.
- When everything is done and the Definition of Done is met, just write "Task completed." in the summary—don't repeat the task description.
- Always add to the summary how many tokens were used to complete the task.
- When the definition of done is not met, write what is missing so any other agent/human can take it from there.

What's interesting about this command is that it allows me to keep track of what Claude is working on, but it also shows approximately how many tokens were used to complete the task.

When a given task consumes too many tokens, it might be a good idea to look for a dedicated MCP server.

The Summary

I think AI hasn't changed the way I approach problems since I remember always trying to automate the boring parts of my work. AI can handle repetitive, boring tasks pretty well if you supervise it properly.

AI can execute well-defined, repetitive tasks, but if you don't make those tasks small and precise, it's likely to cut corners or misunderstand your intent. After all, it's all based on probabilities, not true understanding.

If anything, this reinforced how important architectural thinking and code review skills are.

This blog post was written with help from Claude Code.
I came up with the idea, wrote the initial draft, and edited the final version. The AI Agent was used for rephrasing and polishing the text, fixing grammar, and ensuring the content was clear and concise. It was a collaborative effort, which I hope will let me blog more often and make sharing my thoughts easier and faster.