Coding with AI - Projects

Disclaimer : this was done in January 2026, AI is progressing fast and so are development methods, this article may come obsolete sooner than later.

I used Claude, ChatGPT and even Mistral before, but only in the chat form, asking to improve solitary functions, or for technical details on certain libraries, etc, but I never actually coded anythong alongside a Coding Agent. To follow the course of evolution and not be left behind like a caveman, I needed to learn how to code with AI.

I decided to use Claude Code for that, because I believe it was the most capabale Coding Agent out there (even tho I have no proof, just based on popular feeling). It was quite easy to install and running it felt as much.

The Premise

A few years ago I made a simple game using python and pygame, as an exercice. The game was a copy of an old CDRom game that we couldn't run anymore. It consist of a grid of tiles stacked on top of each others, clicking a group of 2 or more removes them from the board and the remaining tiles are falling into place. I figured it would be a good exercice to make it available as a web game, since I already had the game mechanic defined in the python code. The margins of errors for the AI to make would be pretty thin.

Key Technologies Used

Claude Code with Sonnet 4.5
Backend in python using the logic of the old game
Web server with FastAPI and uvicorn
Frontend in HTML/JS, with PixiJS for the graphics
Highscores saved on server with SQLite

The workflow

After reading a few articles on how to use the coding agents, I realised that managing the context is a critical point of the workflow.
The big picture and key features seems to be the bare minimum to give to the agent. It's a good starting point, but I felt like if I gave too much wiggle room and freedom on certain aspects, I would loose too much time correcting the shot and use unecessary credits, so it was better to know what I wanted to do from the get go.

The initial setup

The goal was not to play around and experiment by trying out features here and there on my game. I had a clear idea of what I wanted and a good base to start from. So I had to restrain the creativity of the AI as much as possible by providing as many details I could on the architecture and specific implementations. I started by writing a SPECS.md file describing the big picture.
Then with the help of Claude (chat version on the web, free tier), I added development phases to this SPECS.md file.

And those phases are what was keeping the AI in the rails : they were describing developement milestones in details, which technologies to use and what was the expected outcome.
For example phase 1 would be to have a working basic Client/Server Structure, just serving a simple page showing an empty grid using FastAPI, phase 2 would be to handle multiple connections to this page, and starting from phase 3 implementing core logics.
It was written (in bold letters) that the agent should not continue to a new phase until the current phase tests were OK and I allowed it to proceed.

About the tests : at first I asked the agent to do the tests itself, but at times they were failing and it concluded they passed, and in other times it just didn't test some cases at all. I think I am missing a proper way to instruct the agent on how to test, or maybe I should use another AI for that, but for now and for a simple application as this, I prefered not loosing time on this and do the tests myself.

Keeping the context between sessions

It became quite clear after my first coding session that, while it was still fast to read the code and produce some more, I would need to split the work and interupt the session to resume it another day. What that meant was I would need to ask the agent to read the SPECS.md again when I resumed working on the project, and also I would need to specify what PHASE has been done.

What became apparent, also, was that, as much as I tried to specify everything beforehand in great details I still had to deviate a bit on some tasks. Those deviation had to be reported somehow, or I would I have to make the agent work double time to read all the current code and all the specs again to figure out what changed between them. Not a solution. After some searching on the net, I first figured out that keeping a PROGRESS.md file to keep track of what has been done and the deviation was a good start. The good thing is I could ask the agent to write this file itself at the end of the session, because it had everything in the context already and knew exactly those informations without having to think too much.

I realised after some time that the file it produced was not optimal and it repeated a lot of what was said in the SPECS.md already. So I opted for a json file instead. This file was better structured and easier for the agent to read and modify.

Here is a sneakpeek into what this file looked like at the end of the project (only 2 phases shown here) :


  {
  "project": "MagneTile Web Implementation",
  "last_updated": "2026-01-27",
  "current_phase": 7,
  "status": "All Phases Complete (1-7)",
  "next_phase": "none",
  
  "phases_completed": {
    "phase_1": {
      "name": "Basic Client/Server Structure",
      "status": "complete",
      "deviations_from_specs": [
        "Added /health endpoint (monitoring addition)",
        "Used modern FastAPI lifespan context instead of deprecated on_event"
      ],
      "key_additions": [
        "Controlled randomization algorithm ported from Pygame mt_board.py",
        "25% clustering probability for natural tile grouping",
        "Column-first board indexing board[col][row] to match Pygame",
        "Exact Pygame color matching (0xED7D77, 0x77ED83, etc.)",
        "Background color 0xBFCBDB matching original"
      ]
    },
    
    "phase_2": {
      "name": "Handle Multiple Connections",
      "status": "complete",
      "deviations_from_specs": [
        "Implemented localStorage session persistence (NOT in specs)",
        "Added 10-minute reconnection grace period (NOT in specs)",
        "Cleanup runs every 60 seconds instead of hourly (more responsive)"
      ],
      "key_additions": [
        "Session persistence across page refresh",
        "Reconnection feedback ('Reconnected!' message)",
        "Grace period countdown in session API",
        "SESSION_TIMEOUT_HOURS and RECONNECTION_GRACE_PERIOD_MINUTES config constants"
      ],
      "session_management": {
        "grace_period_minutes": 10,
        "full_timeout_hours": 24,
        "cleanup_interval_seconds": 60,
        "security": "IP address validation prevents session hijacking"
      }
    },

    [...]
  },

  "phases_remaining": {},
  
  "major_additions_not_in_specs": {
    "session_persistence": {
      "storage": "localStorage",
      "grace_period": "10 minutes",
      "reconnection_feedback": "UI message on reconnect"
    }
  },
  
  "technical_improvements": {
    "fastapi_patterns": [
      "Modern lifespan context manager (not deprecated on_event)",
      "Proper async/await throughout",
      "Type hints and documentation"
    ],
    "pygame_port_accuracy": [
      "Exact color matching",
      "Controlled randomization with 25% clustering",
      "Column-first indexing board[col][row]",
      "Visual depth matching original"
    ]
  },
  
  "current_capabilities": {
    "backend": [
      "WebSocket communication",
      "Multi-session management with isolation",
      "Session persistence and reconnection",
      "Tile group detection (flood-fill)",
      [...]
    ],
    "frontend": [
      "Pixi.js rendering with images",
      "Click-to-remove interaction",
      "Smooth animated board updates (fade, fall, slide)",
      [...]
    ]
  },
  
  "testing_status": {
    "phase_1": "All criteria met",
    "phase_2": "All criteria met + session persistence verified",
    [...]
  },
  
  "file_structure": {
    "backend": [
      "main.py - FastAPI app, WebSocket, session management, highscore endpoints",
      "game_logic.py - GameState class, board logic, scoring, restart",
      "database.py - SQLite highscore management (Phase 7)",
      [...]
    ],
    "frontend": [
      "index.html - Main page with UI overlays and game over modal",
      [...]
    ]
  },
  
  "config_constants": {
    "board_dimensions": "22 cols × 9 rows",
    "tile_colors": 5,
    "tile_size": "40×64px",
    [...]
  },

  "project_completion": {
    "all_phases_complete": true,
    "total_phases": 7,
    "completion_date": "2026-01-27",
    "specifications_followed": true,
    "deviations_documented": true,
    "testing_complete": true,
    "production_ready": true
  }
}
  }

This is more informations than I would I anticipated (the real file is 400 lines). But this is a good thing. Less is more? Not here, we need as many details as possible. And after reading this file again, I even think we can add more.

With that in place, the whole workflow was complete : At the start of the session, I tell the agent to read those two files (SPECS.md and PROGRESS.json) and that's it, the agent is ready to go! I don't even need to tell it to read the source code, it already has a solid context. When implementing the next step, the agent already knows what file to modify (it has the path loaded in context from either the SPECS.md or PROGRESS.json) and just reads precisely the files it needs to start working. This is limiting what it loads in context and reduce usage costs to a minimum (reading + thinking).

How fast was that? And how expensive?

Working with the agent was not as fast as I thought, but it's because I had a lot of expectations. In the end it saved me a lot of time, I didn't have to search around and figure things out by myself. I never used Pixi JS before for example, so instead of searching how to use the library do to so and so, the agent knew quickly was it was capable of and gave me the functions tailored for my needs.

Depending on the phase, I could spend between 30min and 2h on it. This range could be explain by my ability to split the project. I would say creating phases that take 1h each at most would be perfect. The ability to stop at any phase was a plus and allowed me to work on multiple days without worry.

Reloading the context everytime was not costly, maybe 0.6$ in average. Not very significant compared to the total cost.

In total I spent around 6h30 on it, most of it with the agent running and me checking it's output.
The cost was below 15$ (USD) and probably saved me a full week of work. So I would argue that it was a x5 gain time.

Conclusion

Spending time to define the SPECS.md file is mandatory for a project to go smoothly (and not too costly as well), even if it takes a day to do so, this is a critical step.

PROGRESS.json is also a key thing to have, it's another guard rail to restrain the agent even more, avoid stacking up deviations and reduce cost. I would say it is also a mandatory file to have and maintain.