Redefining Software Development: Human Architects and AI Builders

In the last couple of years, my approach to programming has undergone a significant transformation. It all started when I began integrating ChatGPT into my coding routine. What began as a simple experiment quickly evolved into a powerful workflow that has reshaped how I tackle software development.

In the last couple of years, my approach to programming has undergone a significant transformation. It all started when I began integrating ChatGPT into my coding routine. What began as a simple experiment quickly evolved into a powerful workflow that has reshaped how I tackle software development.

Introduction

Key Concepts:

  • Transformation of programming approach through AI integration

  • Emergence of the human as code architect, AI as builder

  • Potential implications for the future of software development

Through this process, I've come to see myself as an architect of code, with AI serving as my builder. I lay out the blueprint of what needs to be done, while the AI rapidly generates the initial structure. This partnership has not only increased my productivity but has also changed how I think about problem-solving in programming.

As I've refined this method, I've noticed patterns emerging that could have far-reaching implications for software development as a whole. This article explores my journey, the lessons learned, and the potential future of AI-assisted coding.

The Architect and the Builder: A New Paradigm

Key Concepts:

  • Human architect breaks down complex concepts for AI builder

  • AI rapidly generates initial code structure

  • Iterative refinement process between human and AI

ChatGPT has become an integral part of my daily programming routine. I've discovered that I naturally take on the role of an architect, envisioning how everything fits together, while AI acts as a skilled builder, generating close-to-working code. When the AI produces code, I can either make quick fixes if I spot simple issues or paste it back into the chat with a new prompt, providing just enough context for the AI to know how to improve it. This process became so ingrained in my programming that I began to notice a pattern emerging.

In essence, this is our relationship with AI: the human as the architect, the AI as the builder of smaller components. The human's role is to break down complex concepts into simpler steps that AI can process and execute. Imagine you want to construct a website from scratch. Typically, when working with ChatGPT or Claude, you'd first design the blueprint in your mind. You'd envision a home page with specific components, a contact page with particular elements, and so on. Then, assuming we're focusing solely on programming and setting aside hosting and stack decisions, you'd prompt an AI to generate a general home page with the components you've conceptualized.

The AI would swiftly create a template of the page you described, far faster than any human could. However, it wouldn't be well-designed, and considerable tweaking would be necessary - adding new components here, removing others there. Nevertheless, you'd have a starting point with generally good practices already implemented in the code. What's next? You'd return to the AI and say, "Let's focus on the header section." If you encounter a type error in the header image code snippet, you'd copy the problematic code and its surrounding context, paste it into the AI prompt, along with the error message from your editor. More often than not, the AI would know exactly how to fix it. You'd then copy the solution back into your code, and voila - it works without errors.

Running the project, you'd see a functional home page, albeit poorly designed and missing some key components you originally envisioned. AI models excel at templating code, as evident in GitHub Copilot's generation of unit tests or its autocompletion features. They produce excellent templates that give you a head start. This is a crucial point: although both you and the AI can generate code, the AI does so much faster, giving you a jumpstart on writing code you would have written anyway. It's then up to the human architect to refine the builder's work and make it function as intended.

Returning to our website example, now that you have a poorly designed but working site, you can focus on areas needing improvement. You might notice the absence of a background video in the middle section that you had originally planned. You'd highlight the code for that section, paste it into the AI prompt, and instruct it to create a background video using a file from S3. The AI returns code that does precisely this, which you then paste back into your editor and supply with a video stored on S3. The site is looking better, but you're struggling with the text alignment at the bottom of the section. Once again, you'd copy the relevant code, paste it into the AI prompt, and receive a solution.

As you can see, a pattern emerges. I, the human architect, understand the overarching problem and can instruct the AI builder, providing appropriate context for it to fix issues most of the time. The main advantage of an AI is that, with the right prompt and context, it can write the correct solution faster than humans. But that's the key - humans must be the architects, the visionaries, to make this collaboration truly effective.

Challenges and Evolution: From Simple Tasks to Complex Problems

Key Concepts:

  • Exploration of AI as autonomous architects

  • Challenges with embeddings and vector databases

  • Appreciation of the complexity in daily programming tasks

The Quest for Autonomous AI Agents

This realization led me to ponder: could AI models become the architects of their own tasks? This is where Autonomous AI Agents enter the picture, sparking much discussion recently. The challenge, however, is that while these tools can create a general website for you, the end result often falls short of expectations, riddled with issues that require human intervention to fix. Ironically, this human intervention could take just as much time, if not longer, than if a person had built the entire thing themselves. Many times, the output of these Autonomous AI Agents is so off-target that humans must scrap the whole thing and start from scratch.

Exploring Embeddings and Vector Databases

This line of thinking led me on a journey over the past few months. It began when I delved deeper into embeddings and vector databases, creating FeatureTranscribeAI in the process. As I learned more about these technologies and their applications, I realized that by converting an entire codebase into embeddings, one could use a simple algorithm to find relevant code from a feature request or bug ticket. In theory, this meant quickly locating the pertinent code, feeding it into an AI along with the feature request or bug description, and instructing the AI to fix it. This mirrors the architect-builder pattern I described earlier.

However, my findings revealed a flaw in this approach. While the AI could indeed find relevant code, it wasn't always the right relevant code to feed back into the system. The AI's solutions often sounded correct, using the right terminology and code snippets, but were frequently broken and half-baked.

The Complexity of Daily Programming Tasks

This experience made me appreciate the complexity of the problems we solve daily in programming. When we're in the thick of it, coding every day, we often forget about all the little things we do to understand the code we're writing and its context.

Consider a seemingly simple feature request: "Add a required form field to capture a user's zipcode." Sounds straightforward, right? But let's break it down. Assuming it's a React TypeScript project with styled-components for form styling, we need to ensure:

  1. The new field uses Pascal case to match the existing code format

  2. A reusable styled component is used for consistency with other form fields

  3. The appropriate class name for global styling is applied (as it's a required field)

  4. An API endpoint is written in TypeScript Node.js with a Prisma ORM

  5. Zod is used for request validation

  6. Documentation is updated and uploaded to readme.io for developer reference

As human architects, we might handle this task with relative ease, but an AI builder would likely stumble in several areas. However, by employing the iterative architect-builder pattern I described earlier, you could potentially complete this task faster than if you were working alone.

Rethinking Code Comprehension

Key Concepts:

  • Understanding how humans comprehend code

  • Importance of identifying focus areas and context

  • Creation of extract.ai for automated context collection

Realizing this after developing FeatureTranscribeAI, I took a step back to reassess what was going wrong in collecting relevant coding context. I realized I was approaching it from the wrong angle. The key lay in understanding how humans comprehend code - something we rarely consciously consider

When I encounter a feature request or bug, how do I make sense of the code? First, I identify the area of focus - where the code needs to be implemented. For the backend of the feature described above, I'd locate the file handling the routes for that form, then compare each route's name to the one the frontend form is calling until I found a match. Once I've found the appropriate code snippet, I'd examine where the request body is being processed and look at the related types and code called from within the request and surrounding code. If I can include this code and (especially) the code residing in referenced functions, classes, and variables from other files, I have most, if not all, the context needed for a low to medium complexity task.

This realization led me to create extract.ai, a tool that allows me to select code in a file (the entry point) and automatically copy all referenced code as well. This includes most of the context I need for tasks of varying complexity, often providing more comprehensive context than I could manually collect.

While this approach significantly improved my workflow, it got me thinking further. Providing the right context to AI models with the appropriate prompt can be incredibly powerful, but the output is only as good as the input.

Many of you have likely experimented with ChatGPT to create simple webpages. As mentioned earlier, it excels at creating individual components or pages. However, things become much more complicated with larger codebases as the AI loses context. By including the right cross-file context as I've described, we open up a world of possibilities.

I began to wonder if we could take this a step further and automate the context retrieval and prompting process. This proved challenging, as even identifying the entry point where a bug needs fixing or a feature needs adding is complex. There are numerous variables to consider, such as whether changes are needed in one spot or multiple locations, whether a database migration is necessary, and so on. While challenging, I don't believe it's impossible - it just needs to be broken down into simpler steps.

So, what's the simplest yet most valuable task that programmers do that might be automated through this process? What's mind-numbingly repetitive but still crucial? Unit tests emerged as a prime candidate. The beauty of writing unit tests is their predictability. Typically, even the test file name follows a consistent pattern, often matching the name of the file being tested with an additional ".test." or ".spec." in the name, and they're often located in the same directory.

Mimicking Human Problem-Solving in AI

Key Concepts:

  • Breaking down complex tasks into manageable parts

  • Creating a multi-step AI process to replicate human problem-solving

  • Impressive results in unit test generation

I put all of this together and applied it to unit test generation, with astounding results. While not perfect, the outcome demonstrated that automating code writing with AI is indeed possible. The AI wrote test code that closely matched the style of existing tests in the file. It successfully reused functions and even imported them correctly. Most of the time, the tests worked right out of the box! When they didn't, I could quickly make corrections. The main issues were that the generated tests weren't always useful or focused on the intended code, and there were occasional problems with test-specific elements like middleware usage or framework selection.

When tackling complex coding tasks, human programmers typically follow a mental process of breaking down the problem into smaller, manageable parts. This process involves:

  1. Understanding the big picture

  2. Identifying key components

  3. Determining the order of operations

  4. Recognizing dependencies between parts

  5. Adapting to new information as it arises

I wondered if we could teach AI to mimic this approach. The experiment involved creating a multi-step AI process that attempts to replicate these human problem-solving skills:

  1. The AI starts with a high-level understanding of the task.

  2. It then generates prompts to gather more specific information about each component.

  3. The AI uses this information to break the task down further, creating a chain of increasingly focused queries.

  4. As it progresses, the AI adapts its approach based on the new context it gathers.

This process continues, with each step breaking down the task into simpler components, much like a human programmer would. The goal was to see how much of this high-level planning - typically the domain of human programmers - we could effectively delegate to AI.

The results were impressive, particularly in unit test generation. This approach significantly outperformed template-based tools like GitHub Copilot and Codium, coming much closer to producing ready-to-use code.

This experiment showed that while AI excels at creating boilerplate code, its true potential emerges when it can mimic human problem-solving strategies. By combining AI's speed and pattern recognition with a more human-like approach to breaking down problems, we can achieve a level of efficiency and accuracy that bridges the gap between human and machine capabilities in programming.

This journey of discovery and refinement led to the creation of celp. I've been using it personally for a while now, and I'm consistently pleased with the results. While it works well on my projects, I'm curious to see how it performs for others, so we can iteratively improve and support a wider range of projects. I initially embarked on this process because I saw the potential in the AI-human collaboration in coding, and I recognized an opportunity to automate a significant portion of my workflow while gaining more confidence in pushing code, knowing it's backed by far more tests than I would have written manually.

I'm releasing celp because I believe it can help others as it has helped me. I also want to push this conversation further, potentially tackling more complex tasks with similar approaches. After going through all of this, I now know it's possible, and I'd love to hear your thoughts on where we can take this technology next.

The Birth of Celp: Automating Unit Tests and Beyond

Through this journey of challenges and discoveries, I've come to appreciate the intricate dance between human architects and AI builders in the coding process. Celp represents a significant step forward in this relationship, automating a crucial yet often tedious aspect of programming while maintaining the human touch where it matters most. As we continue to explore and refine these tools, I'm excited to see how they can further enhance our coding practices and push the boundaries of what's possible in software development.