Untangling the “Beast”: An AI Playbook to Understanding Legacy Systems

If you’ve spent any time in the trenches of software engineering, you know the beast I’m talking about: the legacy system. It’s that monolithic application guarded by cryptic comments (or worse, no comments at all), and held together by sheer historical inertia.

The usual drill? Days, weeks, sometimes months of archaeology. You’re digging through old documentation that’s probably more fiction. Drawing up architectural diagrams is more an art than science at this point. And it’s painstaking work.

But here’s a practical approach that’s changed how I tackle these titans: leveraging AI.

My Playbook: AI for Legacy Dissection

Think of an LLM as a highly skilled analyst but someone that needs directions. It won’t tell you why a particular design choice was made 20 years ago, but it can quickly show you what exists and how it’s connected. Here are the steps:

1. Feeding it the Code

This is the foundational step and it often needs to deal with data privacy. Many legacy systems hold incredibly sensitive business logic or customer data.

Cloud LLMs (Gemini, ChatGPT, etc.): They’re powerful, yes, but for proprietary legacy code, you must know your company’s policies. I always check contracts: What happens to my data? Is it used for training? For how long is it retained? For truly sensitive stuff, I’m very cautious about copy-pasting. Sometimes, I’ll sanitize code snippets by removing sensitive variable names or data.
On-premise/Self-hosted LLMs: This is the gold standard for high-security environments. If your organization has the resources, deploying open-source LLMs on your own servers is the way to go. It keeps everything in-house, giving you complete control. It’s more setup work, but the peace of mind is worth it.

Once privacy is sorted, there’s the context window challenge. Legacy codebases are often huge. You can’t just dump a gigabyte of code into an LLM.

Chunking is Key: We break the codebase down. I usually start with what I think are the main entry points, the src directory, or perhaps a folder that seems to house critical business logic. I feed it file by file, or logical groups of files.
Prioritization: I focus on areas that seem most critical or relevant to my immediate task (e.g., “I need to fix this bug in the invoicing module, so let’s start there”).
Prompting Like a Detective: Instead of a vague “Explain this,” we can go specific:
- “Analyze this COBOL program. What are its main sections and what data files does it interact with?”
- “Given this Java class, what are its responsibilities, and which other classes does it depend on? Describe its main methods.”
- “Trace the data flow for an ‘order placement’ operation starting from this entry point. Show me the sequence of calls between components.”
- “Identify the main architectural patterns or anti-patterns present in this legacy system’s structure.”

2. Visualizing the Beast: Diagrams from AI

After the LLM has chewed on the code, we ask it to produce diagram code. This is where we get *the* “map.” Humans are visual creatures, and a diagram is worth a thousand lines of undocumented code.

I’ve experimented with a few tools, all of which LLMs are surprisingly good at generating syntax for:

PlantUML / Mermaid.JS: Great for quick sequence diagrams or component diagrams. If I just need a fast visual of a particular flow, these are my go-tos.
Graphviz (DOT): For more complex dependency graphs where relationships are key, Graphviz is excellent.
Structurizr (C4 Model): This is my personal favorite, especially for legacy systems. The C4 model (Context, Container, Component, Code) helps me zoom in and out. I can get a high-level system view (Context), then drill down to what runs where (Container), then what modules are inside (Component). The LLM can generate the Structurizr DSL, giving me a structured, manageable view. It’s a game-changer for understanding how all the pieces fit together.

It’s rarely perfect on the first try, so we need iterate: “That relationship is wrong, fix it.” or “Break this component down further.” Think of it more as a dialog.

3. Interrogating the Oracle: Deeper Insights

Now that we have a mental model, we use the LLM to dig deeper, almost like having a highly detailed and always-available expert to bounce ideas off:

“Given this error message from the logs, which module is most likely responsible, and what are its dependencies?”
“If I refactor this calculateInterest routine, what are the potential ripple effects across the Financials module?”
“Can you explain the purpose of this obscure data structure (OLD_REC_001) based on its usage patterns in the code?”
“What would be a good strategy for writing automated tests for this BatchProcessor given its complexity and lack of existing tests?”
“Suggest a strategy for migrating this CustomerDAO to a more modern ORM, outlining the challenges and potential steps.”

This back-and-forth saves me hours of manual exploration. It accelerates my understanding, whether I’m trying to fix a gnarly bug or plan a modernization roadmap.

My Stack: Gemini + Structurizr – A Powerful Combo

I’ve found Google Gemini to be particularly effective for code understanding. It seems to have a strong internal representation of code logic, which translates into more accurate and insightful analyses

Looking Ahead: Our Augmented Future

We’re still early on this journey. Imagine IDEs with built-in AI assistants that generate architectural diagrams on the fly, or CI/CD pipelines that automatically update documentation based on code changes. It’s a fascinating future for us legacy engineers.

But let’s be clear: AI is a tool. My experience has taught me that the human engineer, with their domain knowledge, critical thinking, and intuition, is still essential. AI helps me see the forest and the trees, but I’m still the one deciding which trees to cut down, which to nurture, and how to replant the forest for the next generation.

So, if you’re wrestling with a legacy beast, I encourage you to experiment with these AI tools. They won’t solve all your problems, but they can certainly provide a powerful compass to navigate the uncharted territories of old code. You might be surprised at how much clearer that “monster” becomes.