Skip to content

How Google is accelerating code migrations with AI

Google cuts code migration time in half by automating tasks with AI.

This post was originally published in Engineering Enablement, DX’s newsletter dedicated to sharing research and perspectives on developer productivity. Subscribe to get notified when we publish new issues.

I read Migrating Code At Scale With LLMs At Google, a new paper describing how Google used AI to automate a large, tedious migration initiative: converting 32-bit integers to 64-bit across their monolithic codebase. This type of migration had previously taken two years to complete manually. With their new system, Google cut that time in half while having AI generate 70% of the code changes.

This paper shows the potential of using AI to automate a substantial portion of code migration tasks.

My summary of the paper

While migrations are one of the most necessary parts of software maintenance, they can be time-intensive, costly, error-prone, and unrewarding for developers, making them a great candidate for using AI. Google’s system solves this by identifying code that needs changing, using an LLM to generate updates, validating the changes through several checkpoints, and routing successful modifications for human review. Today the system runs nightly, continually chipping away at the migration task until complete. In this paper, the authors describe how the system works, its results, and its benefits and challenges.

How the system works

Google’s automated migration workflow consists of three main components:

Finding where to make changes.

The system uses Kythe, Google’s internal code indexing system, to trace both direct and indirect references to ID fields. The system maps dependencies up to five levels deep, casting a wide net to avoid missing anything, even if that means over-including.

To deal with noise and false positives, they:

  • Use automated classifiers to flag irrelevant or already-migrated code
  • Run regression tests to catch missing or incorrect changes
  • Rely on the LLM to decide what actually needs to be edited

Importantly, instead of giving the LLM just a few lines of code, they feed it the entire file so it can understand the full context and make more accurate changes.

Categorizing references.

Each potential code location is sorted into one of four buckets:

  • Not-migrated locations are confirmed as needing changes, identified with 100% confidence through automated checks (like finding test code with small integers).
  • Irrelevant locations are those that definitely don’t need changes, such as class definitions or code already using values outside the 32-bit range.
  • Relevant locations are those that directly reference the ID and likely need investigation but don’t fall into either of the previous categories.
  • Leftover contains everything not automatically sorted into the other categories. Developers manually review these locations and decide whether they need migration or not. During the next system run, these manually reviewed locations are moved to the appropriate category based on the developer’s decision.

Making and validating the changes.

Google uses an internal version of Gemini to generate diffs. A prompt explains the migration and provides suggested lines, but the LLM is free to modify any part of the file.

Each proposed change is validated through a stepwise process:

  • Did the LLM return a valid result?
  • Did it change more than just whitespace?
  • Can the new code be parsed?
  • Was the change actually needed?
  • Does it build and pass tests?

Only changes that pass all checks are submitted for review. Failed changes are marked for manual handling.

Results and impact

The researchers evaluated their system through a comprehensive case study of 39 distinct migrations over twelve months. The results are impressive:

  • 595 code changes were submitted, containing 93,574 edits
  • 74% of code changes were generated by the LLM (either entirely or with human adjustment)
  • 69% of all edits were made by the LLM

Additionally, developers reported high satisfaction with the system and estimated a 50% reduction in time spent on migrations compared to the manual approach.

Benefits and challenges

The researchers found several advantages of their approach:

  • End-to-end automation: The system handled the entire process from identifying references to submitting validated changes.
  • LLM flexibility: The model is adapted to different code styles, languages, and patterns with just a natural language prompt.
  • Validation pipeline: Developers only reviewed high-quality changes that had already passed builds and tests.

However, they also encountered challenges:

  • LLM limitations: Context window constraints, hallucinations, and variable performance across programming languages sometimes require manual intervention.
  • Pre-existing issues: Build failures and test dependencies occasionally hindered the automated process.
  • Production roll-out complexities: Large, distributed migrations still required careful management during production deployment.

Final thoughts

This paper highlights the incredible potential of using AI to assist with large-scale code migration tasks. This may be useful for teams exploring ways to improve developer productivity with AI, as well as mature organizations looking for faster ways to update and maintain their codebases.

By automating much of the work, Google’s system cut down manual effort, saved developers time, and gave them a clearer sense of progress, making the entire migration process less tedious and more manageable.

Published
May 14, 2025