A New Git Diff Algo

TL;DR

Myers diff represents changes mainly as add/delete between two repo snapshots, which can inflate visual noise for moves, renames, and refactors.

Briefing Cornell Notes

Briefing

GitHub’s diff experience is getting a rethink: a newer diff strategy called “commit cruncher” is designed to cut the amount of code reviewers must visually parse by recognizing more kinds of changes than the classic Myers diff. Myers—named for Eugene Myers’ canonical algorithm—treats line differences largely as add/delete operations between two repo snapshots, which can make refactors look like large, noisy edits even when most of the work is moving code around or making trivial updates.

The research behind commit cruncher argues that expanding the diff “vocabulary” (adding operations like move, update, find/replace, and copy/paste) can produce a more compact, reviewer-friendly representation of a pull request. In practical examples, whitespace-only changes can be treated as low-signal updates so reviewers focus on meaningful edits. Refactors that extract code into a new function can be shown as a smaller set of “what changed” events rather than thousands of lines appearing as rewritten. The approach also aims to preserve context for multi-step edits—such as when a rename happens and later commits modify the renamed file—by tracing how each changed line evolves across the commit sequence.

The core workflow shift is important: Myers diff compares only the repo state before and after a commit, while commit cruncher takes a more computationally intensive route by tracking each changed line through the commits where it appears, building a “commit group” view. The payoff claimed in the research is twofold: fewer highlighted lines to review and better reviewer context. One benefit described is that hovering over a line can surface the commit messages that explain why that line ended up in its final form. Another is that when a line is moved and then modified, the diff can show the original location and evolution rather than forcing reviewers to reconstruct the story from two endpoints.

Empirical results are presented using 12,638 pull requests processed in the second half of May 2024, spanning popular open-source projects (including React, VS Code, Chromium, and TensorFlow) and SaaS repositories. Using GitHub’s API compare endpoint as a baseline, the study reports “28% fewer lines to review” on average when commit cruncher’s diff highlighting is used instead of Myers-style highlighting. A separate experiment with 48 developers assigned to review pairs of pull requests on GitHub vs the commit cruncher platform found no meaningful difference in question accuracy (with differences under 5%), while review duration decreased in the direction expected.

Still, the evidence comes with caveats. The reported “lines to review” metric is a proxy for effort, not a direct measure of comprehension difficulty or bug-finding outcomes. Review time can drop even if the underlying cognitive load shifts elsewhere, and real-world code review includes confounders like reviewer familiarity, language expertise, and the nature of the changes. Even so, the broader takeaway is clear: diff algorithms shape how humans interpret change, and moving beyond Myers’ add/delete framing could reduce the visual noise that slows reviews—especially for refactors, moves, and incremental edits.

Cornell Notes

Myers diff—the default line-diff approach behind GitHub-style comparisons—mostly reduces changes to add/delete between two repo snapshots. Commit cruncher aims to improve that by using a richer set of diff operations (including move, update, find/replace, and copy/paste) and by tracing how changed lines evolve across the commit sequence. The claimed result is less reviewer-visible “diff noise”: a study of 12,638 pull requests reported about 28% fewer highlighted lines to review on average. In a separate user study with 48 developers, question accuracy stayed essentially the same (differences under 5%), while review duration decreased. The practical implication is that diff representation can materially affect review throughput, though “fewer lines” is still an indirect measure of bug-finding quality.

Why does Myers diff often make refactors look worse than they are?

Myers diff constructs a visual diff using only two repo states: before the commit and after the commit. That endpoint-only view tends to classify many non-trivial edits as combinations of deletions and additions, even when the underlying change is a move, rename, or a sequence of incremental edits. The result is a noisy red/green picture that forces reviewers to infer what actually happened.

What is the key difference in how commit cruncher builds a diff?

Commit cruncher uses a more computationally intensive approach: it traces each changed line through the commits where it appears, grouping related edits across the commit sequence. Instead of treating the diff as a transformation from state A to state C only, it builds a representation that can preserve lineage—like showing how a rename (commit B) followed by edits (commit C) should be understood together.

How do richer diff operations reduce reviewer workload?

By recognizing more change types than add/delete, the diff can compress low-signal edits and highlight higher-signal differences. The research describes examples where whitespace-only changes are treated as trivial updates, so reviewers can focus on substantive edits. It also describes cases where extraction/refactoring is shown as a smaller set of meaningful changes rather than thousands of lines appearing rewritten.

What evidence is used to claim a reduction in review effort?

The study processes 12,638 pull requests from May 2024, using GitHub’s compare endpoint to capture the number of added/deleted lines GitHub would show, then comparing that to commit cruncher’s highlighted change-line counts. It reports about 28% fewer lines to review on average (with median differences reported as roughly 27%–31% depending on change magnitude). The metric is based on highlighted lines, not direct bug outcomes.

Did reviewers actually understand the code better, or just review faster?

In the 48-developer experiment, question accuracy differences between GitHub-style diffs and commit cruncher diffs were under 5%, described as statistically insignificant. Review duration decreased, and the study presents plots comparing accuracy vs duration. That suggests throughput improved without a clear accuracy gain, though the proxy measures still leave open whether bug-finding quality changed.

Review Questions

How does endpoint-only diffing (before vs after) tend to misrepresent moves and refactors compared with line-tracing across commits?
What does a “28% fewer lines to review” metric measure, and what important real-world outcomes might it fail to capture?
Why might review duration decrease without a measurable improvement in question accuracy?

Key Points

1
Myers diff represents changes mainly as add/delete between two repo snapshots, which can inflate visual noise for moves, renames, and refactors.
2
Commit cruncher aims to reduce that noise by using additional diff operations such as move, update, find/replace, and copy/paste.
3
Instead of only comparing pre- and post-commit states, commit cruncher traces changed lines through the commits where they appear to build a more contextual diff view.
4
The research reports about 28% fewer highlighted lines to review across 12,638 pull requests, using GitHub compare output as a baseline.
5
A user study with 48 developers found question accuracy differences under 5% while review duration decreased, implying faster review without clear accuracy gains.
6
“Fewer lines to review” is a proxy for effort; it does not directly prove fewer bugs or better production outcomes.
7
Diff representation can change reviewer attention and interpretation, so algorithmic choices can affect developer workflow even when the underlying code change is the same.

Highlights

Commit cruncher’s core shift is tracing changed lines through the commit sequence, not just comparing repo state before vs after.

The study’s headline metric—about 28% fewer highlighted lines to review—targets visual noise reduction rather than directly measuring bug rates.

Examples emphasize treating whitespace-only changes as low-signal and representing refactors as smaller, more meaningful change sets.

In the developer study, accuracy stayed essentially flat (under 5% difference) while review time dropped, suggesting throughput gains without obvious comprehension gains.

Topics

Diff Algorithms
Git Pull Requests
Code Review
Myers Diff
Commit Cruncher

Mentioned

Eugene Myers
AI
IDE
CIS
PR
CTO
CEO
SaaS
API
CDF
V8
TCP
VS Code
OBS