Table of Contents
In this article, I will share my philosophy of making large pull requests palatable for reviewers. It’s a perspective I’ve developed from the hundreds of pull requests I created, reviewed, and shipped in the last decade.
Table stakes #
Pull requests take the throne of engineering collaboration tooling. PRs are the de-facto approach to managing, tracking, and reviewing proposed changes to a codebase. Popular repositories, like Bitcoin or Rails (at the time of writing this), have tens of thousands of pull requests on Github. They’re genuinely the bedrock of collaboration in open source projects.
Engineering teams, from large organizations to small garage startups, all love pull requests. An engineer that aspires to work in a team must learn how to create tight PRs. It’s table stakes.
The broader community agrees on the value that patch proposals bring. But the baptism didn’t pan out, as vendors tend to come up with their names for things. The industry standard name seems to be the one GitHub, Gitea, and Bitbucket use – a pull request. GitLab calls it a merge request. And Google – a changelist (CL).
I will stick to pull request (PR).
The value of reviews #
Pull requests are the collaboration bedrock as they are the best outlet for reviewing code. Reviews provide tremendous value. First, they allow quality assessment:
- Keep the code quality bar high (enough)
- Catch issues before they get to production
- Give confidence with the proposed changes
- Check for architecture adherence
Second, they’re a knowledge-sharing medium:
- Master the codebase by reviewing PRs
- Pick up communication and written skills while composing PRs
- Learn to give and receive meaningful feedback
Third, PRs provide a checkpoint to assure coding standards:
- Assure compliance through automated formatters and linters
- Security posture by involving security professionals in the review
Code reviews have other benefits that are well documented on the internet and in print. Engineers must educate themselves about the value of reviews and how to review code well, as it’s essential to their success as engineers. If you’re unfamiliar with the benefits of code review, I absolutely recommend spending time on that.
More lines, more problems #
No two pull requests are the same. Some are a breeze to review. Others, not so much. Reviewing chunkier pull requests often packed with value is the area I want us to explore. The popular advice is two-fold when it comes to dealing with large pull requests:
- Keep pull requests small
- Split more extensive changes into multiple minor pull requests.
Good advice, I am sure you’ll agree. But as with any advice, it doesn’t suit every situation. Sometimes splitting a PR is not possible, or just prohibitively cumbersome. In such cases, the only remaining option is to make our chunky pull request simpler to review.
How can we make PRs easier to digest and review? Storytelling. Use the PR description and commits as storytelling vehicles.
“Big bang” pull requests are not ideal. Instead, each pull request should have a gradual flow of proposed changes. Like the credits roll after the motion picture, a commit always comes after each step. Always. They are cheap to create. They are trivial to amend. And squashing them is easy when they’re not needed.
Storytelling via commits #
Commits are a storytelling vehicle. They are checkpoints in time that follow our steps in the codebase and tell a story of the changes in the PR in chronological order. Three aspects are crucial to maximize the checkpoints' effectiveness:
- Maintain a good commits order
- Pick a good diff for each commit
- Write a descriptive commit message
Tracing the path #
The reviewer must see the author’s footsteps in the path to follow their steps. Thus I don’t place traps for them. I don’t take detours. I avoid drive-by fixes. I refrain from dirty diffs or massive commits. I order the commits chronologically to give the PR a flow. As a reader, imagine reading a story that begins with an outro, continues with the intro, and ends with the story’s culmination — what a mess. So, I think of my pull request as a book: each commit is a chapter, and the reviewer is the reader.
Diff assembly #
When organizing the diffs, I aim to assemble logical chapters. Therefore each commit must contain diffs that are logically related. Adding unrelated diffs in a commit is like having erratic book chapters. You can be sure the reader will put down the book after just a few pages if the storyline is all over the place.
Coherent commits require high hygiene. The easiest way I’ve found to have quality commits is through atomic commits. For example, when I add a method, I make sure I commit only its unit test with it. Once invoked the new method in the codebase, I commit the invocation with its test separately. In the PR with two commits, the reviewer will see a small story: I add the new method in the first commit; I invoke it in the next. Equally important, I cover both chapters of the story with tests.
While it requires effort, strong commit hygiene makes tracing the author’s steps simple. And it makes for an uncomplicated review.
The message #
The commit message summarizes the change and why the author made it. It gives the reviewer the context and reasoning behind the change. Also, a good commit message shows the collaborative qualities of an engineer. Engineers that write good commit messages consider the reader of the message, whether that’s someone else or their future selves.
Commit messages affect not only the review qualities of a pull request. They also affect the overall quality of the repository. A repository with a lousy commit log will diminish the excellent code it contains. Likewise, if the history is hard to navigate, it reduces the quality of the project.
The commit message quality also relates to the maintainability of the project. In addition to the reviewers, commits are essential for maintainers too. When maintaining a project, maintainers must traverse the Git history to discover the context of the code they’re changing. In such cases, it’s apparent that the success of the maintainer has a causal relationship to the quality of the message.
I recommend reading “How to Write a Git Commit Message” if you’d like to improve your commit skills.
Storytelling via description #
The PR description is another place I see poorly used by others. The key word is ‘context’. When I open an extensive PR, I briefly explain the change that I am proposing. It must explain the reason for the modifications. It’s not only what I changed or how I did it (it’s evident in the diff) but also why I did it. If we’d use the book analogy from before, the PR description should contain the prelude to the story the commits will tell. I like to mention a few sentences about the current state of the code, its shortcomings, and how the new implementation will solve/improve its flaws.
Use a template #
A PR template is there to give the PR structure, emphasize critical aspects of the PR, and guide the author through the process. Therefore, I appreciate it when a team takes the time to compose their pull request template. It shows that the team values reasonable and standardized PRs, a signal for its culture.
But composing a custom template from scratch wastes time, especially in a world with many pre-built templates. So I usually tell folks not to come up with a template but instead start with one from Github and adapt it. It’s much easier to start with a good foundation and customize the last 10% as needed.
Temporary changes #
When dealing with an emergency, resorting to non-idiomatic or obtrusive changes is acceptable. Understandably, any reviewer will reject an incident mitigation PR without the proper context. In such cases, a brief and precise description is critical. What kind of emergency? What are the effects? Is it an acute emergency, such as an incident? The author must enable the reviewer to review and approve efficiently.
There are other cases when PRs with short-lived changes are warranted. Like as a temporary workaround or experimental code. Temporary code works fine in the short term, but its PRs should explain how the author will make the temporary code better, cleaner, clearer, or more idiomatic in the long run. Such explanations to the description make us think about paying the temporary debt the PR will incur. And it shows the reviewer that the author is aware of the shortcomings and has a plan to address them.
Test and rollout/rollback plans #
Extensive pull requests have a larger blast radius, so they must undergo rigorous testing. Test plans must accompany such PRs, preferably linked in the description of the PR. The plan must consist of the scenarios tested by the author, reproduction steps, and the scenario outcome. By thinking about test scenarios, we don’t just follow a process. We think about edge cases or absurd consequences. Deep thinking unlocks a profound understanding of what could go wrong and how.
The same goes for rollout plans. To de-risk a PR release, it should be accompanied by a rollout plan. Similar to the test plan, the rollout plan can be a linked document or described in the pull request description. Feature flags, dark deploys, experiments, you name it. But make sure you link it.
To accompany the examples above, it’s a good practice to add proof that the proposed change works as intended. Screenshots, short recordings, or links to the staging configuration with the applied change. There are many options. Some companies have short-lived deploy previews of pull requests – like Netlify’s deploy previews. If your company has the capability, use it and show the reviewer how the PR works in action — all without impacting any customers.
Example PRs #
Before I wrap up, I want to leave you with a few examples of non-trivial pull requests, with good commit messages and descriptions.
- Implement BIP 340-342 validation (Schnorr/taproot/tapscript) in bitcoin/bitcoin
- Add delegated type to Active Record in rails/rails
- [9.x] Query builder interface in laravel/framework
- [PATCH v9] can, tty: can327 CAN/ldisc driver for ELM327 based OBD-II adapters in linux
- Stabilize generic associated types in rust-lang/rust
- Add encryption to Active Record in rails/rails