An AI PR Review Prompt Template for Clean Diffs
A copy-paste ai pr review prompt template that turns a full diff into a prioritised verdict — architecture, defects, security, tests — instead of a throwaway snippet.
The difference between a PR review that catches the regression and one that waves it through usually isn't the model. It's whether the prompt has a workflow or just a wish. "Review this pull request" is a wish. The model reads three files, writes a paragraph, and moves on.
A real ai pr review prompt treats the diff like a senior engineer does: several deliberate passes, each looking for a different class of problem, then one report that ranks everything by how badly it'll hurt in production. That's a template, not a one-liner. And it's reusable, because the only thing that changes between PRs is the {{diff}} you paste in.
Most of the copy-paste prompts floating around stop at the wish. They give you a single instruction and no output shape, so the review depth swings wildly run to run. This template fixes the shape.
Why a single-pass review misses the important stuff
When you ask for everything at once, the model satisfices. It finds the easy, surface-level issues, declares victory, and never gets to the architectural problem hiding three files down.
The fix is to make the prompt run distinct passes:
- Architecture pass. Does this change fit the existing design, or bolt on a parallel way of doing the same thing?
- Defect pass. Edge cases, null handling, off-by-ones, race conditions.
- Security pass. Auth, input validation, injection surfaces touched by the diff.
- Test pass. What changed that isn't covered, and what tests are missing.
Each pass primes the model to look for one thing. Stacked, they cover ground a single "review this" never reaches. Then the template merges the findings into one ranked list, because four separate reports nobody reads is its own failure mode.
Depth in a PR review comes from separate passes, not from asking harder. A model told to "look for security issues" finds more of them than the same model told to "review the code," even on the identical diff. Naming the pass is the prompt.
The PR review prompt template
Here's the anatomy. The variables are what you reuse; the workflow and output contract are written once.
Variables → {{diff}}, {{pr_description}}, {{repo_conventions}}
Role → Senior reviewer producing a merge-ready verdict.
Passes → 1 architecture 2 defects 3 security 4 test coverage
Per finding → FILE:LINE | SEVERITY | PASS | risk | suggested fix
Scope check → compare {{pr_description}} to the diff; flag drift
Verdict → blocking count, then APPROVE or REQUEST CHANGES
Two details earn their keep. First, the {{pr_description}} variable lets the prompt catch scope creep: code that changed but was never mentioned in the PR. That's where sneaky regressions live. Second, the verdict line is a single machine-readable result, so the same prompt drops cleanly into a CI gate later.
Where models drift, and how the template holds them
A template only works if it survives a real diff. Two model behaviors matter here.
Claude holds a multi-pass structure across a long diff if each pass is a numbered heading. GPT-4o tends to merge the passes back together after the first one unless you restate "complete all four passes before writing the verdict" near the end of the prompt. Both models will, left alone, stop reviewing once they've found something quotable. The explicit instruction to finish every pass is what keeps the back half of the diff from getting a free ride.
| Behavior | Claude | GPT-4o |
|---|---|---|
| Runs all four passes | Yes, with numbered headings | Collapses to one pass unless reminded near the end |
| Honors APPROVE/REQUEST verdict line | Consistent | Consistent when contract is last |
| Catches scope drift | Strong with {{pr_description}} present | Needs the comparison spelled out as a step |
A PR review prompt that doesn't compare the diff against the PR description is doing half the job. The most dangerous changes aren't the ones that look wrong; they're the ones nobody mentioned. Make scope drift a first-class output, not an afterthought.
Prompt-craft patterns for clean diffs
Pattern 1: rank, don't list
A flat list of twelve comments buries the one that matters. Sort the output by severity so the blocking issue is line one. Reviewers triage top-down; respect that.
Pattern 2: make "nothing to flag" a valid pass result
Each pass should be allowed to return empty. Otherwise the model invents a nit per pass to look diligent, and you train your team to ignore the output. An honest empty pass builds trust faster than padded findings.
Pattern 3: keep the diff first, the contract last
Same rule as any long-context prompt. The {{diff}} goes early, the output contract on the final lines. Models weight recent tokens, so the format you want is the format you state last.
Variables you'll set
| Variable | Required | What it is |
|---|---|---|
{{diff}} | Yes | The full unified diff for the PR |
{{pr_description}} | No | What the PR claims to do, for scope-drift checks |
{{repo_conventions}} | No | Standards the review must enforce |
{{severity_threshold}} | No | The level that triggers REQUEST CHANGES |
Getting started
- Decide your four passes. Architecture, defects, security, tests is a strong default.
- Write the per-finding row and the verdict line. Lock them.
- Paste a real PR diff into
{{diff}}and its description into{{pr_description}}. - Read the output top-down. Is the worst issue first? Did it catch anything the PR didn't mention?
- If a pass returned filler, add "an empty pass is valid; do not invent findings."
- Save the template so every PR runs the same workflow. The Pull Request Review Workflow Pack ships this as a tested pack: a diff-to-prioritised-report flow across architecture, defect risk, security, and test coverage, with concrete test additions you can paste back.
If you'd rather start from the review standard itself, the Code Review Policy System Prompt sets the dimensions and severity tiers that this PR template enforces.
The Pull Request Review Workflow Pack does this end-to-end. A single {{diff}} variable feeds a multi-pass review that ends in a prioritised report and an approve-or-block verdict, plus ready-to-implement test additions for the gaps it finds. It's part of The Complete AI Prompts Bundle, a one-time lifetime license to the whole catalog and every pack added later, worth it once you're running reviews on more than the odd PR.
The template here is the engine; the standard it runs is a separate choice. For the dimension-and-severity layer that feeds it, see the ai code review prompt that actually finds bugs. And if you're still deciding whether a packaged review pack beats rolling your own, how to choose a reusable AI prompt pack walks the trade-offs.
Browse the developer prompt packs →Common questions
What should an AI PR review prompt include?
Why does a one-off PR review prompt give shallow results?
Can I feed a GitHub PR diff straight into the prompt?
Get the prompt packs this guide is built on
Ready-to-paste prompts with documented variables and worked examples for ChatGPT, Claude, and Gemini. One-time payment, own it forever.
More prompt guides

A Production Readiness Review Prompt That Grades a Service
A service ships, and two weeks later it pages someone at 3 a.m. because nobody asked whether it had alerting before launch. The production readiness review checklist exists to catch that. Most teams k…

Write an AI Code Review Prompt That Actually Finds Bugs
A developer pastes a 400-line diff into ChatGPT, types "review this," and gets back three friendly paragraphs ending in "overall this looks solid." The off-by-one in the pagination loop is still there…

The AI Prompt to Review a Pull Request (With a Findings Contract)
A pull request review prompt that you retype from scratch every time isn't a workflow. It's a habit you'll skip the moment you're busy. The reusable version, with a real AI security code review prompt…