AI and Automation
12 min read

How AI Understands Pull Request Context

A technical look at how AI models analyze PR data (descriptions, commits, diffs, and labels) to generate release notes that capture intent, not just changes.

By the Engammo Team

When you merge a pull request, there is a lot of context scattered across different places. The PR title gives a quick summary. The description (if someone wrote one) explains the motivation. The commit messages track the evolution of the work. The diff shows exactly what changed. Labels and linked issues add categorization and business context. The challenge for any automated release note system is figuring out which of these signals matter and how to combine them into something a human would actually want to read.

This is not a search problem. It is a comprehension problem. The AI needs to understand not just what the code change does, but why it was made and who cares about it. A customer cares that "CSV exports now include the created_at timestamp." They do not care that "added created_at column to ExportSerializer and updated the corresponding factory."

The signal hierarchy

Not all PR data is equally useful. Through building and iterating on release note generation, we have found a fairly reliable hierarchy of signal quality:

PR descriptions are the highest-value source. When an engineer takes the time to write a description, they are usually explaining the "why": the user-facing motivation or the bug that was reported. This is exactly the information that makes a good release note. The problem is that PR descriptions are optional, and a surprising number of them are either empty or filled with template boilerplate that adds no real information.

Commit messages come next. They tend to be more technical than PR descriptions, but they capture the incremental thinking of the developer. A series of commits like "add retry logic to webhook handler," "add exponential backoff," "add max retry limit" tells a story: the developer was making webhook delivery more reliable. A good AI can synthesize that sequence into "Improved webhook reliability with retry logic and exponential backoff."

Labels and linked issues provide categorization. A PR labeled "bug" with a linked GitHub issue that says "Export fails when date range exceeds 90 days" gives the AI both the category (bug fix) and the user-facing description of the problem. This is gold for release note generation.

The diff is the source of truth but the hardest to interpret. It tells you exactly what changed at the code level, but translating code changes into user-facing descriptions requires understanding the codebase architecture. A change to a file called invoice_pdf_generator.rb is probably about invoice PDF generation. A change to utils.py could be about anything.

Handling the noise

The biggest challenge is not finding good signals. It is filtering out bad ones. PR templates are a major source of noise. Many teams use templates with sections like "Description," "Testing," "Screenshots," and "Checklist." If the developer did not fill in the description section, you get the template header with no content. If the AI naively includes this, the release note says something useless like "Description: N/A. Testing: Added unit tests."

We spent a lot of time building heuristics to detect template-only content. If a PR body contains only section headers with no meaningful content under them, we ignore it and fall back to commit messages. If it contains checkbox lists (the typical "I have read the contributing guidelines" checklist), we strip those out. The goal is to extract the parts where a human actually wrote something, and discard the structural noise.

Another source of noise is merge commits and automated commits. Dependabot PRs, for instance, have very predictable descriptions ("Bumps lodash from 4.17.20 to 4.17.21") that can be summarized much more concisely. CI-generated commits, version bumps, and auto-formatted code changes all need special handling to avoid cluttering the release notes with things nobody cares about.

From code context to user context

The hardest part of release note generation is the translation from developer context to user context. Consider a PR that changes a database query to add an index. The developer thinks of this as a performance optimization. The user thinks of it as "the dashboard loads faster." The release note should say something closer to the user's version.

AI models handle this translation by drawing on training data that includes millions of examples of how developers describe changes versus how changelogs present them. The model learns that "add database index on users.email" maps to "improved performance for user lookup operations," not because it understands databases, but because it has seen enough examples of that pattern.

This is also where the PR title matters. Developers who write good PR titles ("Fix: Dashboard takes 10s to load with large datasets") give the AI a strong starting signal. Developers who write "fix bug" as their PR title are making the AI's job much harder. One of the indirect benefits of automated release notes is that teams start writing better PR titles and descriptions once they see their words being turned into customer-facing notes.

The summarization challenge

A single PR might touch 40 files and include 12 commits. The release note for that PR should be 1-2 sentences. That is an extreme compression ratio, and getting it right requires the AI to make judgment calls about what matters most.

The approach that works best is hierarchical summarization. First, group the commits by their intent (bug fix, feature addition, refactoring). Then, identify the primary intent of the PR, usually the one described in the title or the first paragraph of the description. Finally, generate a note that leads with the primary intent and optionally mentions secondary changes if they are significant.

For example, a PR titled "Add team billing dashboard" that also includes commits for fixing a CSS bug and updating a dependency would generate a note focused on the billing dashboard. The CSS fix and dependency update are not worth mentioning in the release note because they are incidental to the main purpose of the PR.

Quality feedback loops

No automated system gets it right every time. The most important thing is having a feedback mechanism. When a generated note is wrong or misleading, someone should be able to edit it. Over time, those edits become training signals, not for the global model, but for understanding which types of changes need more careful handling in your specific codebase.

The teams that get the most out of automated release notes are the ones that treat it like any other tool: set it up, watch the output for a week, adjust the configuration, and iterate. The first batch of generated notes will not be perfect. But they will be consistent, complete, and produced without anyone having to remember to write them. That baseline is worth more than the occasional perfectly-crafted manual entry that sits next to six empty ones.

If you are curious about trying this with your own PRs, see how Engammo processes pull request data to generate release notes automatically.

See automated release notes in action

Connect your GitHub repositories and start generating AI-powered release notes in under 2 minutes.