L1
ยท
Quiz
ยท
Lab
L2
ยท
Quiz
ยท
Lab
L3
ยท
Quiz
ยท
Lab
L4
ยท
Quiz
ยท
Lab
Module Test
Module 7 ยท Lesson 1

Why Review Knowledge Decays โ€” and How to Stop It

Every decision made in a code review disappears unless someone captures it. Most teams don't.
What institutional knowledge vanishes every time a reviewer leaves โ€” and what would it cost to keep it?

In 2012, the Knight Capital Group lost $440 million in 45 minutes when a legacy code path โ€” one that senior engineers remembered deactivating years earlier โ€” was silently reactivated during a deployment. The institutional context for that code, including why it existed and why it was disabled, existed only in the memories of people who had since left the firm. No knowledge base. No annotated decision record. No review artifact linking the original rationale to the production system.

Knight Capital filed for emergency financing the next morning. The firm was acquired within weeks. The cost of not writing things down was measured in corporate survival.

The Decay Problem

Code review generates enormous quantities of decisions: why a particular pattern was rejected, why an exception was granted, why a security trade-off was accepted. Almost none of it is recorded in retrievable form. It lives in pull request comment threads that are never searched, in Slack messages that scroll away, and in the heads of engineers who move on.

Research published in the IEEE Transactions on Software Engineering (Rigby & Bird, 2013, "Convergent Contemporary Software Peer Review Practices") found that the median code review comment thread in large open-source projects was consulted again fewer than three times after the PR closed. Teams default to re-litigating the same decisions because retrieval is harder than re-discussion.

The problem compounds with team turnover. A 2022 survey by DX (formerly DevEx) found that engineers at companies with fewer than 200 developers spent an average of 4.1 hours per week recreating context that existed somewhere in the organization but could not be efficiently found. For a 30-engineer team, that is more than 120 hours weekly โ€” three full-time engineers โ€” lost to knowledge decay.

Real Cost Benchmark

Google's internal engineering effectiveness research (Forsgren et al., "DORA State of DevOps Report 2023") identifies "documentation quality" as one of five predictors of elite software delivery performance. Teams in the top quartile for documentation quality deploy 208ร— more frequently than the lowest quartile. The gap is not tooling โ€” it is captured knowledge.

Three Categories of Review Knowledge

Before building a knowledge base, you need to distinguish what is worth capturing. Review knowledge falls into three categories with different shelf lives and retrieval patterns:

Architectural Decisions
Long-lived rationale
Why a pattern was chosen, why an alternative was rejected, what constraints existed at the time. Shelf life: years. Format: Architecture Decision Records (ADRs).
Domain Invariants
Business logic boundaries
Rules the code must never violate โ€” and why. Often discovered through review, rarely documented. Shelf life: medium-term, changes with product. Format: annotated rule catalogs.
Review Precedents
Decisions made under ambiguity
When two reviewers disagreed, how it was resolved and why. What exception was granted and under what conditions. Shelf life: short to medium. Format: decision logs linked to PRs.

The Architecture Decision Record Standard

Michael Nygard introduced the Architecture Decision Record (ADR) format in 2011 in a blog post titled "Documenting Architecture Decisions." The format is deliberately minimal: each ADR captures the context, the decision, the status (proposed, accepted, deprecated, superseded), and the consequences. Nothing more.

ADRs became widely adopted after Thoughtworks placed them in the "Adopt" tier of their Technology Radar in 2016. GitHub's engineering blog documented their adoption of ADRs in 2020, specifically citing the need to preserve reviewer rationale across team growth from dozens to hundreds of engineers.

The critical insight is that ADRs are written at review time, not retroactively. When a significant design decision surfaces during a code review, the reviewer who raises it owns writing the ADR. The PR does not merge until the ADR is committed alongside the code it describes.

Adoption Signal

The MADR (Markdown Architectural Decision Records) format, maintained at adr.github.io, has been starred over 4,000 times and is used as the standard ADR template by teams at Netflix, Zalando, and the UK Government Digital Service. All three organizations cite reviewer alignment as the primary driver of adoption โ€” not documentation completeness.

Key Terms

Knowledge DecayThe progressive loss of institutional context as team members leave, tools change, and undocumented decisions fade from memory.
ADRArchitecture Decision Record โ€” a short document capturing a significant design decision, its context, and its consequences, stored alongside the codebase.
Review PrecedentA documented record of how a specific type of ambiguous review situation was resolved, enabling consistent future decisions.
Decision LogA chronological record of review decisions, cross-referenced to pull requests, used to trace the evolution of a codebase's standards over time.

Lesson 1 Quiz

Why Review Knowledge Decays โ€” and How to Stop It
1. The Knight Capital Group's 2012 loss was primarily caused by which knowledge failure?
โœ“ Correct โ€” Correct. The rationale for disabling the SMARS code path existed only in the memories of departed engineers. No knowledge base preserved it, and it was reactivated without understanding its consequences.
Not quite. The core failure was that the institutional memory of why the code was disabled had never been recorded โ€” it left with the engineers who made the original decision.
2. According to Rigby & Bird (2013), what was the typical retrieval rate for review comment threads after a PR closed?
โœ“ Correct โ€” Correct. The median comment thread was consulted fewer than three times post-close, which is why teams default to re-litigating decisions instead of retrieving existing ones.
The research found the opposite of high reuse. Median threads were consulted fewer than three times โ€” not because the knowledge lacked value, but because retrieval was harder than re-discussion.
3. Which category of review knowledge has the longest shelf life and is best captured as an ADR?
โœ“ Correct โ€” Correct. Architectural decisions are long-lived (years), making them the best fit for ADRs. Domain invariants change with the product, and review precedents have short-to-medium shelf lives.
Architectural decisions have the longest shelf life โ€” years โ€” and the ADR format was specifically designed to preserve the rationale for why design choices were made or rejected at that moment in time.
4. Who owns writing an ADR when a significant design decision surfaces during code review, according to the practice described in the lesson?
โœ“ Correct โ€” Correct. The reviewer who surfaces the significant decision owns writing the ADR, and it must be committed alongside the code before the PR merges โ€” not after.
Ownership belongs to the reviewer who raises the issue, not a manager or documentation specialist. The key constraint is that the ADR is written before merge, not after โ€” retroactive documentation is far less accurate.

Lab 1: Diagnosing Knowledge Decay Risk

Practice identifying what review knowledge your team is currently losing and what format would capture it.

Scenario

Your team has been code-reviewing a codebase for 18 months. You have 400 closed PRs, a Slack workspace with 3 years of history, and four engineers who joined in the last six months and frequently ask questions like "why does this work this way?" You have no formal ADR process.

Work with the AI to audit your team's current knowledge retention situation and design a lightweight capture system.

Start by describing one specific type of decision your team makes repeatedly in code review โ€” the AI will help you categorize its decay risk and suggest the right capture format for it.
Knowledge Decay Audit
Lab 1
Welcome to the Knowledge Decay Audit lab. I'm here to help you diagnose what your team is currently losing during code review and design a practical capture system. To start: describe one specific type of decision that comes up repeatedly in your team's reviews. For example, "we often debate whether to use X pattern or Y pattern," or "we keep revisiting security exception approvals." What's one recurring decision type your team faces?
Module 7 ยท Lesson 2

Structuring a Living Decision Log

A knowledge base that nobody maintains is worse than no knowledge base โ€” it creates false confidence.
How do you build a system that teams actually use instead of letting it become another abandoned wiki?

In 2018, Spotify's engineering blog documented what they called the "tribe knowledge problem." As the company scaled from 300 to 4,000 engineers, teams developed dozens of internal wikis, Confluence spaces, and Notion databases โ€” none of which were maintained. Engineers reported that searching internal documentation was less reliable than asking on Slack, because written content was frequently outdated or contradicted by newer decisions.

Spotify's solution was structural, not technological. They introduced what they called "golden path" documentation โ€” a small set of authoritative, actively maintained decision records that were reviewed quarterly. The key change was accountability: every active decision record had a named owner responsible for keeping it current. Stale records were automatically archived, not just marked as outdated.

Why Wikis Die

The default failure mode for engineering knowledge bases is not malice โ€” it is the accumulation of small omissions. A page is created for a decision. Six months later the decision changes, but the PR author doesn't know the page exists. The page stays accurate-looking but wrong. Another engineer reads it, makes decisions based on it, and the error propagates.

Research by O'Reilly Media (2021, "What Engineers Know About Documentation") found that 68% of engineers reported finding documentation that actively misled them in the prior six months. Only 14% of engineers had a defined process for marking documentation as outdated when they discovered an error.

The problem is not a lack of writing โ€” most engineering teams produce enormous quantities of text. The problem is ownership. Documents without owners decay. The moment a document becomes everyone's responsibility, it becomes no one's responsibility.

The Living Decision Log Structure

A decision log differs from a wiki in one critical way: every entry is tied to a specific PR, time, and decision-maker โ€” not written as general guidance. The structure preserves the original context and makes staleness visible rather than hiding it.

  • 1
    Entry Trigger: Any review comment that uses the words "we should always," "we never," "this is our pattern," or requests an architectural exception automatically triggers a log entry. The reviewer owns creating it before the PR merges.
  • 2
    Mandatory Fields: Date, PR link, decision owner, decision text (one sentence), rationale (two to five sentences), and a review date โ€” typically six months forward. No exceptions to the review date field.
  • 3
    Status Lifecycle: Entries move through Active โ†’ Under Review โ†’ Superseded โ†’ Archived. "Superseded" entries are never deleted โ€” they stay as historical record with a link to the entry that replaced them.
  • 4
    Automated Review Nudges: The review date field is parsed by a CI/CD step or calendar automation that surfaces expiring entries in the team's weekly engineering sync. No manual tracking required.
  • 5
    Single Source of Truth: The log lives in the repository alongside the code, not in a separate tool. Proximity to code is the single strongest predictor of long-term maintenance.
Real Implementation

The UK Government Digital Service (GDS) published their decision log template on GitHub in 2016 as part of their "How to document software architecture" guide. The template is a directory of numbered Markdown files (0001.md, 0002.md) stored at /docs/decisions/ in every service repository. GDS teams report that the file-in-repo approach โ€” as opposed to Confluence or Notion โ€” is what made the practice durable across seven years of team turnover.

Cross-Referencing PRs to Decisions

A decision log entry without a PR link is almost useless. The PR provides the full context โ€” the code that triggered the decision, the alternative approaches considered in comments, the timeline. The decision log entry is the abstract; the PR is the full paper.

Effective cross-referencing works both directions. The decision log entry links to the PR. The PR description includes a standard line: "Decision log: see docs/decisions/0047.md." Many teams enforce this with a PR template field. GitHub's CODEOWNERS feature can require review of docs/decisions/ by a designated documentation steward whenever that directory is modified.

Netflix's engineering culture documentation (published 2023 on the Netflix Tech Blog) describes their use of "decision contexts" โ€” structured PR descriptions that require authors to link any decision that touches existing ADRs. The engineering platform team reports that this single requirement reduced repeated architectural debates by approximately 40% in the services that adopted it.

Ownership Model

Etsy's engineering documentation (2019 Engineering Effectiveness Survey, published internally and partially at Velocity Conference 2019) found that the most durable documentation practice was assigning a named "documentation steward" per service โ€” not per document. One person responsible for all decision records in a service boundary reviewed monthly. Etsy found this reduced stale documentation incidents by 61% compared to shared-ownership models.

Key Terms

Decision LogA repository-resident chronological record of significant review decisions, each tied to a specific PR, owner, and mandatory review date.
Documentation StewardA named engineer responsible for maintaining decision records within a service boundary โ€” reviewing, updating, and archiving entries on a defined schedule.
Status LifecycleThe progression of a decision record through Active, Under Review, Superseded, and Archived states โ€” preserving historical context while making current status unambiguous.
Golden Path DocumentationSpotify's term for a small set of authoritative, actively maintained records that represent current team standards โ€” as opposed to comprehensive but unmaintained documentation.

Lesson 2 Quiz

Structuring a Living Decision Log
5. What was Spotify's primary solution to the "tribe knowledge problem" at scale?
โœ“ Correct โ€” Correct. Spotify's "golden path" approach focused on accountability โ€” named owners, quarterly reviews, and automatic archiving โ€” rather than switching tools or restricting authorship.
Spotify's solution was structural: named document owners, quarterly review cycles, and automatic archiving. The tool didn't matter โ€” the ownership and staleness management did.
6. According to O'Reilly's 2021 research, what percentage of engineers had a defined process for marking documentation as outdated when they found an error?
โœ“ Correct โ€” Correct. Only 14% had defined processes for marking documentation as outdated โ€” which explains why 68% of engineers had been misled by documentation in the prior six months.
The number was strikingly low: only 14%. The majority of engineers encounter misleading documentation but have no formal mechanism to flag or correct it, so the errors persist.
7. What is the single strongest predictor of long-term maintenance for a decision log, according to the GDS experience?
โœ“ Correct โ€” Correct. GDS found that file-in-repo storage โ€” as opposed to external tools โ€” was the key factor in their decision log surviving seven years of team turnover. Proximity to code drives maintenance.
GDS specifically attributed the durability of their practice to storing decisions as files in the repository. External tools added friction that caused the practice to erode during team transitions.
8. Which field in a living decision log is described as mandatory and non-negotiable, even for simple entries?
โœ“ Correct โ€” Correct. The review date field is described as having "no exceptions." It enables automated nudges for re-evaluation and makes the staleness risk visible rather than hidden.
The review date is the mandatory non-negotiable field. It's what enables automated staleness detection and prevents accurate-looking but outdated entries from persisting undetected.

Lab 2: Drafting a Decision Log Entry

Practice writing a properly structured decision log entry for a real or realistic review situation.

Scenario

During a code review last week, your team had a significant debate about whether to use optimistic locking or pessimistic locking for a high-contention database table in your e-commerce checkout service. After 45 minutes of discussion, the tech lead made a call: optimistic locking, with a retry limit of three attempts, because the product team forecasted low contention 95% of the time.

No one wrote it down. You've been asked to create the decision log entry retroactively.

Draft a complete decision log entry for this scenario โ€” the AI will critique your structure, check your mandatory fields, and help you improve the rationale until it meets the standard described in the lesson.
Decision Log Drafting
Lab 2
Welcome to the Decision Log Drafting lab. Your task is to write a properly structured decision log entry for the locking strategy decision described in the scenario above. Try writing a first draft in the chat โ€” include the mandatory fields: date, PR link (use a placeholder), decision owner, a one-sentence decision statement, a rationale of two to five sentences, and a review date six months out. Once you share your draft, I'll critique its structure and help you strengthen it.
Module 7 ยท Lesson 3

Tagging, Taxonomy, and Retrieval

A knowledge base you can't search is a slightly better version of no knowledge base at all.
What tagging and taxonomy systems make review decisions findable in under thirty seconds โ€” even by engineers who weren't there?

In 2020, the Backstage project โ€” originally built by Spotify and open-sourced in March of that year โ€” addressed retrieval failure directly in its design rationale. The team documented that Spotify's internal tools had accumulated over 2,000 service-level decisions that were effectively unsearchable because every team used different tag vocabularies. Searching for "auth" returned different results than "authentication" or "oauth", and there was no canonical taxonomy.

Backstage's "TechDocs" system imposed a controlled vocabulary of top-level categories that all service documentation had to use. Within 18 months of adoption, Spotify's engineering surveys showed that engineers could find relevant decisions in under two minutes โ€” down from an average of 23 minutes using the old Confluence structure. The taxonomy change, not the tooling change, drove the improvement.

Why Free-Form Tags Fail

Most teams that implement a decision log start by adding "tags" as a free-form text field. Within three months, the tags degrade. One engineer writes "security." Another writes "auth." A third writes "authentication-decisions." A fourth writes "sec-review." They all mean the same thing, but a search for any one tag misses the others.

This is not a hypothetical. Atlassian's internal study of Confluence usage patterns (2019, partially published in their "State of Teams" report) found that the average Confluence space developed a 4:1 tag synonym ratio within six months of creation โ€” meaning four different tags existed for every distinct concept. The result was that engineers abandoned tag-based search in favor of full-text search, which in turn returned too many results to parse.

The fix is a controlled vocabulary: a predefined, finite list of approved tags, with clear definitions, that every decision log entry must use. New tags require a proposal and team approval โ€” they are not added unilaterally.

A Practical Taxonomy for Code Review Decisions

Effective taxonomies for code review knowledge bases typically use two axes: a domain axis (what part of the system the decision affects) and a decision-type axis (what kind of decision it is). Combining both axes in every entry enables precise retrieval.

Domain Tags (examples)
System boundary labels
security ยท data-model ยท api-contract ยท performance ยท dependency ยท testing ยท infrastructure ยท observability ยท auth ยท payments
Decision-Type Tags
What kind of call was made
pattern-choice ยท exception-granted ยท standard-established ยท deprecated ยท rejected-alternative ยท escalated ยท deferred
Severity Tags
How binding is this decision
team-local ยท service-wide ยท org-wide ยท external-constraint

A well-tagged entry might read: tags: security ยท exception-granted ยท service-wide. A new engineer wondering "has anyone ever granted a security exception for this service?" can retrieve every relevant entry without knowing what keywords were used when the original decision was written.

Building a Searchable Index

For teams storing decision logs as Markdown files in a repository, a simple index file (DECISIONS_INDEX.md) at the root of the decisions directory enables fast retrieval. The index is a table with columns for entry number, date, primary domain tag, decision-type tag, one-sentence summary, and PR link. It is updated automatically as part of the CI check that validates new decision log entries.

Teams using GitHub can leverage GitHub's built-in code search across repository files, which searches Markdown content. The critical practice is consistent field naming โ€” using "Status:" as a field label in every entry means a search for "Status: Active security" returns all active security decisions. Inconsistent field naming makes even full-text search unreliable.

Amazon's internal engineering wiki practices (described in "Working Backwards," Bryar & Carr, 2021) rely on a similar principle: structured templates with fixed field labels ensure that even free-text content is findable. Amazon's "Correction of Error" (CoE) documents, which serve a similar function to decision logs in operational contexts, use identical field labels across thousands of documents specifically to enable cross-team search.

Search Latency Benchmark

The DORA 2023 report identifies documentation findability โ€” defined as locating relevant guidance in under 30 seconds โ€” as a key differentiator for elite engineering teams. Teams that cannot meet this threshold for review decisions are classified as "documentation bottlenecked," a condition correlated with 2.4ร— higher review cycle time and 1.8ร— higher defect escape rate.

Connecting Tags to Onboarding

A well-tagged knowledge base becomes the most valuable onboarding artifact a team can produce. Instead of a static "new engineer guide," new team members can search for service-wide ยท pattern-choice and retrieve every foundational decision the team has made about how this service works. The reasoning is included. The alternatives considered are noted. The review date tells them how current each decision is.

Stripe's engineering blog (2021, "How we think about onboarding engineers") describes a "decision trail" onboarding approach where new engineers are given a curated list of 10โ€“15 decision log entries to read in their first week, not documentation pages. Engineers who onboarded using the decision trail approach reported being productive 31% faster than those who used traditional wiki-based onboarding in Stripe's internal survey.

Key Terms

Controlled VocabularyA predefined, finite list of approved tags with clear definitions, used consistently across all decision log entries to prevent synonym fragmentation.
Tag Synonym RatioThe ratio of distinct tag labels to distinct concepts in a knowledge base โ€” a ratio above 2:1 typically indicates retrieval failure risk.
Domain TagA tag identifying which system boundary or technical area a decision affects (e.g., security, data-model, api-contract).
Decision-Type TagA tag identifying what kind of review decision was made (e.g., pattern-choice, exception-granted, standard-established).

Lesson 3 Quiz

Tagging, Taxonomy, and Retrieval
9. What was the root cause of Spotify's 2,000-entry documentation retrieval failure before Backstage's TechDocs system was introduced?
โœ“ Correct โ€” Correct. Searching for "auth" missed "authentication" and "oauth" โ€” all meaning the same thing. The taxonomy fragmentation, not the tool, made 2,000 decisions effectively unsearchable.
The failure was taxonomic, not technical. Teams used different vocabularies โ€” "auth," "authentication," "oauth" โ€” for the same concept, making tag-based search unreliable across team boundaries.
10. What tag synonym ratio did Atlassian's Confluence study find developing within six months in most spaces?
โœ“ Correct โ€” Correct. Atlassian found a 4:1 ratio โ€” four tag variants per concept โ€” within six months in the average Confluence space. This drove engineers to abandon tag search entirely in favor of less precise full-text search.
The ratio was 4:1 โ€” four different tags for every distinct concept. This is why free-form tagging degrades quickly without a controlled vocabulary to constrain it from the start.
11. In the two-axis taxonomy described in the lesson, which combination of tags would correctly identify "we approved using HTTP basic auth as a temporary measure for this internal service"?
โœ“ Correct โ€” Correct. Security/auth is the domain. An exception was granted (not a standard established). The scope is service-wide since it applies to one internal service. This combination enables precise retrieval.
The correct combination is security/auth (domain) ยท exception-granted (type) ยท service-wide (scope). "Standard-established" would mean this is the new normal โ€” but approving basic auth temporarily is an exception, not a standard.
12. What did Stripe's internal survey find about engineers onboarded using a "decision trail" approach versus traditional wiki-based onboarding?
โœ“ Correct โ€” Correct. Stripe found a 31% faster time-to-productivity for decision trail onboarding โ€” because reading why decisions were made (with alternatives considered) accelerates understanding of a codebase far more than static documentation pages.
Stripe's survey showed decision trail engineers became productive 31% faster. Reading actual decisions โ€” with rationale and alternatives โ€” builds accurate mental models of a codebase more efficiently than general documentation.

Lab 3: Designing a Tag Taxonomy

Build a controlled vocabulary for your team's code review knowledge base that prevents synonym fragmentation.

Scenario

You're the designated documentation steward for a six-service fintech platform handling payments, user accounts, fraud detection, notifications, reporting, and an internal admin API. Your team has 12 engineers and reviews roughly 60 PRs per week.

You need to design a tag taxonomy โ€” domain tags, decision-type tags, and scope tags โ€” that will work for the next two years without needing constant expansion.

Start by proposing your domain tag list (the system boundaries your team owns). The AI will stress-test your taxonomy against edge cases from your six service areas and help you refine it into a controlled vocabulary your team can actually enforce.
Taxonomy Design Workshop
Lab 3
Welcome to the Taxonomy Design lab. You're building a controlled vocabulary for a fintech platform with six service areas: payments, user accounts, fraud detection, notifications, reporting, and an internal admin API. Let's start with your domain tag list โ€” these should map to the system boundaries your team owns, not to specific technologies. Try to keep it to 8โ€“12 tags maximum. What are your proposed domain tags? I'll test them against edge cases to see where the taxonomy breaks down.
Module 7 ยท Lesson 4

Scaling Knowledge Bases Across Teams and Time

What works for one team of eight engineers breaks at four teams of eight. The architecture of a knowledge base must scale with the organization.
How do you prevent a knowledge base from becoming a bureaucratic burden that slows down the reviews it's meant to support?

In 2022, Zalando โ€” the European e-commerce platform โ€” published a retrospective on their engineering decision documentation practice in their engineering blog. Between 2016 and 2019, they had built a centralized ADR repository containing over 800 architecture decisions across 300+ engineering teams. By 2021, the system was effectively unused. The overhead of routing decisions through a central review process had created a bottleneck: engineers waited an average of 11 days for an ADR to be approved before merging code. Teams stopped creating ADRs.

Zalando's 2022 solution was federated ownership: each domain (roughly 10โ€“20 teams) maintained its own ADR repository with its own review process. Cross-domain decisions required only the affected domain stewards to approve, not a central board. Average approval time dropped to 1.4 days. ADR creation rate increased 340% within six months of the change.

The Centralization Trap

The instinct at scale is to centralize: one knowledge base, one taxonomy, one review process. This instinct is wrong. Centralization creates approval queues that interrupt the flow of review work. It creates taxonomy committees that move slower than engineering teams. It creates a single point of staleness โ€” when the central knowledge base falls behind, all teams are equally lost.

The research supports federation. A 2021 study by Forsgren, Humble, and Kim (published in the "Accelerate" follow-up research) found that teams with locally owned documentation outperformed centrally documented teams on deployment frequency (1.9ร— faster), change failure rate (41% lower), and mean time to recovery (2.1ร— faster). The pattern mirrors microservices architecture: local ownership, agreed-upon interfaces, not shared mutable state.

The Interface: Cross-Team Decision Referencing

Federated knowledge bases require agreed interfaces for cross-team referencing. When Team A's service depends on Team B's decision โ€” for example, Team B decides to deprecate an API contract โ€” Team A's decision log must be able to reference Team B's entry. The entry format must be stable enough to bookmark across repository boundaries.

The canonical solution is a stable URL scheme for decision entries: a permalink format that includes the organization, the domain, and the entry number. GitHub Pages, used by the UK GDS and several public-sector engineering teams, provides this by default when decisions are stored as numbered Markdown files in a repository with Pages enabled. A decision at org/domain/decisions/0042 is permanently referenceable from any team's record.

Sunset Policies and Archive Governance

A knowledge base that only grows is a knowledge base that becomes impossible to navigate. Effective scaling requires explicit sunset policies: rules for when decision records move from Active to Archived, and what happens to dependent decisions when a foundational one is superseded.

  • 1
    Automatic Review at 12 Months: Any entry that has not been reviewed within 12 months of its creation date is automatically moved to "Under Review" status. The decision owner receives a notification. They have 30 days to confirm, update, or archive.
  • 2
    Cascade Flagging: When an entry is marked Superseded, an automated check queries all entries that reference it and flags them for review. Dependent decisions don't automatically become invalid, but they must be explicitly confirmed as still correct given the change.
  • 3
    Archive, Don't Delete: All archived entries remain permanently accessible via their original URL, marked with an archive banner. The historical record has value โ€” particularly for post-incident analysis and regulatory audit trails.
  • 4
    Annual Portfolio Review: Once yearly, the domain steward reviews all Active entries as a portfolio โ€” not individual items. They look for contradictions, redundancies, and gaps. The goal is coherence, not completeness.
Regulatory Context

Financial services organizations operating under SOX, PCI-DSS, or FCA regulations have a non-optional reason to maintain decision logs: audit requirements. The Financial Conduct Authority's SYSC 13 rules require documented evidence of material technology risk decisions for regulated firms. Several UK fintech firms (Monzo, Starling Bank โ€” per their engineering blog posts) use their ADR repositories as part of their regulatory evidence trail, explicitly citing the archive policy as a compliance feature.

Integrating the Knowledge Base into Review Workflow

A knowledge base that requires engineers to leave their review workflow to consult is a knowledge base that will not be consulted. The final scaling challenge is integration: surfacing relevant decisions at the moment a review comment is written, not as a separate lookup step.

GitHub's code review interface supports this via PR templates with embedded decision log links. A template field reading "Related decisions (check docs/decisions/ if touching auth, payments, or data-model)" surfaces the knowledge base at exactly the moment an engineer is forming a review opinion. The ask is not "go find documentation" โ€” it is "confirm you've checked the relevant decisions."

Linear, the project management tool used by many engineering teams, introduced "document links" in their issue view in 2022 specifically to solve this integration problem. Linear's engineering team documented (in their changelog) that teams which linked decision records to Linear issues saw 28% fewer repeated discussions on the same architectural topics in the following quarter.

The final integration point is onboarding. Stripe's "decision trail" approach (discussed in Lesson 3) scales naturally: new engineers receive a curated reading list of 10โ€“15 domain-specific decisions, filtered by the services they will initially work on. The knowledge base becomes the primary onboarding document โ€” not an appendix to it.

Key Terms

Federated OwnershipA knowledge base governance model where each domain or team cluster owns and maintains its own decision records, with agreed interfaces for cross-domain referencing โ€” as opposed to a single centralized repository.
Cascade FlaggingAn automated process that identifies decision records dependent on a superseded entry and flags them for explicit review โ€” preventing silent invalidation of dependent decisions.
Sunset PolicyExplicit rules governing when and how decision records move from Active to Archived, preventing knowledge base bloat while preserving historical record.
Permalink SchemeA stable, predictable URL format for decision entries that enables cross-team and cross-repository referencing without link rot.

Lesson 4 Quiz

Scaling Knowledge Bases Across Teams and Time
13. What caused Zalando's centralized ADR system โ€” which grew to 800+ entries across 300+ teams โ€” to become effectively unused by 2021?
โœ“ Correct โ€” Correct. The 11-day average approval wait made ADR creation a blocker for merging code. Engineers chose the path of least resistance: skip the ADR. The bureaucratic overhead destroyed the practice.
The failure was process-driven: an 11-day average approval time made ADR creation actively harmful to engineering velocity. When documentation blocks shipping, engineers stop documenting.
14. According to the Forsgren, Humble, and Kim research cited in the lesson, which metric showed the largest improvement for teams with locally owned documentation versus centrally documented teams?
โœ“ Correct โ€” Correct. MTTR showed the largest relative improvement at 2.1ร—. Locally owned documentation means engineers can find relevant context faster during incidents โ€” reducing recovery time directly.
All three metrics improved, but MTTR showed the largest relative gain at 2.1ร—. Local documentation means faster incident context retrieval, which directly compresses recovery time.
15. Which of the following correctly describes the "cascade flagging" process in a knowledge base sunset policy?
โœ“ Correct โ€” Correct. Cascade flagging is triggered by a Superseded status change, not by age alone. The goal is to ensure dependent decisions are explicitly confirmed as still valid โ€” preventing silent invalidation.
Cascade flagging is triggered when an entry is marked Superseded. It then identifies all entries that reference the superseded decision and flags them for explicit review โ€” because a foundational change may invalidate dependent decisions.
16. What outcome did Linear's engineering team document after teams linked decision records to Linear issues?
โœ“ Correct โ€” Correct. Linking decision records to issues surfaced relevant knowledge at the moment of need โ€” reducing the rate at which teams re-debated decisions they had already documented and resolved.
Linear documented a 28% reduction in repeated discussions on the same architectural topics. Surfacing decisions at the point of work โ€” not requiring a separate lookup โ€” is what drove the change.

Lab 4: Designing a Scaling Strategy

Plan how your knowledge base survives team growth, leadership changes, and organizational restructuring.

Scenario

Your startup has grown from 8 engineers to 45 in 18 months. You have three product domains (Core Platform, Data & Analytics, and Customer Experience) with 3โ€“5 engineering teams each. You have a working decision log in a single repository โ€” 180 entries โ€” but teams are starting to complain that cross-team decisions take too long to approve and that the taxonomy is getting noisy.

You're being asked to present a scaling recommendation to the engineering leadership team next week. It must address governance, taxonomy, cross-team referencing, and sunset policy.

Work through your scaling strategy with the AI. Start by identifying the single biggest risk to your current system as you scale โ€” the AI will help you build a federated architecture that addresses it, ending with a recommendation you could actually present to leadership.
Scaling Strategy Workshop
Lab 4
Welcome to the Scaling Strategy lab. You have 45 engineers across three domains with a single growing decision log that's starting to show strain. Before designing a federated architecture, let's diagnose accurately. What do you see as the single biggest risk to your current system as you scale? Think about governance, taxonomy drift, cross-team coordination, or maintenance burden โ€” which failure mode concerns you most, and why? Once we agree on the primary risk, we can build a scaling strategy that addresses it directly.

Module 7 Test

Building a Review Knowledge Base ยท 15 questions ยท Pass at 80%
1. The Knight Capital Group's 2012 loss is primarily cited in this module as an example of which failure category?
โœ“ Correct โ€” Correct. The lesson uses Knight Capital to illustrate how institutional memory โ€” why a code path was disabled โ€” can vanish when it exists only in the minds of engineers who have since left.
Knight Capital exemplifies knowledge decay: the rationale for disabling a critical code path lived in departed engineers' memories, not in any retrievable document. No review artifact, no ADR, no decision log.
2. What did the DX 2022 survey find about time spent recreating inaccessible context at smaller companies?
โœ“ Correct โ€” Correct. 4.1 hours weekly per engineer โ€” equivalent to more than three full-time engineers' labor on a 30-person team โ€” wasted recreating context that existed somewhere in the organization.
The DX survey found 4.1 hours per engineer per week lost to recreating accessible-but-not-findable context. At 30 engineers, that exceeds 120 hours weekly โ€” more than three full-time equivalents.
3. What does the DORA 2023 report identify as a key predictor of elite software delivery performance related to this module's topic?
โœ“ Correct โ€” Correct. DORA 2023 identifies documentation quality as one of five predictors of elite delivery performance, with top quartile teams deploying 208ร— more frequently than the lowest quartile.
DORA 2023 cites documentation quality โ€” not tooling or process metrics โ€” as one of five elite team predictors. Top quartile documentation quality correlates with 208ร— higher deployment frequency.
4. What is the correct status sequence for a decision record that was originally accepted and is now replaced by a newer decision?
โœ“ Correct โ€” Correct. Superseded entries are never deleted โ€” they move to Archived with a link to the replacement entry, preserving the historical record for post-incident analysis and audits.
The lifecycle is Active โ†’ Superseded โ†’ Archived, with the superseded entry linking to its replacement. Deletion is never correct โ€” the historical chain has value for incident analysis and regulatory auditing.
5. According to the lesson, what specific language in a review comment should trigger creating a decision log entry?
โœ“ Correct โ€” Correct. These phrases signal a normative claim โ€” an assertion about how the team should always or never behave โ€” which is precisely the knowledge that decays fastest if not captured.
The trigger phrases are normative claims: "we should always," "we never," "this is our pattern," or exception requests. These are the review moments that establish or modify standards โ€” exactly what a knowledge base exists to preserve.
6. What did the UK Government Digital Service find was the key factor in their decision log practice surviving seven years of team turnover?
โœ“ Correct โ€” Correct. GDS attributed the practice's durability specifically to file-in-repo storage. External tools added friction that caused the practice to erode during leadership transitions.
GDS found that file-in-repo storage โ€” numbered Markdown files at /docs/decisions/ โ€” was the single factor that kept the practice alive through seven years of turnover. Proximity to code drives maintenance.
7. How does the module describe the relationship between a decision log entry and its associated PR?
โœ“ Correct โ€” Correct. The decision log entry abstracts the key decision, while the PR retains the full deliberation โ€” alternatives considered, timeline, code context. Both are needed; neither replaces the other.
The module uses the abstract/paper analogy: the entry summarizes the decision in retrievable form, while the PR preserves the full deliberation. Losing either half means losing part of the record.
8. What tag synonym ratio does the lesson identify as indicating retrieval failure risk?
โœ“ Correct โ€” Correct. A tag synonym ratio above 2:1 โ€” more than two tags per distinct concept โ€” indicates that retrieval is likely to fail because searches for any single tag miss synonymous entries.
The lesson cites 2:1 as the retrieval failure threshold. Atlassian's Confluence research found teams typically reaching 4:1 within six months without a controlled vocabulary โ€” double the danger threshold.
9. Which of the following is an example of a correctly structured two-axis tag for "we decided to use Redis for session caching instead of Memcached across all services"?
โœ“ Correct โ€” Correct. Infrastructure is the domain (caching technology). Pattern-choice is the type (choosing Redis over Memcached). Org-wide is the scope (applies across all services). All three axes are correct.
The correct combination is infrastructure (caching is infrastructure), pattern-choice (selecting Redis over Memcached), org-wide (all services). "Rejected-alternative" would describe the Memcached option in a separate entry, not the primary decision.
10. What specific finding did Netflix report about cross-domain architectural debates after introducing "decision contexts" in PR descriptions?
โœ“ Correct โ€” Correct. Netflix's engineering platform team reported approximately 40% fewer repeated architectural debates in services that required PR descriptions to link relevant ADRs โ€” because the decision was surfaced before the debate began.
Netflix reported ~40% fewer repeated architectural debates in services using decision-context PR descriptions. The 31% figure is from Stripe's onboarding research; the 340% ADR creation increase is from Zalando's federation change.
11. What is the primary governance lesson from Zalando's 2022 federated knowledge base retrospective?
โœ“ Correct โ€” Correct. Zalando's 11-day central approval queue caused engineers to stop creating ADRs entirely. Their federated solution cut approval time to 1.4 days and saw a 340% increase in ADR creation rate.
Zalando's key lesson is that bureaucratic overhead kills documentation practices. The 11-day approval wait caused complete abandonment. Federation restored the practice by removing the bottleneck โ€” not by changing the format or tool.
12. What does the module recommend happen to a decision record that references a superseded entry?
โœ“ Correct โ€” Correct. Cascade flagging does not automatically invalidate dependent entries โ€” it requires explicit human confirmation. Dependent decisions may still be valid despite the superseded reference; the review ensures accuracy.
Cascade flagging triggers a required review, not automatic invalidation. A dependent entry might remain perfectly valid even after its reference is superseded โ€” but it must be explicitly confirmed, not silently assumed.
13. Which regulatory frameworks are cited in the module as giving financial services firms a compliance reason to maintain decision logs?
โœ“ Correct โ€” Correct. SOX, PCI-DSS, and the FCA's SYSC 13 rules all require documented evidence of material technology risk decisions โ€” making ADR repositories a regulatory compliance artifact for regulated firms.
The module cites SOX, PCI-DSS, and FCA SYSC 13 as the relevant frameworks โ€” all of which require documented evidence of material technology risk decisions. Monzo and Starling Bank are cited as firms using ADR repositories for regulatory compliance.
14. According to the DORA 2023 research cited in Lesson 3, what is the documentation findability threshold that distinguishes elite engineering teams?
โœ“ Correct โ€” Correct. DORA 2023 defines documentation findability as locating relevant guidance in under 30 seconds. Teams failing this threshold show 2.4ร— higher review cycle time and 1.8ร— higher defect escape rate.
The DORA threshold is 30 seconds โ€” not minutes. Teams that cannot meet this mark for review decisions show 2.4ร— longer review cycles and 1.8ร— higher defect escape rates, classified as "documentation bottlenecked."
15. Michael Nygard introduced the ADR format in 2011. Which organization's Technology Radar first placed ADRs in the "Adopt" tier, driving widespread engineering adoption?
โœ“ Correct โ€” Correct. Thoughtworks placed ADRs in the "Adopt" tier of their Technology Radar in 2016, which drove widespread adoption across the industry. GitHub documented their own ADR adoption in 2020.
Thoughtworks placed ADRs in their Technology Radar "Adopt" tier in 2016 โ€” four years after Nygard's original proposal. That endorsement drove mainstream adoption. GitHub's blog post in 2020 was a downstream effect of the Thoughtworks recommendation.