In early 2023, Reuters Institute for the Study of Journalism published findings from its ongoing Digital News Report project showing that newsrooms experimenting with large language model assistants for background research consistently ran into a shared problem: the models confidently described events up to their training cutoffs but could not retrieve anything more recent — and, critically, did not always flag the gap. Reporters working deadline stories on the Silicon Valley Bank collapse in March 2023 found that AI assistants described SVB as a healthy mid-size lender because their training data predated the bank run. The lesson that spread quickly through professional circles was simple but consequential: AI research tools operate from a frozen snapshot of the world, not a live feed.
Every large language model is trained on a corpus of text assembled up to a specific date. After that date, the model has no awareness of new events unless that information is explicitly injected through retrieval tools, fine-tuning updates, or user-provided context. This boundary is called the training cutoff or knowledge cutoff.
The practical consequences for writers are significant. A model whose cutoff falls in early 2023 cannot describe the outcome of a court case decided in late 2023, cannot cite a scientific study published after that date, and cannot reflect policy changes, leadership transitions, or market shifts that occurred afterward. Asking it to do so risks receiving a confident but outdated — or entirely fabricated — answer.
Importantly, the gap between cutoff and current date grows over time. A model released with a 2023 cutoff that is still in active use in 2025 carries a potential two-year blind spot on any current-events query.
A training cutoff is not the same as a release date. Models are often deployed months after their data cutoff, meaning the gap between "what the model knows" and "today" is always larger than the deployment gap alone. Always ask a model directly about its knowledge cutoff before using it for time-sensitive research.
When asked about topics where its training data is thin, contradictory, or absent, an AI model may generate plausible-sounding but factually wrong responses — a behavior researchers call hallucination. For writers, this is most dangerous in three specific research situations: citation retrieval, biographical facts about less-prominent individuals, and technical statistics.
A documented example: in May 2023, a New York federal court case (Mata v. Avianca) came to public attention when a lawyer filed a brief containing citations to court decisions that did not exist — all generated by ChatGPT. The citations had realistic case names, docket numbers, and jurisdictions. When opposing counsel checked, none of the cases could be found. Judge P. Kevin Castel fined the attorneys $5,000 and issued a formal sanction order. The episode made front-page news in legal and journalism circles and became a standard cautionary example for any professional using AI in document-intensive work.
The takeaway for writers: never use an AI-generated citation, statistic, or quotation without independent verification from a primary source. The model's confidence level is not a reliable signal of accuracy.
Understanding limitations does not mean abandoning the tools — it means deploying them correctly. AI assistants offer genuine research leverage in specific tasks where the risks of hallucination are lower and the speed gains are substantial.
Background synthesis: For topics well-documented before the training cutoff, AI can compress hours of reading into a useful orientation summary. A journalist covering a company's antitrust history, or a novelist researching Victorian-era textile manufacturing, can use AI to get oriented quickly before diving into primary sources.
Question generation: AI is excellent at helping writers identify what they do not yet know. Prompting a model to "list ten questions I should be able to answer before writing about X" often surfaces angles the writer had not considered.
Source pathway identification: Rather than asking AI to be the source, ask it to identify what types of sources would contain the answer — government databases, academic journals, professional associations, regulatory filings. This turns the model into a research librarian rather than an encyclopedia.
Think of AI as a first-pass orienter, not a final-pass verifier. Use it to understand the shape of a topic, identify the right questions, and find where authoritative information lives. Then go to those authoritative sources directly for anything that will appear in your published work.
In this lab you will interrogate the AI assistant about its own knowledge limitations and practice the kinds of boundary-testing queries that professional researchers use before trusting AI output. Ask about training cutoffs, probe a recent event to see how the model handles it, and ask the assistant how you should approach verifying its claims.
When the Associated Press formalized its AI usage guidelines in the summer of 2023, the document — portions of which were published and widely discussed — included specific guidance on what the news organization called "precision prompting." AP editors had observed that vague, open-ended queries to AI tools produced sweeping but unverifiable summaries, while narrowly scoped queries with explicit constraints returned outputs that were far easier to check and far more likely to be accurate. The guidance emphasized treating AI outputs as leads, not facts, and required that any AI-assisted background material be traced to at least two independent primary sources before being used in a story. The policy became a reference point for other newsrooms developing their own AI workflows throughout 2023 and 2024.
A well-designed research prompt has four components working together: a scope constraint that limits the domain; a time boundary that acknowledges the cutoff and focuses appropriately; a format directive that shapes output for usability; and an uncertainty request that explicitly asks the model to flag what it does not know.
Compare these two prompts: "Tell me about climate policy." vs. "Summarize the major international climate agreements that existed before 2023, identify any aspects you are uncertain about, and list the government or intergovernmental sources where I could verify each point." The first invites a confident sweep of everything. The second constrains the domain, time-bounds the query, requests source pathways, and activates the model's uncertainty flagging.
The second prompt type is slower to write but dramatically more useful for writers who intend to publish. The AP's internal findings matched what researchers at Stanford's Human-Centered AI Institute reported in 2023: specificity in prompts correlates strongly with output reliability across multiple AI platforms.
1. Scope: "Focusing only on [specific domain or entity]…"
2. Time boundary: "…based on information available before [year], or noting where recency matters…"
3. Format directive: "…give me a structured summary with [bullet points / numbered claims / a table]…"
4. Uncertainty request: "…and explicitly flag any claims where your confidence is low or where I should verify independently."
Effective AI research is iterative. The first query establishes orientation; subsequent queries drill into specific claims, challenge assumptions, and surface what was glossed over. This technique — sometimes called layered questioning — mirrors how experienced journalists approach source interviews: the first question opens the subject, and each follow-up tightens focus based on what was just said.
In practice: after an initial summary, a strong follow-up might be "You said X in your previous response. What is the basis for that claim, and how confident are you in it?" This forces the model to expose its reasoning. When the model produces hedged language ("it is generally believed," "reportedly," "some sources suggest"), that is a signal to pursue primary verification rather than building further queries on that foundation.
A documented example of layered questioning going wrong: in 2023, The Guardian reported on a case where a researcher asking an AI to elaborate on an initial claim inadvertently anchored subsequent queries to the first hallucinated response. Each follow-up accepted the false premise and built more plausible-sounding detail around it. This "hallucination cascade" is a known failure mode — specificity in each individual query, rather than building on previous AI outputs uncritically, is the main defense against it.
One of the most reliable research prompting strategies is to ask AI for the category of sources that would contain authoritative information, rather than asking AI to name specific sources or quote specific texts. This sidesteps the hallucination risk entirely for citation purposes while still leveraging the model's genuine strength: knowledge of which institutions, databases, and publication venues cover which domains.
For example: "What types of government databases or regulatory filings would contain data on pharmaceutical pricing practices in the United States?" is far safer than "Give me three studies that document pharmaceutical pricing practices." The first prompt produces a map; the second invites fabrication.
Washington Post technology reporter Nitasha Tiku documented this approach in a 2023 article about journalists adapting to AI research tools, noting that the reporters who found AI most useful were those who used it to understand the research landscape rather than to extract specific facts.
Ask AI: "Where should I look?" not "What is the answer?" Use AI to build your research map, then navigate that map using primary sources. This separates the orienting function — where AI is strong — from the verification function — where AI is unreliable.
Practice building precision research prompts using all four components: scope constraint, time boundary, format directive, and uncertainty request. Try a topic relevant to your writing. Then experiment with asking for source types rather than source names, and use layered follow-up questions to probe what the AI's initial response glossed over.
In late 2023, The Guardian's editorial technology team published an internal briefing — later referenced in media industry coverage by Press Gazette — describing a tiered verification protocol the newsroom had implemented for AI-assisted research. The framework divided AI outputs into three categories: background context (could inform but not appear in copy), checkable claims (specific facts requiring one primary-source confirmation), and high-risk claims (statistics, quotes, biographical details requiring two independent confirmations from original sources). The protocol reduced AI-related errors reaching subeditors by a documented margin, because it forced researchers to categorize risk before routing claims to verification. The system's key insight was that not all AI outputs carry the same risk level — and treating them identically was producing inefficiencies in both directions.
The Guardian's tiered approach maps cleanly onto a framework any writer can implement. The central move is risk categorization: before verifying anything, you assess how much damage an error in that specific claim would do, and how easily it could be verified if wrong.
Tier 1 — Background orientation: General context, historical framing, conceptual explanations. These shape your understanding but do not appear verbatim in published work. AI can supply these relatively safely, since errors here are caught before they reach the page.
Tier 2 — Specific checkable claims: Dates, named individuals in roles, organizational descriptions, event sequences. These require one primary-source confirmation — a contemporaneous news report, an official document, a named official statement. AI can surface the claim; you confirm it before using it.
Tier 3 — High-risk claims: Statistics and percentages, direct quotations attributed to named individuals, medical or legal claims, financial figures, biographical facts about living individuals. These require two independent primary-source confirmations from original sources, not secondary summaries. AI should not be the proximate source of any Tier 3 claim in published work.
A primary source is not the same as a reputable publication reporting on a primary source. For Tier 3 claims, trace back to the original: the government database, the peer-reviewed paper, the official transcript, the company filing. Secondary reporting is a pathway to the original — it is not the verification itself.
Verification is most efficient when it is built into the research process rather than added at the end. Practically, this means tagging AI-sourced claims during note-taking rather than trying to reconstruct which facts came from AI versus primary sources after the fact.
A simple markup system: during AI-assisted research, mark any claim you intend to use with a colored flag or shorthand notation. [T2] for Tier 2, [T3] for Tier 3. At the end of a research session, you have a clear action list: Tier 2 items each need one primary-source link; Tier 3 items each need two. Nothing moves to your draft until it has the required confirmations.
Science writer Ed Yong, writing in The Atlantic on science communication practices in 2020 and regularly discussing AI research tools in subsequent interviews, has described a similar approach: never let AI-sourced claims "blend in" with verified claims in notes. The spatial separation in the notes maps to verification steps before drafting.
Several publicly accessible tools directly support the verification workflow. The key is knowing which tool addresses which type of claim.
Factual claims about statistics: Government statistical agencies (ONS in the UK, BLS and Census Bureau in the US, Eurostat in Europe) publish primary data directly. JSTOR, PubMed, and Google Scholar link to peer-reviewed originals. Always retrieve the actual paper or report rather than trusting a summary.
Corporate and organizational facts: SEC EDGAR (US), Companies House (UK), and equivalent national registries publish official filings. Leadership titles, financial data, and corporate structure questions answered here are primary-source verified.
Quotation verification: The Internet Archive's Wayback Machine captures page content at specific dates, allowing verification that a public statement was actually made and that it says what the AI claims it says. This is especially useful for statements from websites that have since been edited or removed.
Scientific claims: For AI-generated summaries of scientific research, always retrieve the original abstract at minimum, and note the actual sample size, methodology, and year. AI summaries of scientific papers frequently overstate findings, drop confidence intervals, and misattribute results.
Tag → Tier → Trace. Tag every AI-sourced claim in your notes. Assign each a tier based on risk. Trace each to the required number of original primary sources before it moves to your draft. This three-step habit, applied consistently, is the main defense against AI-assisted factual errors in published work.
Present a set of AI research outputs to the assistant and practice categorizing each as Tier 1 (background), Tier 2 (specific checkable claim), or Tier 3 (high-risk claim requiring two confirmations). Then ask the assistant to help you build a verification action list — identifying which primary-source types would confirm each Tier 2 and Tier 3 claim.
ProPublica's investigative team began formally evaluating AI-assisted research workflows in 2023, with findings described in a Nieman Lab article by reporter Craig Silverman in early 2024. The key observation from ProPublica's process was that AI research tools changed the entry point of investigation but not its core method. Reporters used AI to get oriented faster — to understand the regulatory landscape of an industry, to identify which government agencies held relevant documents, to surface terminology that would improve database search strings. But the pivotal moment of every story — the human source interview, the document read, the expert conversation — remained unchanged. The reporters who used AI most effectively treated it as a way to arrive at those pivotal moments better prepared, not as a substitute for them.
One risk specific to AI-assisted research is what might be called framing capture: the tendency for AI's summary of a topic to shape not just what a writer knows but how they conceptualize the story. If an AI frames a story about a corporate scandal primarily through a financial lens, a writer who absorbs that framing uncritically may produce analysis that misses the human and cultural dimensions. The AI's framing was determined by what was most prominent in its training data — which reflects what was most published, not necessarily what is most significant.
The countermeasure is deliberate perspective-seeking after the AI research phase. Having absorbed the AI's orientation, explicitly ask: what angle is underrepresented here? Who would disagree with this framing? What human story does this financial summary leave out? These questions — best explored through direct human sources — are the corrective that keeps the writer's judgment in control of the story rather than the model's training data distribution.
In 2023, Columbia Journalism Review published analysis showing that AI-assisted news stories about corporate earnings tended to reproduce the framing priorities of financial wire services — not because reporters chose that framing consciously, but because it dominated the AI's training data. Writers who began with AI orientation before human reporting were more likely to reproduce these framing defaults than those who consulted human sources first.
The most powerful integration of AI into longform research is not replacement of human sources but preparation for them. An AI session that produces a solid orientation of a regulatory domain, a corporate history, or a scientific debate can dramatically improve the quality of subsequent expert interviews. The writer arrives knowing the vocabulary, understanding the fault lines in the debate, and able to ask second-order questions rather than first-order definitions.
This is the approach that science journalists at publications including STAT News and Wired described adopting in 2023 coverage of AI tools in journalism. Rather than asking an expert "can you explain what mRNA vaccines do?" — a question that could now be answered by AI — they could ask "the published literature suggests X, but I've seen criticism along the lines of Y — where do you come down, and why?" The AI handled the encyclopedic background; the human source provided the judgment, nuance, and new information that no model could supply.
As AI research tools become normalized, publication standards for disclosing AI use in research — distinct from AI use in drafting — are still developing. However, several principles are already clear from guidance issued in 2023 and 2024 by the Society of Professional Journalists, the Authors Guild, and academic style guides including APA and MLA.
First: AI is not a citable source for factual claims in published work. If a fact came from AI, the citation must trace to the primary source that confirms it — the AI session itself cannot be the reference. This is because AI outputs are not stable, reproducible, or independently verifiable in the way that published texts are.
Second: if AI was used in a research process that informed a published piece, disclosure is increasingly expected, particularly in journalism and academic writing. The form of disclosure varies — some publications use an end-note, others a process note — but transparency about AI's role is the emerging norm.
Third: AI-generated quotes or paraphrases attributed to real named individuals are ethically unacceptable in journalism, nonfiction, and academic writing, regardless of how plausible they appear. Any quotation must trace to a documented, verifiable source.
AI research changes the speed of orientation, not the standard of verification. Use AI to arrive at your primary sources better prepared. Use primary sources — human, documentary, institutional — to verify, deepen, and complicate everything AI gave you. The story is what happens when those two layers meet.
Choose a topic you are currently writing about — journalism, nonfiction, fiction research, or academic writing. Use the AI assistant to get oriented, then practice: (1) identifying the AI's framing choices and what perspective is underrepresented, (2) generating second-order questions for human sources, and (3) building a disclosure note for how you used AI in your research process.