AI Citation Hallucinations: A Hidden Threat to Scientific Publishing

Science relies on evidence that can be verified. Citations allow readers, reviewers, and researchers to trace claims back to their original sources and assess whether the evidence supports the conclusions being presented. They also connect new studies to earlier discoveries, methods, and theories, creating a documented chain of scientific knowledge. Because citations are widely used to measure research impact and influence, they are a fundamental part of academic and scientific publishing.

But what happens when a citation leads nowhere? A researcher may encounter a reference that appears legitimate, complete with a convincing title, authors, and journal details, only to discover that the cited study does not exist.

A recent analysis of 111 million citations across 2.5 million research papers found a rise in these non-existent references following the widespread adoption of large language models (LLMs), highlighting growing concerns about citation hallucinations in scientific publishing.

What Are AI Citation Hallucinations and Why Do They Matter?

AI citation hallucinations occur when a large language model generates a reference that appears legitimate but cannot be verified in academic databases. These references often contain realistic titles, plausible author names, and convincing publication details. The use of AI tools in academic writing has expanded rapidly in last 5 years. Researchers use large language models to improve language, summarize literature, organize drafts, and assist with writing. While these tools can save time, they can also generate false information that appears accurate. In scientific writing, this sometimes results in fabricated references.

To measure the scale of the problem, researchers examined citations from arXiv, bioRxiv, SSRN, and PubMed Central. Together, these repositories cover fields such as computer science, mathematics, medicine, biology, environmental science, and the social sciences. The study analyzed approximately 111 million citations from papers published between 2020 and 2025. Researchers compared references against major bibliographic databases including OpenAlex and Semantic Scholar. Additional verification involved title matching, reference cleaning, and searches through Google Scholar. References that could not be verified after these checks were classified as unmatched citations.

The researchers did not assume that every unmatched citation was fabricated. Some references may be difficult to verify because of indexing limitations, formatting errors, or incomplete database coverage. To account for this, citation patterns after widespread LLM adoption were compared with historical trends. Growth above normal error levels was treated as the likely signal of AI-generated hallucinations.

How Many Hallucinated Citations Are Appearing in Research Papers?

The analysis found a clear change after late 2022. Before that period, unmatched citation rates remained relatively stable. Following the release and adoption of AI writing tools, those rates began increasing across all repositories examined. By August 2025, estimated hallucination rates reached approximately 0.39 percent in arXiv, 0.21 percent in bioRxiv, 1.91 percent in SSRN, and 0.27 percent in PubMed Central.

These percentages may appear small, but they represent a large number of citations across millions of papers. The researchers estimated that more than 146,000 hallucinated citations entered these repositories during 2025 alone. One of the study’s most important findings involved distribution. Fabricated references were not concentrated in a small group of poor-quality papers. Instead, most affected papers contained only a few problematic citations. This pattern makes detection difficult. A manuscript with one fabricated reference and dozens of legitimate citations can appear entirely credible. Reviewers rarely verify every source in a reference list, particularly in papers with extensive bibliographies. As a result, hallucinated citations can pass through editorial and review processes without being identified.

The study found citation hallucinations across all major disciplines examined. Higher rates were observed in computer science and the social sciences. These fields also showed stronger indicators of AI-assisted writing. Researchers also estimated AI-assisted writing within paper abstracts and found a statistically significant association between inferred LLM use and citation hallucination rates. Papers showing stronger signs of AI assistance were more likely to contain unverifiable references. While this association does not prove causation, it is consistent with the hypothesis that increased use of large language models is contributing to the rise in fabricated citations.

Can Peer Review and Moderation Stop Citation Hallucinations?

The study examined whether existing quality-control systems are successfully identifying fabricated references before publication. Scientific publishing includes several safeguards, including repository moderation, editorial screening, and peer review. These processes are designed to detect problems before research enters the scholarly record. The findings suggest that some hallucinated citations are detected, but many remain in published papers.

Researchers analyzed nearly 31,000 manuscripts rejected by arXiv moderators and found that rejected submissions contained higher hallucination rates than accepted papers. This indicates that moderation systems identify some problematic manuscripts. However, the study estimated that nearly 79 percent of non-existent citations still entered the repository. The researchers also tracked bioRxiv preprints that later appeared in PubMed Central. More than 85 percent of hallucinated citations present in the preprints remained in the final published articles.

These results suggest that peer review alone is not sufficient to eliminate fabricated references. Reviewers typically focus on research methods, results, and conclusions. Verifying every citation in a manuscript is often impractical. The implications extend beyond individual papers. Scientific literature is increasingly used by search engines, citation databases, knowledge graphs, and AI training datasets. Once fabricated references enter these systems, they can become part of the information environment used by future researchers and automated tools.

The researchers describe a possible feedback cycle. AI systems generate fabricated citations. Some of those citations enter the scientific record. Future AI models trained on that literature may encounter and reproduce the same references. The study also highlights a broader issue. Verifying that a citation exists is relatively straightforward because bibliographic databases provide a clear reference point. Determining whether a real citation actually supports the claim being made is much harder. A citation can be genuine and still be used inaccurately.

The known limitations are database coverage issues and discipline-specific citation practices. For that reason, the reported estimates are described as conservative rather than definitive. Even with those limitations, the study provides quantitative evidence that AI-related citation hallucinations are entering scientific literature across multiple disciplines and that existing review systems detect only a portion of these errors.

FAQs on AI Citation Hallucinations

Q: What is an AI citation hallucination in scientific research?

A: An AI citation hallucination occurs when a large language model generates a reference that appears legitimate but cannot be verified in academic databases. These citations often include realistic titles, author names, and publication details, making them difficult to identify without verification.

Q: How common are hallucinated citations in research papers published with AI assistance?

A: A large-scale study analyzing 111 million citations found that hallucinated citations have increased since the widespread adoption of large language models. The researchers estimated that more than 146,000 hallucinated references entered major scientific repositories during 2025, although the true number may be higher or lower due to database limitations.

Q: Why do AI tools create fake academic references?

A: Large language models generate text by predicting likely patterns rather than verifying facts against citation databases. When asked to provide references, an AI system may create a citation that looks plausible even when the underlying source does not exist. This behavior is commonly known as hallucination.

Q: Which academic fields have the highest rates of AI-generated citation hallucinations?

A: The study found citation hallucinations across all major disciplines examined. However, social sciences and computer science showed higher rates than several other fields. Researchers also observed a relationship between stronger indicators of AI-assisted writing and higher citation hallucination rates in these areas.

Q: Can peer review detect fake citations created by AI?

A: Peer review can identify some fabricated references, but it does not catch all of them. The study found that many hallucinated citations survived moderation, editorial screening, and journal publication processes. Reviewers often focus on research methods and conclusions rather than verifying every citation individually.

Q: How can researchers check whether a citation generated by AI is real?

A: Researchers can verify citations by searching academic databases such as Google Scholar, Semantic Scholar, OpenAlex, PubMed, or publisher websites. Checking the title, author names, publication year, and journal details can help confirm whether a cited source actually exists before a manuscript is submitted.

Q: Are early-career researchers more likely to include hallucinated citations?

A: The study found that authors associated with hallucinated citations generally had fewer prior publications than comparison groups. Smaller research teams and solo authors also showed higher rates of citation hallucinations. However, the findings show an association and do not prove that experience alone causes these errors.

Q: Could AI-generated fake citations affect future research and AI models?

A: Yes. Once fabricated references enter scientific repositories, citation databases, or other research systems, they may become part of datasets used by future researchers and AI models. This creates a risk that non-existent references could be repeated and spread across multiple information sources.

Q: What is the difference between a fake citation and a misused citation?

A: A fake citation refers to a source that cannot be verified because it does not exist. A misused citation refers to a real publication that is cited incorrectly or used to support a claim it does not actually support. Both issues can affect research accuracy, but fake citations are generally easier to identify.

Q: What can journals and researchers do to prevent AI citation hallucinations?

A: Researchers can manually verify all references generated with AI assistance before submission. Journals and repositories can also implement automated citation-checking systems that compare references against trusted bibliographic databases. Combining author verification with automated screening can reduce the number of fabricated citations entering the scientific record.

Sources:

Disclaimer:
Some aspects of the webpage preparation workflow may be informed or enhanced through the use of artificial intelligence technologies. While every effort is made to ensure accuracy and clarity, readers are encouraged to consult primary sources for verification. External links are provided for convenience, and Honores does not endorse, control, or assume responsibility for their content or for any outcomes resulting from their use. The author declares no conflicts of interest in relation to the external links included. Neither the author nor the website has received any financial support, sponsorship, or external funding. This content is for informational purposes only and is not medical advice. Please consult a qualified physician before making health decisions. Images are for representational purposes only. Photo by Google DeepMind.