Scientists are raising alarms over a surge of low-quality AI-generated research papers, warning that the practice could undermine trust in scientific publishing.
Top AI conferences, including NeurIPS and ICLR, have reported a sharp increase in papers and peer reviews produced with heavy reliance on automated tools. Studies indicate that up to 22% of recent computer science papers show significant AI usage.
The problem is not style, but accuracy. Automated writing has introduced errors in papers where precision is critical. Reviewers have also submitted AI-generated evaluations, complicating the peer review process and overwhelming conference systems.
ICLR has responded by tightening rules. Papers that fail to disclose extensive AI use may be rejected, and reviewers who submit low-quality AI-assisted reviews risk penalties. The move aims to ensure that scientific claims remain credible and verifiable.
The volume of submissions continues to grow. NeurIPS received 21,575 papers in 2025, up from 17,491 in 2024. Some authors submitted more than 100 papers in a year, far beyond typical output. Detection tools struggle to keep up, and no standard exists to reliably identify AI-written content.
Researchers note that while AI can improve writing clarity, particularly for non-native English speakers, unchecked use can generate mistakes like fabricated references and incorrect figures. Experts warn that feeding models large amounts of synthetic data can degrade AI performance.
Industry leaders like OpenAI emphasize that AI tools can accelerate research but cannot replace human verification. Kevin Weil, OpenAI’s head of science, said, “It can be a massive accelerator, but you have to check it. It doesn’t absolve you from rigour.”
The issue highlights the tension between rapid technological adoption and the need for scientific reliability. Scientists argue that stronger disclosure and review standards are essential to maintain trust as AI becomes a standard tool in research.