今天我们从互联网上获悉，智源研究院在预印本网站 arXiv 发布的综述报 “A Roadmap for Big Model”（大模型路线图）涉嫌抄袭。对这一情况，研究院立即组织内部调查，确认部分文章存在问题后，已启动邀请第三方专家开展独立审查，并进行相关追责。
2. 4月13日，我们获悉谷歌研究员Nicholas Carlini在个人博客上指出该报告抄袭了他们论文的数个段落，同时还有其他段落和语句抄袭其他论文。我们对此进行了逐项核查，经查重确认第2篇文章的第3.1节179个词，第8篇文章的第3.1节74个词、第12篇文章的第2.3节55个词、第14篇文章的第2节159个词、第16篇文章的第1节146个词与其他论文重复，应属抄袭。我们决定立即从报告中删除相应内容，报告修订版今天将提交arXiv进行更新。目前已通知所有文章的作者对所有内容进行全面审查，后续经严格审核后再发布新版本。
Statement on the Alleged Plagiarism by "A Roadmap for Big Model"
It has come to our attention that the survey report "A Roadmap for Big Model" uploaded on arXiv by a BAAI team is suspected of plagiarism. Immediately upon learning of the allegations, an internal investigation was organized to confirm the issue. BAAI is also initiating an independent review by third-party experts to further assess the issue and accountabilities. As a research institution that attaches great importance to academic standards, BAAI holds a zero-tolerance policy towards academic misconduct. We express our sincerest apologies to the authors of the original papers and to all of those affected.
The report in question constitutes a collection of 16 feature articles on big AI models. It was intended to cover all relevant literatures in this field at home and abroad and was led by the BAAI, which is responsible for the structure design and overall compiling. We invited researchers at internationally to write the 16 articles, each of which was written by a group of authors, totaling over 200 pages. Since its initial release, the report has been continuously revised based on feedback, and was updated to its third edition on the arXiv website on April 2.
On April 13, we learned that a Google researcher, Dr. Nicholas Carlini, had on his blog identified instances in which the report plagiarizes several paragraphs of his own paper, as well as content from other papers. We checked these findings item by item and rectified that some paragraphs in Section 2.3.1 of Article 2, Section 8.3.1 of Article 8, Section 12.2.3 of Article 12, Section 14.2.2 of Article 14, and Section 16.1 of Article 16, are duplicates of other papers and constitute plagiarism. We have removed these paragraphs from the report and are submitting a revised version to arXiv. The authors of all articles in the report have also been notified to conduct a rigorous review of their respective content and a new version will be released after subsequent afterward if needed. In addition, a third-party expert panel will be assembled to conduct an independent investigation of the issue, and those identified as responsible will be held accountable based on the final findings.
As the organizer of the report in question, BAAI takes responsibility for not conducting a thorough review before publishing. We are grateful to our colleagues in academia and the media for alerting us to this issue and will use this unfortunate incidence as an opportunity to improve our research management and publication review process. We continue welcoming input and feedback to further improve our process and culture.
Beijing Academy of Artificial Intelligence
April 13th, 2022