A study found that about 35 percent of newly published websites since ChatGPT appeared in 2022 may have been generated by artificial intelligence (AI) or written with AI assistance.
Japan's ITmedia reported on May 11 that researchers at Imperial College London, the U.S. non-profit Internet Archive and Stanford University published a paper titled "The Impact of AI-Generated Text on the Internet".
The researchers analysed websites published from August 2022 to May 2025. The study used the Internet Archive's web preservation service, the Wayback Machine. To avoid concentrating on specific domains, the team randomly sampled about 10,000 URLs each month, extracted page text and classified it using an AI text detector.
After prior comparisons, the detection tool selected was Pangram v3. The researchers explained that the tool showed consistently high accuracy across long and short texts, different models such as GPT, Claude and Gemini, and multiple languages. They then analysed the text in three categories: fully AI-generated, human-written with AI assistance, and fully human-written.
The focus was not only the share of AI-written text. The researchers viewed a gap between changes in actual internet text and user perceptions. In a survey, many respondents voiced concern, saying, "AI has increased misinformation," and "unique personal writing styles are disappearing and similar writing is increasing."
However, a large-scale analysis of web text found no evidence that factual accuracy had noticeably worsened across the internet as a whole. The researchers also viewed the uniformity of writing style as not as clear-cut as people perceive it to be.
Instead, two changes appeared relatively clearly. One was a reduction in semantic diversity. The group of AI-generated websites had content similarity 33 percent higher than the group of human-written websites. The researchers explained that AI's tendency to avoid extreme views and provide average, inoffensive answers aligns with this result. This suggests the possibility that diverse perspectives and original ideas online could narrow.
The other change was an expansion of excessive positivity, called a "positivity shift." Positivity scores on sites involving AI generation were 107 percent higher than on human-written sites. The researchers viewed this as reflecting AI's tendency to choose overly bright, inoffensive expressions to avoid repelling users.
This led to an assessment that changes now seen on the internet are closer to a different direction than an explosive increase in misinformation. The researchers pointed out that the core of the crisis lies not in the spread of blatant lies or rumours but in an increase in AI-typical sentences that are "not bold and overly bright." They said online text is gradually shifting in an artificially, excessively wholesome direction.
This trend shows that generative AI is beginning to influence how expression works across the web, beyond serving as a simple writing aid. Going forward, it is emerging as a task to examine not only the volume of information created by AI but also what tone and perspective become fixed as a standard on the internet.