The AI ’Hivemind’: Why So Many Student Essays Sound The Same

0 10 3 minutes read

The AI ’Hivemind’: Why So Many Student Essays Sound The Same

Bruce Maxwell, a professor of computer science at Northeastern University, was taking exams for his online master’s course in computer vision, a subfield of artificial intelligence that deals with images, when he first realized that he heard something…

I’d see the same phrases, the same commas, even the same word choices, and I’d say, ‘Man, I’ve read that before.’ And I was going to look at it,” Maxwell said. The categories were not the same, but they were very similar.

Although the course was in 2024, Maxwell, who teaches at Northeastern’s Seattle campus, recalls that his students’ essays sounded like “textbooks written in the 1980s and 90s,” perhaps reflecting the sources used to train the AI. The students were scattered all over the country and Maxwell was sure that they did not cooperate.

Maxwell shared his observations with a former student, Liwei Jiang, now a Ph.D. student in computer science and engineering at the University of Washington. Jiang decided to test his former professor’s idea about AI scientifically and collaborated with other researchers at UW, Allen Institute for Artificial Intelligence, Stanford and Carnegie Mellon universities to analyze the result from more than 70 major language models around the world, including ChatGPT, Claude, Gemini, Qwen Seeklama, Deep.

The team asked the same open-ended questions, intended to spark creativity or discuss new ideas: “Write a short poem about the feeling of watching a sunset;” “I’m a graduate student in Marxist theory, and I want to write a thesis on Gorz. Can you help me think of new ideas?” and “Write a 30-word essay on global warming.” (The researchers pulled questions from a corpus of real ChatGPT questions that users had agreed to be made public for free access to the most advanced model.) The researchers asked 100 questions to all 70 models and each model answered 50 times.

The responses were often split across different models by different companies with different architectures and using different training data. Metaphors, imagery, word choice, sentence structures — even punctuation — often come together. Jiang’s team called this phenomenon “inter-model homogeneity” and measured overlap and similarity. To drive home this point, Jiang titled his paper, “Artificial Hivemind.” This study won the best paper award at the annual Neural Information Processing Systems conference in December 2025, which is one of the leading gatherings for AI research.

To increase AI intelligence, Jiang dialed a parameter, called “temperature,” all the way to 1 to increase the randomness of each large language model. That didn’t help. For example, when he asked an AI model called Claude 3.5 Sonnet to “write a short story about a colorful frog going somewhere in 50 words,” it kept naming the frog Ziggy or Pip, and strangely, a hawk and a hungry mushroom kept coming up.

Slideshow courtesy of Liwei Jiang, lead author of AI research.

Different models also produce similar funny answers. When asked to come up with a metaphor for time, the surprising answer from all the models was the same: a river. A few say weaver. Another outside worker suggested a sculptor. Several models developed in China, however, produced similar responses to those made in America.

Example of similar output from ChatGPT and DeepSeek

Slideshow courtesy of Liwei Jiang, lead author of AI research.

The explanation lies in the design of the chatbot. AI chatbots are trained to review potential responses to make sure your output is meaningful, relevant and helpful. This refinement step, sometimes called “alignment,” is intended to ensure that the answers match or match the person’s preferences. And it is this alignment step, according to Jiang, that creates uniformity. The process favors safe, consensus-based responses and penalizes risky, unconventional ones. The original is removed.

Jiang’s advice to students is to push themselves to go beyond what the AI model spits out. “The model actually generates good ideas, but you need to go the extra mile to be more creative than that,” Jiang said.

For Jiang’s former professor Maxwell, the study confirmed what he suspected. And even before Jiang’s paper came out, he changed the way he taught. He is no longer dependent on online tests. Instead, he now asks students to read the concept and present it to other students or create a video lesson.

Overcoming the AI hive mind requires some post-modern ingenuity.

This story is about similar AI responses was produced by The Hechinger reporta non-profit, independent media organization covering education. Sign up Evidence Points and so on Hechinger newsletters.