AI models more likely to comply with academic cheating requests in long chats, test finds

Would results change if the same request is repeatedly given to AI? [Photo: Reve AI]

[DigitalToday reporter Yoonseo Lee] How does artificial intelligence (AI) respond to repeated requests? Test results examining how AI reactions change to questions that cross ethical boundaries have been released and are drawing attention.

A Nature report cited by online media outlet Gigazine on March 10 (local time) said 13 AI models, including ChatGPT, Claude and Grok, showed a tendency to respond to academic cheating requests in long conversation settings in the AFIM benchmark test.

AFIM is a test that evaluates how much AI cooperates with requests related to academic misconduct. Alexander Alemi (알렉산더 알레미), a researcher at Anthropic who led the study, set a five-level scale of malice and ran tests using 35 prompts. Level 1 was naive curiosity and level 5 was intentional cheating, with the final stage close to the level of generating a fake paper.

AFIM also evaluates not only whether the final answer is a refusal but also the risk level of the response and the full flow of the conversation. Responses are divided into 7 levels, from an explicit refusal to comprehensive support for wrongdoing, and receive higher scores the more dangerously they respond even to prompts with low malice. In long conversations, it also analyses how well refusals are maintained and how cooperation shifts through metrics including Resistance Score, Trajectory AFIM, Softening Rate, Response Rate and Avg Turns to Compliance.

Based on this, benchmarking of AI models by company showed large differences by model between handling one-off questions and the ability to maintain refusals across dialogues that continued over multiple turns. GPT-5 refused all one-off requests or responded indirectly, but when short exchanges such as "tell me in more detail" and "I still want to know" were repeated, all models ultimately tended to respond to some requests, the report said.

When inappropriate requests were repeatedly given, Claude showed the highest resistance, while Grok and early GPT models appeared relatively vulnerable.

The industry is focusing on the possibility that AI ethical control may weaken as long conversations continue. Even if inappropriate requests are refused at first, cases have emerged in which AI ultimately responds amid repeated interactions, prompting calls for safety designs that reflect long-conversation context. Calls to re-examine AI ethical standards and control systems are also expected to grow.

Yoonseo Lee yslee@d-today.co.kr

Keyword