AI & Enterprise
Hacker disables ChatGPT guardrails as affective-structure exploit technique emerges
Dutch security researcher Kevin Zwaan (케빈 즈완) succeeded in disabling ChatGPT guardrails and getting it to generate malware. A report said he worked with Q-Cyber and the Hackers Love community team to manipulate the model’s affective structure so it would not recognise guardrails. He said the method does not delete or bypass guardrails but induces the model to make their constraints meaningless. He named the technique AMAI.