Search results for MMLU
AI & Enterprise
\"You are an expert\" prompts can backfire, lowering AI accuracy
Prompts that tell large language models \"you are an expert\" can reduce accuracy, a study found. A University of Southern California team tested six AI models with short and detailed expert-persona instructions. Complex expert prompts slightly improved writing and reasoning on MT-Bench but reduced quality in coding, math and humanities, and lowered overall performance on MMLU. Researchers said the instruction may divert resources from recalling facts. Results also showed gains in blocking unethical content on JailbreakBench.
Industry
Nvidia steps up push into South Korea\'s AI ecosystem with CUDA-style lock-in strategy
Nvidia has released 7 million synthetic Korean-language personas for free and then added the multimodal model Nemotron3 Nano Omni, signalling the rollout of a four-step lock-in package linking models, data, frameworks and hardware in South Korea. The Nemotron-Personas-Korea dataset is available on Hugging Face under a CC BY 4.0 licence. Nvidia is also expanding the Nemotron3 lineup and open-sourcing post-training tools in its Nemo framework, while tying performance to its Blackwell hardware features.
AI & Enterprise
DeepSeek V4 seen having bigger impact than R1 on strong price performance
China\'s AI company DeepSeek has launched its new V4 models, drawing attention for offering open-source, near-frontier performance at a much lower price than Opus 4.7 or GPT-5.5. The V4 Pro and V4 Flash models were trained on about 33 trillion tokens and post benchmark results close to those rivals. Commentator Matthew Berman said pricing could pull U.S. companies toward DeepSeek, though geopolitical risks remain.