A web service called IN THE WEIGHTS has been released that compares how strongly various generative artificial intelligence (AI) models have learned the name of a specific person.
An online media outlet, Gigazine, reported on June 19 (local time) that IN THE WEIGHTS shows, in the form of a score, how heavily a specific person’s name is reflected inside an AI model.
Large language models (LLMs) process massive data during training and adjust weights that reflect the importance of each piece of information. If a specific person’s name is strongly reflected inside a model, it likely means that person appeared relatively often or was important in the training data. That can also be interpreted as meaning the AI is more likely to describe or mention the person without a separate web search.
IN THE WEIGHTS compares multiple generative AI models by inputting the same person’s name. Supported models include GPT-5.5, GPT-5.4 Mini, Opus 4.8, Haiku 4.5, Grok 4.20, Gemini 3.1 Lite, Kimi K2 0905, DeepSeek V4, Llama 3.3 70B, Llama 3.2 1B, GLM 4.7 Flash, Mistral 3.2 24B and Qwen3 8B.
When a user enters a person’s name, the service asks each model who the person is. It then collects up to 10 candidate results, brief explanations and confidence levels, and calculates a “STRENGTH SCORE” from the combined information.
On the initial screen, examples with high scores among people searched that day were shown at the top. Wolfgang Amadeus Mozart, William Shakespeare, Taylor Swift, Steven Spielberg and Elizabeth II ranked near the top, each with a score of 996. Gigazine said the figure appeared to be effectively a theoretical upper limit.
Differences in name recognition by model were also shown in actual searches. Apple Chief Executive Tim Cook recorded 986 points, shown as the top 1 percent level. Elon Musk, CEO of Tesla and SpaceX, scored 992, higher than Cook.
Japanese politician Sanae Takaichi scored 792 points. It was shown on screen as the top 3 percent, but some models incorrectly linked people with the same name. For example, Mistral 3.2 24B suggested an animation character as a candidate, and it was confirmed that no character with that name actually appears in the work.
The service also shows the possibility of such errors. A “possible hallucination marker” is attached at the bottom of results, and it separately categorises items with uncertain facts, such as low-confidence responses suggested by Llama 3.2 1B.
Historical figures also received high scores. Oda Nobunaga, a figure from Japan’s Sengoku period, recorded 982 points. That shows that not only modern business figures but also historical figures may be strongly learned across many AI models.
IN THE WEIGHTS is less a simple person-search service than a tool for comparing how stably different AI models recognise a specific name. Because it shows not only scores but also incorrect candidates and low-confidence responses, it allows users to check differences in recognition and the possibility of hallucinations at the same time.
The industry sees such a service as potentially useful for examining differences in training bias and reliability across AI models for public figures, historical figures and celebrities.