Inflection-2- 2 AI, the developers of the PI AI Private Assistant, have launched Inflection-2. A powerful new massive language model that outperforms Google’s PaLM-2 language model across multiple significant datasets.
From weakest to strongest, these are the MMLU dataset scores:
Grok-1 (68.9), LLaMA 270b (70.0), GPT-3.5.
- PaLM-2
- giant 78.3
- GPT-4 86.4
- Claude-2 _
- CoT 78.5
- Inflection-2 79.6
As shown in the above image, it should only be GPT-4 that achieves higher than Inflection-2.
Pi Private Helper
Pi is a personal assistant that you can download on your mobile, both Apple and Android devices, and it is also web-based.
Moreover, you can add it as a contact in WhatsApp and get it through direct messages on Facebook and Instagram.
Pi is supposed to be a chatbot that can respond to questions, research information on products, science, or other topics, and serve as a discussion aid that gives guidance.
MMLU – Huge Multitask Language Understanding
In a nutshell, AI benchmark scores on the MMLU dataset were displayed significantly. In a way similar to human testing, this dataset is used to assess LLMs.
STEM (science, technology, engineering, and math) topics, as well as a broad range of other topics such as regulations.
MBPP: Math Reasoning and Code Efficiency:
Math and code reasoning exams compared the performance of GPT-4, LLaMA, AI, Inflection-2, and PaLM 2. Although it was not specially trained to solve math problems, it performed surprisingly well.
The dataset used was Largely Primary Python Programming (MBPP), a benchmarking dataset. This dataset features more than 1,000 crowdsourced Python programming problems. The scores actually differ because Inflection AI was tested against a different massive language model that is very well-suited for coding, called PaLM-2S.
Read Also: