LLMs are solving MCAT, the bar test, SAT etc like they’re nothing. At this point their performance is super human. However they’ll often trip on super simple common sense questions, they’ll struggle with creative thinking.
Is this literally proof that standard tests are not a good measure of intelligence?
Actually, you can give chatbots a real IQ test, and the range of scores fall into roughly the same spread as how they rank on other measures, with the leading model scoring at 100:
https://www.maximumtruth.org/p/ais-ranked-by-iq-ai-passes-100-iq