But it’s not cheap.
“The model is definitely better at solving the AP math test than I am, and I was a math minor in college,” OpenAI’s chief research officer, Bob McGrew, tells me. He says OpenAI also tested o1 against a qualifying exam for the International Mathematics Olympiad, and while GPT-4o only correctly solved only 13 percent of problems, o1 scored 83 percent.
That's still unreliable enough that I wouldn't trust it to actually do anything. If it scoured its database for a trigonometry textbook and cited a solution for a problem which was as correct as any web calculator, cool. That'd be as useful as google was in 2010. 83% is the kind of score I get on advanced mathematics tests when I have no idea what I'm doing but half-remember the basic steps to get an answer.
I'm curious how it will do on the private benchmark that ai explained made. I think it was called simple bench?
no it doesn't have reasoning abilities. It just replicates you trying to coax it into giving you something decent, hides the process from you, and then charges you for it.