- cross-posted to:
- technology
- cross-posted to:
- technology
Haven't read the article, but I'm guessing this new model enables them to do something computers could do 20 years ago, only far, far less efficiently.
To me it seems like they added a preprocessor that can choose to tokenize letters or send directions to other systems as opposed to just using the llm. So this makes it far better at logical reasoning and math from natural language input. The achievement isn't counting the R's in strawberry, it's working around the fundamental issues of llms that make that task difficult.
I had to write some javacc parser code and javacc is sort of terrible to write so I jammed in what I wanted and o1 gave me a real and reasonable solution, as opposed to 4o which was just bad. To go a bit Ted Kaczynski, we are already seeing a huge lack of critical thinking in market decisions. I've had bad ideas thrusted upon me that were clearly chatgpt inspired, at least before chatgpt the bad ideas had a bit of novelty to them. Now they are all this homogeneous badness. Most market bets are just throw things at the wall and see what sticks and sometimes people are accidentally right, but now we have this thing where everyone is making the same bets and nothing sticks. The short term productivity gains are far out-shadowed by this larger market madness. At least now the homogeneous decisions will be a bit better now, and hopefully a bit more well reasoned, but we're still at this level of bad we can never get away from.
Another W for accelerationists? Or an L because the newer models might reproduce status quo antetreatprintus?
https://cdn.arstechnica.net/wp-content/uploads/2024/09/strawberry_demo_HKeSJB5tjdRgdoGU.mp4?_=1
What happens in the "thinking" ?
I really need to know this.
Since you don't see any text generation happening, it could be reviewing it's own output to mimic knowledge of future tokens from a point
havent tried it much but it does appear to be much better than 4o. 4o was kinda mid, super dumb.
It's $20/mo right now for o1 and 4o is also $20/mo sans preview