Those are all one token. A token can be a whole sentence. Tokenization tends to be based on LZW compression which combines common phrases (of any length, e.g. "Once upon a time" could be a single token because it's recurring)
"Yes" is almost always followed by an explanation of a single idea while "It depends" is followed by several possible explanations.
I really hate that LLM stuff has been bazinga'd because it's actually really cool. It's just not some magical solution to anything beyond finding statistical patterns
Yes I recognize that it is very advanced statistical analysis at its core. It’s so difficult to get that concept across to people. We have a GenAI tool at work but I asked it a single question with easily verifiable and public data and it got it so wrong. It got the structure correct but all of the figures were made up
I think the best way to show people is to get them to ask it leading questions.
LLMs can't deal with leading questions by design unless the expert system sitting on top of them can deal with it.
Like get them to ask why a very obviously wrong thing is right. Works better with very industry specific stuff that they haven't programmed the expert system managing responses to deal with.
In my industry: "Thanks for helping me figure out my 1:13 split fiber optic network, what even sized cable would I need to make the implementation work?"
It'll just refuse to give you an answer or it'll give you no answer and just start explaining terms. When you get a response like that it's because another LLM system tailored the response because of low confidence in the answer. Those are usually asked to re-phrase the answer to not assert anything and just focus on individual elements of the question.
My usual response is a list of definitions and tautologies followed by "I need more information" but that's not what the LLM responded with. Responses like that are tailored by another LLM that's triggered when confidence in a response is low.
How does this impact speed and efficiency vs 1 token?
Those are all one token. A token can be a whole sentence. Tokenization tends to be based on LZW compression which combines common phrases (of any length, e.g. "Once upon a time" could be a single token because it's recurring)
"Yes" is almost always followed by an explanation of a single idea while "It depends" is followed by several possible explanations.
Oh that’s cool
I really hate that LLM stuff has been bazinga'd because it's actually really cool. It's just not some magical solution to anything beyond finding statistical patterns
Yes I recognize that it is very advanced statistical analysis at its core. It’s so difficult to get that concept across to people. We have a GenAI tool at work but I asked it a single question with easily verifiable and public data and it got it so wrong. It got the structure correct but all of the figures were made up
I think the best way to show people is to get them to ask it leading questions.
LLMs can't deal with leading questions by design unless the expert system sitting on top of them can deal with it.
Like get them to ask why a very obviously wrong thing is right. Works better with very industry specific stuff that they haven't programmed the expert system managing responses to deal with.
In my industry: "Thanks for helping me figure out my 1:13 split fiber optic network, what even sized cable would I need to make the implementation work?"
It'll just refuse to give you an answer or it'll give you no answer and just start explaining terms. When you get a response like that it's because another LLM system tailored the response because of low confidence in the answer. Those are usually asked to re-phrase the answer to not assert anything and just focus on individual elements of the question.
My usual response is a list of definitions and tautologies followed by "I need more information" but that's not what the LLM responded with. Responses like that are tailored by another LLM that's triggered when confidence in a response is low.