This is just a restatement of the second example argument I gave, trying to assert something about the internals of a model (it doesn't understand) based on the fact that it was optimized to predict the next token
It's not "optimised" to do that, that's all it does. Like what specifically do you mean by internals? the weights of particular nodes?
You seem to be implying there's something deeper, some sort of persistent state or something but it is stateless after training. It's just a series of nodes and weights, they cannot encode more than patterns derived from training data.
Not the weights, the activations, these depend on the input and change every time you evaluate the model. They are not fed back into the next iteration, as is done in an RNN, so information doesn't persist for very long, but it is very much persisted and chewed upon by the various layers as it propagates through the network.
I am not trying to claim that the current crop of LLMs understand in the sense that a human does, I agree they do not, but nothing you have said actually justifies that conclusion or places any constraints on the abilities of future LLMs. If you ask a human to read a joke and then immediately shoot them in the head before it's been integrated into their long term memory they may or may not have understood the joke.
I really don't think your analogy is a great one there. We can't compare brains to computers usefully because they're super distinct. You're sneaking in this assumption that there is more complexity to the models by implying there's something larger present being terminated early but there isn't.
This seems as absurd to me as asking whether a clock has a concept of time. Being very good at doing time related stuff, vastly superior to a human, is not evidence in favour of having any sort of knowledge of time. I think that the interface of these models may be encouraging you to attribute more to them than there could possibly be.
The analogy is only there to point out the flaw in your thinking, the lack of persistence applies to both humans (if we shoot them quickly) and LLMs and so your argument applies in both cases. And I can do the very same trick to the clock analogy. You want to say that a clock is designed to keep time and that's all it does therefore it can't understand time. But I say, look, the clock was designed to keep time yes but that is far from all it does, it also transforms electrical energy into mechanical and uses it to swing around some arms at constant speed, and we can't see the inside of the clock who knows what is going on in there, probably nothing that understands the concept of time but we'd have to look inside and see. LLMs were designed to predict the next token, they do actually do so, but clearly they can do more than that, for example they can solve high school level math problems they have never seen before and they can classify emails as being spam or not. Yes these are side effects of their ability to predict token sequences as human reasoning is a side effect of their ability to have lots of children. The essence of a task is not necessarily the essence of the tool designed specifically for that task.
If you believe LLMs are not complex enough to have understanding and you say that head on I won't argue with you, but you're claiming that their architecture doesn't allow it even in theory then we have a very fundamental disagreement
Huh? a human brain is a complex as fuck persistent feedback system. When a nervous impulse starts propagating through the body/brain whether or not that one specifically has time to be integrated into consciousness has no bearing on the existence of a mind that would be capable of doing so. It's not analogous at all.
LLMs were designed to predict the next token, they do actually do so, but clearly they can do more than that, for example they can solve high school level math problems they have never seen before
No see this is where we're disagreeing. They can output strings which map to solutions of the problem quite often. Because they have internalised patterns, they will output strings that don't map to solutions other times, and there is no logic to the successes and failures that indicate any sort of logical engagement with the maths problem. It's not like you can say "oh this model understands division but has trouble with exponentiation" because it is not doing maths. It is doing string manipulation which sometimes looks like maths.
human reasoning is a side effect of their ability to have lots of children.
This is reductive to the point of absurdity. you may as well say human reasoning is a side effect of quark bonding in rapidly cooling highly localised regions of space time. you won't actually gain any insight by paving over all the complexity.
LLMs do absolutely nothing like an animal mind does, humans aren't internalising massive corpuses of written text before they learn to write. Babies learn conversation turn taking long before anything resembling speech for example. There's no constant back and forth between like the phonological loop and speech centers as you listen to what you just said and make the next sound.
The operating principle is entirely alien and highly rigid and simplistic. It is fascinating that it can be used to produce stuff that often looks like what a conscious mind would do but that is not evidence that it's doing the same task. There is no reason to suspect there is anything capable of supporting understanding in an LLM, they lack anything like the parts we expect to be present for that.
Huh? a human brain is a complex as fuck persistent feedback system
Every time-limited feedback system is entirely equivalent to a feed-forward system, similar to how you can unroll a for loop.
No see this is where we're disagreeing.... It is doing string manipulation which sometimes looks like maths.
String manipulation and computation are equivalent, do you think not just LLMs but computers themselves cannot in principal do what a brain does?
..you may as well say human reasoning is a side effect of quark bonding...
No because that has nothing to do with the issue at hand. Humans and LLMs and rocks all have this in common. What humans and LLMs do have in common is that they are a result of an optimization process and do things that weren't specifically optimized for as side effects. LLMs probably don't understand anything but certainly it would help them to predict the next token if they did understand, describing them as only token predictors doesn't help us with the question of whether they have understanding.
...but that is not evidence that it's doing the same task...
Again, I am not trying to argue that LLMs are like people or that they are intelligent or that they understand, I am not trying to give evidence of this. I'm trying to show that this reasoning (LLMs merely predict a distribution of next tokens -> LLMs don't understand anything and therefore can't do certain things) is completely invalid
The architectute of LLM neurons is an incredibly simplified and bayesean in nature. It can interact with other neurons and maintain an activation weight and some other parameters, but it's not a physical object.
Biological neurons are independent organisms capable of self organization, migration, communication, and interaction either directly with the world or abstractly through nerve senses.
The general concept of LLM architecture (something that has been around for decades now, I think all the way back to the 50s) is a reduced and simplified facsimile of that biological function.
I think because we interact with LLMs through an interface that's been basically exclusively limited to other human interactions forever, it can be easy to forget that they aren't the system they're emulating. They're no more a sentient machine than a dialysis machine is a kidney.
The very first chatbots had a similar effect on users, even though those were more expert machines and didn't use large natural language training sets. And in the end I believe that replicating the biological function of kidneys and livers and lungs is a much more important step in human history than replicating the function of the mind. Especially because any simulation of the mind trained on a natural language dataset is not something that can ever help us.
It will at best begin to placate us, we will have a mirror held up to ourselves because the training of the model isn't done for the sake of creating intelligence, but for making something that resembles intelligence enough to make us happy. The training is done entirely on our terms.
And again, LLMs and more broadly statistical models do have tons of uses, and using them to discover hidden patterns in data that would take forever for a human to find by hand. They can also be used in planning to simplify economic forecasting and detect possible shortages and future labor allocation needs (this was done by hand for GOSPLAN, and TANS proposes using these models for cybernetic planning systems).
But it's still just a machine, it's still just a programming language. A language where the syntax is a giant matrix of floating point numbers and relationship rules, but a programming language nonetheless.
Idk if we can ever see eye to eye here.. if we were to somehow make major advances in scanning and computer hardware to the point where we could simulate everything that biologists currently consider relevant to neuron behavior and we used that to simulate a real person's entire brain and body would you say that A) it wouldn't work at all, the simulation would fail to capture anything about human behavior, B) it would partly work, the brain would do some brain like stuff but would fail to capture our full intelligence, C) it would capture human behaviors we can measure such as the ability to converse but it wouldn't be conscious, or D) something else?
Personally I'm a hard core materialist and also believe the weak version of the church turing thesis, I'm quite strongly wedded to this opinion, so the idea that being made of one thing vs another or being informational vs material says anything about the nature of a mind is quite foreign. I'm aware that this isn't shared by everyone but I do believe it's the most common perspective inside the hard sciences, though not universal, Roger Penrose is a brilliant physicist who doesn't see this way.
I understand your perspective, and I don't necessarily disagree or think that there's anything innately spiritual or unique about biological intelligence. I do also agree that you could hypothetically scan every aspect of a brain or build a system that exactly mimics the behavior of neurons and probably pretty accurately recreate human intelligence.
I really think our only disconnect is that I don't think the current LLM model is anything close to complex or developed enough to be considered that.
That's a perfectly reasonable position, the question of how complex a human brain is compared with the largest NNs is hard to answer but I think we can agree it's a big gap. I happen to think we'll get to AGI before we get to human brain complexity, parameter wise, but we'll probably also need at least a couple architectural paradigms on top of transformers to compose one. Regardless, we don't need to achieve AGI or even approach it for these things to become a lot more dangerous, and we have seen nothing but accelerating capability gains for more than a decade. I'm very strongly of the opinion that this trend will continue for at least another decade, there's are just so many promising but unexplored avenues for progress. The lowest of the low hanging fruit has been, while lacking in nutrients, so delicious that we haven't bothered to do much climbing.
I don't know if it's relevant as I haven't read it yet but I was recommended this book: https://www.hup.harvard.edu/books/9780674032927 in a conversation the other day that was related to the pitfalls of comparing humans and computers.
It might be interesting? apparently it made some significant waves when published.
This is just a restatement of the second example argument I gave, trying to assert something about the internals of a model (it doesn't understand) based on the fact that it was optimized to predict the next token
It's not "optimised" to do that, that's all it does. Like what specifically do you mean by internals? the weights of particular nodes?
You seem to be implying there's something deeper, some sort of persistent state or something but it is stateless after training. It's just a series of nodes and weights, they cannot encode more than patterns derived from training data.
Not the weights, the activations, these depend on the input and change every time you evaluate the model. They are not fed back into the next iteration, as is done in an RNN, so information doesn't persist for very long, but it is very much persisted and chewed upon by the various layers as it propagates through the network.
I am not trying to claim that the current crop of LLMs understand in the sense that a human does, I agree they do not, but nothing you have said actually justifies that conclusion or places any constraints on the abilities of future LLMs. If you ask a human to read a joke and then immediately shoot them in the head before it's been integrated into their long term memory they may or may not have understood the joke.
I really don't think your analogy is a great one there. We can't compare brains to computers usefully because they're super distinct. You're sneaking in this assumption that there is more complexity to the models by implying there's something larger present being terminated early but there isn't.
This seems as absurd to me as asking whether a clock has a concept of time. Being very good at doing time related stuff, vastly superior to a human, is not evidence in favour of having any sort of knowledge of time. I think that the interface of these models may be encouraging you to attribute more to them than there could possibly be.
The analogy is only there to point out the flaw in your thinking, the lack of persistence applies to both humans (if we shoot them quickly) and LLMs and so your argument applies in both cases. And I can do the very same trick to the clock analogy. You want to say that a clock is designed to keep time and that's all it does therefore it can't understand time. But I say, look, the clock was designed to keep time yes but that is far from all it does, it also transforms electrical energy into mechanical and uses it to swing around some arms at constant speed, and we can't see the inside of the clock who knows what is going on in there, probably nothing that understands the concept of time but we'd have to look inside and see. LLMs were designed to predict the next token, they do actually do so, but clearly they can do more than that, for example they can solve high school level math problems they have never seen before and they can classify emails as being spam or not. Yes these are side effects of their ability to predict token sequences as human reasoning is a side effect of their ability to have lots of children. The essence of a task is not necessarily the essence of the tool designed specifically for that task.
If you believe LLMs are not complex enough to have understanding and you say that head on I won't argue with you, but you're claiming that their architecture doesn't allow it even in theory then we have a very fundamental disagreement
Huh? a human brain is a complex as fuck persistent feedback system. When a nervous impulse starts propagating through the body/brain whether or not that one specifically has time to be integrated into consciousness has no bearing on the existence of a mind that would be capable of doing so. It's not analogous at all.
No see this is where we're disagreeing. They can output strings which map to solutions of the problem quite often. Because they have internalised patterns, they will output strings that don't map to solutions other times, and there is no logic to the successes and failures that indicate any sort of logical engagement with the maths problem. It's not like you can say "oh this model understands division but has trouble with exponentiation" because it is not doing maths. It is doing string manipulation which sometimes looks like maths.
This is reductive to the point of absurdity. you may as well say human reasoning is a side effect of quark bonding in rapidly cooling highly localised regions of space time. you won't actually gain any insight by paving over all the complexity.
LLMs do absolutely nothing like an animal mind does, humans aren't internalising massive corpuses of written text before they learn to write. Babies learn conversation turn taking long before anything resembling speech for example. There's no constant back and forth between like the phonological loop and speech centers as you listen to what you just said and make the next sound.
The operating principle is entirely alien and highly rigid and simplistic. It is fascinating that it can be used to produce stuff that often looks like what a conscious mind would do but that is not evidence that it's doing the same task. There is no reason to suspect there is anything capable of supporting understanding in an LLM, they lack anything like the parts we expect to be present for that.
Every time-limited feedback system is entirely equivalent to a feed-forward system, similar to how you can unroll a for loop.
String manipulation and computation are equivalent, do you think not just LLMs but computers themselves cannot in principal do what a brain does?
No because that has nothing to do with the issue at hand. Humans and LLMs and rocks all have this in common. What humans and LLMs do have in common is that they are a result of an optimization process and do things that weren't specifically optimized for as side effects. LLMs probably don't understand anything but certainly it would help them to predict the next token if they did understand, describing them as only token predictors doesn't help us with the question of whether they have understanding.
Again, I am not trying to argue that LLMs are like people or that they are intelligent or that they understand, I am not trying to give evidence of this. I'm trying to show that this reasoning (LLMs merely predict a distribution of next tokens -> LLMs don't understand anything and therefore can't do certain things) is completely invalid
The architectute of LLM neurons is an incredibly simplified and bayesean in nature. It can interact with other neurons and maintain an activation weight and some other parameters, but it's not a physical object.
Biological neurons are independent organisms capable of self organization, migration, communication, and interaction either directly with the world or abstractly through nerve senses.
The general concept of LLM architecture (something that has been around for decades now, I think all the way back to the 50s) is a reduced and simplified facsimile of that biological function.
I think because we interact with LLMs through an interface that's been basically exclusively limited to other human interactions forever, it can be easy to forget that they aren't the system they're emulating. They're no more a sentient machine than a dialysis machine is a kidney.
The very first chatbots had a similar effect on users, even though those were more expert machines and didn't use large natural language training sets. And in the end I believe that replicating the biological function of kidneys and livers and lungs is a much more important step in human history than replicating the function of the mind. Especially because any simulation of the mind trained on a natural language dataset is not something that can ever help us.
It will at best begin to placate us, we will have a mirror held up to ourselves because the training of the model isn't done for the sake of creating intelligence, but for making something that resembles intelligence enough to make us happy. The training is done entirely on our terms.
And again, LLMs and more broadly statistical models do have tons of uses, and using them to discover hidden patterns in data that would take forever for a human to find by hand. They can also be used in planning to simplify economic forecasting and detect possible shortages and future labor allocation needs (this was done by hand for GOSPLAN, and TANS proposes using these models for cybernetic planning systems).
But it's still just a machine, it's still just a programming language. A language where the syntax is a giant matrix of floating point numbers and relationship rules, but a programming language nonetheless.
Idk if we can ever see eye to eye here.. if we were to somehow make major advances in scanning and computer hardware to the point where we could simulate everything that biologists currently consider relevant to neuron behavior and we used that to simulate a real person's entire brain and body would you say that A) it wouldn't work at all, the simulation would fail to capture anything about human behavior, B) it would partly work, the brain would do some brain like stuff but would fail to capture our full intelligence, C) it would capture human behaviors we can measure such as the ability to converse but it wouldn't be conscious, or D) something else?
Personally I'm a hard core materialist and also believe the weak version of the church turing thesis, I'm quite strongly wedded to this opinion, so the idea that being made of one thing vs another or being informational vs material says anything about the nature of a mind is quite foreign. I'm aware that this isn't shared by everyone but I do believe it's the most common perspective inside the hard sciences, though not universal, Roger Penrose is a brilliant physicist who doesn't see this way.
I understand your perspective, and I don't necessarily disagree or think that there's anything innately spiritual or unique about biological intelligence. I do also agree that you could hypothetically scan every aspect of a brain or build a system that exactly mimics the behavior of neurons and probably pretty accurately recreate human intelligence.
I really think our only disconnect is that I don't think the current LLM model is anything close to complex or developed enough to be considered that.
That's a perfectly reasonable position, the question of how complex a human brain is compared with the largest NNs is hard to answer but I think we can agree it's a big gap. I happen to think we'll get to AGI before we get to human brain complexity, parameter wise, but we'll probably also need at least a couple architectural paradigms on top of transformers to compose one. Regardless, we don't need to achieve AGI or even approach it for these things to become a lot more dangerous, and we have seen nothing but accelerating capability gains for more than a decade. I'm very strongly of the opinion that this trend will continue for at least another decade, there's are just so many promising but unexplored avenues for progress. The lowest of the low hanging fruit has been, while lacking in nutrients, so delicious that we haven't bothered to do much climbing.
I don't know if it's relevant as I haven't read it yet but I was recommended this book: https://www.hup.harvard.edu/books/9780674032927 in a conversation the other day that was related to the pitfalls of comparing humans and computers.
It might be interesting? apparently it made some significant waves when published.