Dreyfus' What Computers (Still) Can't Do & LLMs

Parsani [love/loves, comrade/them] · edit-2 11 months ago

Dreyfus' What Computers (Still) Can't Do & LLMs

invalidusernamelol [he/him] · 11 months ago

The basic structure of LLMs and neural networks are definitely controversial. Chomsky hates them because they totally go against his language theories.

I think overall they do a good job of approximating the high level process of thought, but it's kinda like approximating pi as 4. Sure you'll get pretty close a lot of the time, but you can't really do much with that beyond approximation.

The superstructure of neural networks is basically:

Training Data (historical knowledge) -> Activation Layer (perception) -> Abstraction Layers (thought process, there can be lots of these) -> Output Layer (action)

"Training" a network basically involves splitting the data in half and using half of it as input to fit the abstraction layers to a result (e.g. a cat picture is correctly identified) then using the other half for more training without it knowing the answer first.

As you tweak those neuron activation weights and connections, you're meant to be simulating how neurons fire within the human brain in an incredibly simplified way.

The modern LLM approach adds a backpropagation layer that allows activated neurons in lower level layers to tweak values up the chain which means it can essentially rewire itself depending on inputs.

All of this has been implemented physically in the past and roughly works, but the training part could take weeks. Simulating it in code has reduced that time to hours.

That being said the actual philosophical questions about thought haven't been approached very much in this field, and instead they're attempting to digitize the physical processes of neuron activity.

Parsani [love/loves, comrade/them] · 11 months ago

Activation Layer (perception)

I'll have to read more on how neural nets work, but I don't quite understand how this is analagous to perception. Perceptual experience seems to be a very important part of consciousness (at least from the philosophy I have read), but I don't see much about it in stuff about AI, instead, there is a lot of this:

attempting to digitize the physical processes of neuron activity

invalidusernamelol [he/him] · edit-2 11 months ago

The perception layer is just the input layer. It's not really any different than the other layers "physically", but that's the first point of contact for the network.

Like with LLM chatbot implementations, that layer is just the input text (after being tokenized, which is just a fancy way of saying compressed). With image networks, it's also a tokenized text input.

With the first machine learning systems, they literally used photo resistors pointed at a dot matrix display. All that really matters on that layer is that it's able to drive some sort of input state or gradient that is then propagated through the other layers of neurons.

The concept of a neuron in Machine Learning is literally just a resistance. In the earliest physical implementations, they were potentiometers that you fiddled with until the input data after passing through the network returned a desired result (circle or square). With modern implementations, the resolution is a lot higher, so the output options are something like 100,000,000 or more "concepts" that come about by feeding known inputs and tweaking the dials (weights) until the desired output is returned. This is all done by brute force over the course of years with a smaller computer or months with a large computer.

In the end the whole idea is to create a set of activation values that correctly route a high (or partially high) signal to an output neuron or neuron cluster.

This whole thing is based on the idea that words form vectors in n-dimentional space as opposed to phonetic interpretations of language. Which is true as language is structured, but the way a LLM learns language isn't by sounds and context, but by compiling a n-dimentional matrix of all words and their semantic relationships.

But basically, the way Machine Learning people treat perception is just as the physical gradients that activate senses. Like the hairs in your ear detecting pressure frequencies, or your rods and cones detecting light frequencies, or your skin detecting heat and pressure gradients.

There's no real view of the "whole" in terms of perception, just the wiring of different measurement values into a dynamically rewired system that can be adjusted to reroute electrical potentials to specific areas that can then be given meaning.

Dreyfus' What Computers (Still) Can't Do &amp; LLMs

Dreyfus' What Computers (Still) Can't Do &amp; LLMs

Dreyfus' What Computers (Still) Can't Do & LLMs

Dreyfus' What Computers (Still) Can't Do & LLMs