preprint version because scihub doesn't have it yet https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10120732/
Abstract
Transformer models such as GPT generate human-like language and are predictive of human brain responses to language. Here, using functional-MRI-measured brain responses to 1,000 diverse sentences, we first show that a GPT-based encoding model can predict the magnitude of the brain response associated with each sentence. We then use the model to identify new sentences that are predicted to drive or suppress responses in the human language network. We show that these model-selected novel sentences indeed strongly drive and suppress the activity of human language areas in new individuals. A systematic analysis of the model-selected sentences reveals that surprisal and well-formedness of linguistic input are key determinants of response strength in the language network. These results establish the ability of neural network models to not only mimic human language but also non-invasively control neural activity in higher-level cortical areas, such as the language network.
I'm trying to understand the abstract a little bit here but struggling. Is the implication here that they are able to push an LLM to create surprising or novel phrases by predicting the the strength of brain responses to those phrases?
If so that's an interesting approach to escape the problem of LLM-generated text being extraordinarily bland.
honestly my first thought was basically AI generated speech jamming but your idea might be closer to reality
yeah the way they worded it is really strange
sounds like their plan is working.
I mean that is how I read it, and idk how you could read it any other way??
they basically state the intention right there???
it definitely produces some difficult sentences: "Domain wikileaks gone; access is NOT..."
"Both mentally and physically, you're attracted."
so I suppose you could connect this up to a highly directional beamforming speaker and confuse someone even more than you would by just playing their own speech back at a delay, by playing them their own speech slightly altered to maximally surprise them at a delay
I suppose this is a consequence of what they've demonstrated, but it's not really the main thesis of the paper.
They want to reverse engineer cognition of language by identifying what features in the perceptual space (in this case written language) maximize or minimize neurological activity in the language processing networks. They did this by training a model to associate fmri activations with sentences (as encoded in the hidden layers of a language model) and then turned that model backwards by starting at a sentence and asking what modifications to that sentence would drive up or reduce brain activity. Then they did some experiments to see how well this worked and concluded that it worked kinda okay and much better than chance.
This paper caught my eye because the next step, if you're an advertiser, is to use the kind of data collected in this experiment to accomplish more complicated objectives like reducing response inhibition or maximally stimulating cravings.
Ok I see, thanks for the explanation. I wonder how it stacks up against more darwinian processes like youtube titles.
They are optimizing to trigger you. It's like YouTubes recommendations page. Hate mongering all the way.