So I never got the chance to play with LLMs because of the phone number requirement. But recently duckduckgo added a chat feature which lets you talk to these models so I have been trying them out.
I know these two models aren't state of the art but its tragic how they have devoured terabytes of corpus only to never have understood the meanings of the words they use.
I tried to talking to them about simple political issues. Once I asked why Elon Musk complains about woke while being a multibillionaire. I also asked why it is commonly said that Israel-Palestine conflict is complicated. Both times it gives very NYT-esque status quo-friendly answers which I think is very dangerous if the user is a naive person looking to delve into these topics. But as I question the premise and the assumptions of the answers, it immediately starts walking back and starts singing a completely different tune. Often I don't even have to explain the problems I have and just asking it to explain its assumptions is enough.
I got it from saying "Musk is critical of censorship and the lack of free speech" to "Musk is a billionaire in a highly unequal society and his societal criticisms are to be taken with a ton of salt". For the Palestine one, it started off with a list of reasons behind the complexity. Then I got it to strike them off one by one eventually concluding that the conflict is one of extreme power imbalance and that Palestinians are a clear victim of settler colonialism according to international consensus. My arguments weren't even that strong and it just caved in almost immediately.
I'm still trying to find use cases LLMs. Specifically I would be really happy if I could find a use for a small model like TinyLlama. I find that text summarization is promising but I wouldn't use it for a text I haven't read before because LLM is a liar sometimes.
I can explain more later if need be, but some quick-ish thoughts (I have spent a lot of time around LLMs and discussion of them in the past year or so).
They are best for "hallucination" on purpose. That is, fiction/fantasy/creative stuff. Novels, RP, etc. There is a push in some major corporations to "finetune" them to be as accurate as possible and market them for that use, but this is a dead end for a number of reasons and you should never ever trust what an LLM says on anything without verifying it outside of the LLM (e.g. you shouldn't take what it says at face value).
LLMs operate on probability of continuing what is in "context" by picking the next token. This means it could have the correct info on something and even with a 95% chance of picking it, it could hit that 5% and go off the rails. LLMs can't go back and edit phrasing or plan out a sentence either, so if it picks a token that makes a mess of things, it just has to keep going. Similar to an improv partner in RL. No backtracking and "this isn't a backstory we agreed on", you just have to keep moving.
Because LLMs continue based on what is in "context" (its short-term memory of the conversation, kind of), they tend to double down on what is already said. So if you get it saying blue is actually red once, it may keep saying that. If you argue with it and it argues back, it'll probably keep arguing. If you agree with it and it agrees back, it'll probably keep agreeing. It's very much a feedback loop that way.