This is legit.

  • The actual conversation: https://archive.is/sjG2B
  • The user created a Reddit thread about it: https://old.reddit.com/r/artificial/comments/1gq4acr/gemini_told_my_brother_to_die_threatening/

This bubble can't pop soon enough.

  • amemorablename@lemmygrad.ml
    ·
    3 hours ago

    I'm not saying this to excuse google (I generally avoid the big corp AI models), but I've used LLMs for like... what is it, almost 2 years now? And the degradation seems barely even existent, like it just went from 0 to 100. I only skimmed, so maybe I missed something important. It's very weird. Typically there's going to be somewhat of a path to output like this, and corp models such as this are usually tuned heavily to stay on a sanitized, assistant-like track.

    The theory that weird tokens in input caused it to go wonky does seem plausible. LLMs use tokenizers (things that break stuff up into words or segments of words) and so weirdness relative to how they tokenize and what they're trained on could maybe cause it to go off the rails.

    Anyway, I tend to be opposed to the sanitized assistant format that they are most known for because it presents AI as a fact machine (which it cannot do reliably), it tries to pave over creativity of responses with sanitized tuning (which gives a false sense of security for "safe" output - as we see in examples like this, it cannot block everything weird in all scenarios), and it gets people thinking that AI = chat assistant. When the basics of an LLM without all the bells and whistles is more like: You type "I went to" and the AI continues it as "the store to buy some bread, where I saw". How an LLM is likely to continue given text will depend some on how it's tuned, what is in the training data, etc., but that's ultimately what it's doing, is it's predicting the token that should come next and sampling methods add an element of randomness (and sometimes other fancy math) so that it doesn't write deterministically. It doesn't know that there is an independent human user and itself a machine. It is tuned to predict tokens like the format is a chat between two names and some stuff is done behind the scenes to stop its output before it continues writing for the user; if you remove those mechanisms with a model like this, you could have it write a whole simulated back and forth.

    But because chat format presents it like AI and user, no matter how many times the corps shove in phrases like "As an AI language model", it's going to feel like you are talking with an entity. Which I don't think is so much a problem for chat format where you go in knowing it's for fantasy, like roleplay setups. But this corp stuff badly wants to encroach on the space inhabited by internet search and customer service, and it just can't reliably. It's a square peg in a round hole, or round peg in a square hole, however that goes.

    • loathsome dongeater@lemmygrad.ml
      hexagon
      ·
      3 hours ago

      It eventually all ties into the contradiction between what the technology is vs. what big tech and venture capital want you think it is as you alluded. I think LLMs in an ideal scenario could be at worst a fun toy and at best a good stepping stone but big tech has decided to get incredibly weird with it. So now you get bombastic claims about what LLMs will be able to do five years from now alongside disclaimers that it currently makes shit up so please double check the responses.

      The reason I posted this is that it's good to try and hold demoncorps like Google accountable even though it won't likely make a dent. At worst it's just good fun expect for the Gemini user in question.

      • amemorablename@lemmygrad.ml
        ·
        2 hours ago

        The reason I posted this is that it’s good to try and hold demoncorps like Google accountable even though it won’t likely make a dent.

        Agreed. I have no love for google or how they and others like them are going about this. Personally, it's a subject I hang around a lot, so I tend to use what opportunities I have to drop some basics about it, in case there are people around who think it's more... magical than it is, for lack of a better word.

        So now you get bombastic claims about what LLMs will be able to do five years from now alongside disclaimers that it currently makes shit up so please double check the responses.

        Lol yeah, that stuff is... something. AGI (Artificial General Intelligence) seems to be the go-to buzzword to fuel the hype machine, but as far as I can tell, the logistics of actually achieving it are so beyond what an LLM is, at least in the current transformer infrastructure of things. One of the things I've picked up along the way is just how important data is that goes into training an LLM. And it's this thing that kinda makes intuitive sense when you think about it, but can get lost in the black box "AI so clever" hype; that it can't know something it hasn't ever been presented with before. To put it one way, if you trained an LLM on a story with binary good and a story with binary evil, it's not necessarily going to extrapolate from that how to write a mundane story about shades of gray. It might instead combine the two flavors, creating a blend of the extremes. I can't claim with confidence it's exactly this straightforward in practice, but trying to get at a general idea.