• TheOubliette@lemmy.ml
    ·
    3 hours ago

    "AI" is a parlor trick. Very impressive at first, then you realize there isn't much to it that is actually meaningful. It regurgitates language patterns, patterns in images, etc. It can make a great Markov chain. But if you want to create an "AI" that just mines research papers, it will be unable to do useful things like synthesize information or describe the state of a research field. It is incapable of critical or analytical approaches. It will only be able to answer simple questions with dubious accuracy and to summarize texts (also with dubious accuracy).

    Let's say you want to understand research on sugar and obesity using only a corpus from peer reviewed articles. You want to ask something like, "what is the relationship between sugar and obesity?". What will LLMs do when you ask this question? Well, they will just attempt to do associations and to construct reasonable-sounding sentences based on their set of research articles. They might even just take an actual semtence from an article and reframe it a little, just like a high schooler trying to get away with plagiarism. But they won't be able to actually mechanistically explain the overall mechanisms and will fall flat on their face when trying to discern nonsense funded by food lobbies from critical research. LLMs do not think or criticize. Of they do produce an answer that suggests controversy it will be because they either recognized diversity in the papers or, more likely, their corpus contains reviee articles that criticize articles funded by the food industry. But it will be unable to actually criticize the poor work or provide a summary of the relationship between sugar and obesity based on any actual understanding that questions, for example, whether this is even a valid question to ask in the first place (bodies are not simple!). It can only copy and mimic.

  • lattrommi@lemmy.ml
    ·
    4 hours ago

    I think I read this post wrong.

    I was thinking the sentence "We could be saving the world!" meant 'we' as in humans only.

    No need to be training AI. No need to do anything with AI at all. Humans simply start saving the world. Our Research Papers can train on Reddit. We cannot be training, we are saving the world. Let the Research Papers run a train on Reddit AI. Humanity Saves World.

    No cynical replies please.

    • UlyssesT [he/him]
      ·
      7 hours ago

      It's marketing hype, even in the name. It isn't "AI" as decades of the actual AI field would define it, but credulous nerds really want their cyberpunkerino fantasies to come true so they buy into the hype label.

      • queermunist she/her@lemmy.ml
        ·
        7 hours ago

        Yeah, these are pattern reproduction engines. They can predict the most likely next thing in a sequence, whether that's words or pixels or numbers or whatever. There's nothing intelligent about it and this bubble is destined to pop.

        • UlyssesT [he/him]
          ·
          6 hours ago

          That "Frightful Hobgoblin" computer toucher would insist otherwise, claiming that a sufficient number of Game Boys bolted together equals or even exceeds human sapience, but I think that user is currently too busy being a bigoted sex pest.

  • callouscomic@lemm.ee
    ·
    5 hours ago

    Most research papers are likely ad valid as an average reddit point.

    Getting published is a circlejerk, and rarely are they properly tested, or does anyone actually read them.

  • NuXCOM_90Percent@lemmy.zip
    ·
    edit-2
    6 hours ago

    Part of it is the same "human speech" aspects that have plagued NLP work over the past few years. Nobody (except the poor postdoctoral bastard who is running the paper farm for their boss) actually speaks in the same way that scholarly articles are written because... that should be obvious.

    This combines with the decades of work by right wing fascists to vilify intellectuals and academia. If you have ever seen (or written) a comment that boils down to "This youtuber sounds smug" or "They are presenting their opinion as fact" then you see why people prefer "natural human speech" over actual authoritatively researched and tested statements.

    And... while not all pay to publish journals are trash, I feel confident saying that most are. And filtering those can be shockingly hard by design.

    But the big one? Most of the owners of the various journals are REALLY fucking litigious and will go scorched earth on anyone who is using their work (because Elsevier et al own your work) to train a model.

  • slacktoid@lemmy.ml
    ·
    edit-2
    2 hours ago

    AuroraGPT. They are trying to do it.

    Its cause number of people who can read, understand, and then create the necessary dataset to train and test the LLM are very very very few for research papers vs the data for pop culture is easilier to source.

  • milicent_bystandr@lemm.ee
    ·
    6 hours ago

    I saw an article about one trained on research papers. (Built by Meta, maybe?) It also spewed out garbage: it would make up answers that mimicked the style of the papers but had its own fabricated content! Something about the largest nuclear reactor made of cheese in the world...

      • thepreciousboar@lemm.ee
        ·
        4 hours ago

        Because "ai" ad we colloquially know today are language models: they train on and can produce language, that's what they are designed on. Yes, they can produce images and also videos, but they don't have any form of real knowledge or understanding, they only predict the next word or the next pixel based on their prompt and their vast examples of words and images. You can only talk to them because that's what they are for.

        Feeding research papers will make it spit research-sounding words, which probably will contain some correct information, but at best an llm trained on that would be useful to search through existing research, it would not be able to make new one

  • Destide@feddit.uk
    ·
    7 hours ago

    Redditors are always right, peer reviewed papers always wrong. Pretty obvious really. :D