Obviously google was basically already unusable over the past five years but this is... this is something else. Any subject you want to look up will have the first three pages entirely comprised of procedurally generated fake blogs and websites. And there's nothing there. Just scraped text from articles mashed together with varying amounts of hallucinations on top. Not even selling a product, just to harvest ad revenue.

What's maybe even worse is the steady degradation of image searches too. Joe Everyman generates some slop, the image's metadata tags include a historical artist's name. Some one on pinterest pins it while trawling the web. Now there are fifty gooey generated sludge images when you look up the historical painter.

And its not just artists. Historical figures, animals too. Look up 'baby peacock' if you want a clear example of it.

It's a pretty funny bit google, I hope it keeps going.

  • Awoo [she/her]
    ·
    1 year ago

    Part of the problem is their shift away from using backlinking signals.

    Google was massively better when they used backlinks as a signal, even if it also meant that sometimes results would have blackhat content in there from the way blackhats could farm backlinks out to trick the algo. The quality of relevant content it produced overall was enormously better.

    • rubpoll [she/her]
      ·
      1 year ago

      Could you explain what backlinking signals are and why they matter?

      • Awoo [she/her]
        ·
        edit-2
        1 year ago

        In the past the entire web was categorised via backlinks. A backlink is something like me linking to this wiki article about Losurdo who wrote about joseph stalin. The text of this link "wiki article about Losurdo who wrote about joseph stalin" has a lot of information in it that can be used as signals for search results.

        In the past, the internet was mapped by bots. Search engine bots went out into the web and they randomly visited sites. They read the pages, they then read all the links on the pages and went through them. Building a vast database of links to pages along with what the text of those links to those pages was.

        This information, from thousands of entries all for the same individual page, would then be used to categorise what the topic of that page likely was, along with the content crawling of it.

        You would then use all of this information to assign trust values to pages. And these trust values would result in the potential search results. All of this was primarily weighted by backlinks in the past whereas today it's primarily driven by the content itself.

        The backlink method meant that you could fake thousands of backlinks and trick algos sometimes. The content method means you can trick algos with the bullshit way pages are written today.

        In my opinion the older methodology gave significantly higher quality results. But some search pages would be static because some pages were so well established as high-value (linked millions of times) that they would always be the number 1-10 results. Google doesn't like this, they want the search pages to lean towards the NEW CONTENT. Because that's where the ad game is. It doesn't matter to them if a 10 year old page might have the highest and most valuable answers to a person's search, they want to serve ads. They'd rather serve mid-content and make cash from it. And that's why search engines suck ass today, because they're so heavily weighted to new and regularly updated content.

          • zifnab25 [he/him, any]
            ·
            1 year ago

            Shit costs money.

            I do wonder if Chinese search engines are outpacing their American peers by being more publically oriented. Or if they're just blindly cribbing from American techniques as Best Practices by habit, and getting similar degraded results.

            Or if this really is just a Cold War of bullshit, and the prior method would ultimately be contaminated by spammers in the same way new shit is.

            • robot_dog_with_gun [they/them]
              ·
              1 year ago

              people were gaming pagerank back in the day, anyone who ever hired an SEO person should probably be shot.

              • zifnab25 [he/him, any]
                ·
                1 year ago

                Hate the game, etc, etc. You can't really blame folks for wanting their content to be at the front of the Google queue.

                I'd argue the real root problem is discrete list-rankings as a means of presenting information. These kinds of search results imply a certain empirical authority to higher ranks sources.

                I might argue that a significant move forward on searches would be to present data not in terms of a ranked list, with the Top Item being definitive. Instead, present results as a graph with the Center Item being the result that most closely matches your query. Then you can move in 2D space to navigate results based on multiple axises of relationship and zoom in/out to reveal broader/grainular characteristics of results.

                So, perhaps a search for "horse" gives you the dictionary definition. Then you have that result broken into quadrants by results organized as "horse: biological", "horse: fictional", "horse: historical", and "horse: metaphorical". Moving in a given direction gives you more refined data on that topic (so - horse: biological might give you the Wikipedia article on Horse Breeds and a Veterinary website on Horse Health). You can zoom in to get a more granular look at horses broken up by breed or zoom out and get categories of animal within the Equidae family of animals.

                This kind of navigation would inevitably also get gamed. But it would de-emphasize the value of the initial results and turn it into a starting point for a search rather than the definitive result.

                • robot_dog_with_gun [they/them]
                  ·
                  1 year ago

                  completely changing web design would've been cool before smartphones, you can still see the legacy of 800x600 everywhere but if they did it now it would just be shit.

                  your idea is a little interesting but people would just click on the top left and that kind of movie UI goes wrong real fast for actual use.

                  • zifnab25 [he/him, any]
                    ·
                    1 year ago

                    Maybe. But if I had an abundance of free time and/or some infinite cash spigot, I'd give it a shot regardless.

                    If nothing else, I think the novelty of a spacial search over a linear search would get people's attention and give the platform more engagement than the Bing approach of being just like Google but pushier.

      • chickentendrils [any, comrade/them]
        ·
        edit-2
        1 year ago

        Reputable sites link to other sites, they form small world graphs that have bridges to other sites. They were primarily useful in early WWW search because sites had narrow focus. Eventually news sites became the hubs, and Wikipedia got mixed in and anyone can edit that, and then things started getting centralized in big forum hosts and news sites started degrading and getting bought and operated by grifters. There were fewer independent sites, blogs, etc, so most of the links just go between "news" and "social media", with a few surviving special interest sites still operating but mostly turned into places that link to news, social media, and Wikipedia. You can kind of get a similar effect between participants in social networks, but there's so much linking to mock/argue that it confuses things.

        They are still useful, it just got naturally less useful as they say human activity on the Web evolved. If you run a local spider/index, like YaCy and I assume SearX, you can look at the network graphs showing connections between sites. It's usually a good indicator of the trustworthiness of a page you don't recognize.