MBFC BTFO: https://mediabiasdetector.seas.upenn.edu/
The smart way to keep people passive and obedient is to strictly limit the spectrum of acceptable opinion, but allow very lively debate within that spectrum....
Assuming this isn't just evidence that the methodology sucks or the sample is crap because they picked a single right-wing crank site to serve as the functional outgroup, it seems to be pointing to a distinct lack of liveliness. The debates are all over lurid speculation about the diets and religious practices of immigrant communities.
The discourse gap between the two groups have narrowed so much that it wouldn't surprise me if that's how it looks on a chart. They're not arguing about whether they should or shouldn't do things anymore, merely how.
I don't think that's the problem. The problem is that an AI can't know truth from falsehood, or when things are being omitted, overtly emphasized, etc. The only thing it can actually evaluate is tone, and the factual, objective affect that all news reporting tends to use is gonna read as unbiased. It'd only register as biased if they started throwing out insults, used lots of positive or negative adjectives, or other kinds of semantically evident bias. You'd basically need an AGI to actually evaluate how biased an article is. Not to mention that attempting to quantify that bias assumes that there even is a ground truth to compare against, which might be true for the natural sciences but is almost always false for social reality.
You'd basically need an AGI to actually evaluate how biased an article is.
Too many bazingas, including a few on Hexbear, believe that a sufficiently large dataset (and sufficient burned forests and dried up lakes) will make the treat printers AGI.
Oh yeah, if you want something reflecting objective reality, sure absolutely. You need context out the wazoo. But, if you're just measuring a spread of bias from Democrat to Republican among the hegemonic media sources that are already only reporting within that context you can probably be pretty accurate for which way they're leaning. Especially since within that spread "reporting" is largely gonna be providing support for talking points from one party or the other.
This is the exact kind of thing we can expect to get touted as ideal use of AI. They don't care that AI can't possibly know when information is being omitted, contextually less relevant facts are being emphasized, or even if a claim is fabricated (as long as a "credible" source is given, i.e. an NGO or think tank). This is less than useless.
all major LLMs in the west are programmed to have bias. i think if you actually let it train on all data and didn't censor it it would call out propaganda
I don't think it would help much, let's look at CNN's coverage of the pager attack to see what I mean.
So, starting off with the headline, the article is already biased in a way that an LLM couldn't detect: the headline claims the attack was targeting Hezbollah, which is already contradictory with the facts shown immediately below, and contextual information from the real world. As humans, we can think about what it means for pagers to be rigged with explosives and detonated simultaneously at scale. We can understand that there's no way to do that while "targeting" any specific group, because you can't control where rigged consumer goods go. But an LLM would never know that, because an LLM doesn't reason about the world and doesn't actually know what the words it reads and writes represents. The best an LLM could do is notice something wrong between the headline saying the attack targeted Hezbollah and then the body showing that most injured people were civilians. But how could an LLM ever know that, even trained on better data? The article has this statement,
NNA reported that “hacked” pager devices exploded in the towns of Ali Al-Nahri and Riyaq in the central Beqaa valley, resulting in a significant number of injuries. The locations are Hezbollah strongholds.
that an AI would just have to know to mistrust to still identify the problem. As far as the AI knows, when the article says those towns are Hezbollah strongholds, that could literally mean those towns are... Hezbollah strongholds, in the literal military sense, rather than just places where they have strong presences. How would an LLM know any better?
Similar argument can be made about the information the article gives about the context more broadly. It mentions the tit for tat battles since Oct 7 but has no mention of Israel's history of aggression against Lebanon. Could an LLM identify that omission as a source of bias? It'd need to be very sure of its own understanding of history to do so, and if you've ever tried to have an LLM fact check something for you, you'd already know they're not designed to hold a consistent grip on reality, so we can reasonably say that LLMs are not the right tools to identify omission as a source of bias. Much less fabrication, because to identify that something is a lie you'd need a way to find a contradicting source and verify the contradicting source, which an LLM can never do.
Ah, the two fundamental biases as described by Marx, democrat and republican.
We trained the machine on our data and it says everythings normal with it!
Health: the most fascist category of all
Judging by the "mental health" concern trolling done in typical public chatrooms, absolutely true.
It's really convenient when you train on that same media class as examples of unbiased reporting