https://twitter.com/Seamus_Malek/status/1727448891057144090
if AI ever gets good enough to do this accurately and in real time, we'd be looking at an actual babelfish
i would finally, at long last, have to hand it to them
Sentence structure means that it kind of can't happen in real-time as such, because you would need to wait until potentially the end of the sentence to get words that appear early in the sentence in an accurate and natural-ish translation. If "20 seconds later" is real time, barring run-on sentences, which are much more common in speech than in writing, then I guess.
you would need to wait until potentially the end of the sentence to get words that appear early in the sentence in an accurate and natural-ish translation
I think most people are okay with a reasonable delay if the live interpretation is accurate.
In Star Trek they just edit out all the pauses while everyone waits for their Universal Translator to finish telling them what was just said.
yeah good point, I speak enough german that I should already know how much grammar would be an issue lol
Yandex Browser already does this, but to Russian only. It has like 10-15 seconds delay for live streams (at least on Youtube) but it works as well as the auto-generated transcription.
Here’s the funny part: their American accent totally made it believable.
It’s very clear that even with the AI generated voice, they are not native Mandarin speakers. They sound like your typical foreigners who learned Chinese for a number of years lol. I don’t know if it’s the dataset they’re trained on or just how the algorithm works, but it’s very interesting.
Makes me think about what it would be like if Chinese ever becomes an international language, in the way English has and Latin did before it. It makes me giggle to think about Mandarin with a backwoods Tennessee drawl.
The best comparison for me is Montreal french. Deadass sounds like your uncle from up north getting a little parlez vous on.
Sure, they pronounce their Rs, but I don't see anything much silly with Montreal french...
Even with the phonemes of any two given language varieties that are considered to be “the same sound”, there are going to be differences in what the average pronunciation is, so I assume that’s a lot of what’s going on here. The other thing is that English and Chinese have a lot of phonemes that barely or don’t at all overlap in possible pronunciations, so the algorithm is picking the closest match.
Felix in the replies: "I’m crying at how beautiful this is. I support AI now. all I have ever wanted is for the show to be credibly portrayed as a Chinese podcast"
Whenever you see things like this, or just how many pages Stalin read in a day, I'm just blown away. I'm such a lazy motherfucker, goddamn
don't be too hard on yourself - a) he probably didn't read all of these and it is some intern's job to make this list b) reading for fun and reading for information are too different skills. You can buy a book on philosophy, skim 50% of it and deep dive into a single chapter.
Various people who knew Andropov well, including Vladimir Medvedev, Aleksandr Chuchyalin, Vladimir Kryuchkov[92] and Roy Medvedev, remembered him for his politeness, calmness, unselfishness, patience, intelligence and exceptionally sharp memory.[93] According to Chuchyalin, while working at the Kremlin, Andropov would read about 600 pages a day and remember everything he read.[94] Andropov read English literature and could communicate in Finnish, English and German.[95]
damn I'm not even sure the leader of my country has ever read a book
On the one hand it's very lib to be reading Piketty; but on the other Ninety-Three is Hugo's best and most revolutionary novel and dosen't get a lot of attention becuase of it. So I guess it's a wash.
Piketty is a one trick pony, I went to one of his conference at my University. All he advocates for is a global progressive tax system. He never explains how it is going to be implemented (he himself acknowledges that there is a need for all government to form a UN like IRS which he doesn't believe it's ever going to be possible) or the fact that the implementation of his global tax system would just nuke most of the economy of countries that are just taxe havens for the rich.
The Netflix documentary of the book is enough to explain its contents. It's funny to see him talk about capitalism without mentioning ONCE Marx, Webber, Ricardo, etc. Or why the USA did the New Deal (because they never mention the Bolsheviks Revolution at all). The only mention of the USSR is that ''wow it collapsed because it's a failure and a police state, oh wow suddenly wealth inequality skyrocketed for no reasons''
Oh yeah, China is basically doing State Capitalism not so different than USA Post WW1 (never going into the details ofc), but they are bad because there are wealth inequality
I can speak enough Chinese to know the AI is trying to give them all Beijing accents and that's funny as hell
Native Mandarin speaker here. They all sound like your typical Westerners who have lived in China for a number of years. It’s more interesting that the AI were able to give them that realistic Western accent than a proper regional Chinese accent.
Accurate with regards to which one?
Even without watching the clip, any native speaker will be able to immediately tell that they’re not native Mandarin speakers, as the sounds do not correspond to any regional accents in China or even Taiwanese and other Chinese diaspora accent (Malaysian, Singaporean, for example).
As others have said, there are some subtle finesse in the AI voice generation that are very interesting - like Will’s “yi dian er” (一点儿), but they still sound like foreigners who try to imitate Chinese speakers.
In other words, if the Chapo boys move to China and live there for 5-10 years, they’d probably sound like that. I am more impressed by how “realistic” the AI voice imitates Westerners speaking Mandarin as a foreign language than just transposing perfect native accent on to the input English sentences. I don’t know how the algorithm works but it’s very interesting.
Will sounded like he has lived in China for 5+ years, Matt’s accent is slightly worse, so probably 2-3 years. Felix sounded like he just arrived last week lol (just extrapolating from my own experience with foreigners, as we know everyone learns at a different rate).
I meant accurate not regarding their dialects but regarding the translation itself. As in was it accurately translating what they said? Sorry, I should have been more specific.
Getting the "er" in there is quite northeastern, my grandfather speaks with 儿化. Adding a soft "er" sound at the end of syllables, basically.
But they also don't have 100% correct tonal pronunciation, something like 80-90% correct.
This is actually extremely impressive.
The ''er'' is not profound enough to be authentic, it is the exact mistake that Dongbei people would pick up on people who are from the south for example
What's their Beijing accent like? More casual or formal, so to speak?
New bejing accent is more formal. Old beijing accent is more piraty.
Old beijing accent is more piraty.
I'll suppose the old one is comparable to the New one, like how Irish and Scottish english sounds like to Londoners....
Speaking of which, what's funny with THEIR Beijing accent, as far as I'm concerned?
it's just funny to me that the AI voice has any regional bias at all, but I suppose it would have to. They sound like Americans trying to talk with very textbook Beijing/northeast area Chinese.
it's just funny to me that the AI voice has any regional bias at all,
Nothing too interesting to me... I mean come on, it'll be like if a person from Guangdong or Beijing spoke in standard Californian or New York English...
Will: "会有一点儿奇怪吗?"
Me: oh my god they added the 儿 to "一点", Skynet is real
Will actually sounds like a mango that picked up some beijing accents because he worked at a dongbei style restaurant for 1 year.
Matt sounds like someone from Guangxi
You have not experienced Chapo Trap House until you have heard it in the original Mandarin.
Excited to hear Hell of Presidents in the original Mandarin, as it was intended.
the series has been so good and it makes me miss Matt Christman even more :(
He is risen again, he just need time for recuperation and raising a baby is all
wild. any other mandarin speakers get a bizarre almost synaesthetic sensation from this?
Imagining a bit about Bo Xilai being a Myanma heroin harvester
Me Play Joke but it's about a drunk Chinese guy trying to get with an American woman who turns out to be a CIA operative.
Where is this dubbing tool? I need to hear them talk about the Noid in Gaelic.
There is an ElevenLabs logo in the bottom right of the vid, so could be that?
Intently listening to Zongchang War Room rant about dipshit mid-level Oppo execs and their obsession with golf while I study for the Gaokao
how good are the translations? does anyone know Chinese and can compare the original to the translated version? I'd be interested to hear how accurate the translations are. passingly accurate transcription services are a huge boon.
If we get to the point where this kind of stuff can process accurately, in real time, on a mobile device, we will have destroyed the need for real time speech translators (sorry, translators), and, more importantly, most of the language barriers between the international working class, no? Cautiously optimistic.
I already wrote this below, but Yandex Browser already does this! It only translates to Russian, and with live streams (on Youtube for example) you get a ~15 seconds delay.
It’s basically as a real-time transcription -> translation -> voice generation pipeline so the accuracy is as good as the transcript it manages to extract from.
I am not that scared for my job in the next 10 years. As long as people don't trust Bazinga Translate from the Torment Nexus Company to translate the doctor who goes through the Operation details with them, I am safe. Remember many countries want a stamp from a sworn translator for legal documents. Not to mention how shit even AI translation is still for Arabic (if it can recognize the letters to begin with, lmao). I can pre-translate a text and then go through it again, sometimes it even saves me time.
Yes I think this technology will be great for casually listening to shit, but won't get in the way of serious translation that matters.
Not looking forward to monolinguals getting even more insufferable with better translation software.
Eh I dunno speech to text algorithms are still kinda garbage. They work well on simple sentences but once you throw in colloquialisms and abbreviations it just craps out.