- cross-posted to:
- opensource@lemmy.ml
Very cool tool. I tried out the medium-size model on a Russian video, and the English subtitles that it generated were much more accurate than YouTube's autotranslated captions.
Very cool tool. I tried out the medium-size model on a Russian video, and the English subtitles that it generated were much more accurate than YouTube's autotranslated captions.
Is there a good tutorial on how to download and use this? Sounds awesome
Have you used python packages before? It's distributed as one of those. (instructions in project readme under Setup) Looks like it's primarily command-line use only right now (instructions), unless you're writing a python program that includes it.
If you haven't used the command line before, what OS do you use? Might be able to walk you through it
I've used command line some, but it's been awhile. I'm using Windows 10 rn.
Gotcha gotcha, I haven't used Windows since the mid 00s so I won't be the most helpful, but it looks like you'll need to do the following:
If you aren't using a package manager, install Chocolatey (or maybe Scoop? I'm not familiar with that one - maybe some Windows comrades can chime in on which would be better for you)
Install Python 3 and Pip if you don't have them installed
Run the commands in the Setup part of that doc:
pip install git+https://github.com/openai/whisper.git choco install ffmpeg # assuming you are using Chocolatey and not Scoop
Assuming everything installs properly, you can use the examples from the Command-line usage section as a starting point. I'm running
whisper my-audio-file.mp3 --language Korean --task translate
to translate an audio file from Korean to English.Thank you, I'll try this later!
No problem, good luck! :stalin-heart: