觉醒年代 Age of Awakening is rated 9.3 on Douban, on par with Minning Town, and is almost unheard of in the West. At the same time, the production quality (at least from what I see) is the best the country can offer. But I think the reason they won't bother with English subtitles, is (1) that it's produced by CCTV therefore non-commercial and (2) the nature of the series itself.

That being said, I don't see why not I try and give translating this show a shot, so I translated the trailer and here it is.

I've uploaded the subtitles to a remote Git repository so feel free to contribute whenever you can:

https://forge.chapril.org/agentofchange/age-of-awakening.git

  • dengdidnothingwrong [none/use name]
    hexagon
    ·
    edit-2
    3 years ago

    tesseract is already in PATH, I'm referring to OpenCV (or opencv-python). The full traceback is:

    $ python3 ocr.py 
    Traceback (most recent call last):
     File "ocr.py", line 1, in <module>
       from videocr import save_subtitles_to_file
     File "/home/user/.local/lib/python3.8/site-packages/videocr/__init__.py", line 2, in <module>
       from .api import get_subtitles, save_subtitles_to_file
     File "/home/user/.local/lib/python3.8/site-packages/videocr/api.py", line 2, in <module>
       from .video import Video
     File "/home/user/.local/lib/python3.8/site-packages/videocr/video.py", line 6, in <module>
       import cv2
     File "/home/user/.local/lib/python3.8/site-packages/cv2/__init__.py", line 5, in <module>
       from .cv2 import *
    ImportError: libstdc++.so.6: cannot open shared object file: No such file or directory
    

    I'm running Guix System so the problem is probably related to the distribution I use and not the module.

      • dengdidnothingwrong [none/use name]
        hexagon
        ·
        3 years ago

        Unfortunately no, I think the problem is how my distribution packaged opencv, there's something wrong with the compilation in CI

          • dengdidnothingwrong [none/use name]
            hexagon
            ·
            3 years ago

            Tesseract is not the problem, openCV is the issue. Anyways, I've managed to run videocr on Ubuntu on a 4-core NUC (not my 12-core machine), it's been 2 hours and it's not finished processing the 5 minute trailer.

              • dengdidnothingwrong [none/use name]
                hexagon
                ·
                3 years ago

                I have it working too. The 5-minute video takes at most 20 minutes on a 12-core machine. The problem now is finding the right parameters to get the Chinese subtitles in place. Two subtitles lines are sometimes merged together, not properly timed, and some are ignored completely. Fortunately, when it does correctly extract the subtitle, the text is fine albeit with spaces between them.

                  • dengdidnothingwrong [none/use name]
                    hexagon
                    ·
                    3 years ago

                    I'm trying out ffmpeg filters to isolate #FFFFFF and make everything else black, I've tried colorhold but it changes everything else to gray which is not useful.