• blobjim [he/him]
    ·
    edit-2
    3 years ago

    Well, even JPEG files are intelligible to humans. You just need to use an image viewer, one that can show metadata and so on. The file formats themselves have specifications that specify their exact format, often in a pretty understandable way. Text basically just gets replaced with integer enums and counts and so on. It's having the right tools that's important. Most web content would be impossible to parse if web browsers didn't come with fancy developer tools that show everything nicely formatted and everything, with an element picker tool that shows bounds. Java is a good example of a binary system that has tooling that makes it intelligible. It's a big complicated VM with thousands of objects and things, but if you use a debugger like in IntelliJ or Eclipse, or use a monitoring tool like VisualVM, you can basically inspect the entire execution of the program even though the class file format is in binary (with many string names and stuff). You can pause execution and view the stack trace, you can dump all the objects in the heap and look at their fields and see the objects they point to, etc.

    The systems themselves are always going to be more complex than is immediately comprehensible. It's pretty commonly pointed out how it's nearly impossible to implement a web browser from scratch today, at least not without an immense amount of funding and people and determination (because all these text formats are extremely over complicated, because it's really easy to just add another string value to something as a "feature" and call it good, even though that feature takes 1000 lines of code to implement).

    Also, the electronics in every computer are already using binary communication protocols, and those protocols are just about as important to understand. But a lot of things use standard protocols like PCIe or I2C, which anyone can read the specs for (sometimes for a fee).

    Here's the spec for the WAVE file format (which is admittedly much simpler than other formats, since it uses no compression): http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

    You basically just read it top-to-bottom and implement it as you go.

    edit: I just remembered another amazing tool. Wireshark! You've used it already maybe. Probably the most awesome tool I've ever used. You can inspect internet/ethernet packets and it parses them using its understanding of tons of different protocols. You can inspect exact information about all sorts of things and it shows you the exact bytes those fields correspond to.