• Obscerno@lemm.ee
    ·
    9 months ago

    Man, Unicode is one of those things that is both brilliant and absolutely absurd. There is so much complexity to language and making one system to rule them all ends up involving so many compromises. Unicode has metadata for each character and algorithms dealing with normalization and capitalization and sorting. With human language being as varied as it is, these algorithms can have really wacky results. Another good article on it is https://eev.ee/blog/2015/09/12/dark-corners-of-unicode/

    And if you want to RENDER text, oh boy. Look at this: https://faultlore.com/blah/text-hates-you/