• TraschcanOfIdeology [they/them, comrade/them]
    ·
    edit-2
    3 years ago

    I'm pretty sure this is a bit less extreme case of "garbage in; garbage out". When a large amount of the training data set uses female pronouns, the algorithm will default to those unless there is another clue they should use male ones. Besides, Google translate sucks big time. DeepL is the jam.

    Of course, this means we should have humans who have basic human decency checking the training data sets for bias and oppressive speech.

    • sgtlion [any]
      ·
      3 years ago

      True, but this is kind of the point. Datasets are trained on existing data, and so only serve to amplify and conserve the biases hidden within that data.

      • MendingBenjamin [they/them]
        ·
        3 years ago

        Fwiw there are also AI projects run by marginalized people dedicated to identifying those biases and flagging them, which will help with future training processes. It’s not an inherently conservative technology. Just, as with everything, it lacks those cultural filters due to the groups which tend to have access to it first

          • MendingBenjamin [they/them]
            ·
            3 years ago

            And in the case of the project I mentioned above, the tech is used to find those patterns in order to avoid them. Entrenched oppressive structures aren’t only about conserving patterns. They’re about preserving specific patterns and dampening others. For example, if this tech were available during the AIDS epidemic, there would have been plenty of data which revealed just how prevalent gay culture was in certain areas whereas the dominant ideology would insist that gay people were rare and unnatural. That data and the resulting analysis in the hands of police officers would have had very different outcomes from the same data/analysis in the hands of gay activists

      • TraschcanOfIdeology [they/them, comrade/them]
        ·
        3 years ago

        Oh yes, definitely, that's why I added the second paragraph. I just wanted to mention the most likely reason why this happened.

        The fact that many STEMheads and techbros don't bother with social justice makes it all the harder to check for biases though, because they rarely concern themselves with the ethical ramifications of the tech they develop, or just think that there's no way it could be used to reinforce oppressive systems.

  • gvngndz [none/use name,comrade/them]
    ·
    edit-2
    3 years ago

    It's really interesting how much Turkish and Hungarian have in common, we Turks also only have one gender pronoun and it's "O". So practically the same thing.

    So far example:

    He = O

    She = O

    They = Onlar (literally the plural of "O")

    • huf [he/him]
      ·
      3 years ago

      apparently the two words are unrelated

      • gvngndz [none/use name,comrade/them]
        ·
        3 years ago

        That's even more interesting then, I'm always fascinated by how different cultures can invent very similar ideas independently.

        • huf [he/him]
          ·
          3 years ago

          it's a pretty popular idea, not having noun classes:

          https://wals.info/feature/30A#2/26.7/149.1

          https://wals.info/feature/44A#2/18.0/149.1

          • gvngndz [none/use name,comrade/them]
            ·
            3 years ago

            I was more talking about how similar they are pronounced, not about having non-gendered pronouns, I know that is a fairly common thing.

            • huf [he/him]
              ·
              3 years ago

              oh, right. yeah, you get that kind of coincidence a lot i think

    • Pseudoplatanus22 [he/him]
      ·
      3 years ago

      I guess it's the Ottoman influence, perhaps? They were half-occupied by the Ottomans for several centuries. Otherwise, they are completely unrelated languages.

      • infuziSporg [e/em/eir]
        ·
        3 years ago

        Pronouns tend to be some of the most durable parts of a language, and the Ottomans didn't have much contact until the 1400s. I would say it's more likely similar patterns of speech traded between Uralic and Turkic-speaking peoples in north-central Eurasia.

      • Krem [he/him, they/them]
        ·
        edit-2
        3 years ago

        I heard somewhere that 他 is originally a genderless pronoun and that 她 was invented much later as a variant after contact with european languages

        apparently on taiwan they even have the gendered you: 妳, but i've never seen it used by mainlanders.

  • Quimby [any, any]
    ·
    edit-2
    3 years ago

    I've seen this sort of thing before, and tested it. Google randomly alternates between pronouns when translating if one of the languages leaves ambiguity in that regard. You could feed this same thing in and get the opposite in terms of the pronouns used in the translation.

      • dismal [they/them, undecided]
        ·
        3 years ago

        oh yeah i really dont think that its even a possibility that its random, the way those genders got reassigned after detranslation back to english

        this is really interesting honestly , its a subject ive never even really seen discussed or mentioned anywhere

      • Quimby [any, any]
        ·
        edit-2
        3 years ago

        SHIT. I just saw the same thing. I got a different result previously with a different language where this same thing was brought up.

        It sounds like maybe it's supported for some languages but not yet others? https://support.google.com/translate/answer/9179237?hl=en

        more context here, I think: https://www.theverge.com/2018/12/6/18129203/google-translate-gender-specific-translations-languages