• ta00000 [none/use name]
    ·
    1 month ago

    Since LLMs essentially decide on one character at a time, I wonder if they would have better accuracy if asked to tell you the sum backwards. That's how we teach kids to add, right to left, carry the 1.

    • hexaflexagonbear [he/him]
      hexagon
      ·
      1 month ago

      I think this is essentially what they did. The point of the paper is they made an architecture to make the llm more aware of an individual digit's position in a number. It helped with addition, multiplication, and even sorting.

    • HexLlama [it/its, she/her]
      ·
      1 month ago

      Its technically true that it decides token at a time but it also takes previous tokens into account.

      • ta00000 [none/use name]
        ·
        edit-2
        1 month ago

        That's why it's easier. if you're going left to right you have to not only figure out the sum of the first number position, but also if there's a 1 to carry or not. Going right to left you only have to focus on one 1 digit add at a time and you already know if there's a carry by looking at the last addition.