• hexaflexagonbear [he/him]
    hexagon
    ·
    6 months ago

    I think this is essentially what they did. The point of the paper is they made an architecture to make the llm more aware of an individual digit's position in a number. It helped with addition, multiplication, and even sorting.