You must log in or register to comment.
I got an imprecise large language model right here for them
JUST WRITE GOOD CODE HOLY SHIT
Only briefly skimmed, but don't you need nonlinearity for these things to work (e.g., rectifier, sigmoid...)? Else, it's just linear algebra, and more layers can't help (since matrices can be multiplied, the dimensionality is the only thing that matters). I don't think you can really get nonlinearity with one bit.
Not my field, so I'm sure I'm missing something. If anyone wants to ELI5 though...
Interesting read, especially the idea of specialized hardware for 1-bit LLMs.