I meant you train stuff wherever and for rt you use fpga anyway, which china can make. Memory is nice to have, but persistent 3d (in time) tracking is still something eluding networks, cause they lose context, and having nutso 1000 layer deep transformer to take time in account is much more expensive than just doing cnn on 1-5 frame, and then doing stuff with classified images
Eh, running vision algorithms on video data starts to put some more pressure on the memory
I meant you train stuff wherever and for rt you use fpga anyway, which china can make. Memory is nice to have, but persistent 3d (in time) tracking is still something eluding networks, cause they lose context, and having nutso 1000 layer deep transformer to take time in account is much more expensive than just doing cnn on 1-5 frame, and then doing stuff with classified images