I watched Nvidia's Computex 2024 keynote and it made my blood run cold

git [he/him, comrade/them] · 4 months ago

I watched Nvidia's Computex 2024 keynote and it made my blood run cold

KnilAdlez [none/use name] · edit-2 4 months ago

I'm going to slap down a hot take that is somewhat tangential to the article. I think deep learning models of many kinds are actually good. I use ML for speech to text, and text to speech. I use home assistant, with it's AI powered tools. Hell, I've even generated images to help me visualize descriptions of things (it was not useful tbh). All of these uses help me, both with my disability and with just making my day easier, and it all ran LOCAL HARDWARE. The fact is a almost any of the use cases for ML can be made small enough to run locally, but there are almost no good local ML accelerators. Something like 90% of sentences sent to Google home was requests to set a timer. You don't need a data center for that.

This is the true intent of these ridiculously power hungry data center cards, to build out a sort of rentier class of hardware owners (Nvidia the largest among them) that others have to rent from to stay cutting edge. This has been the dream since software as a service was thrown around 10 years ago. But just like streaming services, SaaS, and indeed rent, once the demand has been fully monopolized or trusted into whatever sectors it may end up in the price will skyrocket. ML could have been a real revolution in improving people's lives, but instead the hardware makers saw only a way to develop a route to make money with no work of their own, and the theft of work from others.

PorkrollPosadist [he/him, they/them] · edit-2 4 months ago

All of these uses help me, both with my disability and with just making my day easier, and it all ran LOCAL HARDWARE. The fact is a almost any of the use cases for ML can be made small enough to run locally, but there are almost no good local ML accelerators. Something like 90% of sentences sent to Google home was requests to set a timer. You don't need a data center for that.

I run LibreTranslate on matapacos.dog for inline post translation (and at home to avoid Google knowing every scrap of foreign language text I read) and it is a similar story. It runs locally (doesn't even require a GPU) and makes no remote requests. Language models developed for specific purposes can accomplish great things for accessibility with much lower performance requirements than the gimmicky shit Silicon Valley tries to pawn off as "artificial general intelligence."

KnilAdlez [none/use name] · 4 months ago

Exactly! PCs today are powerful enough to run them in decent time without acceleration too, it would just be more efficient to have it, ultimately saving time and energy. I would be interested in seeing how much processing power is wasted to calculate what are effectively edge cases in a models real work load. What percentage of GPT-4 queries could not be answered accurately by GPT-3 or a local LLaMA model? I'm willing to bet it's less than 10%. Terawatt-hours and hundreds of gallons of water to run a model that, for 90% of users, could be ran locally.

gaycomputeruser [she/her] · 4 months ago

How do you feel about stuff like google coral?

KnilAdlez [none/use name] · edit-2 4 months ago

I remember hearing about coral when it first came out, and it is exactly what I want to see more of. Low-cost, efficient AI accelerators for home use. I haven't personally used it but I have used Nvidia's Jetson Nano, which is similar but more power hungry. Coral does has some issues though, namely that you are locked into tensorflow lite (a similar issue with the Jetson series, but less bad since you can translate models into tensorRT), and the edge TPU hasn't had an update since 2018 as far as I can tell, and surely there have been breakthroughs to make them more efficient, able handle larger and newer models, or at least cheaper since then. I could be conspiratorial and suggest that any more powerful and they will cut into google cloud service money, but it's moot.

In short, I like it and I want to see more of it. In a capitalist view, I want more competition in this space.

Edit: I know that multiple edge TPUs can be used together, but larger solutions for municipal or business use would be nice as well.

gaycomputeruser [she/her] · 4 months ago

I didn't realize the hardware locks you to a software suite, that's generally a dealbreaker unless you already use that software for everything. The conspiratorial comment seems quite likely to me. It really seemed like they were putting a lot of weight behind the project and then just dropped it.

KnilAdlez [none/use name] · 4 months ago

Yeah, it leaves a lot to be desired, especially because Torch is much better than TensorFlow to work with (imo). But still, it's a step in the right direction. I would love to see more like it (but better) in the future. A peek at mouser shows that there are a few options in the $100-$200 range, but at that price I'll save the headache of trying to figure out what frameworks they support and get a Jetson. Lots to be desired in this sector.