cross-posted from: https://lemmy.ml/post/24102825

DeepSeek V3 is a big deal for a number of reasons.

At only $5.5 million to train, it's a fraction of the cost of models from OpenAI, Google, or Anthropic which are often in the hundreds of millions.

It breaks the whole AI as a service business model that OpenAI and Google have been pursuing making state-of-the-art language models accessible to smaller companies, research institutions, and even individuals.

The code is publicly available, allowing anyone to use, study, modify, and build upon it. Companies can integrate it into their products without paying for usage, making it financially attractive. The open-source nature fosters collaboration and rapid innovation.

The model goes head-to-head with and often outperforms models like GPT-4o and Claude-3.5-Sonnet in various benchmarks. It excels in areas that are traditionally challenging for AI, like advanced mathematics and code generation. Its 128K token context window means it can process and understand very long documents. Meanwhile it processes text at 60 tokens per second, twice as fast as GPT-4o.

The Mixture-of-Experts (MoE) approach used by the model is key to its performance. While the model has a massive 671 billion parameters, it only uses 37 billion at a time, making it incredibly efficient. Compared to Meta's Llama3.1 (405 billion parameters used all at once), DeepSeek V3 is over 10 times more efficient yet performs better.

DeepSeek V3 can be seen as a significant technological achievement by China in the face of US attempts to limit its AI progress. China once again demonstrates that resourcefulness can overcome limitations.

  • AtmosphericRiversCuomo [none/use name]
    ·
    2 days ago

    These type of posts go over like a lead balloon here because people don't want to accept the material reality of what's happening with this tech, but it's undoubtedly a stroke of luck for all of us that ghoulish companies like openai don't have any special sauce here. Open source models have consistently been able to keep up or at least get really close to the frontier model performance from companies that spend billions only to see their efforts replicated by these abaolute Chads from China.

    • ☆ Yσɠƚԋσʂ ☆@lemmy.ml
      hexagon
      ·
      2 days ago

      The amount of hate this tech gets is phenomenal, and most of it is completely misdirected. The problems that people ascribe to it aren’t inherent in the technology, but are simply symptoms of underlying social problems in a capitalist society.

      For example, people complain that it takes jobs away, but the whole idea that we have to work for the sake of work is idiotic to begin with. Technology that frees up people from work should create more free time for people to enjoy. The reason that’s not happening is because capitalism is not a rational economic system.

      Another common argument is that it’s very resource intensive and wastes energy. This is true, but there’s no reason to believe this won’t be optimized. In fact, we’ve already seen a lot of optimizations happen in just a few years that now make it possible to run models that used to require a data centre to run on a laptop.

      However, more fundamentally, wasting energy is once again an aspect of the capitalist system itself. Before AI we saw stuff like crypto, NFTs, and so on. Much of the technology that’s developed under capitalism ends up being frivolous or even actively harmful. So, it’s not generative AI that’s the problem, but the social system that guides allocation of labour and resources.

      In particular, artists are still clinging to an artisan model focusing on individual exceptionalism and intellectual property rights. These reactions, rooted in petty-bourgeois ideology, ultimately serve to reinforce inequality and empower corporations rather than protect artists.

      The core contradiction here is between the increasingly socialized nature of artistic production in a globalized, digital world and the continued emphasis on private ownership. It's a symptom of capitalist development that leads to the proletarianization of artists as they are displaced by industrial competition.

      The real solution lies in worker solidarity, unionization, and ultimately, the socialization of property. The enemy is not AI itself but the capitalist market that shapes its deployment, a system that already produces formulaic, profit-driven art. The focus on the underlying class struggle is how we get a future where technology serves the collective good rather than further entrenching existing power structures. This was a brilliant write up on the subject incidentally https://redsails.org/artisanal-intelligence/