CPU-posting on main

MTI = MIPS Technologies (company that made MIPS (Microprocessor without Interlocked Pipeline Stages) processors, they make RISC-V processors now lmao)

At the time when the MIPS R10000, known as the "T5" while in development, was being designed, MTI had made a name for themselves as designers of high-performance computer microprocessors along the lines of the then-new philosophy of reduced instruction set computing (RISC). Actually, their R2000 design was the first commercially-available RISC microprocessor. By the time the T5 was being designed, they were no longer alone in the RISC microprocessor market. Several companies, including IBM and Motorola (joined together in the AIM alliance which produced PowerPC), DEC (who designed the Alpha line of RISC microprocessors after MTI owned them in the 80s when their radically simpler chips were performing better than VAXen), and Sun Microsystems (who were making the SPARC line of microprocessors) were now marketing RISC microprocessors. Not just even marketing but beating MTI in the market they had created. After trying and failing to develop their own complete computer systems alongside their chips, they were having financial difficulties until Silicon Graphics acquired MTI to secure availability of MIPS microprocessors for their famous ("it's a Unix system, I know this!") MIPS-based workstations and servers. Although their new (in 1993) R4000 and R4400 designs performed well compared to their contemporaries, they were quickly being made obsolete by MTI's competitor's new offerings and they were left with a problem:

The MIPS R4000 and the R4400, which is essentially an R4000 with bigger on-die caches, were more or less just an architectural evolution from the R2000. The R4000 made its performance in much the same way as the R2000 did, the classic RISC design process mantra: "let's make it simpler" and thus be able to run it faster. In particular, what this means for the R4000, and what is a key difference from its predecessors and its contemporaries, is a technique called superpipelining. In an instruction pipeline, the maximum speed at which your processor can issue instructions is set by the pipeline stage which takes the longest to complete. Superpipelining is one way of addressing this problem: you can subdivide each pipeline stage into 2 simpler pipeline stages that individually complete faster and thus be able to clock your chip faster without problems. However, this has its limits. Eventually, it becomes impossible to further "deepen" the pipeline like this or clock the processor faster in general without other problems. This is why MTI's competitors opted for the analogous superscalar approach: you can duplicate functional units of your processor and have multiple instructions "in flight" at the same time and usually this also involves multiple pipelines. At the time MTI thought this approach would result in more consistently higher performance (not to mention save die space) but were quickly proven wrong when their competitor's superscalar (and often with other architectural tricks) chips were outperforming the R4000 in spite of MTI's fabrication partners constantly improving their process and releasing chips that ran at higher and higher speeds.

Enter the MIPS R8000 (die not pictured here) in 1994, a weird and expensive 6-chip 4-way superscalar design meant for the high-end microprocessor market while the next-generation T5 (which would become the MIPS R10000, as mentioned earlier) was under development. It didn't sell well because of its high price and the fact that its integer performance, important for general-purpose computing applications, was lacking compared to the 200-MHz R4400 that was being sold by then. It did, however, have impressive floating-point performance, which landed many R8000-based systems in the TOP500 supercomputer list for a time. But this design could never be the high-performance and general-purpose processor MTI needed to compete with their competitor's offerings...

Introduced in 1996, the MIPS R10000 (die IS pictured here) was a significant departure from the architecture of the R4000 (which more or less was directly derived from the first research done at Stanford University where MIPS was initially created over a decade earlier). Dropping the superpipeline approach, the R10000 is a 4-way superscalar processor even capable of executing instructions out of order! Another big change is that it has a branch predictor and speculatively executes instructions after a branch as opposed to the R4000, which used the classic MIPS "branch delay slot" technique to schedule one more instruction in the pipeline after a branch and then stall lol (they should have added even more delay slots, caring about binary compatibility is liberalism). It's hard to find benchmarks for something this old but this design performed at least several times faster than an R4400 at about the same clock speed!

If you like my CPU posting and want me to post more in the future let me know

Also ask me any questions if you want too and I'll try to answer

  • Yurt_Owl
    ·
    5 months ago

    Very good post! I'd love to see more CPU history posting or just more tech posting in general.

    • PaX [comrade/them, they/them]
      hexagon
      ·
      5 months ago

      Thank you for the kind words

      Maybe I can write something about a SPARC next. I don't know much about them and it's a good reason to learn lol

  • Bugger@mander.xyz
    ·
    5 months ago

    Great post, I could personally spend hours looking at a nice die shot like that and I appreciated the background history. Thanks!

  • M68040 [they/them]
    ·
    edit-2
    5 months ago

    I had a SGI o2 and SGI Octane for a while. R10 and R12-based, respectively. Sadly, UNIX workstations aren't very exciting to actually use, a lot of what made IRIX and Solaris unique at the time is stuff modern linux environments do out of the box. Ended up getting sold off to make room and let me branch out into complex kitbuild designs.

    Show

    The o2's A/V capture abilities are very much geared towards production purposes - attempting to capture off my Nintendo 64 resulted in the A/V board being unable to sync to the signal.

    • PaX [comrade/them, they/them]
      hexagon
      ·
      5 months ago

      Ooh those machines are so cool, at least to me lol. Thanks for sharing that pic. Of course, yeah, they don't do really anything special compared to Linux on a regular old PC but I appreciate them architecturally and like writing new software for them. I just recently got my first MIPS machine, an SGI Indigo 2 (R4400-based), and I'm still meaning to port Plan 9 to it sometime if I can find the energy and focus to lol.

      One of those SGI MIPS machines is so strange and different compared to a PC, in a good way. Sorry to hear that you couldn't use your O2 to capture your N64 video signal though

      Btw, how is your Radio-86RK build coming along?

      • M68040 [they/them]
        ·
        5 months ago

        Not bad, I’ve decided to tackle a newer revision of the board Sergei released while the first machine was in assembly. Wanted to go through his storefront.

        Been getting capacitors together for it; using some of those nice box KEMET ones for added fanciness. More effort than a trip to mouser by some distance, but I’ve already been importing chips from abroad for this so I’m already going some extra distance I theoretically don’t need to.

    • PaX [comrade/them, they/them]
      hexagon
      ·
      5 months ago

      It is kinda similar! You could view all the different functional units of a microprocessor as pieces of a factory that consume an input and produce an output. All these signals have to be moved around, from place to place, in a manner that does kinda resemble how a really messy Factorio factory might look lol.

  • CoolYori [she/her]
    ·
    5 months ago

    Thank you for writing this post. I am in networking and you will still find MIPS processors doing the heavy lifting. The main firewalls I program have Cavium branded MIPS CPUs and they could not do their main jobs without em. Some of the older Cisco gear that ran the internet was also MIPS driven.

  • ashinadash [she/her, comrade/them]
    ·
    1 month ago

    O yea I rember this post :)

    However, this has its limits. Eventually, it becomes impossible to further "deepen" the pipeline like this or clock the processor faster in general without other problems.

    Netburst engineers in shambles

    the R10000 is a 4-way superscalar processor even capable of executing instructions out of order! Another big change is that it has a branch predictor and speculatively executes instructions after a branch

    Shit, did the Pentiums of the time(MMX I think) rven have branch prediction? Kinda slaps.

    • ashinadash [she/her, comrade/them]
      ·
      1 month ago

      Ok I was reading Natopedia about the P5 and

      Superscalar architecture – The Pentium has two datapaths (pipelines) that allow it to complete two instructions per clock cycle in many cases. The main pipe (U) can handle any instruction, while the other (V) can handle the most common simple instructions. Some[who?] reduced instruction set computer (RISC) proponents had argued that the "complicated" x86 instruction set would probably never be implemented by a tightly pipelined microarchitecture, much less by a dual-pipeline design. The 486 and the Pentium demonstrated that this was indeed possible and feasible.

      This has the same vibe as all the anticommunist screeds on natopedia, this is x86 propaganda I think.

      • PaX [comrade/them, they/them]
        hexagon
        ·
        1 month ago

        That is strangely pointed lol

        Which is weird cuz the P5 and current PC processors are usually implemented as some kind of RISC-like microarchitecture that still pretends to be a CISC processor from like 50 years ago lol, it's not even a half-measure it's like an tenth-measure instead of just building better computer systems without the baggage of the past

    • PaX [comrade/them, they/them]
      hexagon
      ·
      1 month ago

      Netburst engineers in shambles

      Ikr, they were fucking cooking with their 200-stage long pipeline or whatever they were doing

      Wait I looked this up and it's not even true, no idea how this 200 number got into my head lol

      Shit, did the Pentiums of the time(MMX I think) rven have branch prediction? Kinda slaps.

      Oh yes, MTI was actually pretty late with branch prediction in their flagship processor, pretty much everyone was already doing it

      The original idea they had, implemented in the R2000, was to instead "expose" the pipeline by making it so that compilers could schedule instructions while the processor was figuring out what branch to take instead of stalling or having complicated speculative execution circuitry (the infamous "branch delay slot", people often hate on this but I like it) but unfortunately the original MIPS processor, the R2000, only had one slot and they couldn't add more without breaking binary compatibility (liberalism) between different MIPS processors. Also sadly it's not always possible for compilers to find useful work that can be done regardless of which branch is taken

      • ashinadash [she/her, comrade/them]
        ·
        1 month ago

        I think Prescott was 36 stages and Tejas/Jayhawk was gonna be 50, which is still ABSURD squidward-nervous

        To be fair I can see why they'd want that compatibility between MIPS processors (liberalism) given x86 stuff is all (somewhat?) intercompatible. That branch delay slot actually does sound cooler than dedicating die space and engineering time to branch predicting stuff... Alas,