PC apologists keep trying to perpetuate this myths about what is and isn't RISC. There is a myth running around that the Pentiums (and other x86 variants) are RISC chips. ARS-Technica seems to be especially fond of twisting the truth, using lots of technobabble and jargon (mixed in with selective omissions), in order to obfuscate the truth and make people believe this myth. I will try to address the facts more openly than they (or others do) in order to allow people to make up their minds from an educated position.
What is RISC
I recommend that you read my article: What is RISC, as it is necessary background for this discussion and I hope will give you a deep understanding of the jargon and what it all means. Sadly, even that long article isn't all the issues -- but at least it is a good starting point.
A quick summary is that RISC is a design philosophy where you reduce the COMPLEXITY of the instruction set, which will reduce the amount of space, time, cost, power, heat and other things it takes to implement the instruction set part of a processor. Then you use those saving in other areas to make a better design (over all). It is about simplifying the ISA (Instruction Set) to enable compilers to take better advantage of things and to use that saved space (and time designing and implementing the chip) for better things.
There are many techniques use to save space (or complexity) or to increase performance. The techniques are not the ends -- just ways to achieve the ends. But many of the techniques allow for differntiation in design (and philosophy) and help achieve goals. Here are some of the things done to achieve the goals:
When Intel implemented the P6-Core (PentiumPro), they needed to get some of the features of RISC (superscalar, pipelining, cache, etc.) without actually redesigning the instruction set (and doing things right). So they just bolted it on -- in engineering terms, this is called a hack. They expand all the ugly x86 instructions into one (or a few) RISC like instructions (called ROps). Then the processor basically runs those ROPs in a RISC-like back-end sub-core -- think of it as built-in x86 emulator.
This is a CISC-RISC hybrid that certainly isn't like real RISC and sees only a few of the rewards of a new (better) design. Intel originally called this hack, er, architecture "CRISC" -- which stood for the oxymoronic "Complex Reduced Intruction Set Computer". That name and concept was so moronic that even Intel couldn't market it -- so they stopped calling it that. The problems with Intel's hybrid chip is that it sees few of the benefits of RISC.
Remember, the design philosophy of "CRISC" didn't change -- and RISC is a design philsophy. Intel didn't reduce the instruction set complexity -- they just bolted more on. Think of sticking a V8 motor, bigger tires, and a camper-shell on an old Volkswagen Beetle and calling it a SUV (Sport Utility Vehicle) -- that's Intel's way of approaching processor design. The bolt-on RISC core (and CISC emulator) makes their chips bigger, hotter, and use more power than RISC, they aren't easier to design (and are actually harder). The x86 processors have more bugs since they are much more complex (which has been proven with Pentium floating-point bugs, and P6 Overflow Bugs, all the support chip bugs, and so on). Intel is hiding the symptoms of not being RISC by improving their microcode to be more RISC like and by throwing 10 times more designers at the problem -- but this can't make up for the lousy (antiquated) architecture. They can get nearly the same performance, but they require more transistors, design time, heat, power, and cost to do it. They haven't made the engineering tradeoffs -- they just bought their way out. They didn't engineer anything out, they didn't design (which is the RISC philosophy) -- they just hacked more crap on and called it "good enough". All these concepts are diametrically opposed to the RISC design philosphy.
It isn't just the philosophy that isn't RISC -- let's look at the implementation details and techniques that have defined RISC;
Decrease instruction set complexity
With the Pentiums (or Athlon) there are no RISC advantages for modernizing the instruction set. Programmers still have problems with a register starved architecture, and an ugly outdated instruction set that can't do many things programmer want (easily). They still have this butt-ugly stack based floating point unit, and so on. The compilers (and programmers) can't access the ROps directly (and get around the CISC emulator/core). The implementation of that RISC-like core is hidden (opaque) from programmers (and changes implementation to implementation). Because of all this, it is harder to properly schedule things (with psuedo-RISC) and more difficult to optimize well, and when you do you are dependent on that implementation of the chip. Which brings up another point -- you need more specialization in your compiler optimization to get the same results -- and there can be more variation processor to processor (Pentium, PentiumPro, AMDs, CYRIX and so on). So one of the negatives of RISC (requiring smarter compilers) is magnified by this psuedo-RISC implementation -- it takes an even smarter compiler (or more hardware) than RISC to achieve the same results. And the other negative of this (that code was more specifically optimized to a particular flavor of chip) is magnified as well. This is one of the reasons why Pentiums do better in special case specifications and benchmarks than they do in most real world tests.
One of the key concepts to simplifying the instruction set was a LOAD/STORE archicture -- to break down the MOV instruction into its factored parts. Intel didn't change the ISA, so they still use MOV's -- with all the foibles and issues. In the back end, they can break it down, and sort of expand it to many RISC instructions -- but this requires a complex front-end which increases size, heat, complexity, design time and so on. One of the most defining parts of RISC -- LOAD/STORE is not there.
Fixed size instructions (alignment)
Another concept of RISC is to have fixed size instructions. This keeps the prefetch logic and cache logic much simpler. Everything you do is a predictable size. There is some gray area here as RISC evolved to allow misaligned data (but not instructions) -- but that didn't take much logic at all. (It just shifts alignment as it gets into the processor). The Pentiums load all these mixed sizes, then it tries to break them down, and pad them out to simulate fixed sized instructions for the RISC-like core. This means a whole new front-end unit to bring in data and keep unpacking it to try to align it, or sometimes expand it into many instructions. This doesn't simplify instruction decoding like RISC did -- it makes it more complex.
Prefetching and Cache
Pentiums can prefetch, and they do have caches. But because Intel didn't simplify the ISA (Instruction Set Architecture) first, they have have less space for cache (and prefetch logic) or they have to use more space and make bigger and hotter chips to get the same work done. So they did borrow the technique from RISC, and see many of the results (performance) -- just not "as good" of results.
Intel didn't change the ISA so they can't have more registers. You are stuck with the same old 8 registers. They do use a technique called register renaming to mimic more. Basically this gives the chip sets of registers and it keeps flipping between the sets -- but this isn't nearly as versatile and powerful as having named registers and allowing the compiler and programmers to use them. Often the "sets" can conflict with each other, or require this context swap too often -- and the compiler and programmers have no control over it. So this technique is better than nothing, but not better than having more registers. Register renaming is a technique that RISC chips can do as well -- so it isn't like there is any gain over RISC -- just a hack to minimize the flaw of the Pentium. Intel knows of this flaw, and is trying to fix this in their next processor (Merced/Itanic).
3+ Operand Instructions
The Pentiums did nothing to fix the problem with constantly overwriting your source registers (2 operand instructions) -- they couldn't, they didn't change the instruction set -- so they see none of the advantages of 3 operand instructions. Intel knows of this flaw, and is trying to fix this in their next processor (Merced/Itanic).
Even when you look at the RISC-like core, you realize it isn't really a good RISC. They didn't have room to make a lot of superscalar (parallel execution units), so secondary units are usually stripped down units that only work on some subset of functionality. And of course to do that they still had to add fancy hardware to help with the scheduling (to use the specialized units well) -- but even then, it requires very fancy scheduling in the compiler as well. Doing all this, they can minimize the flaws of this -- but usually that only works in best case scenarios.
This is another reason why Intels best case numbers (like Spec) offer misleading results as compared to their average or worst case scenarios. Now don't get me wrong, most RISC chips have some special units or limited units as well -- it is just that the Pentiums seem to be more limited and more special cased -- making things more complex.
Pipelines, Out of Order and Branch Prediction
These areas are where Intel did very well.
The pipes are deep but expensive -- they had to go very deep just to improve anything. The deeper you go in a pipe, the simpler each stage is, but the more complexity you have to create overall. Also the deeper you go, the faster in MHz your processor runs, but the more penalties you have when you stall and the less actually gets done in each cycle. So Intel has to run the processor faster (MHz), just to keep up with slower RISC chips. Overall -- they put in a very powerful pipeline, just to make up for the inefficiencies of the instruction set.
Intel didn't stop with deep pipes. The out-of-order execution (reorder buffers) are usually tied to pipe-depth -- so Intel had to keep this part in-order with the pipelines. So intel put a lot of work and complexity in here.
Branch prediction becomes even more of an issue the deeper you go in pipes. Since a stall is a much bigger penatly with deeper pipes (more stages), they really had to work hard to avoid those stalls. Branches are common stalls -- so they spent a lot of time in this area as well. They implemented some of the most complex branch prediction logic -- and saw some performance rewards for doing it.
Deep pipes, complex branch prediction and large out-of-order buffers aren't bad -- you are just trading off design complexity for some speed (and less being done at a given speed). They are tied together in design (if you want to balance them and perform well) -- and Intel did a good job. But Intel added lots of design complexity (which included heat and size and cost) because they needed to. Even though these techniques were first used in RISC, they aren't RISC only concepts -- many non-RISC chips use them too.
Remember, RISC is about tradeoffs and philosophy -- and going too far towards complexity in your design seems to be "not very RISC like" when it comes to philosophy. So Intel did good (in performance) -- but we can argue whether it was a balanced or "RISC" like in design. I personally think they leaned way too heavy on "more transistors" and complexity to solve their problems. The P6 took two or three iterations to get working right, and another one (or two) to really shine.
The real tradeoffs of RISC were supposed to net many things. Faster time to market, lower design cost, less complexity, less transistors, less size, less heat, less power, cheaper to manufacture and so on. Intel realized none of them. The Pentiums took them a long time to design (and get to market), the design costs were astronomical -- I've seen estimates that start at at least 10 times the design costs of PowerPCs at the same performance. Intel didn't reduce the complexity -- they only increased it. More transistors, heat, size, power, cost to manufacture and so on. Intel has done an amazing job of hacking on to an old dying architecture, and dragging it along kicking and screaming. And there is no doubt that the designers claiming that "CISC is dead" were premature -- you could evolve CISC, if you had enough money to burn. But the point is that the Pentium isn't really RISC. You don't program in RISC, you get none of the efficiency benefits of RISC, and they didn't take any of the old garbage out -- it would be like trying to cure having too many things, by adding more things (that you hide your other things in) -- this doesn't cure the problem, it just hides the symptom.
Splitting the processor into two part (CISC preprocessor and the RISC core) helped Intel achieve some goals (like speed) -- but they only achieve speed by spending lots of space (transistors, heat and die area) and money (and time) to get around the same performance as a PowerPC that is half their size (or less). The Athlon and the PIII demonstrate that you can make fast CISC chips (that borrow RISC concepts and back-ends), but only if you have a power budget that is an order of magnitude larger than the competitive RISC machines. On a desktop the power and heat budget doesn't matter much to most people (though it probably should matter more) -- in a laptop or other low end device it is a VERY big deal.
The CISC-RISC hybrids (Pentium and Athlon) are all out designs that really should be compared to Alpha 21264's and Power4's (where performance is designed in without respect to other costs) and the CISC designs fall short compared to those RISC designs. Compare the Pentium for Portable designs to the PowerPCs and they fall flat on their face and aren't in the same league.
Adding complexity (to get the same performance) isn't the design goals of RISC, and it certainly doesn't have the same design philosophy. In fact the old CISC half of Pentiums is just wasted space (power, heat, cost, complexity, time) which prevents the RISC part from being a really good design. If Intel threw away the CISC preprocessor, and went to real RISC, they could achieve more performance in less area, with less power consumption, less cost to manufacture, and less heat produced. That chip could be called RISC. Intel knows this and that is what they are trying to do this more (but still have the x86 emulator) with their Merced (Itanium) chip. So Intel itself knows the truth, and they are trying to use the latest RISC concepts (and then some, which are evolving into what we should start calling post-RISC) -- but it is still Intels attempt to catch up with (or leap ahead of) the rest of the industry. Eventually Intel might even drop the tumorous growth (x86 emulator) off the Itanium and make it a more efficient design -- but we'll have to wait and see.
RISC is about factoring the instructions to save space and complexity and use that savings elsewhere. CISC was about keeping lots of older instructions and just adding more and more in. Breaking down CISC instructions into something smaller is nothing new -- that has been done since the 70's and was called microcode. This was being done before RISC and was part of the inspiration to create RISC in the first place. So now Intel has added a more sophisticated microcode engine that is almost a RISC-engine -- that isn't a revolution, and that isn't the same as RISC. In fact the whole point of going to RISC was to eliminate microcode and certainly not to create more of it.
RISC is just a design philosophy -- it doesn't cure world hunger, and just the name/philsophy itself doesn't guarantee a good implementation. Why some people are insistant that their processors are RISC, when clearly they are not, is beyond me. So the next time someone tells you the Athlon, Celeron, PentiumPro, PentiumII, or PentiumIII is RISC, you can chuckle knowingly, pat them on the head, and send them on their way -- they obviously just don't understand RISC, and what they are talking about -- or they worse and they do and are trying to misinform or confuse people because of their own insecurities, or the truth is something they just don't want to face.