Dojo (HowTo)







  Easter Eggs




  Martial Arts

Technical Respond to the State of Mac Processors
Response #2

By:David K. Every
©Copyright 1999

I keep getting lots of questions about articles over at MacObserver / Webintosh.

This article will respond to the first of the articles.

Technical Analysis Of The Mac Processor Technology, Part II

Tuan Truong has some valid points in the articles -- but the truth gets obfuscated in the opinion side of the article. There is not much balance in the article -- not because of what Tuan says, but because of what he does not say. So I will break apart the articles and try to explain things more clearly.

There are two problems to look at. They are both interrelated. One is the microprocessor company Apple buys its CPU and CPU subsystems from, and the other is the computer architecture Apple follows. My strategy is for Apple to team up with and or invest in a microprocessor company whose main, if not only, source of revenue is from desktop microprocessors, so that Apple would have more influence on it than as it would as just a buyer. Intimately related to this is the computer architecture Apple uses. It's essentially the same as x86 computer systems: microprocessors communicating with its subsystems on a shared memory and I/O bus. Apple along with IBM and Motorola cannot compete with the x86 market's economies of scale, so simply cannot mimic the x86 computer architecture and expect to be cheaper, especially when they don't have a CPU and CPU subsystem that are prominently faster than what's available in the x86 market. A new architecture and strategy needs to be considered.

How do I say this delicately?

How about, "Wrong, Wrong, Wrong".

IBM and Motorola do not have the same economies of scale as the x86 because they have their own dynamic, and possibly better one. Motorola moves more processors than Intel does -- they are just far simpler embedded processors. IBM doesn't sell as many as Intel (though they have a large embedded market as well) -- but IBM may make more money on their processors, since IBMs market is much more high end. (Since IBM is the biggest consumers of IBM processors, it isn't always clear how much is being made). Both those dollar and volume amounts help subsidize the R&D for the PowerPC processors. Each is company and market is a tad specialized, and gets them to "tailor" or customize in slightly different directions -- but then for a general processor, you are really only paying for the "difference" in R&D between the slightly customized flavors and the more generic ones.

Plus let's look at this thing logically. PowerPCs use the same fabrication techniques that Motorola and IBM must keep sharp to compete in their own areas. Having dual sources (IBM and Motorola) keeps them both competitive and drives innovation. In fact, Macs and PC are more similar than they are different. They use the same RAM, many of the same support chips, same PCI controllers (for cards), many of the same cards, disk drives, DVD, graphics cards, keyboards, mice, printers, and so on. I would say probably 90% of the economies of scale that help x86 PCs are free to PowerPC based machines. So what we are really discussing is the remaining 10% (at MOST) that is the processor and memory controller. (For current designs the memory controller is also an I/O controller).

So Motorola and IBM need to push the processor design anyway! The difference is that in the Motorola arena they are pushing a little harder in the low-power arena (embedded controller). Of course by listening to Apple and putting in the AltiVec unit, they ended up with a far better processor, that is now allowing Motorola to go into new markets. Despite some silliness (and lack of vision) on Motorola management's part, there have to be a few with enough brain cells to figure out that this cooperative development helps them. IBM is in about the same boat. But remember, there is tons of overlap (or at least there is potential). Lets say that Motorola figures out a way to improve the processor for speed and heat (power) -- that ends up resulting that Apple's processors get that gain.

For a very slight investment, lets say that Apple and Motorola cooperate to use Motorola's size and power advantage to make multiple cored processors (really two or four processors in one). The reality is that the x86 crowd can't follow in this direction as easily -- since their die is bigger (and yields of size will drive the costs up far more than the benefits if they increase their size to do this). So it is a slight investment on the side of Motorola that is likely to break their embedded controllers into whole new markets -- and in fact could allow them to build fault tolerant machines as well (with some software voting). So almost everything done for Apple helps Motorola. How long until Motorola figures this out? The same goes for IBM. Only one companies management has to have a clue.

Now, the only thing that is left is the memory and I/O controller (support chips). Apple is currently, and has in the past, designed their own support chips. Apple is trying to basically drive down the rest of the Mac onto a single chip. The cost in time on these chips is fairly fixed -- but the cost in money (compared to the machine) is not very high at all. Apple gets to specialize for its market, and since it is spreading out one design (or a few very related designs) over all Macs made, this cost is low. In fact, the market for Apple's support chip (say UniNorth), is likely to be far larger than most of the support chip sets for the x86 processor. (There are many companies that make support chips for x86 -- so their market is very fragmented).

If Apple starts to try to migrate to a new processor or seriously different architecture, that means that they fragment their own market, and drive UP the relative costs, since more of the costs have to be done twice. And these are just some of the flaws in the idea of economies of scale and changing -- and we are not even getting into the fact that there aren't any dramatically superior architectures out there right now that have even a fraction of the market size.

The architecture I would consider is like the following: a central computing module attached to a switched fabric based I/O logic board. The central computing module would have a CPU with an embedded memory controller and an embedded graphics accelerator (or a vector unit that can emulate it), RAM, and Boot ROM. With the memory controller and graphics accelerator on the chip, it puts the bandwidth to and from CPU, graphics and memory at the performance limit.

The CPU design can be any high performance CPU: PowerPC, MIPS, and even the PSX2's Emotion Engine. Something like this would require Apple to have a major influence on the microprocessor company; hence, the need for investment. The switched fabric I/O is a serial link, packet switched I/O bus architecture as opposed to the memory mapped shared PCI bus.

It's performance is analogous to the differences between Fibre Channel or FireWire to SCSI. It can be found in various workstations and mainframes today. It will allow very easy hot swap, better streaming performance, and easy expansion. PCI will not be abandoned since it can be placed on one of the switched fabric links. Intel is slated to use a version of it for servers and high end workstations in a year or so.

Engineering is always a balancing act. If Apple goes to such an architecture too soon, that means that they have to design everything themselves -- and they can't share components with PCs. So this likely means a marginal increase in speed, and a dramatic increase in costs. That isn't bad on the high end, but would kill volume and the low end. Then you have to see if you can shrink this down (power and size) and get them in a portable and low cost solutions? I just don't think it is wise for Apple to try to customize in this way.

Memory is becoming a limiting factor to processor performance. But the fact is that PowerPCs are smaller and use less heat -- and they have larger caches which help alleviate some of the processor bottleneck. It would be far smarter to put large L2 caches on board and increase performance that way (and reduce memory bottlenecks). But that is only one way. There are a dozen ways to solve many of these problems in far more conservative ways that what Tuan advocates. So I believe Tuan is advocating what he thinks is "cool" and cutting edge, which it is, but without thinking of the consequences or thinking about waiting for the right time (marketing wise) to implement said changes.

For example, lets look at AGP. AGP is basically Intel copying (conceptually) Apple's long used PDS slots (with a slightly different format). It has taken years but it is starting to mainstream and more than just the very high end are using AGP. Intel is about to come out with a faster AGP. If Apple had implemented AGP from the start, then they would have been impacted by a lack of supply for support chips and potential bugs with implementations and drivers. Plus the reality is that in the real world for 95% of what people do, there is a nominal performance increase by using the last generation AGP (over using FAST PCI, or PCI @ 66 MHz). So it is a case of sounding cool -- but not being nearly as cool as the hype. When the bugs get worked out, and there is a bigger advantage of moving to AGP (like the x4 AGP) and there are enough suppliers and cards, I expect that Apple will make the move. But it is all a case of measuring timing versus rewards -- not just who can be the fastest in theory.

To visualize better, look at the Apple Yosemite (the Blue Professional PowerMac G3) main logic board diagram and draw a line between the primary PCI bus and secondary PCI bus. Everything to the left of the secondary PCI bus is on this multichip central computing module. The advantage of this central computing module is that performance for CPU, graphics, memory, and some I/O will increase automatically with clock rate increases. If a more versatile graphics option is preferred, a graphics bus can be built directly into the CPU's bus, similar to the current backside L2 cache in the PPC 750, allowing for very high bandwidth bus with very small latencies.

So what would this do? It would mean that the graphics chips or cards that Apple uses would have to be designed custom for their processor bus. Right now Apple is using ATI's economies of scale in producing PC solutions to drive down the costs for Apple's solutions. But using this new design philosophy ATI would have to customize for Apple -- which would drive up costs, increase development time, and ATI management may be totally disinterested in this "islanded" solution.

More than that, I'm not even sure that is where a substantial bottleneck resides. If you can't saturate the current bus (solution) then making it wider or faster gives you nothing. If it makes it 10 times faster, but is only where you spend 10% of your time, then you've really only gained a few percentage point in total performance -- and may not be worth the costs. Lastly, because there is a bridge chip and a split bus in the current design, it means that the graphics card can talk to other cards directly -- which in some cases may increase performance (by NOT holding up the main memory bus). I doubt that it is true in this case, but in some cases the design "improvement" could actually reduce overall performance (by putting more load on the processor / memory bus). System design is a lot more than just making components go faster!

This strategy gives Apple a generational leap in computer architecture, and most important of all, much more control on the most important part of their computer systems. It's a promise heard before, but reality has disproven that optimism during the heady days of PowerPC's birth. Apple has already lived through a crisis like this with the Motorola 68k, it should take steps so that it doesn't repeat those mistakes again with PowerPC.

I don't see how changing their architecture to a more proprietary higher costs solution is like to cure problems Motorola (or IBM) has with their management and vision. I'm not even sure that such a solution would make a difference with the current architecture in performance.

So Apple is likely to improve things with their own memory controllers. Like the MaxBus (G4 / 7400) can interleave accesses to memory. Grackle (the current memory controller) does not support this. Likely the next generation memory chip, by supporting this one function could help the PowerPC interleave reads and writes that could double or even triple performance without changing the Bus architecture in any substantial way. Isn't that a better use of design dollars? Apple will still be able to use cutting edge memories -- but Apple's size (large market) actually means that Apple has to wait a bit until suppliers can meet their huge demands. Jumping too soon into a technology is only a way for Apple to shoot themselves in the head, by constricting supply and losing sales.

So while I do not doubt that sexy new concepts like Fabrics and switched channels are neat -- there is a time to implement them. The cutting edge can be the bleeding edge. Apple needs to be remain at the front -- but not too far in front (or costs go up). Computers are complex systems -- and there are many many ways to achieve similar goals. Think of a car. If you want it to go faster, you can put in a bigger motor, tune or turbocharge/supercharge the one you have, bottle it (Nitrous Oxide Boost), or you can just lower the weight of the car. Many car nuts love the bottle and the horsepower -- but many wise manufacturers focus on better torque curves, tuning, and lowering the weight and so on. Many techno-advocates are like car nuts -- they want more and more, and are personally willing to pay any price to get there -- but they are not looking at market realities and what consumers really want or need. (No offense to Tuan is intended).

Created: 03/21/98
Updated: 11/09/02

Top of page

Top of Section