MacKiDo/Hardware/64Bit

Advocacy

Myths
Press

Dojo (HowTo)

General
Hack
Hardware
Interface
Software

Reference

Standards
People
Forensics

Markets

Web

Museum

CodeNames
Easter Eggs
History
Innovation
Sightings

News

Opinion

Other

Martial Arts
ITIL
Thought

Processors
When do we need 64 bit?

By:David K. Every
©Copyright 1999

People are reading on the "roadmaps" and hearing about 64 bit processing, and asking questions like "isn't the PPC already 64 or 128 bit"? Unfortunately, asking about the size of a computer is not an easy question to answer -- there are a lot of "sizes" in a computer -- but I'll try to answer lots of questions (asked or unasked), all about size, which matter, and why.

There are three ways to measure a processors "size";

How many bits of data a processor works with

How many bits a processor uses to address memory

How many bits can move around at once

History of data size

A processor works with certain data sizes at a time.

Microcomputers started at 4 bits (with the Intel 4004). That turned out to be too little data to do anything of value -- even then, a character took 8 bits, and so with a 4 bit computer you were always having to do multiple 4 bit instructions to process a single 8 bit chunk of data. Then 8 bits became the standard. 8 bits made sense since a single character of text (upper or lower case, and all numbers and symbols), took 7 or 8 bits to encode. So 8 bits was a good size that lasted for a few years.

But an 8 bit number is only any whole number between 0 and 255. To do heavy math, you needed to work with more bits at once. The more the merrier. 16 bits could get you a value between 0 and 65535 -- which is a lot more detail in a single pass --so 16 was better than 8 bits -- but if 16 bits was better for math, then 32 was better still. Lots of numbers in daily use are larger that 65,000 -- so 16 bits was also requiring double-passes to get things done. 32 bits was much better for integer math -- since it allowed a range of 0 - 4,,000,000,000 (or +2B to -2B signed). That was good enough for about 99%+ of integer math.

The Motorola processors basically skipped from 8 bit (6800) to 32 bit (68000) processors (around 1980) because 16 bit was only marginally better than 8 bit, and you really wanted 32 bits for most work. Intel made a half-assed step from 8 bit (8080) to 16 bit around the same time as Motorola -- but sort of got lodged in 16 bit for a while with (8088, 8086, 80286), and it took them a while(6-8 years, or until like 1987 or so) to get with the program and catch on. Intel was ever the wall-flower when it came to good technology and design.

Well 32 bits turned out to be good enough for most integer math for many years. Heck, I've been using them with Motorola processors since 1979 -- before 32 bit computing, the sizes of changing every few years -- so a 20 year run sorta proves the point. 32 bits meant you could move 4 characters around at once, and work with large integers. Great stuff.

But not all math is an integer (whole numbers between 0 and 4 Billion, or between -2 Billion and 2 Billion if you steal one bit to denote positive or negative integer) -- what if you want decimals or fractions (real numbers)? Well, you can cheat and encode some fractions or decimal values using 32 bits, but you lose resolution (total range). So they are only good for some things. To encode a real number with accuracy good enough for most science and math applications (and plenty of range) you need 64 bits. But again, these are not usually used for Integers -- this is for floating point. At first (the 80s) people used to add special "floating point processors" or FPU's (Floating Point Units) to help your main processor do this kind of math -- and make microcomputers behave like big mainframes and lab computers. By the early 90s, floating point units got added to the main processors (and are integral)-- and we've stayed there ever since -- 32 bit ints, and 64 bit fp, and using 8 bit characters.

How many bits of data

Well, about a decade or two after those sizes settled in, things are finally changing again. The PowerPC AltiVec (Velocity Engine) is 128 bits wide -- and we may have 64 bit integers in the near future.

So when people ask about the size of the PowerPC (G4)-- currently it is a 32 bit (integer), 64 bit (floating point), and 128 bit (vector) processor. Depending on what you are doing, will depend on the size of data you are working with. When you need to move large chunks of data around, you just use the AltiVec unit, and spew data around in large 128 bit chunks. When you are dealing with most simple things (like counters and loops, and addresses) you deal with 32 bits, and real numbers go into the 64 bit floating point unit.

The big change that broke the 32 bit int barrier is that the vector unit is a way for a 128 bit computer, to work with smaller chunks of data. So instead of being just a super-high resolution 128 bit computer, it really behaves like16 individual 8 bit computers, or as 8 individual 16 bit computers, or as 4 individual 32 bit computers -- all in parallel or at the same time. The data sizes of 8, 16 and 32 bit were good enough for most things -- but now we can pairing them up and makes things work faster. So we still don't really need 64 bit integers to work with larger data (espcially if we have 64 bit floating point, and 128 bit vectors).

Yet, everything changes, and things are marching forward. After nearly 30 (or 40) years (depending on how you count it) we are evolving away from 8 bits to represent a character -- and going to 16 bits (unicode). This allows for a character to represent any character (in any character set) in the world -- not just roman characters. So we are seeing some reasons to migrate towards larger datasizes. But not much reason -- a16 bit characters still fits nicely into a 32 bit integer, and the 128 bit AltiVec can move around 8 unicode characters at once (more than older processors could move bytes) -- so we aren't forced to go to 64 bit ints for that reason.

More the reason for the sudden move is memory. While 32 bit integers are good enough for 99% of computing math -- things are easier to do in your computer if your processors integers match the total size of their addresses. So read on to get more of the picture of why we will go to 64 bit integers, even if they aren't going to be a huge win (in performance).

How many bits of address

Now a computer has an address for each and every memory location. 8 bits of address, mean that your computer can address 256 addresses -- each usually being a byte long. 256 addresses wasn't much -- so even 8 bit computers would often work with 16 bits of address to enable them to work with 65,536 bytes (or address 64K of memory). You'd be surprised what we could do with computers back then, even with that little memory.

Now days we have 32 bit addresses -- and a 32 bit address, can deal with 4 Billion addresses (4 Gigabytes of memory). 32 bit addresses have been standard for quite some time, and will be for a while. Let's face it, we aren't bumping our head on the 4 Gigabytes of RAM limit yet, and we have a few years before that is going to be standard on a personal computer. But we will eventually get to where we need more than 4 Gigabytes (billion)-- and at the rate things are growing, it will be in the next 5 or 10 years. So designers are looking at jumping to 64 bits, or roughly 16 exobytes of memory (quintillion) to prepare for the future

To give you some perspective; 16 exobytes of memory would cost around $6 Billion (not counting volume discounts) at todays RAM prices.
The naming goes, mega (million), giga (billion), tera (trillion), peta (quadrillion), exo (quintillion). While these are the popular terms, they are not technically correct terms. Peta means quadrillion -- but petabyte roughly means 2^50 which is really 1.1259 quadrillion. And Exobytes is worse, it means 2^60, or 1.15... quintillion. And actually, 16 exobytes is about 18.5 quintillion bytes of memory. But what's an extra 2.5 quintillion bytes of memory among friends?

64 bits of addressing is a heck of a lot of memory, and as I said, 32 bits is good enough for most users today (and for the next 4 or 5 years or so). So 64 bit addressing is a waste of time for my needs, and almost every home user (for now). Servers and scientific computerrs can use more, because they are often working with huge chunks of data. They can also get by with less memory addressing -- but it creates a little more software work for them, and no one likes more work.

And remember, there is a relationship between address size, and your data size. It is better if they are the same. Just because you have 64 bit addresses doesn't mean that you have to go to 64 bit data -- for many years, Intel proved that; while they only had 16 bit computers, they had 20 bit addresssing -- and older 8 bit computers had 16 bits of addressing. So it isn't hard to work around the problems -- it just makes a little more software work, since your computer can't do the math (in a single instruction) to directly calculate any location in memory. But there are many techniques to get around this, like; segmenting, relative offsets, branch islands, paged memory and other little "tricks" to solve the problems. Most aren't even hard to work with, since compilers and the OS can hide the pain (if done well).

The reason why Intel processor sucked with 64K pages, and 640K barrier was because the Operating Systems (DOS and Windows) sucked. Many others did a far better job with those same processor limitations. So paging isn't great, but it is a known problem that was easily worked around by everyone but Microsoft.

Because things will grow in the future, it is a good idea to jump your data size when you bump your address size, and get it over with. (Avoid the minor issues of segmenting). Since we are going to bump address sizes in the near future with Merced and PowerPC -- I expect that 64 bit integers will become the norm because of those reasons -- not because of speed.

How many bits can move at once

Now ironically, just because a computer (processor) works certain sized data (in registers), doesn't mean that is the same size (number of bits) that it moves in, out and around. In fact, there are a few levels here as well. There is the size of the internal registers, the size of the math it can do at once (ALU - Arithmetic Logical Unit), the size of the cache (width), and the size of the bus (channel / pipe from the cache to the memory).

Going from the processor to memory is the bus (or memory bus) may be different than going from the processor to the internal cache. We used to care about the processor to main memory the most (before cache) -- but now days, 95% of the (or more) when the processor is accessing something, it is getting it from the cache. So the cache size is the most important right? Not as much as you might think, because main memory is up to 10 times slower. So there is a balancing act in design, and between all the sizes in your system.

Interestingly, the PowerPC (called a 32 bit processor) has a 64 bit bus. If you go off-chip, it moves 64 bits at the same time. Even in integers, it has to move 64 bits to the processor (from memory), even if it only needs to see 32 bits of what it loads.
And differing size issues are nothing new. The 68000 was a 32 bit computer (that worked with 32 bits internally, had a 32 bit ALU, and had 32 bit registers) -- but it only had a16 bit bus. While the Intel 8088 was an 8 bit computer that could pair registers (to pretend to be 16 bits) and had an 8 bit bus, and an 16 bit ALU. So the press and PC-advocates called both the 8088 and the 68000,16 bit computers -- just so you know that the Intel bias is nothing new.

Internally, the PowerPC (G3) has a 64 bit bus (as does the Pentium) - but the G4 has a 128 bit bus to cache, so when you need to move lots of data around inside the processor the G4 is much faster. Externally, the current G4s only have 64 bits to memory, but they are talking about doing 128 bits to L2 cache -- and the can do 128 bits to memory, but it would cost a lot in redesigned systems.

Another internal channel which is how well the processor can do math -- the ALU. The ALU is where math is actually done inside the chip -- all the rest of the processor is basicaly for moving things around and doing simpler instructions (loops, branches, conditionals, etc.). The G3 has a 32 bit ALU. So to do most floating point instructions (64 bits), it takes the G3 ALU 2 cycles (passes). The G4 has a 64 bit ALU, so it takes a single cycle -- and in fact, the G4 also has another 128 bit ALU to do vector instructions (and it can do so at the same time it is doing integer and floating point). The Pentiums have a 64 bit ALU (mostly), but for some things it is as slow (or slower) than the 32 bit ALU in the G3, yet for others it is as fast as the ALU in the G4 -- but it is never as good as the 128 bit ALU in the G4 for vectors.

So there are a lot of little areas where data is moving around -- and they are all different sizes for different reasons.

Will 64 bits matter?

Will going from a 32 bit processor to a 64 bit processor matter to the PPC? Not for most things (not by itself). In fact, having larger addresses can mean that you have to move around twice as much memory to get the same work done. Since memory is the bigger bottleneck (over the processor) right now, this means that working in 64 bit mode (addresses) is likely slower! But most of that will be hidden, and minor -- and it will have a few advantage to large databases, and people that need huge amounts of storage. Working with 64 bit data can be slower too (if you only need to work with bytes, or 16 or 32 bits -- all the extra is wasted effort) -- but most of the time it won't matter, and in a few cases it can be faster. So it matters what you are doing. Now days the 128 bit AltiVec unit is already in the processor and is better for almost anything you want to do with large chunks of data over 64 bit integers anyway. So in the real work, if the speed increases (going to the 64 bit versions of the PowerPC), it will be because of other optimizations and improvements and not because of the "larger" data size.

What about compatibility?

I haven't read the programming specification -- but I'm sure that the old (32 bit) software will still run on new 64 bit PowerPCs. Older Apps might not take advantage of the full memory size, or might be a few percentage points slower than they could be (if recompiled) -- but they will still run. And recompiling to take advantage of the 64 bits should be a pretty painless process -- at least for PowerPCs. When Intel goes with IA64 (64 bit), they are changing their entire instruction set (to catch up with the rest of the industry, and finally add RISC concepts) -- and that will cause them lots of compatibility issues. They will have a backwards-compatible mode -- but that is a much bigger deal, and will run far slower than their native mode -- so the x86 camp will not make the migration to 64 bit very easily.

Conclusion

There is a murphy's law of communication (or should be) -- that no matter which way you mean something, someone else will assume you mean it a different way. So in the discusionn about a computers "size", people love to fight over which metric you mean (since they are always talking about the OTHER way of measuring).

I care mostly about data size (internal) and the size to memory. Those are the areas that matter the most to me (and to most users). Currently the PowerPC's (G4s) are 64 bit or 128 bit when I need them to be -- and they aren't wasting memory with 64 bit addresses. I consider the G4 to be a 128 bit computer -- but it is not a FULL 128 bit computer. To be perfectly correct, a 128 bit computer should not only be 128 bits internally, and have 128 bit pipes (like the G4) -- but it should also address 128 bits of memory (so you don't run into that 16 exobyte barrier) and have a 128 bit bus. Something to look forward to I guess.

Created: 09/18/99
Updated: 11/09/02

Top of page

Top of Section

Home