Advocacy

  Myths
  Press

Dojo (HowTo)

  General
  Hack
  Hardware
  Interface
  Software

Reference

  Standards
  People
  Forensics

Markets

  Web

Museum

  CodeNames
  Easter Eggs
  History
  Innovation
  Sightings

News

  Opinion

Other

  Martial Arts
  ITIL
  Thought


iBench
What does this matter?

By:David K. Every
©Copyright 1999


Ziff-Davis has created a new "benchmark" for measuring Internet performance -- called iBench. Great -- just what the world needs, another misleading benchmark. It shows PCs as being dramatically faster than Macs in doing some web things. Of course, after reviewing what I could (the online information isn't complete) I have some technical "issues" with i-Bench methodology -- but I don't doubt their conclusions (much beyond healthy skepticism), I just disagree with everything that they imply!

I have philisophical issues with giving out many benchmarks "objectively" (see without a frame of reference). Benchmarks are confusing and misleading especially when people don't look into the deeper issues. So there may be some technical issues with the benchmark, or maybe not, but I think the real issue will be that many people will miss the point and the far bigger ramifications of what is going on.

Thresholds of performance

Dave's first rule of benchmarks : "Some Benchmarks are more significant than others!".

I know that sounds like an obvious thing -- but many people don't get this. I understand people's need to measure themselves or their things against one another, healthy competition and all that -- but sometimes these things just go too far. Do you consider yourself a better person because you are 1" taller than a friend, or have longer nose hair? Let's keep some perspective!

I am constantly shaking my head listening to people who care about the difference between some game doing 120 and 123 fps (frames per second) on a "QuakeMark" or something like that. People are missing the point -- thresholds of performance! For most games it is around 20 - 30 fps. Once you are going faster than that, it really doesn't matter much. Oh it can matter a little, but I guarantee you the difference in scores between one player at 30 fps and another at 120 fps will be almost completely skill (or psychological) -- not physical. Once the threshold of human perception and response has been surpassed who cares? If you cross the threshold of monitor performance (your monitor is like 60 Hz / fps), then different specs beyond that just won't matter.

To go truly geek -- what little performance difference there is between the 30 fps and 120 fps versions of a game is not even what people think it is. It isn't the frame rate (thoughput) that matters -- what matters is the latency! If both guys shoot their gun at each other at the same time, odds are the faster frame rate game will respond faster. Latency doesn't have to be tied to frame rate -- but in many games it is. In some cases games could be made that the slower frame rate has lower latency (better response time) because it isn't wasting potential performance updating a screen when it should be responding to a user. So the significant question of "why" it matters, was never asked or addressed because people were paying attention to the insignificant benchmark.

This is a key point -- humans are slow. Once you pass a humans threshold of performance (the lowest common denominator) then it doesn't really matter how fast you go! If I create a new Compact Disc player that samples at 166 KHz, and does 128 bit sampling, the specs would sound great, but while my dog or a dolphin might appreciate the increased fidelity, I wouldn't be able to. In fact, the only thing those "increased" specs would do is waste memory and performance and probably increase potential errors and some other negatives.

There are things than can increase the threshold of performance -- most notably, repetition. If you have to repeat something 1,000 times, then little differences can add up to significant amounts -- and addatively cross the threshold of performance. But if you aren't doing that, then it probably doesn't matter. Faster is often better -- but not always, and there are costs. So whenever you hear of a benchmark think "thresholds of performance". How fast is fast enough?

Does it matter to me?

Dave's second rule of benchmarking: "does it matter to me".

Look at any benchmark, and the time lengths, and ask yourself that question.

If you are using something like a Photoshop, doing a 3D render or MP3 encoding, then there is a signicant difference between 30 seconds and 15 seconds, or 10 minutes versus 5 minutes. You should likely decide that the time differential is significant for you -- after all, you are waiting for long, perceptible, amounts of time. On the other hand if the benchmark is something absolutely insignificant like the difference between .1 seconds and .0001 seconds then you need to remember, "it doesn't matter to me". Even if the difference in performance was 1000 times greater than the previous example!

So sometimes a 100% performance increase matters -- and sometimes a 10,000% increase is irrelevant! Any complaints with speed I have on a Mac are not centered around browsing.

Quality goes beyond benchmarks

Dave's third, and most important rule of benchmarking: "Benchmarks are not related to quality or real world performance".

Pardon the car analogy -- but I went and bought myself a car based on a variety of things; I wanted comfort, handling, styling, quality, reliability, performance, gas mileage and so on (probably in something close to that order). Notice that the subjective measurements were all most critical items. I wanted a car that performed well -- but I wasn't going to sacrifice looks, reliability or creature comforts for it. Sure I like to know that the car I have does well in some benchmarks -- but many objectively measurable benchmarks like mileage, horsepower, RPM's and so on, were all insignificant and often misleading! My car has 100 less horsepower than cars far inferior to it, and my car could objective outperform them (if you used the right benchmarks).

If you really know the right benchmarks to look at -- the power curve and torque-to-weight (with gearing), or real world performance benchmarks, then I could get a better idea of how the car stacked up. However, even some of the better benchmarks for performance (quarter mile performance, skid pad times, stopping distance, lateral G's, etc.) weren't going to change my mind about what was the better car for me -- nor should it. Heck, a motorcycle costing 1/4 the cost of my car could blow my car away in most benchmarks, but that doesn't make it a better car -- so once again, just reading benchmarks is misleading.

Lazy magazines with low quality reviewers (who don't understand the many subjective differences that make up a car), just give out the "objective" benchmarks. Good reviewers and magazines would state all the subjective opinions about why they preferred one vehicle over another -- and quite often they are not the cars with the best benchmarks. Once again, the specs mislead and aren't all there is to quality and performance.

Computers are the same way. One machine (say a PC) can do something that looks faster, but later costs you far more time. The difference in time between machines in loading100,000 web-pages won't make up for a single time that I have to reinstall Windows and all my Applications. Then there are many operations in computers that can make the computer slower but do important protective things that save the user time (long term). In cars, anti-lock brakes usually increase the stopping distance (a negative) -- but they give drivers more control while stopping. If you give people the benchmarks without giving them an explanation of why, then you are doing no service to the truth or to the users understanding. Sometimes productivity and usability may be disassociated (or diametrically opposed) to performance.

Conclusion

There are of course some serious technical questions that benchmarks like i-Bench bring up. Like why is the Mac showing up substantially slower than a PC in PC Magazines tests? I could get into the technicalities of the difference between latency versus throughput, and how their latency measuring benchmarks may be less important than a throughput measurement. It certainly wasn't a full function test -- I didn't see browse speeds while doing a download in the background (something that users are likely to do). I didn't see an explanation about how they are using a proprietary solution (like MS's integrated browser) versus Apple's more open solution (that allows any browser to work). Or we could talk configurations (and suspicions I have about the design of the benchmarks themselves). But the important thing is that most of their "performance" doesn't matter.

The tests are showing that for right now, the performance of some things that users shouldn't care about much, are performing better on the PC than on the Mac. I can live with that. Apple, Netscape (AOL), Microsoft and a few others should care -- and look to see what is going on and why. However, I basically couldn't care less about i-Bench -- nor should most users. Why? It is insignificant, not relevant to me, not within my threshold of performance, and it is misleading to true productivity (my performance). I may use i-Bench to compare relative performance of one machine to another, or one browser to another (where it has a little value). But comparing one platform to another, or implying that it has some value to users and productivity is laughable.

I ran i-Bench and looked at what it was doing -- and it isn't doing things that I do. When I used i-Bench on my Mac is loaded pages that were displaying themselves within like 1/2 a second or less. In order to get a significant difference in time (and sampling) they had to reload the same pages hundreds of times, and then sample that collective amount. When *I* load a web page, I have to read it -- that is why I'm loading it after all. The difference between a Mac loading a page in .5 seconds and a PC loading one in .1 second is utterly irrelevant to me. It isn't even that their measurements are completely beyond my threshold of perception -- I can notice that my NT box is a little faster at loading pages than my Mac -- so overall, I imagine their conclusions have some validity. But the important thing is that it doesn't matter to me because I load a page on my Mac in .5 seconds, then I read the thing for 3 minutes -- increasing the load speed will make no difference in my workflow. There is no new Mac sold today that is so slow at displaying even the most complex web pages that I should care.

This is the true issue about some benchmarks used as marketing. PC companies, magazines and users get so caught up in the stupid and inane measurements that they miss the important points of computers. (If they got the important points, more of them would be buying Macs). Every test I tried in i-bench measurements were irrelevant to my usage. I use real audio, and video and QuickTime and 3D and Virtual reality, and the Mac is beyond the functional threshold of perception -- so more is not better, and wouldn't effect my life at all. Most importantly -- more performance doesn't change the quality or true usability of the system.

In fact, there is a significant chance that in this case, as in many others, more is worse! Engineering is about tradeoffs. Throwing processor resources at something like making a screen display in .1 seconds (instead of .5) could end up pulling those resources away from more critical parts of the system. So instead of the computer doing important work, it is interupting itself all the time to make things go faster -- even when they serve no functions for you.

The significant facts about which platform offers the better user experience were all dodged. I have and use both an NT box and a Mac -- 99% of the time that I browse I do so on the Mac. Why? I can do things that are significant to me. I keep a folder of URL's in the finder in a popup window (at the bottom of the screen). I can navigate that window and save or open URL's using spring-loaded folders. When I change my default browser (from IE to Navigator or vise versa) all my URLs work with the browser that I want. I use Sherlock to do Internet finds all the time. Files unstuff themselves. I don't have to worry about the security issues of ActiveX enabled pages, and so on. All these are things that enhance my productivity, and are all things I can't do on the PC (Windows), or at least not as easily. Notice that the benchmarks fail to mention any of these important facts to users? Things like this are what help productivity and quality (and are why for me, the Mac is the better tool) -- and because of these "lies of omission", i-Bench completely misses the point.


Created: 11/07/99
Updated: 11/09/02


Top of page

Top of Section

Home