Dojo (HowTo)







  Easter Eggs




  Martial Arts

Innovation: File Type/Creator
Who innovated what

By:David K. Every
©Copyright 1999

The Mac innovated many things in filing systems, and took things in new and better directions. One area that gets little credit because only Mac-Geeks truly appreciate it -- is the Type/Creator in the Filing System.

File Extensions for "Type"

An Operating System (or File System) needs to know information about a files "type". What is it. If it is an application, the OS needs to know that it should be "run" when opened -- some other types need to be run as part of startup, and some are temporary files that the OS can remove if it is running out of space. Basically, the more powerful you want to make the OS (or smarter about working with files) the more it has to know about what the files are.

In the 60's, Operating Systems used file extensions to denote what "type" a file was. This is usually a special delimeter in the file name (like '.') and a few letter abbreviation -- like ".app" to define application, ".exe" for executable, ".text" or ".txt" for text files, and so forth.

File extension have many problems. One is that a user can easily rename the file (and the extension), either intentionally or by mistake, and then the file thinks it is something other than what it is. If the user forgets the proper extension, the file was basically unusable (just useless indecipherable gibberish). This is compounded by the fact that many CLI's (Command Line Interfaces / Shells) have the ability to do wildcard and scripted renames, and it is very easy to batch rename a bunch of things wrong. It would be much better if users didn't have to know anything about these "extensions", and their behavior was hidden.

Furthermore, because extensions use a special delimiter, you can't (or shouldn't) just add a '.' in a file name -- basically because a file named "txt.txt.txt" would drive the computer insane, or at least the users. So extensions start to restrict file naming -- compounded by CLIs that use other special characters for wildcard characters, and so on.

When Microsoft stole, er, borrowed CP/M (without permission to be a clone OS called MS-DOS), they just used the same old file extensions that had been around forever -- which were 3 characters long. 3 characters are not very descriptive -- it did allow for about 60,000 different combinations, but most are non-sensible. Realistically you probably have more like 20,000. That may sound like a lot, but remember, many apps use multiple versions or extensions (say 3 - 5 each) -- developers already colide with each other. This limitation (3 characters) made sense when computers had 16K of RAM and everything was accessed by typing -- but these extensions were an anachronisms about 20 years ago.

Some other systems (CLI's) improved things, and allowed more characters after the '.', so you could have 4 or 5 or undefined size. Digital Equipment (DEC) even added versioning with another extension -- so you have "test.txt;1" and "test.txt;2". Whenever you saved a file it would tack on the ";" and next number in the series, so you had all previous versions of the file. And when you asked for "test.txt" it always gave you the highest number (latest revision). Then there were special commands to flush older versions and renumber (start over). As you can imagine, in many ways this was a nice feature 5% of the time -- and added confusion and clutter the other 95% of the time. But at least DEC was trying to add value and innovate -- and it saved my butt on a few occasions, so it wasn't all bad.

Some others tried to make extensions (and file names) case sensitive, to give people more choices -- but that meant the "name.DOC" and "name.doc" were two different file types -- which was even more confusing to users, so that wasn't a good idea at all.


Apple created a far more interesting way of solving this problem of "how to define the file's type". Apple created a four letter file type. But file type wasn't a file name extension -- they built this type-code into the Operating System (File System) itself. So the type isn't just tacked on to the end of file names -- it is a hidden and integral part of the file that users don't normally have to see, and can't accidentally rename (1).

(1) Of course programmers and geeks can intentionally rename them with many different tools -- and these tools are better and more versatile than manually renaming the file anyway.

Users can still see what type a file is -- but not by some weird abbreviation or little suffix (".xls" for Excel Spreadsheet). On the Mac type is denoted by what a file's icon looks like. This is much clearer since the icon (image) can denote a lot of information, a lot more than a 3 character code can. For example, if it looks like a little painting or drawing, then chances are it is a painting or drawing file. If it also had a hand, or the icons outline was diamond shaped, then it was an application that created painting files (or drawing files). If it looked like a piece of paper that had a corner folded over, then it was a file / document. If it looked like a puzzle piece, then it was a system extension. If it had a little slider control on the bottom, then it was a system control -- and so on. It is a very descriptive visual language with very specific rules. Even in list mode (when icons are too small to see clearly) there is a column that shows the file "kind" or type, which is a readable explanation like "Microsoft Excel Spreadsheet", instead of ".xls". The whole idea of icons representing type (and creator) in a file system was another new concept (automatic type-specific file icons). This had not been done before; Xerox PARC had used icons to denote actions / buttons(controls) -- but not files, and they didn't have resource forks in which to embed a file's type. That was all Apple's innovation.

Apple's file types were only 4 characters long (again, leftovers of when a few characters in size mattered), but Apple allowed upper and lower case, and special symbols, and so on. There are a lot more possible file types (2). While 4 characters may not sound much more descriptive that 3 characters, you have to remember that it is 33% larger -- and better still because it is case sensitive. So on the PC you get "txt" files, and on the Mac they are "TEXT" -- on the PC you have ".htm" files, and on the Mac you have ".html", even that single extra character is a lot more versatile than people realize.

(2) There are like well over 100,000,000 options -- and that is only if you use the readable characters. Technically you can embed any 4-bytes (unsigned long), so you have 4 Billion possibilities. Apple also split out "creator" from "type" (more on that later in this article) which makes it much more versatile as well, so technically there is a full 64 bits worth of possibilities (184 Quintillion possibilities - 1.84e19). You aren't going to run out possibilities soon.

Best of all, all the ambiguity and confusion about file type was hidden from users. Users don't care what type a file is -- the file should know what it is (thus saving the users from having to know). So exposing that extension allows for users to rename files incorrectly and lose the type, or for them to be bothered with, is all making users know something that they don't need to know. Users shouldn't be changing something from an Application (.exe) to a file (.txt) anyway -- so it was stupid to expose that. On the Mac, programmers and power users can still get to type codes -- but they don't have to. 95% of Mac users know little or nothing about TYPE, all DOS users had to know about extensions, and too many Windows users still have to know.

Someone I was talking to (Chris Cooley) was explaining why he hates file extensions changing the kind of file something is. He basically used an analogy, saying, "if someone loses a male tabby cat named Sidney, then someone else finds it and renames it Fifi, then the male cat shouldn't change into a female toy poodle!" Name and type should be separate -- unless you work for some company in Redmond.

Type + Creator

Apple didn't just stop with doing a better "type" code -- while Apple was at it, they thought of many other problems and then solved them too.

Sometimes you care about a file's type -- is it an Application, Document, System Extension, and so on. Also Apple created many standard file types that were universal -- like "PICT" for picture, "MooV" (Movie), "SND" (Sound), and many more. Standard formats for many of these types (supported by the OS) were new as well -- but that doesn't have too much to do with Type/Creator. But the point is there were lots of types, and Apple expanded these ideas in many ways.

Apple also separated out a file's "creator" -- and creator codes were something completely new. Not only did a file have a type (what they were), they also knew who created them (which Application made them and who they belonged to). This means that on a Mac you can have one text file (type "TEXT") that was created by Microsoft Word (creator "msft"), which will be opened by Word whenever you double-click the file -- and you can have another text file (type "TEXT") that was created by Simple Text ("ttxt" -- short for Teach Text which was the original name), which will be opened by Simple Text when you double-click that file. Yet they are both the same file type, and the other Application can still open that type. Windows can't do that! Files don't know who they belong to -- so all of a particular type can only belong to one app. On windows if you have a text file, then the last application installed (or setup to run all files of that type) will be the program that will be run when you double click on any text (.txt) file. You can only hide that by making proprietary formats for each program -- which then will be more confusing to open in other programs. This is because text files don't know who created them like on the Mac. Once again, Windows did a poor job of imitating the Macs simplicity. Users don't see this right away, so many don't realize how much better the Mac is.

Desktop Database

This Type/Creator is completely hidden from the user and is far more powerful, and versatile, and descriptive, than Windows and previous systems type extensions. It just works, and it is transparent. For example, when you double-click (run) a file, the Mac knows to lookup the type of the file and then what to do with it. If it is an Application, it just runs... if it is a file, it runs the Application and then opens that file. This simple and obvious behavior required some more innovations. Apple had to create a Desktop Database.

When you add a file or Application to a Mac drive, the OS knows to look at the files type/creator and add that to the Desktop Database (if it doesn't already have a record for that type+creator). That way the Mac (Desktop Database) knows what every files icon should look like (because it knows every files type+creator, and has an icon for that in the database). The Desktop Database also lets the OS remember what program should be run for any particular type/creator code (what to do when you double click a file), and so on. It is all very powerful and was never done before Apple did it (innovation).

It took Microsoft over a decade to mimic this behavior properly -- until Windows95 (about 11 or 12 years later) - they created what is called a registry file. The registry file is like a primitive desktop database, and it tells the OS what Applications and files are bound together, what Icons should be used for what files and so on. All cross referenced by file extension. Still the registry file can't handle type and creator as two distinct things -- they are sort of mashed into this one thing that has a lot less versatility. It was a primitive hack to try to mimic what Apple designed into the system (instead of bolted on after the fact). This is why the registry file has far more problems than the Desktop Database, and why the Desktop Database is automatic and can rebuild itself, and so on -- while too many PC geeks need to know how to manually edit the registry file to fix things.

I think NeXT and UNIX UI's use a similar system to the registry file, and they are also dependent on paths for where Apps will be contained, and so you just can't move Apps anywhere. The Mac can handle just copying a program anywhere on a drive, or moving it, and having it work. On Windows you "install" and "uninstall" so that it can add things and fix things in the registry file (when it decides to work). On UNIX you don't move files and expect things to still work.


I still think the Mac's type and creator are great ways of solving the problem, that is clearer than most. Apple did get a little bit, because other systems don't use Type/Creator, and so they don't know how to handle things that they should. The Mac goes a long way with automagically remapping file extensions to type/creator -- but it still has a ways to go.

I think Apple has been talking about a naming convention for files (that are being transferred across the Internet) so that a file can recreate its type and creator information just from a code embedded in the file's name (but this can be tricky since it could take 8 characters, or more, tacked on to the name). Once transferred, the file would automatically generate type/creator and strip out the extra characters -- but we'll see if this ever comes about.

Apple pushed the boundaries -- and the leading edge is the bleeding edge. I'd take the cuts for the rewards (most of the time) -- and the rewards have been a far, far better user experience, and is even better still for a programmers/geeks experience. I've had a few IS people tell me that they love the Registry File, with quips like, "If it weren't for that pathetic Registry File [and file extensions], I wouldn't have anything to fix all day". They were joking... well, half joking. So Type and Creator were both innovative -- certainly the way they were implemented things was. The Desktop Database had never been done before either. All of these things together contributed to a far better user experience, and a interface that is far harder to break, can repair and update itself, and something that just works. Maybe Windows will get there some day -- or maybe people will just keep tolerating hacky half-assed solutions by saying, "close enough".

Created: 11/22/98
Revised: 05/04/99
Updated: 11/09/02

Top of page

Top of Section