Dojo (HowTo)







  Easter Eggs




  Martial Arts

How Computer Languages Work
Understanding languages and OOD, OOP and Frameworks.

By:David K. Every
©Copyright 1999

Around here, and elsewhere, we computer geeks throw around TLA's (Three Letter Acronyms) and terms without ever concerning ourselves with the poor individuals who aren't "in-the-know". One perplexed individual asked me "What the heck is a Framework, and OOD". Poor guy didn't know what he was asking, or who he was asking. Since the question is valid, and many others may have the same question, an article was born. Because I'm an engineer, this is the closest thing you (and he) will get to an easy answer.

Low Level Code

A computer executes very very simple instructions, very quickly.These instructions are made up from patterns of 0's and 1's. Instructions in this form are called machine code since only a machine would want to work with it (see how processors work to learn more). <- this article is not yet available

These 1's and 0's are just an electronic signals inside the computer that is either on or off. The two state nature of this signal is called "binary", and the value is called a "bit". Bit is short for "binary digit".

Early computers (40's and 50's) weren't even programmed, they were literally "wired" together with little wires on a patch board to make connections for each "1", and the absence of a wire (connection) represented a "0". Wiring these things was real slow, hard to follow, and nearly impossible to find errors when something was wrong. If something was wrong you literally had to follow every single wire on the board to try to search for the wrong one, and there were tens of thousands of these wires. To add a new program, designers had to go back to the patch boards and wire a new program.

There is a rumor that a "bug" came from this time. When insects would wander into a computer, and their fried carcasses would cause extra connections -- these "bugs" would cause things to not work right. While it is true this happened, it is urban myth that the term came from this time. The term "bug" is much much older than the 40's, and was used to describe glitches in machinery as long as 100 years before the computer - but for the same reason. Bugs would fall into machinery (especially delicate mechanisms like clocks or music boxes) and mess them up! Computer "bugs" were just the carry-over of this older term.

Later machine code was not "hard wired", it was instead done with values in "memory". (Around the 1950's and 1960's). This was better, and allowed quicker changes, but humans would still make lots of errors trying to deal with patterns of 1's and 0's. So people started using mnemonics to substitute understandable names (or abbreviations) to represent these bit patterns (which were an "instruction" to the computer).

Mnemonics are just human descriptions for the bit pattern. For example move, jump, branch, with some parameters. These words would get converted from their human form (their name) into their numeric form (1's and 0's) by a program called "an assembler". This mnemonic representation got called "assembly" language, because of the assembler program that does the conversion.

Assembly and machine code are very detailed. Each instruction does very little real work. These instructions just change a few switch states, and do things that make sense electrically, but not as much sense to humans trying to model a problem (or solution) --so programmers don't get a lot of work done with assembly.

Assembly language is so detailed that instead of asking someone to "eat", it would be like saying -- "Reach out your right hand by extending your right arm and grasp the multi-pointed utensil with the fingers of your right hand. Place food on the fork by shoveling or stabbing it with said fork. Open your mouth and bend your right arm and bring the pointed utensil towards your mouth . Keep going until the food is inside your mouth but stop bringing the fork forward (so you do not push the fork through the back of your head). Close mouth around fork (not too hard), and remove fork. Repeat the entire process until the the plate is clean."... well, this isn't detailed enough, but you get the point.

After dealing with computers for 10 or 20 years, many programmers become very literal (too literal?). Just ask my very tolerant wife when she asks me to "move" because I'm blocking the Television and my initial response is to move about, without actually getting out of the way. Or ask people that ask me simple questions and get multi-paragraph answers.

High Level Code

Assembly became known as "a low level language" because programmers are talking to the computer at a very low level (you are speaking the computers language). To increase programmer productivity, we created "higher" level languages. (1960's - 70's)

FORTRAN (short for formula translator) is an example of a one of the earlier "high" level languages. Basically the high level language allow a programmer to use less detailed instructions, to do more with each instruction (Just like "eat" means the same thing as all those detailed instructions in the earlier example). This progress allows programmers to be more productive, and write larger more complex programs, that can do more. But all code is only as organized as the programmer makes it -- and many programmers are not very organized (programming was a new kind of art, and there was little training -- some results were elegant, while others looked like the product of an acid tripping kindergartner doing his best imitation of Picasso).

The nightmarish result of a bad programmer is called "spaghetti code", because looking at it later was like trying to follow an individual spaghetti noodle through a plate of spaghetti -- it goes everywhere and nowhere, and is all interwoven. Code, in higher level languages, could get larger and more complex than before (with lower level languages), but it would still eventually collapse under its own complexity. Which means that the program would become so large and complex that it would be unmaintainable and unfixable, which was costing people lots of money.The size of that collapsing point was now larger, but it would still happen.

Part of the problem with the first high-level languages is they could deal with only a few types of data. Programmers used those very primitive data types to try to construct everything. So often programmers used arrays of primitive types to describe more complex types -- since there was no "natural" representation of what they wanted, and they couldn't create them (easily). Programmers had to decode and recode these arrays in many different places in the code, and any errors would be catastrophic. Also programmers would just use something called a "goto" to jump around in the code (to "go" from one place in the code "to" another). This contributed to "spaghetti" code, and needed to be improved.

There had to be a better way to make languages.

Procedural Code and Data Structures

Procedural code and data structures were meant to cure the problems of spaghetti code. It was assumed that newer languages could cure the problems. That assumption was wrong. The problem of spaghetti code was 90% the fault of bad programmers or weak design, and better languages only helped with the remaining 10%. But newer languages did help a little bit. What these languages offered is called "Data Structures" and "Procedural Design" (the results are categorized as procedural code). The procedural languages became popular in the 70's and 80's with languages like Pascal and C.

Before Procedural code, programs were just one long stream of code. Programmers would jump around in that code (Memory) but there weren't any formal boundaries as to where one thing ended and the other started. (Think of an old document written on a scroll of paper.) When a programmer wanted to go from one place to another it was always a pain to find where the other thing was. Procedures were a way to divide the code into separate chunks (think of a modern book with separate pages and chapters). Instead of a programmer having to deal with a stream of random thoughts on a scroll, there was now a way to organize things into chapter groupings, and send people to a particular page. Much better.

Technically languages had procedures before procedural languages became popular. But it was procedural languages that defined how they should be used, in a predictable manner, and popularized and standardized their use.

Data structures allowed programmers to group complex sets of data into "groups" or structures. Instead of programmers having to decode an array of a primitive type of data in hundreds of places in the code (with any errors causing very bad results), the programmer would build a construct (structure) using more data types, and the programmer could pass around references to these structures. Think of the older way (using primitives) as trying to represent every denomination of money using only pennies -- while the new structures allowed programmers to work with all the different types of coins, and $1, $5, $10, $20, $50 and $100 bills... or to just write a check for the exact amount.

There is a joke was that FORTRAN programmers can program FORTRAN in any language. In other words, just because the language changed does not mean that the way it was used would change as well. The results were that bad programmers could now write spaghetti code, with meatballs and clam sauce. Code was now broken up into many poorly written pages and chapters, instead of being all one continuous scroll of poorly written prose -- time to go back to the language drawing boards.

To be fair, good programmers could do a lot more with each generation of language, and write better and clearer code. But the ratio of good programmers to bad programmers seemed to remain constant. Good programmers could also write pretty readable code in the older styles as well. (Can you guess which camp of programmers the author puts himself?) So progress in languages has helped -- but not as much as was hoped.

Code sizes were growing tremendously all during computer history. Programs in the 50's were a few dozen or maybe hundreds of instructions (lines of code), but in the 70's or early 80's, many programs were likely to be 100,000's of lines code (instructions), and in the 90's it is millions of lines -- and remember that what each instruction can do is a lot more powerful as well. Also remember that a single error can bring the whole program down, bring the computer to a crashing halt, or send you a bill from the IRS for $100,000,000 -- that is of course if the IRS used computers.

I think the IRS is still a bunch of sadistic accountants doing it all by hand, which is what makes them so mean -- but that is a separate topic.

With procedural code, programmers learned to group their code into procedures (hence the name). Each procedure tried to do only one thing well, if it was written by a good programmer; or each procedure did nothing well if it was written by a not so good programmer . Programmers also learned how to "prepackage" lots of functions (procedures) into "libraries". Many of these libraries of code were large sets of prewritten functions. Other programmers could buy these libraries, instead of having to write it all themselves, and increase their productivity by using this prewritten code -- in fact, that is basically what Operating Systems are.

Early computers had little or no Operating Systems, every program had to not only do its own job, but control all the Hardware of the computer. An Operating System is just a bunch of prewritten code (a library) to allow programmers to control the hardware (and do other things) without having to write all the code themselves. The parts of the OS code that are built into the computer, and don't have to be loaded into memory, are called the BIOS (Basic Input/Output System). But an OS is not just the programmers view of the computer, but also the users view -- so part of the "Operating System" is also the program that comes with a computer that lets users control the computer.

This dual functionality confuses users -- there is the programming Operating System, and then there is the "shell" part of the OS which is the program that users see. As time progresses, the OS's are doing more and more, to allow programmers have to do less and less. Early OS's only controlled the hardware (and some hard-core weenies still refuse to call anything above that hardware interface layer, "the OS", but that's why they are weenies). Later, OS's started handling things like drawing graphics, copying files, showing movies, Sound or handling printing. Now an Operating System is almost considered "whatever programs and libraries come with your computer".

Some types of code are given special names. If the code is only controlling a piece of hardware, then it is often called a "Driver". It is that piece of code that is in control of the hardware, and all programmers must ask that driver to do things for them. If the code is an OS Library that is meant to to help application writers create programs quicker (or deal with the rest of the OS), then that code is called an API's (Application Programming Interface). There is a lot of functionality (routines) that Every Application programmer was having to create themselves, so to make them more productive, the OS makers started bundling these API's (libraries of prewritten procedures) for them. These differ from other OS libraries, which are procedures that hardware manufacturers or low-level programmers want to use (and App-writers can care less about). So we started getting layers of functionality, with different programmers interested in different layers. There once was a difference between an API and an OS Library (routine), with higher layers being called API's, but the distinction was lost years ago. Now almost all libraries delivered with the OS are called "API's" -- but not always. If it was consistant then lay-people would be less confused -- and us programmers want to continue to speak an undecipherable dialect of techno-lingo and gibberish.

OOD and Frameworks

Language Designers have tried, once again, to improve programmers productivity. However, this solution is much broader than the other solutions, and it required a paradigm shift on how we think of programs. Instead of thinking of code as a sequence of instructions grouped into procedure stew, it requires a new concepts. This process uses something called OOD (Object Oriented Design) using OOP (Object Oriented Programming) with new languages like ObjectPascal, C++ or Java.

What this all means is that code is not just grouped into procedures, but groups of procedures that are called "Objects". Not only is code grouped into "objects" but also all of the data that works with that code is in the same object as well. These grouping concepts are called "encapsulation", and it makes finding things in code easier.

Instead of code being able to jump willy nilly around, and instead of any line of code being able to alter any piece of data, all the code is all encapsulated into Objects -- where all the data and code is localized. Each object can only alter its own data, and no one elses' -- and vise versa. Each object only has free reign to jump around inside of itself -- with a more formal way to communicate with other objects (called messaging). This is called "protection", and means that bugs are more localized and are easier to find. If an objects data gets screwed up -- then you at least know that you only have to look at all the code inside that object (instead of all the code in the entire application or system). This saves lots of money.

Before OOP/OOD it was not very common to use other people's code. Programmers might use the OS, or with an occasional library -- but almost all code was written uniquely for each application. This was wasteful because much of the code in a program was similar to code in other programs , but each program was hand made (with lots of unique parts). So Objects and OOD were a way to bring the standard "components", and the industrial revolution and assembly line, to code. Grouping functionality into objects allows those objects to be more easily used among different applications. More code reuse, means lower costs and increases productivity. Furthermore there is a concept called polymorphism, which is where one object can "inherit" many of traits (code and data) of another object, and the programmer only has to write what is different about the new object type. The programmer only has to debug those differences as well. Even more code reuse.

So what do you call groups of Objects -- especially prepackaged ones? The magic term is "Framework". A Framework of Objects is like a super-library, but because of concepts like "encapsulation" and "polymorphism" there is much more power and productivity for programmers.Parts of the Operating System can be prepackaged as Frameworks. Many times an application shell is prewritten for you -- so the programmer starts with a completely working application, and he only has to add to it (instead of having to write it all himself). This is called an Application Framework. There are also many other types of Frameworks that make programmers more productive.

The big point is that the more code written by someone else, and the better that code is, the less that a programmer has to write. This means that a programmer can do the same job in much less time; or in the same amount of time, a programmer can do a lot more work (and create a better program). This is what OOD does, and why Frameworks are so cool to programmers. The cooler something is to a programmer, the cooler the results to the user.


I believe this article will give you a very good understanding of the basics of computer languages, as well as an overview of what is OOD/OOP and Frameworks. I've also brutally whipped some poor reader, with a multi-page answer, for asking a simple question. Imagine if he had asked a complex question. His suffering will not be in vain, if others have learned from his torture.

Just so I've asked -- are there any more questions?

[For more information you can read Java (and more)]

Created: 05/24/97
Updated: 11/09/02

Top of page

Top of Section