1.1 Programs, Languages, and Compilers
We are all familiar with the computers ability to perform a wide variety of tasks. For instance, we can use it to play games, write a letter or a book, perform accounting functions for a company, learn a foreign language, listen to music on a CD, send a fax, or search for information on the Internet. How is this possible, all on the same machine? The answer lies with programmingthe creation of a sequence of instructions that the computer can perform (we say execute) to accomplish each task. This sequence of instructions is called a program . Each task requires a different program:
To play a game, we need a game-playing program.
To write a letter or a book, we need a word processing program.
To do accounts, we need an accounting program.
To learn Spanish, we need a program that teaches Spanish.
To listen to a CD, we need a music-playing program.
To send a fax, we need a fax-sending program.
To use the Internet, we need a program called a Web browser.
For every task we want to perform, we need an appropriate program. And in order for the computer to run a program, the program must be stored (we sometimes say loaded) in the computers memory.
But what is the nature of a program? First, we need to know that computers are built to execute instructions written in what is called machine language . In machine language, everything is expressed in terms of the binary number system1s and 0s. Each computer has its own machine language and the computer can execute instructions written in that language only .
The instructions themselves are very simple: for example, add or subtract two numbers, compare one number with another, or copy a number from one place to another. How, then, can the computer perform such a wide variety of tasks, solving such a wide variety of problems, with such simple instructions?
The answer is that no matter how complex an activity may seem, it can usually be broken down into a series of simple steps. It is the ability to analyze a complex problem and express its solution in terms of simple computer instructions that is one of the hallmarks of a good programmer.
Machine language is considered a low-level programming language. In the early days of computing (1940s and 50s) programmers had to write programs in machine language, that is, express all their instructions using 1s and 0s.
To make life a little easier for them, assembly language was developed. This was closely related to machine language but it allowed the programmer to use mnemonic instruction codes (such as ADD and names for storage locations (such as sum ) rather than strings of binary digits (bits). For instance, a programmer could refer to a number by sum rather than have to remember that the number was stored in memory location 1000011101101011 .
A program called an assembler is used to convert an assembly language program into machine language. Still, programming this way had several drawbacks:
It was very tedious and error prone.
It forced the programmer to think in terms of the machine rather than in terms of his problem.
A program written in the machine language of one computer could not be run on a computer with a different machine language. Changing your computer could mean having to rewrite all your programs.
To overcome these problems, high-level or problem-oriented languages were developed in the late 1950s and 60s. The most popular of these were FORTRAN (FORmula TRANslation) and COBOL (COmmon Business-Oriented Language). FORTRAN was designed for solving scientific and engineering problems that involved a great deal of numerical computation. COBOL was designed to solve the data-processing problems of the business community.
The idea was to allow the programmer to think about a problem in terms familiar to him and relevant to the problem rather than have to worry about the machine. So, for instance, if he wanted to know the larger of two quantities, A and B , he could write
IF A IS GREATER THAN B THEN BIGGER = A ELSE BIGGER = B
rather than have to fiddle with several machine or assembly language instructions to get the same result. Thus high-level languages enabled the programmer to concentrate on solving the problem at hand, without the added burden of worrying about the idiosyncrasies of a particular machine.
However, the computer still could only execute instructions written in machine language. A program called a compiler is used to translate a program written in a high-level language to machine language.
Thus we speak of a FORTRAN compiler or a COBOL compiler for translating FORTRAN and COBOL programs, respectively. But thats not the whole story. Since each computer has its own machine language, we must have, say, a FORTRAN compiler for a Lenovo ThinkPad computer and a FORTRAN compiler for a MacBook computer.
1.2 How a Computer Solves a Problem
Solving a problem on a computer involves the following activities:
Define the problem.
Analyze the problem.
Develop an algorithm (a method) for solving the problem.
Write the computer program that implements the algorithm.
Test and debug (find the errors in) the program.
Document the program. (Explain how the program works and how to use it.)
Maintain the program.
There is normally some overlap of these activities. For example, with a large program, a portion may be written and tested before another portion is written. Also, documentation should be done at the same time as all the other activities; each activity produces its own items of documentation that will be part of the final program documentation.
1.2.1 Define the Problem
Suppose we want to help a child work out the areas of squares. This defines a problem to be solved. However, a brief analysis reveals that the definition is not complete or specific enough to proceed with developing a program. Talking with the child might reveal that she needs a program that requests her to enter the length of a side of the square; the program then prints the area of the square.
1.2.2 Analyze the Problem
We further analyze the problem to