Jan Graba An Introduction to Network Programming with Java 3rd ed. 2013 Java 7 Compatible 10.1007/978-1-4471-5254-5_1 Springer-Verlag London 2013
1. Basic Concepts, Protocols and Terminology
Learning Objectives
After reading this chapter, you should:
have a high level appreciation of the basic means by which messages are sent and received on modern networks;
be familiar with the most important protocols used on networks;
understand the addressing mechanism used on the Internet;
understand the basic principles of client/server programming.
The fundamental purpose of this opening chapter is to introduce the underpinning network principles and associated terminology with which the reader will need to be familiar in order to make sense of the later chapters of this book. The material covered here is entirely generic (as far as any programming language is concerned) and it is not until the next chapter that we shall begin to consider how Java may be used in network programming. If the meaning of any term covered here is not clear when that term is later encountered in context, the reader should refer back to this chapter to refresh his/her memory.
It would be very easy to make this chapter considerably larger than it currently is, simply by including a great deal of dry, technical material that would be unlikely to be of any practical use to the intended readers of this book. However, this chapter is intentionally brief, the author having avoided the inclusion of material that is not of relevance to the use of Java for network programming. The reader who already has a sound grasp of network concepts may safely skip this chapter entirely.
1.1 Clients, Servers and Peers
The most common categories of network software nowadays are clients and servers . These two categories have a symbiotic relationship and the term client/server programming has become very widely used in recent years. It is important to distinguish firstly between a server and the machine upon which the server is running (called the host machine), since I.T. workers often refer loosely to the host machine as the server. Though this common usage has no detrimental practical effects for the majority of I.T. tasks, those I.T. personnel who are unaware of the distinction and subsequently undertake network programming are likely to be caused a significant amount of conceptual confusion until this distinction is made known to them.
A server, as the name implies, provides a service of some kind. This service is provided for clients that connect to the servers host machine specifically for the purpose of accessing the service. Thus, it is the clients that initiate a dialogue with the server. (These clients, of course, are also programs and are not human clients!) Common services provided by such servers include the serving up of Web pages (by Web servers) and the downloading of files from servers host machines via the File Transfer Protocol (FTP servers). For the former service, the corresponding client programs would be Web browsers (such as Firefox, Chrome or Internet Explorer). Though a client and its corresponding server will normally run on different machines in a real-world application, it is perfectly possible for such programs to run on the same machine. Indeed, it is often very convenient (as will be seen in subsequent chapters) for server and client(s) to be run on the same machine, since this provides a very convenient sandbox within which such applications may be tested before being released (or, more likely, before final testing on separate machines). This avoids the need for multiple machines and multiple testing personnel.
In some applications, such as messaging services, it is possible for programs on users machines to communicate directly with each other in what is called peer-to-peer (or P2P ) mode. However, for many applications, this is either not possible or prohibitively costly in terms of the number of simultaneous connections required. For example, the World Wide Web simply does not allow clients to communicate directly with each other. However, some applications use a server as an intermediary, in order to provide simulatedpeer-to-peer facilities. Alternatively, both ends of the dialogue may act as both client and server. Peer-to-peer systems are beyond the intended scope of this text, though, and no further mention will be made of them.
1.2 Ports and Sockets
These entities lie at the heart of network communications. For anybody not already familiar with the use of these terms in a network programming context, the two words very probably conjure up images of hardware components. However, although they are closely associated with the hardware communication links between computers within a network, ports and sockets are not themselves hardware elements, but abstract concepts that allow the programmer to make use of those communication links.
A port is a logical connection to a computer (as opposed to a physical connection) and is identified by a number in the range 165535. This number has no correspondence with the number of physical connections to the computer, of which there may be only one (even though the number of ports used on that machine may be much greater than this). Ports are implemented upon all computers attached to a network, but it is only those machines that have server programs running on them for which the network programmer will refer explicitly to port numbers. Each port may be dedicated to a particular server/service (though the number of available ports will normally greatly exceed the number that is actually used). Port numbers in the range 11023 are normally set aside for the use of specified standard services, often referred to as well-known services. For example, port 80 is normally used by Web servers. Some of the more common well-known services are listed in Sect.. Application programs wishing to use ports for non-standard services should avoid using port numbers 11023. (A range of 102465535 should be more than enough for even the most prolific of network programmers!).
For each port supplying a service, there is a server program waiting for any requests. All such programs run together in parallel on the host machine. When a client attempts to make connection with a particular server program, it supplies the port number of the associated service. The host machine examines the port number and passes the clients transmission to the appropriate server program for processing.
In most applications, of course, there are likely to be multiple clients wanting the same service at the same time. A common example of this requirement is that of multiple browsers (quite possibly thousands of them) wanting Web pages from the same server. The server, of course, needs some way of distinguishing between clients and keeping their dialogues separate from each other. This is achieved via the use of sockets . As stated earlier, a socket is an abstract concept and not an element of computer hardware. It is used to indicate one of the two end-points of a communication link between two processes. When a client wishes to make connection to a server, it will create a socket at its end of the communication link. Upon receiving the clients initial request (on a particular port number), the server will create a new socket at its end that will be dedicated to communication with that particular client. Just as one hardware link to a server may be associated with many ports, so too may one port be associated with many sockets. More will be said about sockets in .