The TCP/IP Model
The TCP/IP model was developed by the U.S. Department of Defense (DoD) and originated from the need of a network that could survive any conditions, including a nuclear war. After it was released to the public, in a few years the TCP/IP model became the most popular networking model and it is now the core of the Internet.
In a world where we have data transmitted over wires, microwaves, satellite links, and optical fiber, there is the need to transmit data reliably over any media and under any circumstances. Let's see how the TCP/IP model can do that.
First of all, the TCP/IP model consists of four layers as in the following figure:
So, the layers of the TCP/IP model are: Application, Transport, Internet, and Network Access.
Even if some layers from the TCP/IP model share the same name with some layers from the OSI model, they include different functions.
The TCP/IP Application Layer
The TCP/IP application layer handles high-level protocols, representation, encoding, and dialog control. The Application layer in the TCP/IP model defines not only the application, but also how data is formatted, and how sessions are initialized and destroyed. As an analogy to the OSI model, the TCP/IP application layer handles the functions found at the three upper layers in the OSI model—application, presentation, and session. This way, all application-related issues found in the OSI model are combined into one layer.
The application layer in the TCP/IP model includes protocols like FTP, SMTP, etc., with all their issues regarding data representation and dialog control. The application layer ensures that the data is properly packaged before it is passed to the transport layer.
The TCP/IP Transport Layer
The transport layer provides transport services for the application layer by creating logical connections between the source host and the destination host.
In the TCP/IP model, two protocols are found at the transport layer:
- Transmission Control Protocol (TCP)
- User Datagram Protocol (UDP)
TCP is a connection-oriented protocol and provides reliable data transfer between endpoints.
TCP breaks messages into segments, reassembles them at the destination, and sends them to the upper layer (application).
- SourcePort: The port number used by the sending host to send data
- Destination Port: The port number used by the receiving host to receive data
- Sequence Number: The SEQ number of the segment, used to ensure the data arrives in the correct order
- Acknowledgement Number: The ACK number is the next expected TCP octet from the other host.
- Header Length (HLEN): Number of 32-bit words in the header
- Code Bits: Control functions such as set up or terminate a session
- Reserved: Reserved bits are set to zero
- Window: The number of octets that the sender will accept
- Checksum: Calculated checksum of the header and data fields
- Urgent: Indicates the end of the urgent data
- Options: There is only one option defined, which is the maximum TCP segment size.
- Data: The data from the upper layer (application)
Connection-oriented means that TCP needs to establish a connection between the two hosts before it starts sending data. This is done by using a three-way handshake, which means that two hosts communicating using TCP synchronization (SYN).
First, the initiating host sends a SYN packet to the receiving host sending its sequence number (SEQ). The receiving host receives the SYN packet and sends back an acknowledgement (ACK) packet containing its own sequence number and the source's SEQ number incremented by 1. This tells the sending host that the packet was received successfully and informs it about its SEQ number. Next, the sending host sends an ACK packet to the receiving host, containing the receiving host's SEQ number incremented by one. This tells the receiving host that the sending host received its packet.
The process described above is called synchronization (the three-way handshake), and it is necessary because the network doesn't have a global clock and TCP protocols may use different mechanisms to choose initial sequence numbers.
After the synchronization is performed, TCP uses a process called windowing to ensure flow control and ACK packets for the reliability of the data transmission.
Windowing is a process in which the two hosts adapt the number of bytes they send by how many windows the other host receives before sending an ACK packet. For example, see the following figure:
The sender host sends three packets before expecting an ACK packet, while the receiving host can only process two. The receiving host sends back an ACK packet confirming what packet the sender should send and specifies a window size of 2. The sending host sends packet 3 again but with the same window size 3. The receiver sends ACK 5, meaning that it waits for the fifth packet and specifies again the window size 2. From this point, the sender only sends two packets before waiting for an ACK packet from the receiver.
Flow control is a mechanism that keeps the data transmission in limits imposed by the physical medium. For example, a host on a network that is connected to the Internet through a router with 64 kilobits per second, without flow control would flood out 100 megabits per second to the router when sending data to another computer located at the other end of the world. With a flow control mechanism in TCP, the hosts negotiate a window size, meaning an amount of data to be transmitted by one host at once.
ACK packets are sent by the receiving host indicating the last packet has been received, and that the receiving host is waiting for the next packet after the one last received. If packets get lost along the way, this will force the sending host to resend that packet, thus ensuring a reliable communication.
Applications with the need of reliable data transmission use TCP as transport protocol. Examples of such applications are FTP, HTTP, SMTP, Telnet, SSH, etc.
UDP is a much simpler protocol than TCP is, and it's everything that TCP isn't. UDP is a transport layer protocol that doesn't need to establish a connection with the other host for sending data. This means that UDP is connectionless.
A UDP segment contains:
- Source Port: The port number used by the sending host to send data
- Destination Port: The port number used by the receiving host to receive data
- Length: The number of bytes in header and data
- Checksum: Calculated checksum of the header and data fields
- Data: The data from the upper layer (application)
Also, UDP doesn't have any mechanisms for flow control and doesn't retransmit data if data gets lost. This means that UDP provides unreliable delivery. However, data retransmission and error handling can be implemented at the application layer, whenever it is needed.
Now, you are probably wondering if TCP has so many great features, why use UDP?
A first answer to that question would be because there are applications that don't need to put sequences of segments together. Let's take for instance H.323, which is used for Voice over IP (VoIP). Voice over IP is a way to send real-time conversations over an IP network. If H.323 used TCP, in a conversation, when data gets lost due to network congestion, the sending host must retransmit all the lost data while encapsulating the new telephone input into new data, which would have to wait to be sent. This would be very bad for a conversation in a network with delays higher than 100 miliseconds.
A second motive for using UDP would be that a simple protocol needs less processing capacity. For example, DNS uses UDP for handling DNS requests from clients. Think about a very large network that usually has two or three DNS servers. If TCP was used to handle DNS requests, the DNS servers would have to establish TCP connections with all clients for each DNS request. This would need high processing capacity from the DNS server and would be slower than UDP is.
Another example is TFTP, which is used for file transfer, usually by routers to load their operating systems from. TFTP is much simpler than FTP, and it is far easier to code in a router's bootloader than FTP is.
The TCP/IP Internet Layer
The Internet layer in the TCP/IP model has the functions of OSI Layer 3—network. The purpose for the Internet layer is to select a path (preferably the best path) in the network for end-to-end delivery.
The main protocol found at the Internet layer is IP (Internet Protocol), which provides connectionless, best-effort delivery routing of packets. IP handles logical addressing, and its primary concern is to find the best path between the endpoints, without caring about the contents of the packet. IP does not perform error checking and error correction, and for this reason is called an unreliable protocol. However, these functions are handled by the transport layer (TCP) and/or the application layer.
IP encapsulates data from the transport layer in IP packets. IP packets don't use trailers when encapsulating TCP or UDP data. Let's see what an IP packet looks like:
The fields contained in the IP header signify:
- Version: Specifies the format of the IP packet header. The 4-bit version field contains the number 4 if it is an IPv4 packet, and 6 if it is an IPv6 packet. However, this field is not used to distinguish between IPv4 and IPv6 packets. The protocol type field present in the Layer 2 envelope is used for that.
- IP header length (HLEN): Indicates the datagram header length in 32-bit words. This is the total length of all header information, and includes the two variable-length header fields.
- Type of service (ToS): 8 bits that specify the level of importance that has been assigned by a particular upper-layer protocol.
- Total length: 16 bits that specify the length of the entire packet in bytes. This includes the data and header. To get the length of the data payload, subtract the HLEN from the total length.
- Identification: 16 bits that identify the current datagram. This is the sequence number.
- Flags: A 3-bit field in which the two low-order bits control fragmentation. One bit specifies if the packet can be fragmented, and the other indicates if the packet is the last fragment in a series of fragmented packets.
- Fragment offset: 13 bits that are used to help piece together datagram fragments. This field allows the next field to start on a 16-bit boundary.
- Time to Live (TTL): A field that specifies the number of hops a packet may travel. This number is decreased by one as the packet travels through a router. When the counter reaches zero, the packet is discarded. This prevents packets from looping endlessly.
- Protocol: 8 bits that indicate which upper-layer protocol, such as TCP or UDP, receives incoming packets after the IP processes have been completed.
- Header checksum: 16 bits that help ensure IP header integrity.
- Source address: 32 bits that specify the IP address of the node from which the packet was sent.
- Destination address: 32 bits that specify the IP address of the node to which the data is sent.
- Options: Allows IP to support various options such as security. The length of this field varies.
- Padding: Extra zeros are added to this field to ensure that the IP header is always a multiple of 32 bits.
Data is not a part of the IP header. It contains upper-layer information (TCP or UDP packets) and has a variable length of up to 64 bytes.
If an IP packet needs to go out on an interface that has a MTU (Maximum Transmission Unit) size of less than the size of the IP packet, the Internet Protocol needs to fragment that packet into smaller packets matching the MTU of that interface. If the "Don't Fragment" bit in the Flags field of the IP packet is set to 1 and the packet is larger than the MTU of the interface, the packet will be dropped.
ICMP: Internet Control Message Protocol is a protocol that provides control and messaging capabilities to the Internet Protocol (IP). ICMP is a very important protocol because most of the troubleshooting of IP networks is done by using ICMP messages. The most important aspect of ICMP involves the types of messages that it returns and how to interpret them.
ARP: Address Resolution Protocol is used to determine MAC addresses for a given IP address.
RARP: Reverse Address Resolution Protocol is used to determine an IP address for a given MAC address.
The TCP/IP Network Access Layer
The network access layer in TCP/IP, also called host-to-network layer, allows IP packets to make physical links to the network media.
As you can notice, ARP and RARP are found at both the Internet and network access layers. Also, you can see that the TCP/IP network access layer contains LAN and WAN technologies that are found at the OSI physical and data link layers.
Network access layer protocols map IP addresses to hardware addresses and encapsulate IP packets into frames. Drivers for network interfaces, modems, and WAN interfaces also operate at the TCP/IP network access layer.
TCP/IP Protocol Suite Summary
To have an overview of the TCP/IP model, take a look at the following diagram:
You have applications that need to reliably transfer data like FTP, HTTP, SMTP, and the zone transfers in DNS that use the TCP protocol, as well as applications that need to use a simpler protocol like TFTP and DNS requests using UDP.
Both TCP and UDP then use IP for end-to-end delivery (routing) and physical interfaces to send the data.
Let's see what the email example we gave with the OSI model looks like with TCP/IP. So, you are in a company LAN and you want to send an email:
Layer 4: You use an email client (like Outlook Express for example) that has SMTP and POP3 functions according to TCP/IP Layer 4 (application). You send the email, formatted in ASCII or HTML. The application then creates a data unit formatted in ASCII or HTML. The email client uses the operating system to open a session for inter-host communication. All those functions are performed at TCP/IP Layer 4 (application).
Layer 3: A TCP socket with the SMTP server is opened by the operating system. A virtual circuit is opened between your computer and the email server using TCP according to TCP/IP Layer 3 (transport).
Layer 2: Your computer searches for the IP address of the SMTP server according to the routing table of the operating system. If it is not found in the routing table, it will forward it to the company router for path determination. The IP protocol is at TCP/IP Layer 2 (Internet).
Layer 1: The IP Packet is transformed to an Ethernet frame. The Ethernet frame is converted to electrical signals that are sent throughout the CAT5 cable. Those functions are performed at TCP/IP Layer 1 (data link).