ch1
Introduction
TCP (Transmission Control Protocol)
- Reliable: TCP ensures that data sent from one application is received by another in the exact order it was sent. If data is lost or arrives out of order, TCP handles retransmissions and reordering.
- Connection-Oriented: Before data transmission begins, TCP establishes a logical connection between the sender and receiver, requiring a handshake process.
- Point-to-Point: Communication occurs between two endpoints. Each connection is exclusive to the pair of applications communicating.
- Stream-Oriented: Data is treated as a continuous stream of bytes, allowing flexible message sizes. The protocol does not preserve message boundaries, so data can be split or grouped differently from how it was sent.
UDP (User Datagram Protocol)
- Unreliable: UDP does not guarantee delivery, order, or error correction. It's up to the application layer to handle lost data or detect errors.
- Connectionless: There's no need for a connection setup before data transmission, making UDP simpler and faster for certain use cases.
- Multicast Capable: UDP supports one-to-one (unicast), one-to-many (broadcast), and many-to-many (multicast) communication, enabling more versatile messaging patterns.
- Datagram-Oriented: Data is sent in discrete packets (datagrams) with defined boundaries. Each datagram is self-contained, and its delivery is independent of other datagrams.
Comparison and Use Cases
- Reliability vs. Speed: TCP's reliability and connection-oriented nature make it suitable for applications where data integrity and order are critical, such as web browsing (HTTP/HTTPS), email (SMTP, IMAP/POP3), and file transfers (FTP). On the other hand, UDP's simplicity and speed make it well-suited for time-sensitive applications where occasional data loss is acceptable, such as live video or audio streaming, VoIP, and online gaming.
- Complexity vs. Simplicity: TCP's error correction and flow control add complexity but reduce the need for developers to implement these features in their applications. UDP's simplicity offers flexibility but requires developers to handle error detection and correction if needed.
- Use Cases: TCP is used when the accuracy of the data transmission is more important than the speed, making it the choice for most internet applications. UDP is chosen for broadcasting and streaming where high throughput and low latency are desired, and the application can tolerate some data loss.
More on stream oriented v.s. datagram oriented
TCP (Transmission Control Protocol):
- Stream-Oriented: TCP is stream-oriented, meaning that it treats data as a continuous stream of bytes. When an application sends data over TCP, it is interpreted as part of this byte stream.
- Segmentation: TCP may break down the data into smaller segments for transmission over the network. These segments can be of varying sizes and may follow different paths to reach the destination.
- In practice, it means that if a sender application sends particular block of data, there is no guarantee that it will be delivered to the receiver application as the same block of data in a single turn, that is, the sent message may be broken into as many parts as the protocol wants and each of them will be delivered separately, though in correct order.
- Ordered Delivery: TCP ensures that these segments are delivered to the receiver application in the correct order. Even if segments arrive out of order, TCP reorders them before delivering them to the application.
- Reliability: TCP provides reliable, connection-oriented communication. It guarantees that data is delivered correctly and in order. If a segment is lost or corrupted during transmission, TCP automatically retransmits it.
UDP (User Datagram Protocol):
- Datagram-Oriented: UDP is datagram-oriented, meaning that it treats data as distinct, independent messages called datagrams. Each datagram is treated as a separate entity.
- Whole Message Delivery: Unlike TCP, UDP tries to deliver each datagram as a whole. If the protocol fails to deliver a datagram, it won't attempt to deliver it again. There is no concept of retransmission or guaranteed delivery.
- Unordered Delivery: UDP does not guarantee the order of delivery. Datagrams may arrive at the receiver out of order, and it's up to the application to handle this accordingly.
- Low Overhead: UDP is lightweight and has lower overhead compared to TCP. It's often used for real-time applications or scenarios where low latency is more critical than guaranteed delivery.
Endpoints
Endpoints in Network Communication
- Definition: An endpoint is a combination of an IP address and a protocol port number that uniquely identifies a network service on a host. It serves as a specific destination or source in network communications.
IP Address Representation
- Dot-decimal and Hexadecimal: IPv4 addresses are typically represented in dot-decimal notation (e.g., 192.168.10.112), while IPv6 addresses use hexadecimal notation (e.g., FE36::0404:C3FA:EF1E:3829).
- DNS Names: Instead of direct IP addresses, DNS names (e.g., <www.google.com>) can be used, which require resolution to their corresponding IP addresses before communication can proceed.
- Integer Values: Though technically feasible, representing IP addresses as integer values is rarely used due to poor readability.
Client Application Usage
- Obtaining Server Endpoints: Client applications must know the server's endpoint to initiate communication. This information can be provided directly by the user, through command-line arguments, or read from a configuration file.
- DNS Name Resolution: If given a DNS name, the client application must resolve it to an actual IP address. This process might yield multiple IP addresses, requiring the client to attempt connections sequentially until successful.
Server Application Considerations
- Listening on Endpoints: Server applications use endpoints to inform the operating system where to listen for incoming client messages. The chosen endpoint specifies the IP address and port number for incoming connections.
- Multiple Network Interfaces: Hosts with multiple network interfaces and IP addresses pose a challenge for server applications in deciding which IP address(es) to listen on for incoming messages.
- Listening on All Addresses: To ensure no incoming messages are missed, server applications often listen on all available IP addresses on the host. This approach maximizes the chances of receiving all messages directed to the server's port, regardless of the specific IP address used by clients.