Backend server development often uses TCP and UDP without much thought. HTTP APIs and WebSocket run on TCP. Voice/video streaming and DNS use UDP. When workloads with different reliability and performance requirements — chat servers, ad servers — enter the picture, understanding the transport layer becomes necessary.

Transport Layer

The transport layer handles data delivery between applications. While IP finds the route to a host, the transport layer determines which process on that host receives the data. Port numbers serve this purpose.

Both TCP and UDP run on top of IP. The difference is the choice between reliability and speed.

TCP

TCP (Transmission Control Protocol) is a connection-oriented protocol. It establishes a connection before sending data and retransmits on loss.

Connection Establishment

TCP establishes connections through a 3-way handshake.

sequenceDiagram
    participant C as Client
    participant S as Server

    C->>S: SYN (seq=x)
    Note right of S: SYN received, preparing
    S->>C: SYN-ACK (seq=y, ack=x+1)
    Note left of C: SYN-ACK received
    C->>S: ACK (ack=y+1)
    Note over C,S: Connection established

The client sends a SYN packet along with its sequence number. The server responds with SYN-ACK, providing the server-side sequence number. The client sends ACK to complete the connection. The exchanged sequence numbers serve as reference points for tracking order in subsequent data transfer.

Connection Termination

Connection termination uses a 4-way handshake. TCP operates in full-duplex mode, so each direction must be closed separately.

sequenceDiagram
    participant A as Initiator
    participant B as Peer

    A->>B: FIN
    Note right of B: FIN received, inbound closed
    B->>A: ACK
    Note over B: Finishes sending remaining data
    B->>A: FIN
    Note left of A: FIN received, outbound closed
    A->>B: ACK
    Note over A: Enters TIME_WAIT state
    Note over A,B: Connection released

When one side sends FIN, it signals “no more data to send.” The peer responds with ACK, finishes sending its remaining data, then sends its own FIN. The initiator enters TIME_WAIT after sending the final ACK, allowing time for delayed packets that may still be in the network.

Segmentation

Data sent by the application is split into segments by TCP. A single send() call transmitting 4KB gets divided into multiple segments sized to the MSS.

flowchart LR
    A["Application\n4KB payload"] --> B["TCP"]
    B --> C["Segment 1\nseq=1\n1460 bytes"]
    B --> D["Segment 2\nseq=1461\n1460 bytes"]
    B --> E["Segment 3\nseq=2921\n1160 bytes"]

The receiving TCP reassembles arriving segments in sequence number order and delivers them to the application. Segmentation and reassembly happen transparently — the application sees only the original continuous byte stream.

Reliability Guarantees

Segments can be lost or arrive out of order as they traverse the network. TCP guarantees data delivery through two mechanisms.

Ordering: Each segment carries a sequence number. The receiver reassembles data in original order using these numbers. Even when segments arrive out of order, the application receives sorted data.

Retransmission: After sending data, the sender waits for an ACK. If no ACK arrives within the RTO, Retransmission Timeout, it resends the data.

ACK numbers use cumulative acknowledgment. “ACK 3” means “everything before 3 has been received; expecting 3 next.”

Normal Flow

sequenceDiagram
    participant S as Sender
    participant R as Receiver
    S->>R: Segment 1
    R->>S: ACK 2
    S->>R: Segment 2
    R->>S: ACK 3

When segments arrive in order, the receiver sends an ACK with the next expected number. ACK 2 means “received 1, expecting 2 next.”

Loss Scenario

sequenceDiagram
    participant S as Sender
    participant R as Receiver
    S->>R: Segment 1, 2
    S--xR: Segment 3 [lost]
    S->>R: Segment 4
    R->>S: ACK 3
    R->>S: ACK 3 (dup)
    R->>S: ACK 3 (dup)
    Note over S: 3 duplicates → loss detected
    S->>R: Segment 3 [retransmitted]
    R->>S: ACK 5

The receiver cannot advance the ACK number past 3 because Segment 3 is missing, even after Segment 4 arrives. It repeats ACK 3. When the sender sees 3 duplicate ACKs, it retransmits immediately without waiting for a timeout. Once the receiver gets the retransmitted Segment 3, it combines it with the buffered Segment 4 and sends ACK 5.

Flow Control

Exceeding the receiver’s processing capacity causes data loss. TCP prevents this with the sliding window mechanism.

The receiver advertises its available buffer size as the receive window, rwnd. The sender does not send more unacknowledged data than the rwnd allows.

Awaiting ACKs:

block-beta
    columns 10
    block:window["Send Window (rwnd = 4)"]:4
        s3["3 sent"]
        s4["4 sent"]
        s5["5 ready"]
        s6["6 ready"]
    end
    s7["7"]
    s8["8"]
    s9["9"]
    s10["10"]
    s11["11"]
    s12["12"]

    style s3 fill:#4CAF50
    style s4 fill:#4CAF50
    style s5 fill:#42A5F5
    style s6 fill:#42A5F5

Segments 3 and 4 have been sent. Segments 5 and 6 are inside the window but not yet transmitted. Segment 7 onward is outside the window and cannot be sent.

After receiving ACK 3 — window slides right:

block-beta
    columns 10
    s3["3 ✓"]
    block:window["Send Window (rwnd = 4)"]:4
        s4["4 sent"]
        s5["5 sent"]
        s6["6 ready"]
        s7["7 ready"]
    end
    s8["8"]
    s9["9"]
    s10["10"]
    s11["11"]
    s12["12"]

    style s3 fill:#9E9E9E
    style s4 fill:#4CAF50
    style s5 fill:#4CAF50
    style s6 fill:#42A5F5
    style s7 fill:#42A5F5

When ACK 3 returns, the window shifts one position right. Segment 3 moves out of the window as complete. Segment 7 enters the window. This repeats with each ACK. When the receiver’s buffer fills up, it sends rwnd=0 to pause transmission. When buffer space opens, it advertises a new window size to resume.

The sliding window overcomes the limitations of Stop-and-Wait, where only one packet can be in transit at a time. With sliding windows, multiple packets within the window range transmit continuously while awaiting ACKs.

Two retransmission strategies exist. Go-Back-N retransmits all packets from the lost one onward. Simple to implement but causes unnecessary retransmissions. Selective Repeat retransmits only the lost packets. Requires receiver-side buffering but improves network efficiency. TCP uses the Selective Repeat approach. The SACK option enables this.

Congestion Control

While flow control protects the receiver’s capacity, congestion control protects network path capacity. When the network is congested, router buffers overflow and packets are lost.

TCP manages a variable called the congestion window, cwnd. The actual transmission rate is determined by the smaller of rwnd and cwnd.

Slow Start

At connection start, network capacity is unknown. cwnd begins at 1 MSS and increases by 1 MSS for each ACK received. cwnd doubles each RTT — exponential growth.

---
config:
    xyChart:
        xAxis:
            label: "RTT"
        yAxis:
            label: "cwnd (MSS)"
---
xychart-beta
    x-axis ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
    y-axis 0 --> 40
    line "cwnd" [1, 2, 4, 8, 16, 17, 18, 19, 20, 10, 11]

When cwnd reaches ssthresh (slow start threshold), TCP switches to congestion avoidance. In the graph above, ssthresh (=16) is reached at RTT 4, after which growth becomes linear. At RTT 9, packet loss is detected and cwnd drops to half.

Congestion Avoidance

After ssthresh, cwnd increases by only 1 MSS per RTT — linear growth. Transmission volume increases gradually until packet loss is detected. This is the AIMD strategy — Additive Increase, Multiplicative Decrease. It probes available bandwidth gradually without overloading the network.

Fast Retransmit

Two ways to detect packet loss: timeout (RTO expiry) and duplicate ACKs. Timeouts can take hundreds of milliseconds to seconds.

Fast retransmit triggers immediate retransmission when 3 duplicate ACKs arrive, without waiting for timeout. Duplicate ACKs themselves signal that “data after the lost packet is arriving, but something in between is missing.”

Fast Recovery

Returning to slow start after fast retransmit causes a sharp throughput drop. Fast recovery prevents this. Instead of dropping cwnd to 1, it halves cwnd and resumes directly from congestion avoidance.

Arriving duplicate ACKs indicate the network is not completely blocked — some packets are getting through. No need to retreat all the way to slow start.

flowchart TD
    A[Connection start] --> B[Slow Start
cwnd exponential growth] B -->|cwnd >= ssthresh| C[Congestion Avoidance
cwnd linear growth] B -->|Timeout| D[ssthresh = cwnd/2
cwnd = 1 MSS] C -->|3 dup ACKs| E[Fast Retransmit + Fast Recovery
ssthresh = cwnd/2
cwnd = ssthresh] C -->|Timeout| D D --> B E --> C

Timeout signals severe congestion — cwnd resets to 1 MSS. Loss detected via duplicate ACKs indicates milder congestion — cwnd halves.

Implementation Variants

The four core algorithms share the same skeleton, but specific behavior varies by implementation.

TCP Reno: The first implementation to integrate slow start, congestion avoidance, fast retransmit, and fast recovery. Handles one packet loss per window efficiently.

TCP NewReno: Addresses Reno’s limitation. When multiple packets are lost within a single window, known as partial ACKs, NewReno stays in fast recovery rather than falling back to slow start, retransmitting lost packets sequentially.

TCP CUBIC: The default congestion control algorithm in Linux. Uses a cubic function instead of RTT-proportional linear increase during congestion avoidance. Utilizes available bandwidth faster on high-bandwidth, long-distance networks.

BBR, Bottleneck Bandwidth and RTT: A model-based algorithm developed by Google. Determines transmission rate based on measured bandwidth and RTT rather than packet loss. Estimates the actual bottleneck bandwidth of the network path and achieves maximum throughput without excessively filling router buffers. A substantial share of Internet traffic is reported to use BBR.

UDP

UDP (User Datagram Protocol) is a connectionless protocol. It transmits data immediately without a handshake.

Structure

The UDP header is 8 bytes: source port, destination port, length, and checksum. Compared to TCP’s minimum 20-byte header, the overhead is minimal.

block-beta
    columns 4
    block:tcp["TCP Header (20+ bytes)"]:4
        t1["Source Port"]
        t2["Dest Port"]
        t3["Sequence Number"]
        t4["ACK Number"]
        t5["Flags"]
        t6["Window Size"]
        t7["Checksum"]
        t8["Options..."]
    end

    space:4

    block:udp["UDP Header (8 bytes)"]:4
        u1["Source Port"]
        u2["Dest Port"]
        u3["Length"]
        u4["Checksum"]
    end

    style tcp fill:#E3F2FD
    style udp fill:#E8F5E9

No ordering. No retransmission. Lost packets are the application’s responsibility. No flow control or congestion control either.

Why It Exists

TCP’s reliability comes with latency. The 3-way handshake takes at least 1 RTT. Retransmission adds more delay. Congestion control may throttle transmission speed.

For real-time workloads, this latency matters more than reliability. In a voice call, audio arriving 0.5 seconds late via retransmission disrupts the conversation. In a game, a stale position update arriving late is useless. In these cases, dropping lost data beats retransmitting it.

Comparison

flowchart LR
    subgraph TCP
        direction TB
        tc1[Connection-oriented]
        tc2[Ordered delivery]
        tc3[Retransmission]
        tc4[Flow/Congestion control]
        tc5[20+ byte header]
    end

    subgraph UDP
        direction TB
        uc1[Connectionless]
        uc2[No ordering]
        uc3[No retransmission]
        uc4[No control]
        uc5[8 byte header]
    end

    TCP --- Reliability
    UDP --- Speed

When to Choose Which

TCP fits when:

  • Data integrity is essential: web traffic (HTTP/HTTPS), file transfer (FTP), email (SMTP)
  • Database communication: query results must arrive correctly
  • API calls: lost requests or responses are unacceptable

UDP fits when:

  • Real-time streaming: video, voice calls (VoIP)
  • Online gaming: position and state updates
  • DNS: small request/response exchanges that need speed
  • IoT sensor data: periodic transmission, some loss acceptable

QUIC: The foundation protocol for HTTP/3. It implements TCP-like reliability — retransmission, ordering — and TLS encryption on top of UDP. It reduces the combined latency of TCP’s 3-way handshake plus TLS handshake while preserving reliability. Unlike TCP, which is implemented in the OS kernel and difficult to modify, QUIC operates at the application level, enabling faster iteration.

It is easy to overlook the transport layer in backend development. But why HTTP runs on TCP, why WebSocket chose TCP, and why DNS uses UDP all start here. In the end, the protocol choice comes down to whether the workload needs reliability more or needs to cut latency more.