Sockets Tutorial with Python 3 part 1 - sending and receiving data

TL;DR

Use socket.AF_INET with IPv4 and socket.SOCK_STREAM for TCP byte-stream communication.

Briefing Cornell Notes

Briefing

A basic TCP socket setup in Python can reliably send and receive messages, but it also exposes a key reality of networking: TCP delivers a byte stream, not discrete “messages.” That difference drives the need for buffering logic and—eventually—protocol design (like sending a header that tells the receiver how much data to expect).

The walkthrough starts by creating two files, server.py and client.py, using Python’s built-in socket library. On the server side, it creates a TCP streaming socket with socket.AF_INET (IPv4) and socket.SOCK_STREAM (TCP). The server binds to a host/port tuple—using socket.gethostname() and port 1234—then calls listen(5) to queue up to five incoming connection attempts. In an infinite loop, it accepts a connection via accept(), which returns a new client socket object plus the client’s address. After printing a connection-established message, the server sends a UTF-8 encoded payload (“welcome to the server”) using client_socket.send(...).

On the client side, the same socket type is created, but instead of binding it calls connect((host, port)). With the connection established, the client receives bytes using recv(1024), then decodes them as UTF-8 and prints the resulting text. Running server.py and then client.py demonstrates the simplest case: the client receives the message and exits. The immediate disconnect behavior is treated as a limitation of the minimal example rather than a full “long-lived” chat-style design.

The tutorial then highlights the practical buffering problem. Because TCP is a stream, recv() reads up to a fixed buffer size; if the incoming data exceeds that size, the receiver may only get part of the payload. The example changes the buffer size down to 8 bytes, reruns the client, and shows that only a fragment (“welcome”) arrives unless the client keeps reading. Adding a while True loop to repeatedly call recv() allows the client to accumulate the full message into a full_message string by decoding each chunk and appending it.

However, the stream doesn’t end just because the message “feels complete.” Without a clear termination condition, the client can hang waiting for more bytes. The walkthrough points out that the connection remains open, so the loop never naturally stops. The fix demonstrated is to rely on connection closure as the end signal: when the server sends the message and the client reads until the stream ends, the client can break and print the full message. The takeaway is that real socket applications typically avoid guessing by defining a protocol—often sending a header that specifies payload length—so the receiver knows exactly how much data to read.

The next installment is teased as the place where buffering and keeping the stream open are handled more robustly, likely with a clearer message framing approach.

Cornell Notes

The tutorial builds a minimal TCP client/server in Python using socket.AF_INET and socket.SOCK_STREAM. The server binds to a host/port, listens with a queue size of 5, accepts a connection, and sends a UTF-8 message. The client connects, then receives bytes with recv(buffer_size) and decodes them to text. A central lesson follows: TCP is a byte stream, so recv() returns up to N bytes and may split a logical message across multiple reads. To reconstruct full data, the client must loop and buffer chunks until the stream ends or until a protocol tells it how much data to expect—an approach promised for the next video.

Why does the code use socket.AF_INET and socket.SOCK_STREAM, and what do they imply?

socket.AF_INET selects IPv4 addressing, meaning the server and client communicate using an IPv4 host/port tuple. socket.SOCK_STREAM selects a TCP streaming socket, which delivers data as a continuous byte stream rather than preserving message boundaries. That stream behavior is why recv() must be handled carefully.

What role do bind(), listen(5), and accept() play on the server?

bind((host, port)) attaches the server socket to an IP/port so clients know where to connect. listen(5) prepares the server for incoming connections and sets a backlog queue of 5 pending connection attempts. accept() blocks until a client connects, then returns (1) a new client socket object for that connection and (2) the client’s address information.

How does the client receive data, and why is recv(1024) not guaranteed to read the whole message?

The client calls recv(1024) to read up to 1024 bytes from the TCP stream. TCP doesn’t send “one recv per send”; it may split the sender’s bytes across multiple chunks depending on timing and buffer sizes. If the logical message exceeds the buffer size, recv() returns only part of it.

What changes when the buffer size is reduced to 8 bytes?

With a smaller recv buffer, the client receives only a small fragment of the server’s UTF-8 message per read. The tutorial demonstrates that without a loop, the client prints only the first chunk (e.g., “welcome”). Adding a while True loop and appending decoded chunks reconstructs the full message.

Why can a buffering loop hang, and what termination condition is used in the example?

A while True recv loop can hang because the TCP stream remains open until the connection closes. If the code doesn’t know when the message ends, it keeps waiting for more bytes. In the example, the client eventually breaks when the server closes the connection (or when recv returns no more data), using connection closure as the end signal.

What protocol improvement is foreshadowed for real applications?

Instead of relying on connection closure, the tutorial suggests sending a header that tells the client how much data will follow. With a length indicator, the client can read exactly that many bytes and stop cleanly, avoiding indefinite blocking and making message framing reliable.

Review Questions

In TCP, what does recv(buffer_size) return, and how does that affect reconstructing a logical message?
What server-side calls are responsible for preparing for connections and creating a per-client socket?
Why is relying on connection closure as an end-of-message signal fragile, and what alternative framing method is hinted at?

Key Points

1
Use socket.AF_INET with IPv4 and socket.SOCK_STREAM for TCP byte-stream communication.
2
Bind the server to a (host, port) tuple, then call listen(5) to queue incoming connection attempts.
3
Call accept() to create a dedicated client socket for each connected client.
4
Decode received bytes (e.g., UTF-8) after recv() returns raw byte chunks.
5
Treat TCP as a stream: recv(N) may return partial data, so buffer and loop to assemble full messages.
6
Avoid infinite recv loops by using a clear termination condition; connection closure works in the demo but is not ideal.
7
For robust messaging, send framing information (like a payload length header) so the receiver knows exactly how much to read.

Highlights

TCP preserves a byte stream, not message boundaries—recv(1024) can return only part of what was sent.

Reducing the buffer size to 8 bytes shows how a single logical message can arrive in multiple chunks.

A recv loop needs an end condition; otherwise it waits indefinitely because the connection stays open.

Real socket apps typically add message framing (e.g., a length header) rather than relying on disconnects.

Topics

TCP Sockets
Client-Server
recv Buffering
Message Framing
UTF-8 Encoding

Mentioned

TCP
IPv4
UTF-8