Get AI summaries of any video or article — Sign up free
i created my own protocol for my games... thumbnail

i created my own protocol for my games...

The PrimeTime·
5 min read

Based on The PrimeTime's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Every packet starts with a version byte so clients can detect incompatible protocol changes and react (reset/update/ban) rather than misparse data.

Briefing

A custom game-network protocol is built from scratch, starting with a compact, versioned packet header and ending with a streaming “framer” that can reliably extract complete packets from a TCP byte stream. The core payoff is control: the protocol defines exactly how messages are encoded, how much payload data is carried, and how framing handles the messy reality that network reads can return partial packets—or multiple packets at once.

The packet format is intentionally simple. The first byte carries a protocol version, enabling safe evolution when new flags or fields are needed. The next byte splits into two parts: a 2-bit encoding type (chosen to allow multiple payload formats such as Proto Buffs, JSON, or a custom scheme) and a 6-bit message type space (with room for up to 64 message types). Two additional bytes store the payload length, so the receiver knows how many bytes belong to the packet’s data section. The design also includes an explicit emphasis on network byte order (“big endian” / network order), ensuring consistent interpretation across machines.

Beyond the packet structure, the transcript highlights a separate layer: framing. Because TCP is a byte stream rather than a message boundary, the receiver must reconstruct packet boundaries using the header’s length field. The framing logic uses an internal buffer and a loop that reads from a reader into that buffer, then checks whether enough bytes exist to parse a full header and, once the payload length is known, a full packet. If the internal buffer contains more than one packet’s worth of data, the framer copies out the completed packet(s) and shifts remaining bytes forward so parsing can continue without losing data.

The framing approach is described as “almost identical to” WebSocket framing, with the key difference that WebSocket uses variable-length length fields and optional masking. Here, the protocol relies on a fixed header and a straightforward length check. When parsing detects invalid conditions—like version mismatches, impossible payload sizes, or inconsistent length fields—the system uses assertions to crash early, treating unexpected states as fundamentally wrong.

To keep the implementation trustworthy, the protocol includes both unit tests for encoding/decoding and a dedicated framer test. The framer test writes the same packet five times into a buffer, runs the framer concurrently (using an error group), and verifies that exactly five packets are emitted before cancellation. A notable bug is also called out: an earlier parsing condition only allowed one packet per read cycle, causing additional packets in the same TCP chunk to be dropped until more data arrived.

Finally, the protocol is positioned as groundwork for a larger architecture: authentication and matchmaking are envisioned as a “reverse proxy” layer inside a game server. The stated goal is to support roughly 60 to 100,000 concurrent players for a project called Vim arcade, with the code hosted on GitHub under the Primagen account.

Cornell Notes

The protocol design separates two concerns: a fixed packet header and a streaming framer that reconstructs packet boundaries from TCP bytes. Each packet begins with a version byte, then an encoding/type byte (2 bits for encoding, 6 bits for message type), followed by a 2-byte payload length. Network order (big endian) is used so different machines interpret fields consistently. The framer reads into an internal buffer, parses headers only when enough bytes exist, and emits complete packets while shifting leftover bytes forward—handling cases where TCP delivers multiple packets at once. Unit tests validate encoding/decoding and framer behavior, including a test that expects exactly five packets from a buffer containing five concatenated packets.

Why include a version byte as the first field in every packet?

The version byte makes protocol evolution survivable. When new fields, flags, or message semantics are introduced, the receiver can detect whether it understands the incoming format. That enables controlled behavior like rejecting old clients, forcing updates, or resetting game state when the protocol version doesn’t match.

How does the protocol fit multiple payload formats into a single packet design?

A 2-bit encoding field is embedded in the second byte, giving four encoding options. That allows the payload to be interpreted as Proto Buffs, JSON, or a custom binary format depending on the encoding value, while keeping the rest of the header structure stable.

What problem does the length field solve, and why is it essential for TCP?

TCP does not preserve message boundaries; reads can return partial packets or multiple packets together. The 2-byte payload length tells the receiver exactly how many bytes belong to the current packet’s data section. The framer uses that length to decide when a full packet is available to parse and emit.

How does the framer handle cases where a single TCP read contains more than one packet?

It accumulates bytes in an internal buffer. When the buffer holds enough bytes for at least one full packet (header + payload), it copies out the completed packet and then shifts any remaining bytes forward. That shifting ensures the next packet can be parsed immediately from leftover data rather than waiting for another read.

What kinds of failures trigger immediate crashes via assertions?

Assertions are used to enforce invariants like correct protocol version, non-empty/valid payload expectations, and consistency between the declared payload length and the actual encoded data length. The approach treats these mismatches as “fundamentally stupid” states that should never occur in a correct implementation.

What was the framing bug discovered during testing?

An earlier parsing condition only allowed one packet to be read out per cycle. If multiple packets arrived in the same buffer chunk, only the first would be emitted and the rest would wait until more data arrived. The fix was to loop parsing so it can emit all complete packets currently present in the buffer.

Review Questions

  1. What exact fields are in the packet header, and how do they determine how many bytes the receiver should read for the payload?
  2. Describe the framer’s algorithm for parsing from an internal buffer. How does it avoid losing bytes when multiple packets arrive together?
  3. Why does the protocol insist on network byte order, and what could go wrong if it were ignored?

Key Points

  1. 1

    Every packet starts with a version byte so clients can detect incompatible protocol changes and react (reset/update/ban) rather than misparse data.

  2. 2

    The header packs encoding and message type into a single byte: 2 bits for encoding (up to four payload formats) and 6 bits for message type (up to 64 message types).

  3. 3

    A 2-byte payload length field is the receiver’s anchor for reconstructing packet boundaries over TCP’s byte-stream behavior.

  4. 4

    Parsing must be separated from framing: the framer reconstructs complete packets using the header length, while the packet object handles encoding/type/data access.

  5. 5

    Framing must support “multiple packets in one read” by emitting all complete packets found in the buffer and shifting leftover bytes forward.

  6. 6

    Network order (big endian) is used for multi-byte fields like payload length to ensure cross-platform correctness.

  7. 7

    Unit tests validate both encoding/decoding and framing, including a concurrency-based framer test that expects exactly five emitted packets from five concatenated packets.

Highlights

The protocol’s header is deliberately compact: version (1 byte), encoding/type (1 byte), and payload length (2 bytes).
The framer is the real complexity: it reconstructs packet boundaries from TCP by buffering, parsing, emitting, and shifting leftover bytes.
A key bug was dropping additional packets when multiple arrived in one chunk; the fix was to parse in a loop until no complete packet remains.
Assertions are used aggressively to crash on invariant violations like version mismatches or length inconsistencies, turning unexpected states into immediate failures.

Topics

  • Custom Protocol Design
  • Packet Framing
  • Network Byte Order
  • TCP Stream Parsing
  • Game Server Architecture