Skip to main content

Understanding bytes in Go by building a TCP protocol

·26 mins

Some of my newsletter subscribers have asked me a few times what is the easiest way to think about byte slices, or using Go’s syntax: []byte. Folks that have experience with low-level languages, where working with bytes is widespread, usually do not have challenges comprehending what []byte means and how to use it.

But, if you come from a dynamic or a high-level language background, although everything we do ends up being a bunch of bytes, higher-level languages hide such details from us. Coming from a Ruby background, I sympathize with the folks that find this challenging.

Inspired by the conversations I had with these readers, I decided to try to help out. So, without beating around the bush โ€“ let’s explore how we can build a simple TCP-based application protocol. And thoroughly understand bytes in the process.

Standing on shoulders of giants #

The protocol that we will be implementing will work on top of TCP, because:

  1. TCP is high level enough, so we don’t have to worry about too many low-level connection details
  2. Go has excellent TCP support through the net package
  3. The interfaces that the net packages provide allow us to play with bytes (and slices of bytes)
  4. TCP provides certain guarantees when it comes to data transfer, so we are left with less to worry about
  5. Standing on the shoulders of giants is more uncomplicated than reinventing the wheel

Now, because protocol design is no small feat, we will be implementing a very tiny and toyish protocol. I am by no means a protocol designer, so what we are going to be working on here is merely an example. But, if you have any experience with protocols, please do not hesitate to drop me a comment (or a message) and let me know what I got wrong.

Enter SLCK #

Let us imagine that Slack, the omnipresent collaboration and communication app, wants to turn their chat design into an open internet protocol. Being an open internet protocol would allow anyone to create a Slack-compliant server by implementing the SLCK protocol and build their version of Slack.

In theory, having an open protocol would allow Slack to become distributed, with people hosting their Slack servers. If people would host their SLCK servers, the servers will communicate with a cluster (or inter-server) protocol. With a cluster protocol (between servers) and a client protocol (between client and server), two SLCK servers will have the ability to communicate between themselves and with clients.

But, a cluster protocol is not what we will explore here, as it is significantly more challenging to implement. That’s why we will focus only on the client SLCK protocol.

Now, to make the client SLCK protocol production-ready would be a massive effort, and it is beyond what an article can cover. But, I believe that it’s a great example through which we can learn more about working with bytes.

Without further ado, let’s talk about SLCK.

SLCK design #

SLCK is a text-based wire protocol, meaning that the data that travels on the wire is not binary, but just ASCII text. The advantage of text-based protocols is that a client can practically open a TCP connection to a server that implements the protocol and talk to it by sending ASCII characters.

Clients can connect to and communicate with a SLCK server through a TCP socket, using a set of commands and conventions that we will define next.

Protocol conventions #

SLCK follows a few simple but important conventions:

  • It has TLV (type-length-value) format
  • It has a control line & a content line
  • It uses carriage return (\r\n) as delimiter
  • It contains a fixed list of subjects (tags) representing actions
  • And as already mentioned, it uses ASCII encoded text on the wire

Or, if we put all of these conventions in combination, a SLCK message would look like:

MSG #general 11\r\nHello World

So, why an ASCII-encoded text-based wire protocol? Well, the advantage of such protocols is that once the specification (usually called spec) of the protocol is public, anyone can write an implementation. Thus, such a protocol can be the backbone on top of which an ecosystem can be born.

More practically, having a simple protocol makes it easy to work with it. We do not need fancy clients to talk to a server that implements SLCK. A connection through telnet would suffice, and the messages sent to the server can be written by anyone that understands the protocol specification, just by hand.

Protocol commands (subjects and options) #

SLCK has a few different commands:

ID Sent by Description
REG Client Register as client
JOIN Client Join a channel
LEAVE Client Leave a channel
MSG Both Send or receive a message to/from entity (channel or user)
CHNS Client List available channels
USRS Client List users
OK Server Command acknowledgement
ERR Server Error

Let’s explore each of them:

REG #

When a client connects to a server, they can register as a client using the REG command. It takes an identifier as an argument, which is the client’s username.

Syntax:

REG <handle>

where:

  • handle: name of the user

JOIN #

When a client connects to a server, they can join a channel using the JOIN command. It takes an identifier as an argument, which is the channel ID.

Syntax:

JOIN <channel-id>

where:

  • channel-id: ID of the channel

LEAVE #

Once a user has joined a channel, they can leave the channel using the LEAVE command, with the channel ID as argument.

Syntax:

LEAVE <channel-id>

where:

  • channel-id: ID of the channel

Example 1: to leave the #general channel, the client can send:

LEAVE #general

MSG #

To send a message to a channel or a user, the client can use the MSG command, with the channel or user identifier as argument, followed with the body length and the body itself.

Syntax:

MSG <entity-id> <length>\r\n[payload]

where:

  • entity-id: the ID of the channel or user
  • length: payload length
  • payload: the message body

Example 1: send a Hello everyone! message to the #general channel:

MSG #general 16\r\nHello everyone!

Example 2: send a Hello! message to @jane:

MSG @jane 4\r\nHey!

CHNS #

To list all available channels, the client can send the CHNS message. The server will reply with the list of available channels.

Syntax:

CHNS

USRS #

To list all users, the client can send the USRS message. The server will reply with the list of available users.

Syntax:

USRS

OK/ERR #

When the server receives a command, it can reply with OK or ERR.

OK does not have any text after that, think of it as an HTTP 204.

ERR <error-message> is the format of the errors returned by the server to the client. No protocol errors result in the server closing the connection. That means that although an ERR has been returned, the server is still maintaining the connection with the client.

Example 1: Protocol error due to bad username selected during registration:

ERR Username must begin with @

Example 2: Protocol error due to bad channel ID sent with JOIN:

ERR Channel ID must begin with #

Implementing a server #

Now that we have the basics of the SLCK protocol in place, we can move on to the implementation. There are many ways to create the server, but as long as it implements the SLCK protocol correctly, the clients will not care about what happens under the hood of the server.

In continuation, I will explain my approach to building the SLCK TCP server, and while we are on it, we will learn lots about bytes and slices of bytes.

(If you would like to jump ahead, the full code of the SLCK protocol server implementation can be found in this repo.)

Server design #

The server design will have four different parts: a client (user), a channel (chat room), a command (from the client to the server), and a hub - the server that manages it all.

Let’s take it from the most straightforward piece towards the most complicated.

Commands #

Commands are what flows from the clients to the hub. Each received command from the user, such as REG, MSG, and the others, has to be appropriately parsed, validated, and handled.

Each command is of type command. The type definition is as follows:

type command struct {
	id        ID
	recipient string
	sender    string
	body      []byte
}

The four attributes of the type, including their explanation:

  • id - the identification of the command, which can be one of the protocol commands.
  • recipient - who/what is the receiver of the command. It can be a @user or a #channel.
  • sender - the sender of the command, which is the @username of a user.
  • body - the body of the command sent by the sender to the receiver.

The flow of commands will be: a client receives the wire-protocol message, parses it, and turns it in a command, that the client sends to the hub.

Additionally, the command also uses a type ID, which is an int type alias. We use ID so we can control the valid command types using a constants and an iota:

type ID int

const (
	REG ID = iota
	JOIN
	LEAVE
	MSG
	CHNS
	USRS
)

Although the clients have to work with the raw strings that they receive from the network, internally in the server, we map the wire commands to their constant counterparts. This way, we establish strict control of all the command types, enforced by Go’s compiler. Using this approach, we assure that the id will always be a valid command type.

Channels #

Channels in the SLCK protocol lingo are just chat rooms. It’s worth mentioning they have nothing in common with Go channels, except the name.

A channel is just a type with two attributes:

type channel struct {
	name    string
	clients map[*client]bool
}

The name of the channel is just a string that contains the unique name of the channel. The clients map is a set of *clients that are part of the channel at a given time. Having the list of clients available allows us to easily broadcast messages to all clients in the channel, such as:

func (c *channel) broadcast(s string, m []byte) {
	msg := append([]byte(s), ": "...)
	msg = append(msg, m...)
	msg = append(msg, '\n')

	for cl := range c.clients {
		cl.conn.Write(msg)
	}
}

Which brings us to the client itself.

Client #

A client is a wrapper around the TCP connection. It encapsulates all the functionality around accepting messages from the TCP connection, parsing the messages, validating their structure and content, and sending them for further processing and handling to the hub.

Let’s look closer into the client type:

type client struct {
	conn       net.Conn
	outbound   chan<- command
	register   chan<- *client
	deregister chan<- *client
	username   string
}

The four attributes of the client type, in order:

  • conn - the TCP connection itself (of type net.Conn)
  • outbound - a send-only channel of type command. This channel will be the connection between the client and the hub, through which the client will send commands to the hub
  • register - a send-only channel of type *client through which the client will let the hub know that it wants to register itself with the hub (a.k.a. the chat server)
  • deregister - a send-only channel of type *client through which the client will let the hub know that the user has closed the socket, so the hub should deregister the client (by removing it from the clients map and from all channels)
  • username - the username of the user (of type string) that is sitting behind the TCP connection

If this is a bit confusing, worry not. It will get more evident once you see the whole thing in action.

Now, let’s move on to the client’s methods. Once we intantiate a client, it can listen for incoming messages over the TCP connection. To do that, the client has a method called read:

func (c *client) read() error {
	for {
		msg, err := bufio.NewReader(c.conn).ReadBytes('\n')
		if err == io.EOF {
			// Connection closed, deregister client
			c.deregister <- c
			return nil
		}
		if err != nil {
			return err
		}

		c.handle(msg)
	}
}

read loops endlessly using a for loop, and accepts incoming messages from the conn attribute (the TCP connection). Once the message (msg) is received, it will pass it on to the handle method, which will process it.

In case the err returned is io.EOF, meaning the user can closed the connection, the client will send notify the hub through the deregister channel. The hub will remove the deregistered client from the clients map and from all of the channels that the client participated in.

Handling bytes in handle using the bytes package #

Because of the protocol definition, we know the structure of the commands that the chat server might receive from the user. That’s what the client’s handle method does - it get the raw messages from the socket and parses the bytes to make meaning out of them.

func (c *client) handle(message []byte) {
	cmd := bytes.ToUpper(bytes.TrimSpace(bytes.Split(message, []byte(" "))[0]))
	args := bytes.TrimSpace(bytes.TrimPrefix(message, cmd))

	switch string(cmd) {
	case "REG":
		if err := c.reg(args); err != nil {
			c.err(err)
		}
	case "JOIN":
		if err := c.join(args); err != nil {
			c.err(err)
		}
	case "LEAVE":
		if err := c.leave(args); err != nil {
			c.err(err)
		}
	case "MSG":
		if err := c.msg(args); err != nil {
			c.err(err)
		}
	case "CHNS":
		c.chns()
	case "USRS":
		c.usrs()
	default:
		c.err(fmt.Errorf("Unknown command %s", cmd))
	}
}

Handling messages is where we get to see the slice of bytes ([]byte) type in action. So, what happens here? Let’s break it down.

Given that our SLCK protocol is a text-based wire protocol, the bytes that are flowing on the TCP connection are, in fact, plain ASCII text. Each byte (or octet, because a byte is eight bits) in the decimal number system has a value between 0 and 255 (2 to the power of 8). That means that each of the octets can contain any of the characters in the extended ASCII encoding. (Refer to this ASCII table to see all of the available characters.)

Having a text-based protocol allows us to easily convert each of the bytes that arrive through the TCP connection into a meaningful text. That’s why each byte in the []byte slice represents one character. Because each byte in the slice []byte is a character, converting a []byte in a string is as easy as: s := string(slice).

And Go is good at handling bytes. For example, it has a bytes package which lets us work with []byte, instead of converting them into strings every time we want to work with bytes.

Given that all SLCK commands begin with a single word separated with space after it, we can simply take the first word from the []byte, upcase it and compare it with the valid keywords of the protocol. “But, how are we supposed to take a word from a slice of bytes?” you might ask. Since bytes are not words, we have to resort to either compare them byte-by-byte or use the built-in bytes package. To keep things simple, we will use the bytes package. (You can check this snippet to compare the two approaches.)

In the first line of the handle, method we take the first part of the received message, and we upcase it. Then, on the second line, we remove the first part from the rest of the message. The split allows us to have the command (cmd) and the rest of the command arguments (args) in separate variables.

cmd := bytes.ToUpper(bytes.TrimSpace(bytes.Split(message, []byte(" "))[0]))
args := bytes.TrimSpace(bytes.TrimPrefix(message, cmd))

After that, in the switch construct, we handle all of the different commands. For example, handling the REG command is done using the reg, and the err methods:

func (c *client) reg(args []byte) error {
	u := bytes.TrimSpace(args)
	if u[0] != '@' {
       		return fmt.Errorf("Username must begin with @")
	}
   	if len(u) == 0 {
	     	return fmt.Errorf("Username cannot be blank")
   	}

	c.username = string(u) c.register <- c

	return nil
}

The reg method takes the args slice, and it removes any space bytes (using bytes.TrimSpace). Given that the second argument of the REG command is the @username of the user, it checks if the passed username begins with @ and if it’s blank. Once it does that, it converts the username to a string, and it assigns it to the client (c) itself. From then on, the client has an assigned username.

As a second step, it sends the client itself through the register channel. This channel is read by the hub (the chat server), which will do more validation of the username before it successfully registers the client.

func (c *client) err(e error) {
	c.conn.Write([]byte("ERR " + e.Error() + "\n"))
}

The err func simply takes an error and sends its contents back to the user, using the underlying TCP connection of the client.

We will come back to the other commands and methods once we have thoroughly explored the chat server.

The hub #

The hub is the central entity that the clients connect and register with. The hub also manages the available channels (chat rooms), broadcasting messages to said channels, and relaying messages (private/direct messages) between clients.

All of the above functionality means that the hub is the central place of all communications, hence the name.

First, let’s explore the hub type with all of its attributes:

type hub struct {
	channels        map[string]*channel
	clients         map[string]*client
	commands        chan command
	deregistrations chan *client
	registrations   chan *client
}

The attributes, and their explanations, in order:

  • channels - a map of the channels (chat rooms), with the name of the channel as the key and the *channel as value
  • clients - a map of the clients (connected users), with the username as the key and the *client as value
  • commands - a channel of command that are flowing from the clients to the hub, that the hub will validate and execute
  • deregistrations - a channel of *client through which a client deregisters itself, through which the hub will be informed that the user has closed the connection and it will clean up any references to that client
  • registrations - a channel of *client through which new clients register themselves to the hub, through which the hub will accept the new client, validate their username and add them to the clients map

So, how does the hub function? It all begins with the run method:

func (h *hub) run() {
	for {
		select {
		case client := <-h.registrations:
			h.register(client)
		case client := <-h.deregistrations:
			h.unregister(client)
		case cmd := <-h.commands:
			switch cmd.id {
			case JOIN:
				h.joinChannel(cmd.sender, cmd.recipient)
			case LEAVE:
				h.leaveChannel(cmd.sender, cmd.recipient)
			case MSG:
				h.message(cmd.sender, cmd.recipient, cmd.body)
			case USRS:
				h.listUsers(cmd.sender)
			case CHNS:
				h.listChannels(cmd.sender)
			default:
				// Freak out?
			}
		}
	}
}

When we establish a new hub instance (which we will see later), we execute the run method in a goroutine. The goroutine will run the for loop indefinitely, processing the registrations, deregistrations, and the commands channels. Messages arriving through the registrations and deregistrations channels will be handled differently from the messages that will come from the commands channel.

run will receive messages through the registrations channel, and it will send them to the register method for processing:

func (h *hub) register(c *client) {
	if _, exists := h.clients[c.username]; exists {
		c.username = ""
		c.conn.Write([]byte("ERR username taken\n"))
	} else {
		h.clients[c.username] = c
		c.conn.Write([]byte("OK\n"))
	}
}

The register method will check if the hub already has a user with the given username, and it will react accordingly. If the username is taken, it will remove the username from the client and respond with an error. If the username is not taken, then it will add the client to the clients map, with the username as a key and the client reference as a value.

run will receive messages through the deregistrations channel, and it will send them to the deregister method for processing:

func (h *hub) deregister(c *client) {
	if _, exists := h.clients[c.username]; exists {
                delete(h.clients, c.username)

		for _, channel := range h.channels {
			delete(channel.clients, c)
		}
	}
}

The deregister method will check if the hub already has a user with the given username. If it finds the user, it will remove it from the hub’s clients map. Also, it will go through the map of channels and it will try to remove it from each of the channel’s clients map.

When it comes to handling commands, things are more different. Each of the commands, as we already established, has an id attribute. For each of the commands that we receive, we do a switch on the id attribute, which will invoke a different method. For example, to join a channel, the id must be of value JOIN, which will invoke the joinChannel function, with the command’s sender and recipient attributes.

The joinChannel function receives the username (u) and the channel (c) as arguments. Then, if it finds the channel, it will add the client to the channel’s clients map. Otherwise, it will first create the channel, using the newChannel constructor, and then add the client as the first client to the channel:

func (h *hub) joinChannel(u string, c string) {
	if client, ok := h.clients[u]; ok {
		if channel, ok := h.channels[c]; ok {
			// Channel exists, join
			channel.clients[client] = true
		} else {
			// Channel doesn't exists, create and join
			h.channels[c] = newChannel(c)
			h.channels[c].clients[client] = true
		}
	}
}

Now, let’s zoom out and see how a client wraps a TCP connection. Then, we will see how bytes are flowing from the user, through the client, to the hub ending up with the receiver (another channel or user).

Sending messages #

The core functionality to a chat server and the purpose of our SLCK protocol is sending and receiving messages. Let’s follow the flow of the bytes and see how we can implement sending messages between clients.

The structure of the MSG command is as follows:

MSG <entity-id> <length>\r\n[payload]

For example, to send a Hello! message to the #general channel:

MSG #general 6\r\nHello!

Or, to send a Hey! message to @jane:

MSG @jane 4\r\nHey!

Once the user sends the MSG command, the client’s handle method accepts it. Then, in handle we extract the message and the command, and we invoke the msg method of the client:

func (c *client) handle(message []byte) {
	cmd := bytes.ToUpper(bytes.TrimSpace(bytes.Split(message, []byte(" "))[0]))
	args := bytes.TrimSpace(bytes.TrimPrefix(message, cmd))

	switch string(cmd) {
	// Some other stuff here...
	case "MSG":
		if err := c.msg(args); err != nil {
			c.err(err)
		}
	// Some other stuff here...
	default:
		c.err(fmt.Errorf("Unknown command %s", cmd))
	}
}

The args are passed to the msg method, which does some heavy lifting.

func (c *client) msg(args []byte) error {
	args = bytes.TrimSpace(args)
	if args[0] != '#' && args[0] != '@' {
		return fmt.Errorf("recipient must be a channel ('#name') or user ('@user')")
	}

	recipient := bytes.Split(args, []byte(" "))[0]
	if len(recipient) == 0 {
		return fmt.Errorf("recipient must have a name")
	}

	// More stuff here...
}

The msg method’s first step is to check if the first argument of the message begins with a @ or # โ€“ a user or a channel name. If that’s correct, we extract the recipient, which can be the username or the channel name.

func (c *client) msg(args []byte) error {
	// The stuff from above here...

	args = bytes.TrimSpace(bytes.TrimPrefix(args, recipient))
	l := bytes.Split(args, DELIMITER)[0]
	length, err := strconv.Atoi(string(l))
	if err != nil {
		return fmt.Errorf("body length must be present")
	}
	if length == 0 {
		return fmt.Errorf("body length must be at least 1")
	}

	padding := len(l) + len(DELIMITER) // Size of the body length + the delimiter
	body := args[padding : padding+length]

	c.outbound <- command{
		recipient: string(recipient[1:]),
		sender:    c.username,
		body:      body,
		id:        "MSG",
	}

	return nil
}

We extract the next argument after, which, according to the protocol specification, is the length of the body in bytes. Having the size of the body (the length of the bytes) as part of the command that is sent by the client allows the server to slice off the bytes it needs from the body efficiently.

Let’s see an example:

MSG #general 39\r\nHey folks, hope you are all doing well!

In the handle method, we sliced off MSG, and we send the rest of the bytes to the msg method. In msg, we checked if the next argument is a channel or a username โ€“ which is correct. Then, we pick up the 39, and we store them in the l variable.

Having l being 39 is not enough โ€“ the slice of bytes that represent the ASCII 39 is [51 57]. The 51 and 57 bytes just mean that the two octets representing 3 and 9 in ASCII, have byte representation as 51 and 57. To make our Go code understand [51 57] like 39, we have to convert them into a string, so they become "39" and then use the strconv package to convert the string to an int:

l := bytes.Split(args, DELIMITER)[0]
length, err := strconv.Atoi(string(l))

Once we have the length of the body, we validate that it will be at least one byte. Next, we take the remaining bytes from the args, and we slice off the length amount of bytes from the args:

padding := len(l) + len(DELIMITER) // Size of the body length + the delimiter
body := args[padding : padding+length]

In context of our example above:

MSG #general 39\r\nHey folks, hope you are all doing well!

we take the length (2) of the 39 plus the length of the \r\n delimiter and then take the body out of the args slice of bytes by using the “slice” operator (:). The slicing operation results in slicing all of the bytes between \n to the end of the body, meaning the body becomes:

Hey folks, hope you are all doing well!

If the length were less then 39, then the body would end up being shorter, because the user has sent the wrong body size to the server. Conversely, if we tried to slice off more than the size of the body, then the goroutine serving the hub would crash, rendering our server useless.

Given that body now contains the message itself, the last step of the msg method is to send the new command it received through the outbound channel to the hub:

c.outbound <- command{
        recipient: string(recipient[1:]),
        sender:    c.username,
        body:      body,
        id:        MSG,
}

The command has a recipient (channel or user ID), the sender, which is the username of the message author, the body containing the body of the message, and the id that’s the identifier of the command - MSG in this case.

Then, the hub which infinitely loops and reads from the commands channel (which is the same channel as the outbound channel of the client), will pick up the message from the client and process it in the message method:

func (h *hub) message(u string, r string, m []byte) {
	if sender, ok := h.clients[u]; ok {
		switch r[0] {
		case '#':
			if channel, ok := h.channels[r]; ok {
				if _, ok := channel.clients[sender]; ok {
					channel.broadcast(sender.username, m)
				}
			}
		case '@':
			if user, ok := h.clients[r]; ok {
				user.conn.Write(append(m, '\n'))
			}
		}
	}
}

The message method will check if the sender username (u) is present in the list of active clients (h.clients). If the client is active, then it will check if the first byte of the message (m) is a # or a @. Based on the result of the switch, either it will broadcast the message (m) to the channel, or it will find the recipient (r) from the h.clients list and send the message through the recipient’s TCP connection.

Let’s see this in action. I will start the server (using go run .) and open two telnet sessions with the server:

$ telnet 127.0.0.1 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Given that SCLK is a ASCII wire protocol, it means we can just send some commands to the server right away. Let’s first register both clients. The first one will be @jane:

REG @jane
OK

And the second client will be @john:

REG @john
OK

In both cases the registration with the server went well, so the server replied with OK. Now, let’s make both of the clients join the #general channel. If any of the users listed the channels they would get an error:

CHNS
ERR no channels found

Great, let’s make them both join #general, so the channel would be created. First @john:

JOIN #general
OK

@jane righ after:

CHNS
#general

JOIN #general
OK

Now that both are in the channel, we can also send a message from @jane to #general, and it should pop up on @john’s screen too.

Sending the message:

MSG #general 5\r\nHello

And in @john’s screen we can see:

jane: Hello

Voilรก! @john received the message. Let’s say @john would like to send @jane a direct message. He sends:

MSG @jane 3\r\nHey

And on @jane’s screen she will see:

@john: Hey

Tying it all together #

Now that we went through the client, hub, channel, and command, we need to see the last piece of the puzzle - the main function that ties it all together.

In the main func, we will initialize a TCP listener, through which we can accept a new TCP connection. Then, we will establish a new hub and invoke run in a separate goroutine.

Once the hub is running, we will infinitely loop using a for. Within the for loop, we will accept new TCP connections, wrap them in a new client and spin them off in a new goroutine.

package main

import (
	"log"
	"net"
)

func main() { ln, err := net.Listen("tcp", ":8081")
	if err != nil {
		log.Printf("%v", err)
	}

	hub := newHub()
	go hub.run()

	for {
		conn, err := ln.Accept()
		if err != nil {
			log.Printf("%v", err)
		}

		c := newClient(
			conn,
			hub.commands,
			hub.registrations,
			hub.deregistrations,
		)
		go c.read()
	}
}

As a refresher, within the read function of the client, we also infinitely loop using a for and accept new incoming TCP messages:

func (c *client) read() error {
	for {
		msg, err := bufio.NewReader(c.conn).ReadBytes('\n')
		if err == io.EOF {
			// Connection closed, deregister client
			c.deregister <- c
			return nil
		}

		if err != nil {
			return err
		}

		c.handle(msg)
	}
	return nil
}

Having the client’s read method being run a separate goroutine allows us to spin off as many goroutines as we have connections. That leaves our main thread to accept new connections and just spin the off into goroutines. Once the goroutine is running, it will take care of itself until it crashes or the client exits.

The pitfall of this approach is that we have a single hub instance, which means that there’s only one goroutine that is accepting messages from what can be thousands of clients. While having a single hub instance simplifies the design, it also introduces a single point of failure. If the hub.run goroutine crashes/exits for whatever reason, the server will be rendered useless, although all of the client connections will be working fine.

The full code of the SLCK protocol server implementation can be found in this repo.

Notable shortcuts #

Before we wrap up here, I would like to highlight a few of the shortcuts we took while building this server implementation. Cutting these corners was with a purpose - not making this long article even longer.

First, we are missing resource locking when creating the channels or when a user joins/leaves a channel. If multiple people would join the same channel at the same time, it is possible to get a concurrent writes issue.

Second, our server does not have a graceful shutdown. A production-ready implementation would gracefully shut down all of the connections, informing the clients about the shutdown. Then, it would potentially save some state on disk before shutting down.

Another shortcut we took was validation of the body size in the msg method. When we are performing the slicing of the message body, we do not take into consideration if there are enough bytes in the message. If a client sends a body size larger then the body itself, we might slice off more bytes than available, which would result in a panic and a slice out of bounds error.

If you would like to play more with our chat server, I recommend starting with adding each of these missing functionalities to it. And drop me the link to the repo in the comments, so I can see how you pulled it off.


Changelog:

  • 2020-04-04 10:10UTC - Fixed the client type definition, which was missing the deregister channel, as pointed out by Rene C. over email.