socketserver: the Python networking module you didn’t know you needed

I occasionally spend time randomly surfing the Python standard library docs; there is a lot of useful functionality included in the language’s standard distribution, such as, for instance, the socketserver module, which I didn’t know about until this evening and which is one of the most useful I’ve seen in a while. As ever, the docs are straightforward in their self-description:

The socketserver module simplifies the task of writing network servers.

This is something of an understatement. To demonstrate this, here is a simple CaaS (capitalization as a service) server written with socketserver and one with socket:

import socket

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.bind(('', 50006))
    conn, addr = s.accept()
    with conn:
        while True:
            data = conn.recv(1024)
            if not data: break

And here is the same functionality with socketserver:

import socketserver

class CaaSHandler(socketserver.StreamRequestHandler):
    def handle(self):
        data = self.rfile.readline()

if __name__ == "__main__":
    server = socketserver.TCPServer(('', 50007), CaaSHandler)

Both of these take connections synchronously and sequentially, capitalize the data they recieve, and return it. The main difference is that the socketserver version can accept as much data as there is memory, while the socket version can accept only a limited amount (1024 bytes in this example).

This is because socketserver‘s StreamRequestHandler provides the file-like objects rfile and wfile which expose all the normal luxuries of Python 3 files, like readline and read. The parent class of the handler you write will deal with setting the buffer size, looping until a newline or EOF is encountered, and dealing with client-first and server-first protocols. We could just as easily add a welcome message/prompt to the program; just make the CaaSHandler class look like this:

class CaaSHandler(socketserver.StreamRequestHandler):
    def handle(self):
        self.wfile.write(b"Enter some data to be capitalized:\n")
        data = self.rfile.readline()

without any changes to the client’s behavior. Adding that functionality in the socket version is somewhat nontrivial; how, for instance, one would handle both clients that expect to send data first and clients that expect to receive it first is less than obvious.

The second useful facility that socketserver provides is the xxxServer classes. I use TCPServer here, to which I passed a tuple of (hostname, portnumber) and the name of my handler class, CaaSHandler. I could also have used UDPServer for datagrams or UnixStreamServer/UnixDatagramServer for Unix sockets.

The socketserver module also provides mixins for threading and forking servers, which makes writing asynchronous network services much less painful than using socket and threading or even asyncio.

Full Stack By Example 02: Dice… on the Web!

In the last post, we created a simple set of functions that can generate dice rolls with the appropriate probability distribution. Now, we’re going to make it available on the web!

To do this, we’ll need the Flask web framework. Luckily, Python has a great package manager that makes installing things such as Flask really simple:

pip install flask

Depending on your system setup you might need to add sudo in front of this command for it to work.

Now that Flask is installed, open up the file with your dice roller code in it.

As a reminder, it should look like this:

def roll_die(number_of_sides):
    " Simulate rolling a single die with number_of_sides sides"
    import random # We need randint from the random module
    # This is exactly what we were doing by hand before: a random number between 1 and the number of sides
    random_number = random.randint(1, number_of_sides)
    # Give back the number we generated
    return random_number

def roll_dice(number_of_dice, number_of_sides):
    " Return a random number in the same distribution as rolling number_of_dice dice of number_of_sides sides "
    import random # we need randint from the random module
    accumulator = 0 # This variable will "accumulate" a value as the loop runs 
    for roll_number in range(number_of_dice):
        # We don't actually use roll_number for anything, it's just a placeholder
        # You could use it for debugging messages if you want
        accumulator += roll_die(number_of_sides)
    return accumulator

That’s our “business logic” – the part of the program that does actual work for the user. We’ll add a bit of additional code before it to set up the Flask framework (at the top of the file):

from flask import Flask  # Brings in the Flask object from the module we installed
app = Flask(__name__)   # Creates a new Flask app.
app.config['DEBUG'] = True

This imports the Flask object, which is the main way we will interact with the Flask library. Then, we create a new app (called app) and set it to DEBUG mode.

Finally, add the following below the definition of roll_dice:

def main():
    return "Result of roll: {}".format(roll_dice(1, 6)) 

if __name__ == "__main__":

This is a little more complicated, so let’s break it down. First, @app.route("/") tells Flask that the following function defines what should be done when someone fetches the index page of the website. Below that is the function definition, which does only one thing: roll some dice using our previously-defined dice rolling function, put the result in a string, and return it. Below that is a bit of “magic” code; __name__ is a variable that Python sets for you so your code can check whether it’s the main program or a library. This if checks if the current script is the main one, and if so, it runs the app.

If you’ve followed along so far, you should be able to run python3, where filename is whatever you called the file, and see something a bit like this:

 * Running on (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger pin code: 118-607-219

Your debugger code will be something different. If you go to localhost:5000 in your favorite web browser (Chrome, Firefox, etc), you should see a simple page with “Result of roll:” and then a number. You’ll notice that every time you refresh the page, you get a new dice roll, and in the console window, something like this appears: - - [21/Oct/2016 16:38:01] "GET / HTTP/1.1" 200 -

Let’s analyze this. First is an IP address ( That tells you who accessed your web server. Then is the date and time, followed by the request they sent: GET / HTTP/1.1. This is in three parts: GET is the “method”, / is the resource being asked for, and HTTP/1.1 is the version of the protocol being used. Finally, the number 200 represents the status code returned by the server. 200 means “OK” – everything went through as expected.

Let’s break it! Try, instead of asking for localhost:5000, typing in localhost:5000/this_page_is_not_real. You’ll get back an error saying 404 Not Found, and in the console you’ll see something like: - - [21/Oct/2016 16:44:00] "GET /this_page_is_not_real HTTP/1.1" 404 -

Most of this is the same as the successful request, but you’ll see that / has been replaced with /this_page_is_not_real, because we requested a different resource, and the 200 OK status has been replaced with 404, which means Resource Not Found.1 We asked for a page that doesn’t exist, and the server told us so.

Congratulations! You’ve just build a web service. Granted, it doesn’t do much, but it works, and acts just like every other web site out there as far as your web browser is concerned. In the next post, you’ll create a form that allows the user to specify what kind and how many dice to roll.

Rewriting tinyhttpd in Rust, Part One

In 1999, J. David Blackstone, or, as he is know online, jdavidb, was taking CSE 4344 (Network Concepts) at UT Arlington. Those were the glory days of Sparc Solaris, and Blackstone wrote, for his college course, a C program called tinyhttpd. It is, essentially, a very short version of the immensely complex programs that seem run the world these days: web servers. Unlike the million-line behemoths (think Apache, nginx, et cetera), tinyhttpd is a HTTP 1.1 web server in 532 lines of well-commented C.

HTTP 1.1 is a ubuqitously supported protocol that is useful for a great many applications, and in this modern era of embedded (a.k.a “Internet of Things”) computing applications, small web servers have never been more important.

This program is also a small, manageable example of a legacy application – an old program written for an obsolete operating system that still gets the job done, but exposes any organization using it to not only the cost of maintaining ancient operating systems and hardware, but also to the risk of the security vulnerabilities present in tinyhttpd itself and the software it needs to run.


For the purposes of these posts, I’ll be looking at tinyhttpd from the perspective of a company that uses it internally, and wants to transition to a more modular, portable, and maintainable design, rather than one which either ships it as a product or buys it as a product from another company and wants to replace it; these situations are similar, but have additional challenges.

The first thing to do is to analyze the existing source. I’ve gone ahead and created a GitHub repository to host both the old and new source code, and I’ll link to specific commits in these posts. For instance, here is the commit with nothing but the unmodified source of the legacy app.

The first thing to do is to build the existing app. In order not to clutter the repository with object files, I created a .gitignore file from GitHub’s default C gitignores. Now all I have to do is run make, right?

11:32:58: leo [~/Projects/rtinyhttpd/legacy]
$ make
gcc -W -Wall -lpthread -o httpd httpd.c
/tmp/ccbqEOVd.o: In function `main':
httpd.c:(.text+0x1a85): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
Makefile:4: recipe for target 'httpd' failed
make: *** [httpd] Error 1

What’s this, it doesn’t compile? Well, you’ll remember I mentioned it was written for an ancient version of Sparc Solaris – that’s the whole reason we’re rewriting it. Luckily, the original author anticipated this. Looking at legacy/httpd.c (where the error is), I see this comment at the top:

/* This program compiles for Sparc Solaris 2.6.
 * To compile for Linux:
 * 1) Comment out the #include <pthread.h> line.
 * 2) Comment out the line that defines the variable newthread.
 * 3) Comment out the two lines that run pthread_create().
 * 4) Uncomment the line that runs accept_request().
 * 5) Remove -lsocket from the Makefile.

I made a note of this in my analysis folder and made those changes – except that they didn’t apply. The makefile didn’t have -lsocket, and there was only one occurrence of pthread_create. They did make the app build, but it didn’t work!

In order to figure out what’s happening, I looked up pthread_create on It’s part of the POSIX threading API, and it is definitely available on Linux. Furthermore, if we look at the main() function, we can see why commenting out those lines caused a problem – it’s an infinite loop that does nothing but accept connections!

while (1)
 client_sock = accept(server_sock, (struct sockaddr *)&client_name, &client_name_len);
 if (client_sock == -1)

 // Commented out in order to build on Linux
 /* if (pthread_create(&newthread , NULL, (void *)accept_request, (void *)&client_sock) != 0)

So, we need to get POSIX threads working to make this app run properly. (Note that this problem isn’t an uncommon one when looking at legacy apps; there is often not a good set of build instructions.)

In our case, luckily, this is easy: just revert the commenting and change -lpthread in the Makefile to -pthread, as mentioned on the manual page.

Doing this allows the app to build and run correctly, binding to port 9999. When I open localhost:9999 in my web browser, I get a page back. Success!

Now that we have a compiling and running version of the legacy tinyhttpd, it’s time to go through the source code. Luckily for us, tinyhttpd is entirely contained in a single file. Let’s start off with the top:

/* J. David's webserver */
/* This is a simple webserver.
 * Created November 1999 by J. David Blackstone.
 * CSE 4344 (Network concepts), Prof. Zeigler
 * University of Texas at Arlington
/* This program compiles for Sparc Solaris 2.6.
 * To compile for Linux:
 * 1) Comment out the #include <pthread.h> line.
 * 2) Comment out the line that defines the variable newthread.
 * 3) Comment out the two lines that run pthread_create().
 * 4) Uncomment the line that runs accept_request().
 * 5) Remove -lsocket from the Makefile.

Here is some information which will often be included in legacy programs – some short information about the author and purpose of the program, and some (in this case out of date and inaccurate) information about building and running the program. Removing the misleading lines makes this section a lot more concise and is probably a good idea.

Skipping the #includes, which aren’t very helpful in this case, we find two #define statements:

#define ISspace(x) isspace((int)(x))

#define SERVER_STRING "Server: jdbhttpd/0.1.0\r\n"

The SERVER_STRING definition is pretty straightforward; it’s an identifier of the software, which will be sent to clients. In our version, I would prefer to not include the \r\n terminator in the definition itself. As to the ISspace definition, though, I’m not immediately sure. A quick search of the source shows no definition of a function isspace taking an integer, so it’s probably coming from one of the includes.

If this program had multiple files, I’d search through them next; but, as there are none, I’m going straight to the Internet. Turns out, it does just what you’d expect – it checks if a given integer represents whitespace or not. This definition simply allows calling it directly on char values without writing out an explicit cast every time. I’ve made a note of this in my analysis documents.

After the head macros, we can see explicit definitions of all the functions used in the program.

void accept_request(void *);
void bad_request(int);
void cat(int, FILE *);
void cannot_execute(int);
void error_die(const char *);
void execute_cgi(int, const char *, const char *, const char *);
int get_line(int, char *, int);
void headers(int, const char *);
void not_found(int);
void serve_file(int, const char *);
int startup(u_short *);
void unimplemented(int);

Because they have no comments, these definitions are not particularly useful, so let’s go down to the bottom of the page and look at which functions are called in the program’s entry function, main().

int main(void) {
 int server_sock = -1;
 int client_sock = -1;
 u_short port = 9999;
 struct sockaddr_in client_name;
 socklen_t client_name_len = sizeof(client_name);
 pthread_t newthread;

 signal(SIGPIPE, SIG_IGN);

 server_sock = startup(&port);
 printf("httpd running on port %d\n", port);

 while (1)
   client_sock = accept(server_sock, (struct sockaddr *)&client_name, &client_name_len);
   if (client_sock == -1)

   if (pthread_create(&newthread , NULL, (void *)accept_request, (void *)&client_sock) != 0)


Let’s break this down further. This function takes void, meaning that the program has no arguments or command line options. This probably means it’s not very customizable, something I’d like to change in the rewritten version.

After the function signature come the definitions of some local variables: server_sock and client_sock, port, client_name, client_name_len, and newthreadserver_sock and client_sock are just ints, but they represent file handles, as we’ll see in a moment. port is clearly a port number. client_name is the address of the client, and client_name_len is its length.

Below that, the program uses signal() to ignore SIGPIPE, the signal that programs receive when they write to a file handle which has been closed. It seems to me that this should be handled more appropriately in the rewrite.

Immediately afterward, the server_sock variable is filled by the result of the function startup, which is given a pointer to the port number. This seems odd to me – why does it need a reference and not just the value? – so I look at that function’s definition. It is commented with:

/* This function starts the process of listening for web connections
 * on a specified port. If the port is 0, then dynamically allocate a
 * port and modify the original port variable to reflect the actual
 * port.
 * Parameters: pointer to variable containing the port to connect on
 * Returns: the socket */

That makes more sense now – it allows dynamically generating a port number. That’s useful, but the functionality isn’t exposed through the command line interface, which is annoying. In our program, I’d like to expose that, and I’d also like to move away from the C convention of modifying inputs. In the rewrite, I think I’ll return a tuple. Since this is a fairly complex idea, I’ll take this time to write some notes down.

That’s enough to understand a bit more about main. After a simple status message, the program moves on to the main loop:

while (1)
  client_sock = accept(server_sock, (struct sockaddr *)&client_name, &client_name_len);
  if (client_sock == -1)

  if (pthread_create(&newthread , NULL, (void *)accept_request, (void *)&client_sock) != 0)

This is an infinite loop which accepts a connection, as can be seen if we look up accept(), which is where client_sock gets its value. It returns a file handle representing the socket. It returns -1 if it fails for some reason, and the next few lines check for that eventuality. This is another suboptimal design imposed by C’s lack of algebraic data types – in Rust, this idea can be represented with an Option or a Result.

The next few lines try (and handle errors for) spawning a new thread that runs accept_request. Looking at the comments here is not quite as illuminating as one might hope:

/* A request has caused a call to accept() on the server port to
 * return. Process the request appropriately.
 * Parameters: the socket connected to the client */

I’m not really sure what processing the request “appropriately” entails. For now, though, it’s enough to know that this is the main function for dealing with incoming requests.

The only code after this is cleanup code we won’t need in the rewrite, so we have enough info to write a short pseudocode summary of the server:

Open and configure a server socket

Until the process is terminated:
     Wait for a client to request a connection
     Try to open a connection with that client
     Open a new thread to deal with that connection's request

That’s a lot simpler than one might have imagined from the length of this post, and it doesn’t tell us much about the actual functionality of the server, but it gives you a good idea of the process one often has to go through to understand legacy code.

Now that we have examined the basic structure of the server’s execution, I’m going to dive into the actual functionality and logic of the server, which is encapsulated primarily in the function accept_request, whose signature is void accept_request(void *arg). This is a signature that is totally unrevealing, and which in Rust would require an unsafe block; this function takes a raw pointer with no type information at all. We’ll have to do quite a bit of work to understand what the function actually does.

First of all, are there any clues about what the argument might represent? Well, we can look back at how the function is called:

pthread_create(&newthread , NULL, (void *)accept_request, (void *)&client_sock)

This is a little complicated, but essentially a new thread is being spawned which will execute accept_request(&client_sock). This is the only place this function is called, so the argument is presumably expected to be only a pointer to an integer file descriptor to a socket – but the compiler knows none of that! That’s a lot of unchecked assumptions and unsafe memory access. Rust, and more importantly the Rust standard library, has better invariant checking, which will make the re-implementation a great deal safer and thus easier to extend.

Moving on to the body of the function, we see the creation of a lot of local variables which I’ll go into as they’re used. It is important to note, though, that there is a group of buffers created with absolute lengths. These appear, at first glance, to be possible introduction points for overflow vulnerabilities – something that is mitigated by the Rust idiom of defaulting to using Vecs instead of arrays.

One of these buffers, of length 1024, is populated using the function get_line, which, according to the comments above its definition, reads a line into a buffer and null-terminates it, with length checking, and returns the number of bytes stored. That buffer is printed and dissected over the course of the next 90 lines or so.

Now that it’s clear how I dissect each line of code, I’m going to move a bit faster, translating the entire program into pseudocode function by function. What we currently have is this:

Open and configure a server socket

Until the process is terminated:
 Wait for a client to request a connection
 Try to open a connection with that client
 Open a new thread to deal with that connection's request

And we’re examining the idea of “dealing with the client”. This is all done in the accept_connection function, whose pseudocode looks a bit like this:

accept_request takes a socket connecting to the client
     read a line from the client
     log (to stdout) the received request
     copy what is assumed to be the method into another buffer
     if the method isn't GET or POST:
          handle an unimplemented method somehow
     if the method is POST:
          make a note that this request will require executing a CGI script

     copy what is assumed to be the URL into another buffer
     if the method is GET and the url has a ? in it:
          make a note that this request will require executing a CGI script
     construct the path to the requested resource by prepending "htdocs" to the url
          (note - I'd like to make this customizable in the rewrite)

     if the URL is /:
          add 'index.html' to the path

     if the resource being requested doesn't exist:
          handle a not found error somehow
     if the resource is a directory:
          append "/index.html" to the file path
          (note- the existence of THIS file isn't checked!)
     if the file is executable:
          make a note that this request will require executing a CGI script
     if this request requires executing a CGI script:
          handle executing a CGI script
          handle serving a static file

This analysis is pretty revealing: essentially all this function does is determine some properties of a request and then pass it off to be handled appropriately by other functions.

This particular function should be fairly easy to translate into more efficient Rust code, especially if we look at using Rust’s more advanced type system. In particular, rather than having a large number of buffers, I’d like to use slices and ADTs. For example, I might create an enum HTTPMethod:

enum HTTPMethod {

Then I could use a match expression to appropriately dispatch the request, whether to the static server, CGI handler, or error response.

In the next post, I’ll take a look at the handler functions and how they handle the various conditions and actions a request can trigger – unimplemented method, resource not found, static file serving, and CGI execution. I’ll also discuss the Rust idioms that can be used to better model the intended behavior and internal structure of this server.

Am I in a Terminal?

Sometimes, it can be useful to know if your program is running in a terminal. Since Python 3.3, this functionality has been available in the os module:

#!/usr/bin/env python3

# Test if this Python script is running in a terminal or not.

import os

    size = os.get_terminal_size()
    print("I am in a terminal of size {}x{}"
        .format(size[0], size[1]))
except OSError:
    print("I am not in a terminal.")

Here is an example of it in operation:

$ ./ 
I am in a terminal of size 80x24

$ ./ | tee
I am not in a terminal.

This is useful for many reasons. For example, scripts which have interactive “beautifications” like progress bars, no-freeze spinners, and animations should cease these antics when piped into the input of other programs or redirected to files. Additionally, programs being run from scripts can disable all performance-impacting interactivity, including interactive KeyboardInterrupt handling; if a user Ctrl+C’s a script, they want it to stop, immediately, not ask to quit.

Learning Japanese the Python Way

Now that I’m in college, I’m taking a lot of non-computer science classes, and one of them is Japanese. I’m just starting out, and I need to be able to rapidly read numbers in Japanese and think about them without translating them consciously. I could make a bunch of flash cards, or use a service like Quizlet… or I could write some Python!

For those of you who are unfamiliar, Japanese doesn’t have the ridiculous numerical system that English does. One through ten are defined, and eleven is simply (ten)(one). Twenty three, for example, is (two)(ten)(three) (に じゅう さん). This means that rather than having a long list of numbers and special cases, I can just have the numbers zero to ten “hard coded”.

After that, the program is pretty simple: if the number is less than 11, simply look it up. If it’s more than 11 but less than 20, build it with じゅう plus the second digit. If it’s larger than 20, build it with the first digit plus じゅう plus the second digit.

The interactive part is pretty simple too: it runs a loop that randomly generates numbers, checking that they haven’t been done before, translates them, and asks me to translate them back. If I succeed, it moves on; if not, it doesn’t record the number as having been completed, so I have to do it again at some point in the same run.

This simple program came out to 136 lines of very verbose and error-checked Python. It’s a good piece of code for a beginner to try and modify – for example, can you get it to incorporate the alternate form of four (し) as well as the primary form? Can you make one that teaches Kanji numbers? (I plan to do both of those things at some point.)

Why Linux on the PC Needs a Focus on Hardware Support

A few days ago, I had an interesting and somewhat frustrating experience with a friend of mine. Their laptop was dying, so they asked me to give them some suggestions for a new one.

Their requirements were a computer with a display that was good for reading, enough power to be responsive and able to multitask well, and rapidly accessible storage, but not necessarily a lot of it. Of course, I immediately thought of the System76 Lemur, which happened to be on sale at the time; however, after going through a whole list of pros and cons of Ubuntu with the friend, they told me that they wanted to go with Windows or Mac OS X as they “didn’t have time to tinker”.

They’re right, of course, but this really got under my skin. The thing is, if you’re not trying to game on Linux, there are absolutely no problems using it on a desktop in almost all cases. Desktop hardware is nicely standardized and “just works”. But on a laptop? Nope. It’s a crapshoot as to whether your wireless Internet and Bluetooth will work, or if your touchpad’s multi-touch will be usable. On my Dell Inspiron 7000-series laptop, which works almost flawlessly with Ubuntu GNOME, the wireless chipset will occasionally forget that any but a single network, to which I’m not connected, exists.

Why is this? Well, it’s because laptops are often very, very custom. They have custom form factor motherboards with non-standard sets of features. Battery life is often the primary concern rather than compliance, and release cycles are very tight, so if a new hardware system is developed, drivers are produced for the target platform (almost always Windows) and released there. The Linux community has to hack together our own after the fact.

Windows runs on most laptops, and has a lot of big issues. Privacy concerns, resource overutilization, extremely poor real-time performance, and a massive lack of customization are the obvious ones, along with a downright byzantine user interface without much power to back it up (in the consumer versions, that is). Mac OS X looks simple on the surface while still exposing the massive power of a UNIX to its power users and developers (although this is becoming progressively less true). On the other hand, it costs an absolute fortune to buy into that ecosystem, and that is where Linux comes in.

In reality, Linux is a modern UNIX like Mac OS X, and it is far more flexible and powerful, but to many people, it’s just “Windows that costs less”. What we need to be is a Mac OS X that can run anywhere. Linux needs to be simple on the surface, which most DEs accomplish brilliantly, while exposing the power of the underlying OS, which isn’t hard given a terminal emulator. Where Linux falls short is the “runs anywhere” part.

Porting Deucalion to Rust

A few months ago, I made a proof-of-concept for an RPG engine based on SFML and the Tiled map editor, called Deucalion. Over time, the source code became unwieldy, leaked a great deal of memory, and was nearly impossible to build. I ended up spending more time configuring build systems than actually working on the code, and I abandoned it in favor of SBrain and schoolwork.

Recently, though, the Rust game development story has gotten a lot better, and I’ve gotten a bit of free time. With the help of a friend of mine, Dan Janes, I’ve been porting the existing code to Rust and refining the design for the game-dev-facing API. It’s been interesting, since it’s my first time running a project on which I am not the sole contributor.

I’ve certainly run into some problems because of the relative immaturity of the Rust ecosystem – for example, many projects don’t use the standard Error trait, which makes using the handy macros that rely on it such as try! nearly impossible, but I’ve also found that as a whole, the community is very responsive to having these issues pointed out and solved.

Deucalion isn’t quite at the level it was before I decided to port it – I’m still struggling to get tilemaps to draw with decent performance, and a lot of design work needs to be done – but it’s doing better than I thought it would, and I’ve discovered some of the best features of Rust so far.

For example, while Rust doesn’t have exceptions (because exception handling requires a heavyweight runtime), the convention of returning Result<T, Error> from functions that might fail allows programs to act as if it did. Deucalion implements a single, shared error struct DeucalionError that encapsulates every possible error type (Currently IoError, LuaError, TiledError, NotImplementedError, and Other), allowing callers of risky functions to act according to the actual failure that occurred.

I also like the module system much more than I thought I would at first. While learning when and where to use mod vs use can be a bit of a hassle, the fact that multiple includes create an automatic compiler error is very welcome when compared with C++.

Rust is a great language, and its ecosystem is on its way to becoming as good as that of Python or Ruby. I’m excited for every step along the way.

Full Stack By Example 01: Dice

To begin, we’re going to make a simple dice roller application. We want a page that, given a number of dice, a number of sides, and a number of rolls, returns a list of random numbers in the proper distribution. For example, rolling 2 six-sided dice (2d6) is not the same as rolling a 12-sided die, since when rolling 2, 6 is much more likely than, say, 2 or 3, and 1 is impossible.

For this, we need to be able to generate random numbers. Python’s standard library gives us the useful function random.randint(lower, upper) from the random module, which will give us a random integer between lower and upper.

A fair six sided die will only ever give an integer between 1 and 6, evenly distributed: random.randint(1, 6).

We could make a dice roller like this:

import random
print("d4:", random.randint(1,4))
print("d6:", random.randint(1,6))
print("d12:", random.randint(1,12))

and it would work just fine, printing out random numbers every time it was run. Imagine, however, that we wanted to do more than one die of each type. This requires a little arithmetic.

Two six sided dice do not have an even chance of producing a number between 1 and 12; rather, it is impossible for them to produce 1 (as each die’s minimum is 1, and there are two of them), and it is much more likely that they will produce 7 than 12, because there are many combinations of rolls that produce 7, and only one that produces 12.

We can solve this simply by adding together calls to random.randint():

import random
print("3d6:", random.randint(1,6) + random.randint(1,6) + random.randint(1,6))

This totally works and has the correct distribution, but it’s kind of gross and requires a lot of retyping. This is where we go from simple technical knowledge to properly decomposing the problem, and you should try to do this without looking at the next answer: how would we modularize this solution, to be able to deal with and number of dice of any geometry (any number of sides)?

Don’t be afraid to try this on your own!


We could make a more general dice-roller with a function which takes as input the number of dice and the number of sides. At first, let’s deal with only the number of sides:

def roll_die(number_of_sides):
    " Simulate rolling a single die with number_of_sides sides"
    import random # We need randint from the random module
    # This is exactly what we were doing by hand before: a random number between 1 and the number of sides
    random_number = random.randint(1, number_of_sides)
    # Give back the number we generated
    return random_number

We now, however, have the problem of adding multiple dice. Again, we could do roll_die(6) + roll_die(6) + roll_die(6), but this is only marginally better; it looks much cleaner, but doing 1000d6 would still be somewhat problematic.

Luckily, if computers are good at one thing, it’s repeating instructions over and over. All we need to do is make a variable to keep track of the total, just like a person would do on a bit of paper if they had to roll a die a bunch of times. This is called an accumulator, because it’s accumulating a value over time, and that’s what I’ll call the variable. As with all variables, though, you can call it anything you like.

def roll_dice(number_of_dice, number_of_sides):
    " Return a random number in the same distribution as rolling number_of_dice dice of number_of_sides sides "
    import random # we need randint from the random module
    accumulator = 0 # This variable will "accumulate" a value as the loop runs 
    for roll_number in range(number_of_dice):
        # We don't actually use roll_number for anything, it's just a placeholder
        # You could use it for debugging messages if you want
        accumulator += roll_die(number_of_sides)
    return accumulator

First, this isolates the importing of the random module to where it belongs – we only use functions from random in this function, so the rest of the program doesn’t need to know about it.

Second, this function is very general. We could write print(roll_dice(3, 6)) to achieve the same result as above, or print(roll_dice(1024, 6)), or even print(roll_dice(1, 15)); the computer doesn’t care that fair 15-sided dice are physically impossible.

Decomposing and generalizing problems problems is a very important skill – much more important, in fact, understanding any individual language.

In the next post, we’ll create a simple web service that rolls dice for anyone who accesses it.