I Come Not To Bury Dice Server 3.4

… but to praise it. Dice Server 3 had been running for about 8 or so years with nary a hiccup, until very recently. It was my first attempt at writing an on-line service and my first attempt at writing object-oriented Perl code and its longevity is a testament at the level of effort I had put into its architecture. The randomizer was very good, it being based on the Blum Blum Shub cryptographically secure random bit generator. The bit pool files were robust enough to handle the workload and, for the most part, users were happy with it.

That said, it was insecure and almost unmaintainable. Perl as a language is great if you need to bat something out quickly. It’s not very readable, however. I might feel differently if I were a 100% Perl coder, but I’m not. The pool files were regenerated by an external C program b/c the Perl version was glacially slow. To launch the C program, the Perl BitPool library shell’d out to the program when the pool file ran out of bits. While the C program was faster, it still wasn’t very fast and it was possible that an impatient user might kill the connection while bits were being generated. I did have some signal trapping in my Perl scripts to detect this condition, but the C program was probably left as a zombie process in those situations. If I were to hazard a guess as to why numerous copies of DiceServer.pl were being spawned and hogging CPU power, this is the scenario I would put my money on.

To serialize dice requests, I built my own Perl interface to the SysV IPC semaphore mechanism. This worked wonderfully. However, early on, there were times when the code wasn’t robust enough to recreate the semaphore in the event of a server reboot. Once that got fixed, all was good. In fact, one of the reasons behind the signal trapping in the script was to release the semaphore if it was being held.

One part of 3.4’s architecture that I didn’t care for was its reliance on individual files. Configuration was via Windows-style .INI files. Bit pools were individual files. The Perl library that I wrote to handle .ini files was an all or nothing proposition. I would convert the entire file to a Perl hash and write the entire hash out if any changes were made. As a shared resource among multiple requests, I relied on the semaphores to serialize the access to these shared resources.

The biggest insecurity in 3.4’s architecture was its biggest feature, the reporting of dice results via e-mail. Spammers, unfortunately, took it upon themselves to try to co-opt this feature for their own purposes, which led to my having to build banned lists of IP addresses and to add HTML detection in the request’s comments. Luckily, depending on your point of view, my dice server logs used the // character sequence as a tagged data delimiter. When spammers put http:// in the comments field, it completely threw the dice server logging off kilter. That’s when I added the spam detector.

When the problems with the spammers arose about a year or so ago, I began to reconsider 3.4’s design and how much the Internet and its tools have changed. I rarely use e-mail anymore. At one time, I relied on mailing lists to keep abreast of events and information in various fields and interests. No more. Getting bombarded by spam or by flaming threads is not useful, even when there was some good stuff buried in there. It took too much effort to find the silver among the dross. E-mail still has its uses, but it’s too insecure a medium to rely on. For most Interent-driven communications, I rely on forums. It’s more two-dimensional than mailing lists. I can safely ignore topics while perusing others, all on the same forum. It’s a pull mentality instead of a push mentality. When I subscribe to a forum, I retrieve information. When I subscribe to a mailing list, I receive whatever the list spits out. I prefer the former.

3.4 suffered from its freedom. Anyone could visit the site and send a dice request to any e-mail address they wished. Since e-mail is insecure, it is up to the users of the e-mail protocols to secure its usage.

So, with that lengthy preamble out of the way, let me illustrate the key design features of Dice Server 4:

1) 32 parallel randomizers.

Dice Server 3.4 had 8 parallel randomizers for those times when other user’s requests were being handled. Everyone used #1, occasionally, if #1 was busy, then #2 would be used, and so on. Looking at the seed pool files, noone ever got past randomizer #3, even though the code supported up to 8 and the seed files had randomizer seeds for 32.

Dice Server 4 will use all 32 and use some form of cooperative load balancing among them.

2) Time-based bit pool management.

Random bit generation is an expensive process in terms of time and CPU usage. Waiting until the bit pools are empty to generate more bits is too late. It wastes the time of the unfortunate user who emptied the pool and confuses the user into thinking that a refresh is needed to clear things up. Instead an hourly process will check the states of the bit pools and top them off as needed.

This method may result in bit pools running empty before regeneration. More on this later.

3) MySQL

Instead of relying on individual files and rolling my own form of database, I will use a real database. Serialization of requests will be made via application advisory locks, namely the GET_LOCK() call. Seed values will be in one table, bit pools in another as records.

4) Improved bit economy and probability distribution.

Dice Server 3.4 always retrieved 16 bits per die, whether it was a 100-sided die or a 2-sided coin flip. Considering the computing cost of generating a random bit, more care should have been exercised. Only one bit needs to be read to handle a 2 outcome random event.

On the other hand, the results of a 6-sided die roll do not distribute evenly across a randomizer based on bits. In other words, 6 does not divide evenly into any 2n range. There will always be some remainder.

Dice Server 3.4 handled this by dividing the 16 random bits against 216, which would generate a floating point number in the range 0.0 <= r < 1.0. Multiplying this random number by the number of possible outcomes (N) would return a floating point number in the range 0.0 <= r’ < N. Truncating the fraction would result in an integer in the range 0 <= R < N. While this produces a correct result, the extra outcomes are permanently distributed to fixed values within the range.

For example: let’s say we use 3 bits to generate a random integer in the range 0..7 to return a random integer in the range 1..6. We need to narrow the integer range to 0..5, we’ll simply apply +1 to get the 1..6 result we desire. Using the procedure in the preceding paragraph, we’ll divide the given integer by 8.0. This will give possible results of: { 0.0, 0.125, 0.25, 0.375, 0.5, 0.625, 0.75, 0.875 }. Multiplying by 6.0 and discarding fractions gives results of: { 0, 0, 1, 2, 3, 3, 4, 5}. Note the repeated values of 0 and 3. With this randomizer, a d6 will generate a 1 and a 4 25% of the time (don’t forget the +1) and the other results 12.5% of the time. That’s not good enough.

There are two ways to solve this problem. The first is to simply increase the number of random bits and the matching divisor. Let’s compare the preceding 3-bit version against a 32-bit version. 23 mod 6 = 2, meaning that there are two extra outcomes if we use 3 bits, as we just proved. 232 mod 6 = 4 meaning that there are 4 extra outcomes. To determine the effects of the errors introduced by these extra outcomes, we can apply the following formula: E / N / E, where E is the # of extra outcomes and N is the range of random integers. The result will the extra percentage that some values will receive. In our 3 bit example, E = 2 and N = 8. The result is 12.5%, meaning that some outcomes will have an extra 12.5% chance of occurring. Applying the same equation to our 32 bit process yields a miniscule value, 0.0000000233%. (Dividing 4 into 4.3 billion will do that.)

The 16-bit method employed by Dice Server 3.4 had an added outcome percentage of 0.0015%. Not much, but measurable over time.

While throwing more bits at the problem reduces the error, there is a distinct lack of bit economy. Plus, the error distribution is fixed to specific outcomes. In the case of rolling a d6, the outcomes of 1 and 4 would receive the benefit no matter how many random bits we threw at the problem. If we could distribute the error to the other outcomes, however, we could reduce the error to 0% over time.

Going back to our 3 bit process above, the extra outcomes per roll are 1 and 4. However, if we could change the extra outcomes to 2 and 5 on the second roll and 3 and 6 on third roll, then the probabilities across all three rolls will have evened out. Let’s see if this is true:

{ 1, 1, 2, 3, 4, 4, 5, 6 }
{ 1, 2, 2, 3, 4, 5, 5, 6 }
{ 1, 2, 3, 3, 4, 5, 6, 6 }

We have 24 total possible outcomes, 3 requests of 3 random bits. Counting the results, we have 4 of each of the 6 desired outcomes, a perfect distribution.

Error diffusion, of which this is an example, is used as a dithering tool when reducing a color value from one larger ranged value to a smaller one. Back in the days when 1-bit monochrome displays were used, the display of black & white photographs could be viewed with some measure of clarity due to the way that the pixel value errors were spread across, diffused, across the image.

Taken as a whole, we can knock the error down to 0% per outcome range and maximize bit economy. However, on a per request basis, we need to take as many bits as reasonable to avoid a reliance on the error impact and an ability to predict how the error diffusor will impact the process.

For example, let’s say we rely fully on error diffusion and only use 3 bits to generate a random number in the range 1..6. Let’s assume also that our diffusor uses a sliding window to apply the error, such that, on the first roll, 1 & 2 will get the extra events, 3 & 4 on the next and 5 & 6 on the next. If this method were to be employed in a competitive game, a player could adjust their decisions based on the results of each roll. If the die rolled high, and had rolled unusually high on each third prior roll, give or take a few low rolls, then they could reasonably conclude that the next roll would probably roll low as the 5/6 diffusor would give way to the 1/2 diffusor.

One way to get around this would be to try to maintain the per roll average in the application of extra outcomes. As the average roll for a d6 is 3.5, then, if we choose 3.5 as the average of our extra outcomes, like 1 & 6, 2 & 5, 3 & 4, then the diffusors activities could be masked. Still, throwing more bits at the problem, to lessen the per roll impact of errors, would have better results.

5) Modes of Operation

3.4 had one mode of operation. Anyone can do anything. No more. It’s time to lock the doors.

Dice Server 4 will initially support two modes of operation: open and authenticated. Open mode will operate like Dice Server 3.4, but w/o e-mail capability. If you need to just roll some dice, you can do still do that. However, there will be no transmission and only minimal saving of the results. An hourly process will purge open request logs from the system. Open mode dice results will only be saved for a maximum of two hours.

Authenticated mode requires user verification. Users will have to register and can create one or more “named” request spaces. Users can share their spaces with other known users, so that players in a game can share one game’s request space.

Named request spaces will host a variety of features, including:

  • request logs that never expire
  • log filtering
  • randomizer load balancing
  • named pre-defined roll requests, such as “Anti-aircraft Fire: 2d6″ (future)
  • named pre-defined roll requests with defined combat results tables: “Europa AA Fire” (future)
  • comment inclusion (future)
  • shared resource tracking (aka card decks) (future)

All open requests will use randomizer #1. If the bit pool empties before the next hourly regeneration, an error will be generated for each open request. Named request spaces will be assigned a randomizer upon creation that is based on current load levels in order to minimize the chances of hitting an empty bit pool. However, if a bit pool gets emptied, the named request spaces can switch to another underutilized bit pool. For each bit pool, the system will keep track of request usage and bit usage.

6) Reduced reliance on e-mail

With named request spaces and log filtering, one can avoid using e-mails altogether. However, the transmission, at least, of a notification of a dice request to the users of a named request space would still be supported.

Comments are closed.