|Update: Even newer, even-lower-latency database engine is live|
August 4, 2018
You probably feel like you've heard about this issue too much already.
But making a lag-free server is a top priority, and getting the database engine right is the most important part.
Last week, I introduced a custom-coded database engine that made null-lookups (when you find out that nothing is there on the map) extremely fast. These are the most common database actions---imagine someone walking around in the wilderness, and we need to find out that nothing human-made is on each and every map cell that they're exploring. That's a lot of null-lookups.
However, the architecture of this new database engine also made inserts---a much less common operation---quite a bit slower. The hash table is spread across the data file, and that means that newly inserted data is spread randomly across the data file. KISSDB and my previous replacement STACKDB did not distribute the data across the disk in hash table order, but I never thought about why.
It turns out that writing in a bunch of random locations in a file is really slow, because it causes cache misses constantly. So it's best to write all new data at the end of the file, in an arbitrary order, instead of treating the file like one big hash table.
Of course, this doesn't matter most of the time.
Except when loading the tutorial map into the world for a player. At that moment, we insert thousands of new records into the database. This was taking something like 8 seconds on server1 with the new database engine. That's an 8-second lag for EVERYONE every time any player loads the tutorial map.
The latest database engine does things quite a bit differently, keeping the entire hash table in RAM and keeping data on the disk in the order that they are inserted. Thus, when a big sequence of inserts happens, like when the tutorial is loaded into the world for someone, all of those inserts happen in order at the end of the file, without a single file seek along the way.
And, given that the entire table is kept in RAM, null-lookups, and all other operations, are substantially faster than in any previous database engine.
The result of all this is that Server1 is back online and hopefully more lag-free than ever. 43 players are currently on it, and it's only using 7% CPU for that load.
There's still a small bit of optimization work to be done. When the map database is huge, as it is on Sever1, tutorial map inserts are still a bit slower than I would like them to be (a little over 1 second), but given that it's 9pm on Friday night, and my kids are waiting to play League Of Legends with me, that sounds like something I will tackle on Monday morning.
And yes, the bison is coming soon... promise!