Update: New low-latency database engine is live July 28, 2018
With weary coding fingertips I type to let you know that a very long week has paid off.
 More profiling of the bedraggled Server1, which has the largest map data set of any server, revealed that file IO inefficiencies in the custom-coded StackDB were to blame. StackDB was designed to quickly answer questions about recently-accessed map cells---assuming that people in cities are often looking at the same stuff, so that stuff should be kept near the top of the stack. The old off-the-shelf KissDB did not do that, meaning that the newest stuff was the slowest to access as the data set grew.
However, none of these optimizations addressed what is actually the most common case: asking about a map cell that isn't in the database at all. When you wander around in the wilderness and look at the empty ground, we have to ask the database to confirm that that patch of ground is indeed empty. Maybe someone visited that spot earlier and dropped something there that you should see.
It turns out that in both KissDB and StackDB, this is the slowest operation of all. A non-existent cell can never be at the top of the stack, because it doesn't exist, which means that we need to walk to the bottom of the stack to find out for sure that it doesn't exist.
Finally, KissDB and StackDB are both hash table systems, but both of them use fixed size hash tables. In the case of Server1, there were 15 million data records crammed into an 80,000 slot hash table. This means lots of pages to look through in each slot (KissDB) or deep stacks to wade to the bottom of (StackDB) to find out that a given map cell really isn't there, and therefore is empty.
Even worse, the architecture of both engines requires loads of random-access disk seeks to move through the pages or the stack. And disk seeks are extremely slow, relatively speaking, especially when they are jumping around a huge file and missing the cache over and over.
LinearDB, my latest custom-coded database engine, has an ever-expanding hash table based on a very clever algorithm---which I did not invent---called Linear Hashing. The hash table grows gradually along with the data, essentially never letting it get too many layers deep. In addition, a kind of "mini map" of data fingerprints is kept in RAM, allowing us to ask questions about map cells that don't exist without touching the disk at all.
The performance gains here are pretty astounding.
On a simple benchmark where a single player walks in a straight line through the wilderness for a minute, the old database engine performed 1.9 million disk seeks and 3.8 million disk reads.
During the same single-player journey, the new database engine performs less than 4700 seeks and 1600 reads.
Yes, that's a 427x and 2400x in disk seeks and reads, respectively.
According to a system call time profiler, this results in approximately 180x less time spent waiting for the disk. In other words, this part of the server is now one hundred and eighty times faster than it used to be.
Server1 is the only server that has this new engine installed so far. It has the biggest data sets and was seeing the most lag with the old database, so it's the best server to stress test with. It's now back in circulation at the top of the list and seems to be lag-free. I'll be incrementally increasing the player cap over the next few days and seeing how it handles the load.
Assuming that all goes well, I will be rolling the new database engine out to the other servers next week. The end of server lag is almost in sight.
A big thanks goes out to sc0rp, who spent many hours discussing the intricacies of these systems with me in the forums, and filled my head with all sorts of great ideas. I had never even heard of linear hashing until sc0rp told me about it.
| Update: Server lag optimzation and client improvements July 21, 2018
With the recent influx of new players to the game, the servers have been struggling to keep up.
Several months ago, I spent a lot of time on server database optimization, which made the servers around seven times more efficient than they were originally. But as maps fill up with player content, there's more and more information to process, and the load generally grows over time. In the past week, Server1, which has the most extensively settled map, had gotten very laggy. It was time to take another look.
This round of profiling revealed a bunch of hot spots that weren't database-related. The optimization process involves running a server with a profiler (I use the amazing Callgrind profiler from the Valgrind project), finding the most obvious hotspot, figuring out if there's a way to speed it up or---even better---skip that operation entirely, and then repeating with a new build to find the next biggest hotspot. I ended up going through this process nine times, fixing nine hotspots along the way. Some of these changes resulted in 7x speedups to certain parts of the server code.
However, even after several days of intensive work that showed huge performance gains in the profiler, when I finally brought Server1 back online for public use and a load of 38 players, the lag returned. There are more issue afoot with Server1 than just slow code. CPU usage jumped up back to 70%, while Server3 sits happily at around 20% with exactly the same player load.
So, Server1 and Server2 will remain "on ice" over the weekend, at the end of the server list where no one will use them by default, until Monday when I will resume diagnosing Server1's lag issues.
You may have also noticed that the connection management features of the client have been greatly improved. You can now specify a custom server from the SETTINGS screen, and copy/paste server addresses to share with friends. Bugs in the twin matching have also been resolved, so joining twin games is reliable. Some issues that caused logins to fail have also been fixed.
| Update: Twins July 14, 2018
 You want to play with friends.
But how can you do that without subverting the entire premise of the game? This is a game about being born as the helpless baby of a total stranger, and depending on that stranger for your survival. It's also about playing a small part in a family of strangers. You didn't know your mother in real life before you were born, and that's what makes her so special. Imagine if your mother was really your college roommate in disguise, and you knew it. Imagine if your mother talked to you over voice chat, and she sounded just like your college roommate.
But you want to play with friends. There's only one solution here, and I never said it was going to be pretty.
How can you play with friends and yet still be born as a helpless baby of a total stranger? Only if you and your and your friends are ALL born as helpless babies to the same total stranger. Twins. Triplets. Quadruplets.
If you act like you've known each other since before birth, you have---you shared a womb together, after all. To others in the game, you almost seem to read each other's minds. And you look identical. And your poor mother---one baby is hard enough. It's not easy being a twin.
Also, a bunch of little bugs have been fixed along the way.
| Update: Curses July 6, 2018
griefers.... Griefers! GRIEFERS!!!
Yes, you can kill them, just like in real life, but there's one little wrinkle: this game has reincarnation. Thus, you may be dogged by one wayward soul life after life after life. You just can't seem to get rid of them. Even if everyone in your village agrees, you're helpless, long-term, to deal with this handful of trouble-makers.
A metaphysical problem needs a metaphysical solution: The brand new curse system.
Everyone starts out with a curse score of zero and a single curse token to spend. They spend their token by saying CURSE JOHN SMITH. Mr. Smith's curse score goes up by one.
If you have no curse token, you earn another one after playing/living for two hours. You have a max of one token.
If you have a curse score, it goes down by one point after playing/living for one hour.
If Mr. Smith's curse score ever reaches 10, he will be born into his next life marked with inverse-color speech bubbles.
And yes, cursing is done by name, so unique baby names are more important than ever. If two people share a name, the one that received the name first gets the curse point. And nameless people cannot be referred to by name. Name your babies, people.
Also in this update, along with few fixes, is a new home arrow distance estimate if the home arrow (or bell tower) is more than 1000 tiles away. And the tutorial has been updated with a few new things.... even veteran players might want to take another look at it, especially past the end area.
| Update: Tutorial June 30, 2018
If you bounced off the game in the past because it was too hard to figure out, this is a great time to give it another shot.
This week's update brings a brand new tutorial.
This has always been a game that just throws you right in there, and that will remain the case, so don't worry. The tutorial mostly explains the nuances of the controls and systems. There are a lot of little things that just aren't explained anywhere else. Some of these things are lost on veteran players.
Also added, and explained in the tutorial, is the new crafting-hint filter feature. If you want to make a hatchet, for example, you can cull down the hint list to show only steps that are relevant along the way to the hatchet. This greatly reduces the "sharp stone does 46 things" problem, and should allow pretty much everyone to be able to figure out the final crafting challenge at the tail end of the tutorial.
You have to get through the tutorial, and solve that last crafting challenge, to actually play the game, so I'm really betting the farm on this one. At least you'll know that if a baby is born to you in-game, that player is not totally clueless.
This new filter system also just might make wikis and other external crafting guides unnecessary, but we'll see how it goes.
A bunch of little content issues have been fixed. Committing murder now "counts" as living a full time block in a family line, just like being a victim of murder does. Griefers can no longer kill and then suicide before 30 minutes elapse to come back at the same family again. The lineage ban has been upped to 1.5 hours of life lived in other lines before returning to a given family line.
|
|
|