One Hour One Life Forums

a multiplayer game of parenting and civilization building

You are not logged in.

#1 2018-07-30 23:17:45

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Inserts in database are slow now

StackDB made inserts the fastest operation, rather stupidly, because inserts are the least common operation.

LinearDB, as a side effect of the way that it works, makes inserts the slowest operation.  Inserts are the only operation that incrementally expand the table, and because of the way linear probing works, expanding the table can involve moving a bunch of cells on the disk.

This is generally okay, because inserts are relatively rare---a few per active player per second at most.  And furthermore, inserts don't usually expand the table, but replace existing cells in the table (as things are moved around in a town, for example).  So the expensive kind of insert happens very rarely.

EXCEPT....

In the case of the tutorial map being loaded for you.  The tutorial map currently includes 2200+ cells that need to be inserted into the map out in the wilderness.  Those locations are eventually reused---after people finish their tutorials---but it takes a while for this to start happening.

I'm currently clocking about 3 seconds of "pre-load" time when I try to play the tutorial on server1, before the map is sent to my client.  That's 3 seconds where the server isn't doing anything for anyone else.  Three seconds of bouncing for everyone.  Every time someone loads the tutorial.

Offline

#2 2018-07-30 23:54:53

Uncle Gus
Moderator
Registered: 2018-02-28
Posts: 567

Re: Inserts in database are slow now

Owie

Offline

#3 2018-07-31 00:12:28

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

It seems like some kind of "lazy update" is possible with the insert.

Due to linear probing, when we need to expand the table by a cell, we often have to re-hash and reinsert a bunch of cells (in the full run of non-empty cells created by previous linear probing).  The full run of non-empty cells to the right of what we're splitting.

However, we will eventually get around to taking a look at those other cells to the right, later on, when we expand the table more.

Seems possible to rehash only the current cell, and if it moves out of the way, leave an "empty cell" marker in its place to not mess up linear probing in the future.

Then, for all cells before the current split point, mod them with both original and expanded table size, and check both places for them.

Offline

#4 2018-07-31 01:27:51

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Ugg... actually, that lazy method won't work, because it will allow contiguous regions of cells in the table to grow forever, which is bad for linear probing.

Feel kinda stuck here.

Offline

#5 2018-07-31 05:41:57

Go! Bwah!
Member
Registered: 2018-03-16
Posts: 204

Re: Inserts in database are slow now

Would running a dedicated tutorial server with StackDB and aggressive map culling work?  It smells kinda bad but maybe this problem doesn't have to be solved elegantly right this second.


I like to go by "Eve Scripps" and name my kids after medications smile

Offline

#6 2018-07-31 06:29:08

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

I thought about it!

There's a few things:

1.  The number of people who might be playing the tutorial at any one time has no bound.  If the game surges in popularity, there could be hundreds.  Too many to fit on one server, maybe.  So I'd need separate load balancing for the tutorial servers.

2.  Maintaining separate code bases for the tutorial servers seems troublesome.  Currently, all servers run the same code, and they update automatically with pulls from the latest code on github.

3.  The problem with inserts being slow.... is... well, a problem.  Seems like it could be a general problem that could bite me down the road.  Inserts shouldn't be slow...

Offline

#7 2018-07-31 07:41:50

third.eye
Member
Registered: 2018-07-19
Posts: 24

Re: Inserts in database are slow now

I don't know if it's in your interest because of the easteregg or maybe it's not easy codewise, but could you just get rid of tutorial on servers and make it a local client thing?

Offline

#8 2018-07-31 14:04:02

Christoffer
Member
From: Sweden
Registered: 2018-04-06
Posts: 148
Website

Re: Inserts in database are slow now

jasonrohrer wrote:

3.  The problem with inserts being slow.... is... well, a problem.  Seems like it could be a general problem that could bite me down the road.  Inserts shouldn't be slow...

I think you have the right instinct here. Inserting a value shouldn’t take on the order of milliseconds. I would focus there first.

If it turns out that it’s huge effort to improve, I would go with reducing the number of inserts as a second approach. By making the DB block size configurable instead of single tile, you can try out 4x4, 8x8 etc... I’m sure that’s not done without its own effort, but there is the added gain that it’s an upgrade of your system from an engineering perspective (just don’t pick a fixed size - make it configurable, so you can change it easily ).

Offline

#9 2018-07-31 14:15:25

PunkyFickle
Member
Registered: 2018-07-13
Posts: 69

Re: Inserts in database are slow now

jasonrohrer wrote:

Too many to fit on one server, maybe.

A temporary solution could be to direct tutorial players to low traffic/empty servers (?).

third.eye wrote:

could you just get rid of tutorial on servers and make it a local client thing?

Local tuts could be technically interesting if feasible, but solving the slow inserts issue is the priority. And it would make Operation Save the Noobs impossible. And that would be a huge shame!

Offline

#10 2018-07-31 17:37:28

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Well, part of the reason for having tutorials run on the servers is that it's a quick and dirty hack.  It just works, and was like 300 lines of code on the server (to load the map).  Actually, most of that code was already there, because I had the power to load test maps (for testing new content).

All of the game logic is on the server, none is on the client.  Thus, playing the tutorial on the server was the only realistic way forward for a one-person dev team.

Side benefit is being able to "escape" from the tutorial and find other players.

Offline

#11 2018-07-31 17:37:43

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Also, sc0rp, where are you?  :-)

Offline

#12 2018-07-31 17:46:17

Chard
Moderator
Registered: 2018-03-04
Posts: 125

Re: Inserts in database are slow now

jasonrohrer wrote:

Also, sc0rp, where are you?  :-)

Server could keep a tutorial zone ready at all times and once it is populated, starts loading a new one over a few seconds, 20-30 objects at a time.

Edit: "fun" side effect, if you can find the tutorial zone you can mess it up before a player arrives in it.

Offline

#13 2018-07-31 18:07:51

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Yeah, Chard, I thought about this too.

But that will also slow server startup time.

Well, actually not necessarily.  The inserts are only slow if the hash table is already at max fullness, because every insert after that has to expand the table and move items around.  So, my idea was to make the hash table big enough, at startup, to contain the current map items AND 200 tutorial areas, and then pre-load them.  Seems like a pretty ugly work-around for slow inserts, though.

Offline

#14 2018-07-31 18:11:29

Chard
Moderator
Registered: 2018-03-04
Posts: 125

Re: Inserts in database are slow now

jasonrohrer wrote:

Yeah, Chard, I thought about this too.

But that will also slow server startup time.

Well, actually not necessarily.  The inserts are only slow if the hash table is already at max fullness, because every insert after that has to expand the table and move items around.  So, my idea was to make the hash table big enough, at startup, to contain the current map items AND 200 tutorial areas, and then pre-load them.  Seems like a pretty ugly work-around for slow inserts, though.

What works works in my book. It wouldn't substantially slow startup time to load one tutorial area and you don't even need to do that, as long as you don't load the entire tutorial area in one go it wouldn't cause lag.

Maybe there is a way to improve insert times and if there is that's clearly worth considering. But tutorials are the edge case, you work around them, and whatever works works.

Offline

#15 2018-07-31 19:36:45

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

I think I've figured out how to speed up inserts.

When splitting a bucket leaves a hole that would trip up linear probing, I'm currently rehashing the whole island to the right.  This is safe, and always leaves the table in a consistent state, but it can be slow.  A single insert can trigger the re-hashing and re-insert of 100s of items in an island, resulting in lots of disk touches.

There are other methods for dealing with deletion:

https://en.wikipedia.org/wiki/Linear_probing#Deletion

The first method, where you search to the right for a cell with which to fill this hole, and then move it, and then search further to the right to find a cell to fill the new hole, and repeat, immediately returns the table to a correct state with far fewer disk touches than what I'm currently doing, but still an unknown number of disk touches per insert.

The second method, where you add a "something used to be here" marker to the cell, has the problem of letting the islands keep growing.  However, in my implementation, islands actually don't result in more disk touches, because the island is walked through in RAM only.  So bigger islands aren't so dangerous.

Finally, the "something used to be here" marker is only important to remember until we have split the entire table.  Once we've done that, all the markers that we left behind along the way can essentially vanish, breaking up the islands again.  These markers help us find cells further to the right during linear probes, but after we've split the whole table, every cell "further to the right" of one of these markers has been rehashed, so the marker isn't needed anymore.

The new idea is to have these markers start at 2 and "count up" each time the table doubles in size.  Thus, when the marker goes up from 2 to 3, we can start ignoring the old 2 markers, and treat them as ordinary empty cells.  Then later, when the table doubles again, we can start ignoring the 3-markers, and treat them as empty cells, instead paying attention to the newer 4 markers.

The other case we have to worry about is at the end of the table.  Say we have a 10-slot table, and we insert 9, 19, and 29.  The first one will go in slot 9, but the next two will "wrap around" back to 0 and 1 due to linear probing.

Now if we expand the table by one slot due to linear hashing, and nothing lands immediately in the new slot, we can't put an empty cell there, because we won't be able to find 19 and 20.  We'll start looking at slot 9, and walking to the right, encounter an empty cell and stop looking.

So we need to put a "something used to be here" flag in the new cell added to the end of the table.  BUT... we need to keep this cell active even after the table finished doubling (once we get up to 20 cells).  Suppose we were using flag=5 when splitting the cells 0-9, and leaving holes in there.  Once we're done doing that, and we have grown to 20 cells, we will forget about flag=5 and start using flag=6 for future splits.

So, while we are splitting the 10-cell table and adding new empty cells at the end, we should be putting flag=6 in those new cells at the end.  After we're done splitting, we still need to remember them.... until we finally finish splitting the 20-cell table into a 40-cell table.  Then we will have rehashed everything again, and we can forget about flag=6 in favor of flag=7.


The flag increments every time the table doubles in size, meaning that an 8-bit flag can accommodate table sizes on the order of the number of atoms in the known universe.... i.e, that ought to be enough for anyone.


The down-side to this approach is bigger islands, that linger up until the next split, but I'm not sure this is such a big problem, given that the islands are traversed in RAM anyway.

Offline

#16 2018-07-31 20:50:45

Uncle Gus
Moderator
Registered: 2018-02-28
Posts: 567

Re: Inserts in database are slow now

Re:tutorials, why not just have a server dedicated to tutorials and redirect them all there?

Offline

#17 2018-07-31 21:10:44

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Offline

#18 2018-07-31 21:14:56

Uncle Gus
Moderator
Registered: 2018-02-28
Posts: 567

Re: Inserts in database are slow now

Didn't read lol

Offline

#19 2018-07-31 21:27:04

Christoffer
Member
From: Sweden
Registered: 2018-04-06
Posts: 148
Website

Re: Inserts in database are slow now

Imagine if one server is all tutorials, and everyone on it would experience three seconds lag every time a new player started. That would be even worse, on so many accounts.

Offline

#20 2018-07-31 21:49:45

PunkyFickle
Member
Registered: 2018-07-13
Posts: 69

Re: Inserts in database are slow now

Flintstone wrote:

Imagine if one server is all tutorials, and everyone on it would experience three seconds lag every time a new player started. That would be even worse, on so many accounts.

And a such a critical moment in a player's game experience.

So, what about directing players to low population servers (as a temporary fix)? You have the population size information server side and you know if someone loads the tutorial, so couldn't you select a server at the player's connection rather than using the 'go to the highest priority server' algorithm? (If I understood well how it currently work)

Offline

#21 2018-07-31 22:07:50

YAHG
Member
Registered: 2018-04-06
Posts: 1,347

Re: Inserts in database are slow now

What if it just ran tutorials on a local server? They won't notice the lag because they are loading anyways.

The running into other people int he tutorial is kinda cool but could give it up.


"be prepared and one person cant kill all city, if he can, then you deserve it"  -pein
https://kazetsukai.github.io/onetech/#
https://onehouronelife.com/forums/viewtopic.php?id=1438

Offline

#22 2018-07-31 23:50:28

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,804

Re: Inserts in database are slow now

Well, that would entail getting the server built and running smoothly on Mac and Windows...

Also, the above method of speeding up inserts (https://onehouronelife.com/forums/viewt … 331#p26449) doesn't work well in practice.  The islands (with deleted element holes) grow to be huge.  You do way less disk touching, but the RAM operations become so large that it's way slower anyway.  And if you have an island full of holes, and you do an insert, you have to search the whole island anyway, because your insert might be a replace.  When I was devising it, I thought, "Oh, great, you can just insert in the first hole you find," but that's not the case if replacement is going to work correctly without wasting space or leaving duplicate records around.

I've found one small optimization in the table expansion (we don't need to delete and re-insert a cell that hashes to the same bin as before), but we still need to read every cell in the island, at least, and many of them need to be re-inserted.  And because you want to deal with these cells one at a time (to avoid slurping them all into RAM), there's a lot of random disk access going on as you read a cell and then write it elsewhere in the table, then read the next cell in the island, repeat.

Ran some tests inserting 200,000 items into a table that starts with only 2000 cells (so lots of table expansion is happening... an expansion every insert, almost), and I see this:

Inserted 234256
12 cells read on average per table expand (233257 expands), worst read = 1053
5 cells moved on average per table expand (233257 expands), worst moved = 520

So you can see that this is pretty bad... on average, 12 reads per insert is bad, but worst case... 1053 reads for a single insert?  There's a huge island in there, and a huge portion of it needs to be re-inserted.

Offline

Board footer

Powered by FluxBB