One Hour One Life Forums

a multiplayer game of parenting and civilization building

You are not logged in.

#26 2018-07-24 16:11:57

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

So, instead of linear probing or chains to handle collisions, why not always handle any collision by adding a bucket at the end and rehashing the bucket pointed to by S?  Then try inserting again, if there's a collision, add a bucket, and so on?

I'm not sure if I understand your question correctly.

If you ask why not to rehash only bucket with collision: lookup needs some deterministic order of buckets to find items later on.  If you keep adding buckets randomly, you'll need to keep some mapping to find them later on.  With naive implementation this mapping will grow to be as big as the hash table itself.  Non naive implementation will lead to virtual/extendible hashing.

If you ask why not to rehash everything up to the bucket with collision: that's really much more elaborate way of doubling the whole table in one go, which is the problem you want to solve.  On first collision on average you will need to rehash half of elements in the table.  On collision in already split bucket you'll have to first finish rehashing all remaining elements and then start another rehash go up to collision bucket.   So in long run it's just doubling on collision.

jasonrohrer wrote:

Otherwise, you need some other metric to determine when you need to add a bucket (how long is the list in this bucket, or how many failed linear probings steps did I execute?)

Typical choices are:
1. Count the number of inserted items.  You know what number of elements you can store, so every time you go past expected load, add a bucket.
2. Add a bucket every Nth insert.  If your bucket holds 16 elements and you want 50% load, just add new bucket every 8 inserts.  That's pretty much equivalent to point 1, but you need to start with one bucket and do something sane on restarts.
3. Add a bucket every time some bucket overflows.

jasonrohrer wrote:

Seems like a collision is a good enough signal that a new bucket is needed, and the new bucket operation is cheap enough that we don't need to do it sparingly (it's not like we're doubling the table at the first sign of a collision, we're just adding one new bucket).

Yes, that's one of the approaches.  It will stablize on some load, that will depend mainly on how many items you have per bucket.  Drawback is that you cannot control load, so you may not like the performace.  Eg. it still will be O(1), but you may need on average 8 reads per lookup/insert with high variation on occasional long chains.

jasonrohrer wrote:

HOWEVER, I also did spend some time today looking around for simple, single C-file databases, just to see how they were doing things.  Besides KISSDB, there's really nothing else out there, especially not something with a permissive license.  Most of the other options are bloated, heavy-weight libraries.

So, if you finish your implementation, I'd highly recommend putting it on GitHub so that other people can use it.

Maybe I'll put it there.  But IMHO it has very narrow use case.  When your dataset is big enough that need constantly to tweak KissDB parameters.  When you don't need concurrent readers with writer.  And when you don't need perfect durability, so few records going missing on power loss is not something you care about.  Quite rare IMO.

jasonrohrer wrote:

It would be really interesting to see how it performs on various test data sets when compared to KISSDB and other similar embedded databases like Kyoto or LMDB.

Should be equal or faster than KissDB for all cases that matter.  It'll faster than LMDB, but we're not competing it the same category.  LMDB does full ACID, readers don't block writer and your values can have variable size.  I can't provide that in a simple way.

jasonrohrer wrote:

And why not use LMDB?  Looking at this header, my eyes glaze over:

http://www.openldap.org/devel/gitweb.cg … 34;hb=HEAD

It's pretty solid choice.  You will be better off with it, than any hand made solution.  You can also implement all further optimizations I've suggested with it (prefetch thread, grouping data).

The other option to consider is Berkeley DB.  Version 5 has quite permissive license.  But most people are moving away from it after Oracle aquired it and changed license in v6 to Affero GPL.


If you want some quick solution with minimal changes to the code there is another option.  Move back to KissDB.  IIRC you rewrite whole database on shutdown, to wipe old data.  Instead of wipe, calculate number of elements, and allocate new KissDB with 25% load.  That should work great until next restart.  There is only one small fix I'd make to ensure there are no problems.  KissDB uses very bad hash function.   So even with 25% load you may end up with some long chain that will waste a lot of memory for all (almost empty) hash tables.  Replace it with MurmurHash2 or SipHash.

Last edited by sc0rp (2018-07-24 16:23:07)

Offline

#27 2018-07-24 16:35:24

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

wondible wrote:

This jogged a memory of some posts I've seen

"A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set"."

https://en.wikipedia.org/wiki/Bloom_filter

I thought about this.  It's not that usefull in this case.  It needs ~10 bits per key for 1% false positive rate with random access all over it, so you'll pretty much have to keep it in RAM and it'll grow proprtional to number of records in db.  You'll need to completely recreate it from time to time to keep 1% false positive rate with increasing number of records.  And it will deal with part of the problem only - null lookups.  Getting old data will still cause lag with hundreds of reads mulitplied by hundred tiles/thousand items and subitems.  Fixing that will need fix to underying structure anyway and then your bloom filter will be useless.  So it's waste of time IMO.

Offline

#28 2018-07-24 17:43:59

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

The funny thing about "add a new bucket when you see a bucket overfow" is that the bucket that randomly overflowed may occur after the current split point, so you split a bucket that doesn't help you with your current overflow problem.  I guess "overflow" is kindof a squishy term here, where we mean, "It's over the limit, but we'll keep letting it grow even bigger until we eventually get around to splitting it."

Looked at Berkeley DB, and the source download is something like 42 MB compressed.  Um... it's bigger than my entire game.  No thanks.

Some early testing, long ago, found that KISSDB was an order of magnitude faster than some other options.  I think I tried SQLite and straight-up MySQL.  There's a lot of overhead involved in all the extra stuff that is just not needed here.

Of course, I don't think those early tests got to the point where KISSDB was way overloaded, so the lack of indexing, etc, didn't show.  I'm guessing that MySQL would have started coming out ahead in the end once KISS got overloaded.  Indexed MySQL is O(log n) with a large contstant, where KISSDB is O(n) with a tiny constant.


Thus, I disagree, and believe that a custom coded solution is better here.  Because I can understand its inner workings, I can profile it directly and find bottlenecks that might have very specific work-arounds, given my particular application.  Trying to do something similar with and of the gigantic, off-the-shelf options would be ridiculous.

Offline

#29 2018-07-24 19:27:23

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

sc0rp, STACKDB is 7x faster than KISSDB for common operations, and uses infinitely less RAM (because it uses no RAM), and slightly less disk space.  The reason we could ever support 100+ players on one server was because of the switch to STACKDB.  Compare this benchmark:

KISSDB:

Opening DB (table size 80000) used 1472 bytes, took -0.000000 sec
Inserted 1999396
Inserts used 19202048 bytes, took 36.488000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 30.227000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 28.654000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 57.921000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 638976 bytes, took 65.352000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 2.281000 sec

real	3m40.928s
user	0m52.496s
sys	2m47.916s
-rw-rw-r-- 1 jasonrohrer2 jasonrohrer2 43869016 Jul 24 12:16 test.db

STACKDB

Opening DB (table size 80000) used 1496 bytes, took 0.004000 sec
Inserted 1999396
Inserts used 0 bytes, took 11.338000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 4.563000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 2.033000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 3.774000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 0 bytes, took 5.525000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 3.008000 sec

real	0m30.247s
user	0m6.892s
sys	0m23.340s
-rw-rw-r-- 1 jasonrohrer2 jasonrohrer2 41327915 Jul 24 12:17 test.db

This benchmark is showing performance with a 2500% load factor.


So, the idea of going back to KISSDB, with a weekly table resize....

The man problem with KISSDB is that the latest data is the slowest to access.  With civs dying out all week long, you want the latest data being fastest to access.

However, that benchmark shows some other issues with KISSDB.  Even random access or NULL-result access is way slower.  I can't remember why, except that I spent quite a while profiling STACKDB.

I'll post another benchmark in a bit with a 50% load factor for both.

Offline

#30 2018-07-24 19:42:33

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Okay, another benchmark with 50% loading (table size of 4,000,000).

Here's KISSDB:

Opening DB (table size 4000000) used 1472 bytes, took 0.000000 sec
Inserted 1999396
Inserts used 64004096 bytes, took 70.637000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 2.839000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 2.056000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 1.365000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 31997952 bytes, took 6.675000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 2.839000 sec

real	1m26.416s
user	0m7.724s
sys	0m24.396s
-rw-rw-r-- 1 jasonrohrer2 jasonrohrer2 120028792 Jul 24 12:32 test.db

STACKDB

Opening DB (table size 4000000) used 1496 bytes, took 0.712000 sec
Inserted 1999396
Inserts used 0 bytes, took 136.304000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 5.093000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 3.058000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 2.468000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 0 bytes, took 7.326000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 4.709000 sec

real	2m39.675s
user	0m9.108s
sys	0m33.276s
-rw-rw-r-- 1 jasonrohrer2 jasonrohrer2 104047915 Jul 24 12:37 test.db

Somewhat surprising that insert performance for both gets so much worse, with StackDB taking the biggest hit.

I think that the random access issue is becoming relevant, because the table in StackDB may be too big to fit in the disk cache.

On collision, KISSDB is going to insert a whole new page of the entire table, so that's not good either.

Also the "Used ______ bytes" lines are recording changes in the total size of allocated memory.  So you can see that KISSDB is using quite a bit of RAM here


Also, interesting to note that STACKDB with 2500% loading is 2.8x faster overall than KISSDB with 50% loading, but also that all the slowdown in KISSDB comes from INSERT, which is a relatively rare operation.

KISS with 50% loading is on-par or better than STACK with 2500% loading for the other operations.

Anyway, it's clear that "just make the table bigger" isn't going to solve all of the problems here.  With 12M map cells recorded, we'd need a KISS table with 24M cells, which gets doubled when there are collisions, and KISS stores these tables entirely in RAM.  I'm pretty sure I'd be exhausting system memory.

Offline

#31 2018-07-24 19:45:32

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

sc0rp, STACKDB is 7x faster than KISSDB for common operations, and uses infinitely less RAM (because it uses no RAM), and slightly less disk space.  The reason we could ever support 100+ players on one server was because of the switch to STACKDB.  Compare this benchmark:

KISSDB:
Opening DB (table size 4000000)
Inserted 1999396

Rise table size to 8M elements (aka reduce load to 25%) and rerun the test.

EDIT: Alternatively can you check easily the longest chain?  I want to make sure that crappy hash function is not the main culprit.

Last edited by sc0rp (2018-07-24 20:06:09)

Offline

#32 2018-07-24 20:13:15

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Worst bin list length with 2500% loading in STACKDB is 29 in one bin.  Not horrible.

I've run some comparisons today between Murmur2 and djb2, and I'm not seeing a huge difference in terms of collisions.  I also looked at some online benchmarks, and there's very little difference reported by anyone else, either.

I'm running the 2500% load KISSDB test now to get the max bin depth, but it takes a while.  Interesting that it uses high CPU, while the 50% load STACKDB took quite a while but used almost no CPU (I thought it was frozen for a while there).  Thus, it seems likely that STACKDB is indeed blocked waiting for the disk when the table size gets big.

Ah.  Max bin depth for KISSDB is 31.

Thus, there's not some insane "hot spot" in the djb2 hash that's causing a bunch of values to get piled in one bin.

Offline

#33 2018-07-24 20:20:10

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Also, all this testing is happening on a magnetic platter hard drive.

Offline

#34 2018-07-24 20:29:57

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

On collision, KISSDB is going to insert a whole new page of the entire table, so that's not good either.

That's the weakest point of its design.  The goal is to get right hash function and load low enough for keep longest chain to 2-3 items.  Almost all items should be in first table.

jasonrohrer wrote:

Also the "Used ______ bytes" lines are recording changes in the total size of allocated memory.  So you can see that KISSDB is using quite a bit of RAM here

it uses 8 bytes per hash table pointer.  You can halve memory use it by changing uint64 to 32 (will alter disk format as well).

jasonrohrer wrote:

Also, interesting to note that STACKDB with 2500% loading is 2.8x faster overall than KISSDB with 50% loading, but also that all the slowdown in KISSDB comes from INSERT, which is a relatively rare operation.

I would ignore insert performance for now.  Most of it is from writting huge hash table once.

jasonrohrer wrote:

KISS with 50% loading is on-par or better than STACK with 2500% loading for the other operations.

When you ignore initial insert and final seq scan it's not:

KDB Random look used 0 bytes, took 2.839000 sec
SDB1 Random look used 0 bytes, took 4.563000 sec
SDB2 Random look used 0 bytes, took 5.093000 sec

KBD Last-inserted lookup used 0 bytes, took 2.056000 sec
SDB1 Last-inserted lookup used 0 bytes, took 2.033000 sec
SDB2 Last-inserted lookup used 0 bytes, took 3.058000 sec

KBD Random look/miss used 0 bytes, took 1.365000 sec
SDB1 Random look/miss used 0 bytes, took 3.774000 sec
SDB2 Random look/miss used 0 bytes, took 2.468000 sec

KBD Inserts after miss used 31997952 bytes, took 6.675000 sec
SDB1 Inserts after miss used 0 bytes, took 5.525000 sec
SDB2 Inserts after miss used 0 bytes, took 7.326000 sec

jasonrohrer wrote:

Anyway, it's clear that "just make the table bigger" isn't going to solve all of the problems here.  With 12M map cells recorded, we'd need a KISS table with 24M cells, which gets doubled when there are collisions, and KISS stores these tables entirely in RAM.  I'm pretty sure I'd be exhausting system memory.

12M or 120M rows?  It uses 8 bytes/hash entry/table.  With 3 tables it's 300M for 12M rows.  Changing uint64 to 32 will halve it.  150M.  Well, I had hundred times more RAM in my previous laptop.

BTW. I don't suggest that as a final solution.  Just quick and dirty way to fix lag issues for now, until better solution is implemented.

Last edited by sc0rp (2018-07-24 20:34:06)

Offline

#35 2018-07-24 20:41:09

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

The micro Linodes that I'm using have 1GiB system RAM.

I just installed a fresh Ubuntu node for testing, and with nothing of mine running, there are 226M free.

The only point I can see to holding the hash table in RAM is that it speeds up NULL results.  And those are rather common, I suppose.

Offline

#36 2018-07-24 20:42:10

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

Ah.  Max bin depth for KISSDB is 31.

Thus, there's not some insane "hot spot" in the djb2 hash that's causing a bunch of values to get piled in one bin.

31 for 50% load?  If that's not a typo, it's horrible.  I would expect 2-4.

EDIT: I've missed that it's for 2500% version.

Last edited by sc0rp (2018-07-24 20:56:49)

Offline

#37 2018-07-24 20:44:08

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

Also, all this testing is happening on a magnetic platter hard drive.

If it fits in cache, it shouldn't matter.  If it doesn't, it'll be 100-1000 times slower.

Last edited by sc0rp (2018-07-24 21:02:23)

Offline

#38 2018-07-24 20:53:29

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

The micro Linodes that I'm using have 1GiB system RAM.

I just installed a fresh Ubuntu node for testing, and with nothing of mine running, there are 226M free.

Where it goes?  Are you sure you don't count cache as used?

Can you strip it?  I always run such systems with no gui, nothing unimportant running.  So I'm used to having 90%+ RAM free after boot.  Just checked now, even with simple gui 188M/1536M, everything else is free or cache.

jasonrohrer wrote:

The only point I can see to holding the hash table in RAM is that it speeds up NULL results.  And those are rather common, I suppose.

Yes, no I/O for null result.  You can compress all pages except first.  They will take next to nothing in 50% case.  But it's stopgap solution, so probably not worth looking into.

Last edited by sc0rp (2018-07-24 21:04:31)

Offline

#39 2018-07-24 21:15:28

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Yes, I could strip the ubuntu build.  It's the default linode build, so I don't think it has any GUI or anything installed.  Actually, it doesn't even come with a web server installed.

Here's top output sorted by RES column:

Tasks:  88 total,   1 running,  49 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1009052 total,   160104 free,    80288 used,   768660 buff/cache
KiB Swap:   262140 total,   262140 free,        0 used.   753708 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                    
  558 root      20   0  626184  19572  11008 S  0.0  1.9   0:00.29 snapd                                      
  538 root      20   0  170424  17104   9284 S  0.0  1.7   0:00.11 networkd-dispat                            
  377 root      19  -1   78456   9928   9252 S  0.0  1.0   0:00.09 systemd-journal                            
    1 root      20   0   77896   8964   6608 S  0.0  0.9   0:02.22 systemd                                    
  932 root      20   0   76624   7572   6544 S  0.0  0.8   0:00.01 systemd                                    
  921 root      20   0  105684   7048   6056 S  0.0  0.7   0:00.76 sshd                                       
  556 root      20   0  287512   6804   5992 S  0.0  0.7   0:00.07 accounts-daemon                            
  595 root      20   0  288872   6540   5756 S  0.0  0.6   0:00.00 polkitd                                    
  562 root      20   0   70592   6116   5428 S  0.0  0.6   0:00.06 systemd-logind                             
  873 root      20   0   72296   5728   5004 S  0.0  0.6   0:00.00 sshd                                       
 1040 root      20   0   22776   5340   3624 S  0.0  0.5   0:00.10 bash                                       
  514 root      10 -10   25880   5264   4036 S  0.0  0.5   0:00.00 iscsid                                     
  448 systemd+  20   0   70612   5220   4672 S  0.0  0.5   0:00.23 systemd-resolve                            
  436 systemd+  20   0   71820   5200   4612 S  0.0  0.5   0:00.30 systemd-network                            
  545 message+  20   0   50056   4416   3816 S  0.0  0.4   0:00.15 dbus-daemon                                
  559 syslog    20   0  263036   4172   3660 S  0.0  0.4   0:00.01 rsyslogd                                   
  391 root      20   0   45072   4028   3140 S  0.0  0.4   0:00.18 systemd-udevd

Offline

#40 2018-07-24 21:20:37

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Also, for the 50% loading version, SSD comes to the rescue (on Linode... I've got a magnetic drive at home):

KISSDB:

Opening DB (table size 4000000) used 5840 bytes, took 0.000000 sec
Inserted 1999396
Inserts used 64004096 bytes, took 12.989000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 2.601000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 3.004000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 1.056000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 31997952 bytes, took 6.904000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 3.317000 sec
Max bin depth = 3

real	0m29.877s
user	0m9.105s
sys	0m20.553s
-rw-r--r-- 1 root root 120028792 Jul 24 21:17 test.db


STACKDB:

Opening DB (table size 4000000) used 5872 bytes, took 0.138000 sec
Inserted 1999396
Inserts used 0 bytes, took 15.490000 sec
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 5.108000 sec
Checksum = 2123813000
Last-inserted lookup for 500 batchs of 2500 (1250000/1250000 hits)
Last-inserted lookup used 0 bytes, took 3.579000 sec
Checksum = 3471250000
Random lookup for non-existing 500 batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 2.426000 sec
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 0 bytes, took 7.492000 sec
Iterated 2002395, checksum 2833656880
Iterating used 0 bytes, took 6.270000 sec
Max bin depth = 1

real	0m40.506s
user	0m11.433s
sys	0m28.945s
-rw-r--r-- 1 root root 104047915 Jul 24 21:10 test.db

But STACKDB is still way faster than KISS for 2500% loading, mostly because it was designed and optimized for that case (moving stuff to the top of the stack on lookup is expensive, but worth it if the stack is deep).

Offline

#41 2018-07-24 21:22:14

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

Yes, I could strip the ubuntu build.  It's the default linode build, so I don't think it has any GUI or anything installed.  Actually, it doesn't even come with a web server installed.

Here's top output sorted by RES column:

Tasks:  88 total,   1 running,  49 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1009052 total,   160104 free,    80288 used,   768660 buff/cache
KiB Swap:   262140 total,   262140 free,        0 used.   753708 avail Mem 

You don't need to do anything.  It uses only 80MB, 160+768 is free.  Almost all is in cache because that's the only sensible use for it in this situation.  Once some program starts using it, system will reclaim almost all of it.

Offline

#42 2018-07-24 21:40:57

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

You have now 1.5M null lookups/s.  I don't think you need more than that.

So now real test.  Set the key size and value same the same as in production database.  Insert similar number of records.  Cleart the cache (allocating 900MB for StackDB, 900MB minus size of the hash tables for KissDB on Linode will do).  Run one of the test (e.g. null lookups).   Clear the cache.  Run next test.  Rinse, repeat.  Try to give KissDB hash table large enough for 50% load, if memory permits.

We need to reproduce hotspot in StackDB.  Maybe just giving it larger hash table will help enough.  So then there will be no need to go back to KissDB.

Last edited by sc0rp (2018-07-24 21:44:07)

Offline

#43 2018-07-24 22:13:44

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Just ran analysis on actual map.db from server1:

15,733,863 database records total (246 max hash bin depth).

These are 16 byte keys with 4-byte values.

Offline

#44 2018-07-24 23:38:13

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

I've set up a similar KISSDB benchmark test with 15M database records and 30M hash table bins.

Fully system disk cache flush between each test.

The initial insert took 800+ seconds (which is still 16K inserts per second).

Other operations seemed to be pretty quick.

But now I'm stuck on the iteration test at the very end, which seems to be taking FOREVER.

top - 23:35:11 up  3:11,  2 users,  load average: 1.00, 1.00, 1.00
Tasks:  92 total,   1 running,  53 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  5.5 sy,  0.0 ni,  0.0 id, 93.8 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1009052 total,    60212 free,   865320 used,    83520 buff/cache
KiB Swap:   262140 total,    97196 free,   164944 used.    28704 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                    
 9077 root      20   0  951540 810260   1552 D  8.0 80.3   5:25.95 kissTest                                   
   34 root      20   0       0      0      0 S  0.7  0.0   0:17.14 kswapd0                                    
    1 root      20   0   77896   3512   2664 S  0.0  0.3   0:02.37 systemd                                    
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd                                   
    4 root       0 -20       0      0      0 I  0.0  0.0   0:00.00 kworker/0:0H 

It's using a lot of RAM, yes it is.

Offline

#45 2018-07-24 23:54:16

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Okay, it finally finished after 52 minutes:

Opening DB (table size 30000000) used 5840 bytes, took 0.001000 sec
Inserted 15752961
Inserts used 720003072 bytes, took 889.917000 sec
Flushing system disk caches.
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 3.932000 sec
Checksum = 186209500
Flushing system disk caches.
Last-inserted lookup for 500 batchs of 2401 (1200500/1200500 hits)
Last-inserted lookup used 0 bytes, took 3.122000 sec
Checksum = 283318000
Flushing system disk caches.
Random lookup for non-existing 500 repeating batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 2.265000 sec
Flushing system disk caches.
Random lookup for non-existing 1500000 non-repeating values (0/1500000 hits)
Random look/miss used 0 bytes, took 13.899000 sec
Flushing system disk caches.
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 240001024 bytes, took 10.301000 sec
Flushing system disk caches.
Iterated 15755961, checksum 1953944243
Iterating used 0 bytes, took 2200.038000 sec
Max bin depth = 4

real	52m3.596s
user	1m8.901s
sys	5m24.773s
-rw-r--r-- 1 root root 1275119280 Jul 24 23:12 test.db

I attached a debugger to it a few times to see what was going on, but that added an extra 3 minutes at most.

Clearly, using KISSDB in this capacity is not going to fly.

Offline

#46 2018-07-24 23:58:06

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Funny to compare STACKDB in top on the same data set:

top - 23:57:27 up  3:34,  2 users,  load average: 0.79, 0.47, 0.73
Tasks:  92 total,   2 running,  52 sleeping,   0 stopped,   0 zombie
%Cpu(s): 29.2 us, 70.4 sy,  0.0 ni,  0.0 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  1009052 total,   437720 free,    57072 used,   514260 buff/cache
KiB Swap:   262140 total,   223120 free,    39020 used.   806332 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                    
 9789 root      20   0   14036   1820   1676 R 99.3  0.2   1:33.23 kissTest                                   
 9087 root      20   0       0      0      0 I  0.3  0.0   0:00.04 kworker/u2:1                               
    1 root      20   0   77896   4640   4280 S  0.0  0.5   0:02.41 systemd                                    
    2 root      20   0       0      0      0 S  0.0  0.0   0:00.00 kthreadd                                   
    4 root       0 -20       0      0      0 I  0.0  0.0   0:00.00 kworker/0:0H                               
    6 root       0 -20       0      0      0 I  0.0  0.0   0:00.00 mm_percpu_wq  

Offline

#47 2018-07-25 00:08:49

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

jasonrohrer wrote:

Okay, it finally finished after 52 minutes:

Clearly, using KISSDB in this capacity is not going to fly.

Oh, it will.  We're very close.  It used 4 buckets with ~1GB of memory total.  So it had to constantly swap in/out about 150MB of data.

2 things:

1. Change size of hash pointers from 64 to 32 bit - this will halve memory usage.  This line, and all relevant ones that deal with hash table and its I/O:

db->hash_table_size_bytes = sizeof(uint64_t) * (hash_table_size + 1); /* [hash_table_size] == next table */

2. Change hash algorithm, to murmur or siphash.  It may drop worst case chain from 4 buckets to 3 or 2.

Offline

#48 2018-07-25 00:12:35

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

Here's STACKDB on the same data set with an 80,000 bin table:

Opening DB (table size 80000) used 5872 bytes, took 0.004000 sec
Inserted 15752961
Inserts used 0 bytes, took 105.651000 sec
Flushing system disk caches.
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 16.904000 sec
Checksum = 186209500
Flushing system disk caches.
Last-inserted lookup for 500 batchs of 2401 (1200500/1200500 hits)
Last-inserted lookup used 0 bytes, took 4.498000 sec
Checksum = 283318000
Flushing system disk caches.
Random lookup for non-existing 500 repeating batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 29.529000 sec
Flushing system disk caches.
Random lookup for non-existing 1500000 non-repeating values (0/1500000 hits)
Random look/miss used 0 bytes, took 372.466000 sec
Flushing system disk caches.
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 0 bytes, took 25.118000 sec
Flushing system disk caches.
Iterated 15755961, checksum 1953944243
Iterating used 0 bytes, took 39.400000 sec
Max bin depth = 291

real	9m54.634s
user	1m54.917s
sys	7m6.532s
-rw-r--r-- 1 root root 443086923 Jul 25 00:05 test.db

You can see where it does poorly:  the NULL-result values for non-repeating values.  For repeating NULL-result values, it remembers the last "not found" value at the top of each bin stack.

It's funny, but I guess I optimized for the wrong case here.

Offline

#49 2018-07-25 00:20:12

jasonrohrer
Administrator
Registered: 2017-02-13
Posts: 4,805

Re: Welp, found the actual source of lag

And here's STACKDB with a 10x bigger table:

Opening DB (table size 800000) used 5872 bytes, took 0.033000 sec
Inserted 15752961
Inserts used 0 bytes, took 111.969000 sec
Flushing system disk caches.
Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 10.327000 sec
Checksum = 186209500
Flushing system disk caches.
Last-inserted lookup for 500 batchs of 2401 (1200500/1200500 hits)
Last-inserted lookup used 0 bytes, took 3.470000 sec
Checksum = 283318000
Flushing system disk caches.
Random lookup for non-existing 500 repeating batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 11.715000 sec
Flushing system disk caches.
Random lookup for non-existing 1500000 non-repeating values (0/1500000 hits)
Random look/miss used 0 bytes, took 22.488000 sec
Flushing system disk caches.
Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 0 bytes, took 7.863000 sec
Flushing system disk caches.
Iterated 15755961, checksum 1953944243
Iterating used 0 bytes, took 43.319000 sec
Max bin depth = 34

real	3m32.298s
user	0m45.571s
sys	2m10.576s
-rw-r--r-- 1 root root 460366923 Jul 25 00:16 test.db

More than 10x improvement on the NULL-result case for non-repeating, non-existing values.  Still not on-par with KISS for that particular test.

Doing the thing to reduce KISS memory usage next.

Offline

#50 2018-07-25 00:31:45

sc0rp
Member
Registered: 2018-05-25
Posts: 740

Re: Welp, found the actual source of lag

Random lookup for 500 batchs of 3000 (1500000/1500000 hits)
Random look used 0 bytes, took 3.932000 sec

That's ~400k lookups/s - pretty good.

Last-inserted lookup for 500 batchs of 2401 (1200500/1200500 hits)
Last-inserted lookup used 0 bytes, took 3.122000 sec

Same.

Random lookup for non-existing 500 repeating batchs of 3000 (0/1500000 hits)
Random look/miss used 0 bytes, took 2.265000 sec

650k null lookups/s - when served from RAM & cache.

Random lookup for non-existing 1500000 non-repeating values (0/1500000 hits)
Random look/miss used 0 bytes, took 13.899000 sec

100k null lookups/s - when it needs to touch the disk half the time.  That's pretty good too.  SSDs can handle ~100k IOPS, so we're using 50% of capacity.  Pushing for more would require issuing requests in parallel.

Inserts for previously non-existing 500 batchs of 3000
Inserts after miss used 240001024 bytes, took 10.301000 sec

150k inserts/s.  Again, we're hitting disk.  It will be faster for small batches.  Here we see steady state when things need to be pushed on disk.

Anyway most of the time here is spent on adding 4th hash table.  We're going from 50% to 55% load.  I'd recommend to keep in under 50% all the time.  So after reducing memory usage just start with ~40% load.

Last edited by sc0rp (2018-07-25 00:58:08)

Offline

Board footer

Powered by FluxBB