Sharding

One of the performance issue about LSM tree is the write/read amplification. When a db becomes bigger, the number of levels(l) in a LSM becomes increases, in logarithm order.

A record will be rewritten(compaction) l times to enter the bottom level. Assumes the fanout of every level is n, every record amplifies write IO by O(l) * n times.

By splitting LSM into several smaller ones, l becomes smaller and the write/read amplification will be reduced.