Paper reference: A Column Store Engine for Real-Time Streaming Analytics
Background:
According to the official website, MemSQL is “a high-performance data warehouse designed for the cloud that delivers ultra-fast insights of your live and historical data”. It uses row-storage and lock-free engine for data in memory and column-storage for data in the disk.
MemSQL could also store all data into the disk. In its most durable state, MemSQL will not lose any transactions which have been acknowledged. And it implements “Read Commited” isolation level for transactions.
I heard about MemSQL in early 2012, but don’t know it has become a OLTP-and-OLAP system until several days ago.
What is the problem?
To fulfil OLAP jobs, MemSQL has to store data into a disk with columnar-storage-format. But MemSQL still needs to process OLTP requests such as random INSERT or UPDATE. Therefore it stores data into a data structure named “Segment”. And by connecting different segments into an ordered segments list (which named “Sorted Runs”), MemSQL could balance the requirements between frequent INSERT/UPDATE operations and SCAN/GROUP BY operations.
For example:
1. Users INSERT three keys: 1, 92, 107. MemSQL will create a segment that contains the three keys:
[1, 92, 107]
2. Users continue to INSERT two keys: 63, 84. The segments list are:
[1, 92, 107]
[63, 84]
3. After many INSERT operations, the segments become:
[1, 92, 107]
[2, 17, 42]
[63, 84]
[110, 118, 172]
Now,MemSQL makes these segments into “Sorted Runs”, which have a basic order for keys:
When the SELECT comes, MemSQL could find the row quickly by just looking up two ordered segment-lists. Uses could also SCAN two segment-lists effectively to do OLAP tasks since all the data are stored in columnar-format.
What happens if users INSERT more rows? MemSQL will merge the old big Sorted-Runs and create a new segment for freshly-insert-data, which could keep the number of Sorted-Runs acceptable.
In practice, MemSQL column store engine uses a constant of 8, so that the biggest sorted run has at least of all the segments, the second biggest sorted run has at least of the remaining segments, and so forth. This strategy seems just like
LSM tree in
LevelDB of Google. The difference between LevelDB and MemSQL is LevelDB store every key-value pair respectively but MemSQL store a batch of rows into one segment.
If INSERT operations come when MemSQL is merging, the small merging actions will be aborted and relaunch. For big merging actions, it will barely skip any missing or updated segments, for skipping some segments will not ruin the new merged Sorted-Runs.
As we can see, MemSQL endeavours to avoid locking for in-memory data operations, which makes its performance significantly superior.
There are also some practical considerations for MemSQL.
- If only one merger is working at big Sort-Runs, it will cost too much time and the small Sorted-Runs will become tremendous for intensive INSERT operations. So MemSQL launch two mergers: Fast Merger and Slow Merger.
- MemSQL could accumulate some rows before batching them into a segment, which will decrease the fragment of data.
- MemSQL also create special Commands for users to sort all the segments, or just decrease the level of Sorted-Runs.
- Columnar-Storage format in memory makes using SIMD instruments of CPU possible. This shed some light on me: maybe one day we can run machine learning jobs on MemSQL directly 🙂