tl;dr – Mongo multi-key indexes don’t scale well with high-cardinality arrays. This is the same bad behavior that KairosDB is experiencing with its own indexes.
This post explores a technique for using MongoDB as a general-purpose time-series database. It was investigated as a possible temporary workaround until KairosDB adds support for high-cardinality tags. In particular, this describes using a MongoDB multi-key index for associating arbitrary key-value metadata with time-series metrics in MongoDB and the associated performance problems.
Motivation
We’ve been using KairosDB in production for close to a year now. KairosDB is a general-purpose time-series database built on Cassandra. Each “metric” consists of a name, value, and a set of associated “tags” (key-value metadata). This metadata is extremely useful as it provides structured metadata for slicing, filtering, and grouping the stats.
The main issue restricting us from adopting it more widely is its poor support for high-cardinality tags; that is, tag keys with a large number of distinct values, such as IP addresses or other unique identifiers. Unfortunately, these types of values are also a prime use case for tags in the first place. You can read all about this issue on the KairosDB user group, as its one of the most well-known issues currently. A few months ago I gave a presentation on in Building a Scalable Distributed Stats System which describes a work-around for this issue when there’s a small number of high-cardinality tag keys.
However, the new use case requires a set of high-cardinality keys which is dynamic and unknown a priori. Since the KairosDB team is looking into fixing this issue but hasn’t actually resolved it, I wanted to investigate whether we could use MongoDB temporarily as a backing store behind the Kairos API.
Why MongoDB? Because its easy to use, we know how to scale it (even if its painful), and atomic increments are a powerful bonus.
MongoDB Schema
The first task in evaluating MongoDB for this general-purpose use case is to propose a schema of sorts; we need something flexible enough to use the same underlying model and update operations as KairosDB while allowing efficient querying using MongoDB indexes. The initial schema looked something like:
{ "timestamp": , "name": , "value": , "tags": [ "key1=val1", ... ] } |
You might be wondering why “tags” is an array of strings rather than a true subdocument. The answer is indexing. Ideally, we could use a hashed index on a proper “tags” subdocument; however, as the documentation states, “you cannot create compound indexes that have hashed index fields.” Instead, we try to use a multi-key index on an array of values. We can combine this multi-key index on tags with the timestamp and name to create a compound index by which we can query for specific metrics. If we call our collection metrics, then we create the index like so:
db.metrics.ensureIndex({timeStamp:1,name:1,tags:1}) |
Query Plan Explanation
Before we went any farther with this proof-of-concept, I wanted to understand whether these indices were likely to be performant for our query and update operations. If you don’t know about MongoDB’s explain() operator, I want you to stop what you’re doing right now and go read: cursor.explain()
Finished? Good. Hopefully you can see where we’re going with this now. We’ll execute an example query with various documents in the database and let MongoDB walk us through the query operations.
Let’s get a baseline with an empty collection.
db.metrics.find({'timeStamp':1234,'name':'metric1','tags':['tag1=val1','tag2=val2']}).explain() |
Amongst the output, you should see
"cursor" : "BtreeCursor timeStamp_1_name_1_tags_1 multi", "isMultiKey" : true, "n" : 0, "nscannedObjects" : 0, "nscanned" : 0, "nscannedObjectsAllPlans" : 0, "nscannedAllPlans" : 0, "scanAndOrder" : false, "indexOnly" : false, |
This confirms that we’re using our new multikey compound index. Although the indexOnly=false
line may look scary, it means that there are fields to be returned that aren’t in the index; namely, the value itself is stored in the document and must be consulted. This StackOverflow article helped me understand this output field better.
Let’s review the most important fields for our use case. From the documentation:
- n is the number of documents that match the query
- nscanned is the total number of index entries scanned
- nscannedObjects is the total number of documents scanned
Since there are no index entries or documents yet, all three values are 0 initially.
Okay, now let’s add the first metric.
db.metrics.update({'timeStamp':1234,'name':'metric1','tags':['tag1=val1','tag2=val2']}, {'$inc':{'value':1}}, {upsert:true}) |
Here we’re just atomically incrementing the value
field by one. Let’s run the same explain request to see what the query plan looks like now.
"n" : 1, "nscannedObjects" : 1, "nscanned" : 2, |
We see that the query now scans two index entries and one document. That seems reasonable.
What if we insert a record with a new name but the same tags?
db.metrics.update({'timeStamp':1234,'name':'metric2','tags':['tag1=val1','tag2=val2']}, {'$inc':{'value':1}}, {upsert:true}) |
The query plan for the original document would still look like
"n" : 1, "nscannedObjects" : 1, "nscanned" : 2, |
Great, so it seems like the query criteria is good at only selecting the single correct document.
Now let’s insert a record with the same name but a different value for “tag2″.
db.metrics.update({'timeStamp':1234,'name':'metric1','tags':['tag1=val1','tag2=other']}, {'$inc':{'value':1}}, {upsert:true}) |
Let’s look at the query plan now.
"n" : 1, "nscannedObjects" : 2, "nscanned" : 3, |
Uh-oh. This doesn’t look too good. Adding one new value for the second tag increased the number of scanned index entries and documents by one.
What happens if we add a new value for “tag1″ instead?
db.metrics.update({'timeStamp':1234,'name':'metric1','tags':['tag1=other','tag2=val2']}, {'$inc':{'value':1}}, {upsert:true}) |
Let’s look at the query plan now.
"n" : 1, "nscannedObjects" : 2, "nscanned" : 3, |
Well, that’s not so bad. Its the same before and after So, in the worst case, the number of scans increases linearly with the number of tag permutations.
If you continue with this exercise, you’ll start to understand the pattern. Essentially, each new tag in the tags array adds a new entry into the index. Since its doing a range search on the tags, it depends on where the new tag entry falls in the index. If its the last tag, its going to fall near or at the end, depending on the new and previous values of the final tag.
What we’ve learned is that mongo multi-key indexes don’t scale well with high-cardinality arrays. Since we’re in the same position as Cassandra-backed KairosDB, its back to the drawing board for me.
It feels like others must have solved these stats problems before. From real-time pre-write aggregation to attaching high-cardinality metadata, we must be reinventing the wheel. Is everything that is good at these tasks proprietary?
What systems do you use for real-time stats?
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.