You can’t use prime numbers to cheat on CRDTs
Conflict-free Replicated Data Types, or CRDTs, are a simple way to make a data structure that works well in an eventually consistent environment, such as the Riak distributed database my employer makes. I’ve had “google CRDTs” on my to-do list for months; each time I had a hard time getting in to it. Last week, somebody linked me to Kyle Kingsbury’s Ruby implementation of them, and I was very very happy to see that the README file is awesome.
Seriously: https://github.com/aphyr/meangirls has an awesome README, easily the best CRDT resource I’ve read. Thanks, Kyle and Sean Cribbs!
Counters and Primary Keys
One thing that you get with a consistent database (such as Postgres, MySQL, and Redis) is a counter that you can use to generate nice-looking IDs that are not enormous. Think the “943” in https://github.com/blog/943-rubyconf-block-party, vs. the “182302468733018112” in https://twitter.com/#!/HIMANSHU/status/182302468733018112.
You can use a G-Counter (see the meangirls README) to count things, but it doesn’t work as a primary key: if two nodes increment their entries at the same time, they’ll both have the same value.
{
'type': 'g-counter',
'e': {
'a': 1,
'b': 5,
'c': 2
}
}
If a and b increment at the same time, they’ll both get a new value of (1+5+2) + 1 = 9. Since the whole point of a primary key is to be unique, that won’t do.
Prime numbers
Now, what if each node has a different prime number that they increment by? Let’s make nodes a, b, and c use 3, 5, and 7, respectively.
{
'type': 'g-counter',
'e': {
'a': 3,
'b': 25,
'c': 14
}
}
This counter has a value of (3+25+14) = 42.
Now, if a and b increment at the same time, a will see (3+25+14) + 3 = 45, b will see (3+25+14) + 5 = 47. When c looks later and merges them, it’ll come up with 49, but that’s not a problem: we just didn’t want a and b to increment to the same value.
Here’s the problem: if a increments 5 times in the same duration that b increments 3 times without their versions getting merged, they’ll land on the same number, since 5 * 3 = 3 * 5.
Big prime numbers
You can use prime numbers larger than
3
and
5
to mitigate this a bit. I’ll be using the RSA key generator from
http://ats.oka.nu/titaniumcore/js/crypto/RSA.sample1.html
set to 16 bit keys for this.
Here are two bigger primes:
a = 185399
and
b = 125729
. You’d have to increment
a
125729 times and
b
185399 times without a merge to run into a conflict. Let’s see the friendly numbers from the previous example (
c = 200819
):
{
'type': 'g-counter',
'e': {
'a': 185399,
'b': 628645,
'c': 401638
}
}
This counter has a total of…
1215682
. Not quite a friendly
943
, but not as hostile as
FD22C314836F11E19425AB7961E300D6
either.