You can't use prime numbers to cheat on CRDTs
Message from 2022
This post is pretty old! Opinions and technical information in it are almost certainly oudated. Commands and configurations will probably not work. Consider the age of the content before putting any of it into practice.
Conflict-free Replicated Data Types, or CRDTs, are a simple way to make a data structure that works well in an eventually consistent environment, such as the Riak distributed database my employer makes. I’ve had “google CRDTs” on my to-do list for months; each time I had a hard time getting in to it. Last week, somebody linked me to Kyle Kingsbury’s Ruby implementation of them, and I was very very happy to see that the README file is awesome.
Seriously: https://github.com/aphyr/meangirls has an awesome README, easily the best CRDT resource I’ve read. Thanks, Kyle and Sean Cribbs!
Counters and Primary Keys
One thing that you get with a consistent database (such as Postgres, MySQL, and Redis) is a counter that you can use to generate nice-looking IDs that are not enormous. Think the “943” in https://github.com/blog/943-rubyconf-block-party, vs. the “182302468733018112” in https://twitter.com/#!/HIMANSHU/status/182302468733018112.
You can use a G-Counter (see the meangirls README) to count things, but it doesn’t work as a primary key: if two nodes increment their entries at the same time, they’ll both have the same value.
{
'type': 'g-counter',
'e': {
'a': 1,
'b': 5,
'c': 2
}
}
If a
and b
increment at the same time, they’ll both get a new value of (1+5+2) + 1 = 9
. Since the whole point of a primary key is to be unique, that won’t do.
Prime numbers
Now, what if each node has a different prime number that they increment by? Let’s make nodes a
, b
, and c
use 3
, 5
, and 7
, respectively.
{
'type': 'g-counter',
'e': {
'a': 3,
'b': 25,
'c': 14
}
}
This counter has a value of (3+25+14) = 42
.
Now, if a
and b
increment at the same time, a
will see (3+25+14) + 3 = 45
, b
will see (3+25+14) + 5 = 47
. When c
looks later and merges them, it’ll come up with 49
, but that’s not a problem: we just didn’t want a
and b
to increment to the same value.
Here’s the problem: if a
increments 5 times in the same duration that b
increments 3 times without their versions getting merged, they’ll land on the same number, since 5 * 3 = 3 * 5
.
Big prime numbers
You can use prime numbers larger than
3
and
5
to mitigate this a bit. I’ll be using the RSA key generator from
http://ats.oka.nu/titaniumcore/js/crypto/RSA.sample1.html
set to 16 bit keys for this.
Here are two bigger primes:
a = 185399
and
b = 125729
. You’d have to increment
a
125729 times and
b
185399 times without a merge to run into a conflict. Let’s see the friendly numbers from the previous example (
c = 200819
):
{
'type': 'g-counter',
'e': {
'a': 185399,
'b': 628645,
'c': 401638
}
}
This counter has a total of…
1215682
. Not quite a friendly
943
, but not as hostile as
FD22C314836F11E19425AB7961E300D6
either.