Swarm/Lamport timestamps

Swarm/Lamport timestamps

Lamport timestamps are necessary to sort out event order and causality issues in a distributed system. The original article defines Lamport clocks as, essentially, a tuple of a number (plain timestamp) and a process id. Each process stamps its new event with a number, assuming it is greater than any previously known timestamp number. Concurrent events created by different processes may have equal timestamp numbers. Such tie is resolved in an arbitrary but consistent way, namely by comparing (unique) process ids.

Lamport: total order definition

Timing of an event may turn equally useful from the end-user perspective (remember that “1m”, “2h” grey labels on Twitter, for example). But, using the same timestamp both ways (for causal and wall clock time) is actually quite tricky. On one hand, computer clocks tend to be less precise than necessary, especially considering that events propagate by the network with the speed of light. That is especially true for end-user computer clock which may be off by an hour. On the other hand, logical Lamport time is generally not supposed to reflect wall clock time at all; “logical timestamps” means “counters” in most of the cases.

The rich man approach to the problem is to install atomic clocks and satellite receivers to datacenters so machine time is very precise. Then, computer clocks become somewhat useful for handling causality/order issues (see Google Spanner).

Swarm is an operation-based CRDT engine where every event is marked with a Lamport timestamp. In our past projects we used Lamport timestamps literally defined as a pair of integers. That approach was not good enough for the general case, so finally we settled on the following scheme:

Swarm specifiers (compound event identifiers)

The Lamport timestamp is !8V7N809+Walt~ssn. Both parts of the timestamp are different from the “literal” scheme. First, the “number” component is an actual timestamp in base64. 8V7N8 is 142374344 seconds since 1 Jan 2010, while 09 is a sequence number used to create multiple events a second. Second, the “process id” is a combination of a user id (like Walt) and a session id (like ssn). It is an extremely useful feature to see the author of an event straight from the event id. Session id is needed to denote different sessions of the same user (like Chrome on a home PC and Safari on a laptop). While sessions share same data and same credentials, in terms of the Lamport model they are concurrent processes generating independent events. These compound timestamps can be compared lexicographically, have all the necessary logical timestamp features and (tadaam) provide basic information on the event, such as wall clock time and the author id.

That scheme assumes some level of clock synchronization between replicas. Practically, web clients with bad clocks must coordinate with their server to stay reasonably correct.

Swarm implements two other timestamp schemes: purely logical LamportClocks and MinutePreciseClocks if you do not care about seconds. It is up to the developer to pick the most convenient one.

I hope, that explains that slightly unfamiliar look of Lamport timestamps in Swarm.