Synchronisation protocol

Patching the folders

Headers

This should be clear by now that we only synchronize meta datas. These are mere UTF-8 encoded stream of fields, each consisting of a name, a colon and a value and delimited by new lines. As they are similar to internet message headers, we call such messages headers. Headers are ended by a blank line. For instance, here is a header for an incomming email :

sc-type: mail
sc-from: someone@gmail.com
sc-start: 2008 12 30 17 54 07
sc-extid: <261201305n1fa888458b76082@mail.gmail.com>
sc-descr: Re: What about this enlargement thing ?
sc-resource: 51/08/B4BE6483472BA4688AB0E826ADAB40777B44;
    name="noname.txt"; type="text/plain; charset=ISO-8859-1"
sc-resource: 35/D4/B7AF479D2D8917890360CDEAB21BF083E4D6;
    name="noname.html"; type="text/html; charset=ISO-8859-1"
sc-resource: D4/94/068419D2A7FD8142F498A1D22F8F1D6A8376;
    type=text/plain

Notice that the header is typed by a special field sc-type. Types are used to identifies the creator of the message as well as the best plugin to render it. Anyway, it is encouraged to use duck typing to handle messages. For instance, it is best if a calendar application is able to render any messages that have a date and a description, and not rely too strictly on a specific header type.

Also, the header has a start date (sc-start), an external id (sc-extid) which is used to reply to the email, a description (sc-descr) which, for an email, contains the email's subject, an originator (sc-from) and several additional resources (sc-resource).

These resources are reference to external content files, stored independently. For emails, the SMTP gateway build a file for each MIME part of all incomming emails.

Folders

Each folder is synchronized independantly in order to allow the client to subscribe only to a subset of these. We use a per folder version number which is changed to a new unique value (not necessarily incremented) each time the folder's content is changed.

The two only possible changes are : to add a header or to remove a header. Such changes are called patches. A patch is thus merely an action (encoded + or -) followed by a header (with its terminating blank line).

Folders are oganized in a hierarchy similar to a file system directory hierarchy, with the only difference that the same folder can appear in several other folders. We thus speack of mount points rather than mere subfolders.

This whole tree of folders is itself managed with patches : To mount a folder onto another one, simply patch the parent's folder with a header having the special type dir, with additional fields sc-name (the name of the subfolder in its parent) and sc-dirId, the unique identifier that identifies this folder. While directory IDs are unique, the same directory may thus appear under various names at several locations (not unlike UNIX symbolic links to the same directory).

There is then no need for a dedicated mechanism to synchronize folders hierarchy.

The fact that a folder change its version at each new patch implies that any patch can be identified by the version of the folder that follows it. Let's call this the header version, although it's really the folder's version after this header was appended. This is used in several places to locate a given header ; for instance, to remove a header from a folder you write a new patch removing (action is -) the header which version is given by a field named sc-target. This is shorter than copying the whole header (and also more effective, provided headers are indexed by versions).

To receive new patch on a folder, clients have to subscribe to a given folder. They will then begin to receive all new patches on this folder, along with their corresponding version, so that they can track the folder's content.

Protocol

The protocol used for synchronizing folders between the server and all the clients is fully human readable, and consists of very few commands, which are described hereafter. Commands may be prefixed by a sequence number used to associate answers to queries (so that clients do not have to wait for server answer before submitting a new command). If no seqnum is present then the peer will not send a response (but will process the command nonetheless).

The general format for commands is :

[seqnum] keyword [param1] [param2] [...] {<CRLF>|<LF>}

Some commands may be followed by a header. Notice that we admit LF as well as CRLF for line feeds.

All answers follow this pattern :

-seqnum keyword status [(comments)] {<CRLF>|<LF>}

Notice that to distinguish commands from answers the sequence number is negative when the message is an answer (that is, it is equal to minus the sequence number of the query, which is thus reauired to be positive).

The status is a numerical value loosely related to usual statuses (200 to mean OK, 4XX and 5XX for various errors, etc).

Authentification

Most commands requires some sort of authentification. A user sets his identity through the auth command :

[seqnum] auth username

Subscription

Format is :

[seqnum] sub dirId last_version

Where dirId is the identifier for the folder we want to subscribe to, and last_version is the last known version of this folder. If successful, the client will then start receiving patches (see following command patch) for this folder.

Unsubscription is also possible :

[seqnum] unsub dirId

Patch submission

To add a new header, use this command :

[seqnum] put folder

followed by your header. Here folder may be a dirId or a path to a folder mounted anywhere, relative to the folder tree root (which name is "/").

To remove a header :

[seqnum] rem folder

followed by a header composed of a sc-target field identifying the header to be deleted.

When a header is deleted, it will not be visible any more by new client that subscribe to this folder, or if they already have synchronized it they will receive the deletion patch and will proceed to the deletion on their part.

Notice that the server may add some fields to a submitted header. For instance it will add a dirId to a patch that creates a new subfolder.

Server answers, if these commands are successful, contain the new version number (ie. this header's version) as the answer comments. Anyway, if the client subscribed to this folder he will receive the new patch he just added (augmented by some fields the server might have added).

Patches

All these commands were queries from the clients to the server. Server may also use the connection to send patches to the client. To receive some patches for a folder, the client must first subscribe to it, informing the server of its current version number for this folder (see the subscribe command above).

A patch is as follow :

PATCH dirId old_version new_version {+|-}

followed by the header to be added or removed to the specified folder. The patch is meant to be applied to folder at version old_version and will bring it up to version new_version, which is also therefore what we called the header version.

This is not required that new_version = old_version+1 (but is often the case).

A patch may looks similar to a command to add or remove a message except that no answer is expected from the client (in other words, the client can not refuse the patch). The protocol is assymetric for another reason : we do not want the server to remember a thing about the clients between connections (like last version synchronized for instance).

If for some reason a client does not receive a patch it can merely subscribe to the same folder again. Also, notice that having a per folder version number (instead of a global one) allows a client to subscribe and unsubscribe several times from a folder without the need to retransmit the whole folder's content. Also, as we use a dirId instead of the folder path, the folder may be moved around without a need for the clients to download its content anew.

Quit

For nice deconnections, use the nice quit command :

[seqnum] quit

Transport

The transport protocol used for connections to central server may be stream or datagram oriented, with the provision that a whole command must fit into a datagram.

Use cases

Usual case

Full synchronisation