Software Development

Offline data sync design patterns

Featured image by krzysztof-m from Pixabay

In order to provide seamless experience to client-server application’s users, sometimes we need to allow them to access some of the data on the client side, even when the connection to the server is unavailable. One of the most common use case is allowing users to read/update some data in an online mobile application, even when there is no internet connection (offline). The catch is the system must implement a certain mechanism to maintain synchronization between the local and remote database.

Depending on the requirements, there are some design patterns that can be implemented to sync data between local and remote database. These are 3 of them which are inspired by outsystems‘s implementations:

Read-Only Data

This is the simplest pattern that can only be used when user is only allowed to read the local database.

Requirements:

  • Remote database is the single source of truth
  • Local database is read-only

Database design: no specific feature

Algorithms:

function sync() 
{
  delete all data from local database;
  copy all data from remote database to local database;
}

These design pattern can be optimized by altering the remote database design a bit to track modifications. During the sync process, only the modified data will be updated on local database to improve efficiency.

Database design:

Database design for Read-only with Optimization

Algorithm:

function sync()
{
  last_sync_at = last_sync_at in remote_sync table;
  if last_sync_at is null
  {
    delete all data from local database;
    copy all data from remote database to local database;
  } 
  else
  {
    new_remote_data = select all from remote_data table where created_at >= last_sync_at and deleted_at is not null;
    insert new_remote_data to local_data table;
    
    updated_remote_data = select all from remote_data table where created_at >= last_sync_at and deleted_at is not null; 
    update local_data with updated_remote_data;
    
    deleted_remote_data = select all from remote_data where deleted_at >= last_sync_at;
    delete deleted_remote_data from local_data table;
  }
  insert or update last_sync_at with current timestamp;
}

Read/Write Data Last Wins

This pattern allows a user to modify their local database while keeping it in-sync with a remote database.

Requirements:

  • Both the remote and local database can be source of truth
  • The local database can only be owned and written by a single user (must not shared by multiple users)
  • Data that is written last (by timestamp) is considered as the valid one (last wins)

Database design:

Database design for R/W Last Wins

Algorithm:

function sync()
{
  new_local_data = select all from local_data where is_from_server = false and is_active = true;
  insert new_local_data to remote_data;

  updated_local_data = select all from local_data where is_from_server = true and is_modified = true and is_active = true;
  update remote_data with updated_local_data;

  deleted_local_data = select all from local_data where is_from_server = true and is_active = false;
  delete deleted_local_data from remote_data;

  delete all local_data;
  remote_data = select all from remote_data;
  insert all remote_data to local_data by overriding attributes: is_from_server = true, is_modified = false, is_active = true;
}

Read/Write Data with Conflict Detection

This pattern is recommended for advanced scenarios where multiple users can change the same local data. The idea is to update remote data using local data and replace all local data with updated remote data. In case of any remote data is more recent than local data, the data is stored in separate table waiting for later resolution (e.g. user chooses which version to keep).

Requirements:

  • Both the remote and local database can be source of truth
  • The local database can be owned and written by multiple users

Database design:

Database design for R/W with conflict detection

Algorithm:

function sync()
{
  if local_conflict_data is empty
  {
    current_time = get current timestamp;
    last_sync_at = select last_sync_at from local_sync;

    new_local_data = select all from local_data where is_from_server = false and is_active = true;
    insert new_local_data to remote_data by overriding attributes: modified_at = current_time, is_active = true;

    conflicting_data = create empty list;
    updated_local_data = select all from local_data where is_from_server = true and is_modified = true and is_active = true;
    iterate updated_local_data as local_data
    {
      data = find local_data in remote_data table;
      if data is not exist in remote_data
      {
        local_data.is_active = false;
        append local_data to conflicting_data;
      }
      else if remote_data.modified_at <= last_sync_at
      {
        update remote_data with local_data;
      } 
      else if local_data is different than remote_data
      {
        append remote_data to conflicting_data;
      }
    }

    deleted_local_data = select all from local_data where is_active = false;
    iterate deleted_local_data as local_data
    {
      remote_data = find local_data in remote_data table;
      if remote_data is exist in remote_data
      {
        if remote_data.modified_at <= last_sync_at
        {
          delete local_data from remote_data table;
        } 
        else
        {
          append remote_data to conflicting_data;
        }
      }
    }

    insert conflicting_data to local_conflict_data table;
    update local_data with non-conflicting remote_data;
    insert or update last_sync_at in local database;
  }
}

As you can see from algorithm above, the conflict resolution is not done during the sync process and must be done before it. So we need to separately implement a mechanism, either automatic or manual, to resolve detected conflicts. One of the most common manual mechanism is presenting the conflicting records side-by-side, then asking the user to decide which version to keep.

0 0 votes
Article Rating
yohanes.gultom@gmail.com

Share
Published by
yohanes.gultom@gmail.com

Recent Posts

Get Unverified SSL Certificate Expiry Date with Python

Getting verified SSL information with Python (3.x) is very easy. Code examples for it are…

3 years ago

Spring Data Couchbase 4 Multibuckets in Spring Boot 2

By default, Spring Data Couchbase implements single-bucket configuration. In this default implementation, all POJO (Plain…

3 years ago

Firebase Auth Emulator with Python

Last year, Google released Firebase Auth Emulator as a new component in Firebase Emulator. In…

4 years ago

Google OIDC token generation/validation

One of the authentication protocol that is supported by most of Google Cloud services is…

4 years ago

Fast geolocation query with PostGIS

If you need to to add a spatial information querying in your application, PostGIS is…

4 years ago

Auto speech-to-text (Indonesian) with AWS Transcribe and Python

Amazon Web Service Transcribe provides API to automatically convert an audio speech file (mp3/wav) into…

5 years ago