Keeping partly-offline sqlite db in sync with postgresql Keeping partly-offline sqlite db in sync with postgresql sqlite sqlite

Keeping partly-offline sqlite db in sync with postgresql


It doesn't usually fall within the tasks of an ORM to sync data between databases, so you will likely have to implement it yourself. I don't know of any solution that will handle syncing for you given your choice of databases.

There are a couple important design choices to consider:

  • how do you figure out what data changed ( i.e. inserted, updated or deleted )
  • what is the most efficient way to package the change-log
  • will you have to deal with conflicts ? and how will you do that.

The most efficient way to figure out what changed is to have the database tell you that directly. Bottled water can offer some inspiration in this regard. The idea is to tap into the event log postgres would use for replication. You will need something like Kafka to keep track of what each of your clients already knows. This will allow you to optimize your server for writes, as you won't have clients querying trying to figure out what changed since they were last online.The same can be achieved on the sqlight end with event callbacks, you'll just have to trade some storage space on the client to retain the changes to be sent to the server. If that sounds like too much infrastructure for your needs, it's something that you can easily implement with SQL and pooling as well, but I would still think of it as an event log, and consider how it's implemented a detail - possibly allowing for a more efficient implementation lather on.

The best way to structure and package your change log will depend on your applications requirements, available band-with, etc. You could use standard formats such as json, compress and encrypt if needed.

It will be much simpler to design your application as such to avoid conflicts, and possibly flow data in a single direction, or partition your data so that it always flows in a single direction for a specific partition.

One final taught is that with such an architecture you would be getting incremental updates, some of which might be missed for unplanned reasons ( system failure, bugs, dropped messages, etc ). You could have some built in heuristic to check that your data matches, like at least checking the number of records on each side, with some way to recover such a fault, at a minimal a way to manually re-fetch the data from the authoritative source, i.e. if the server is authoritative, the client should be able to discard it's data and re-fetch it. You might need such a mechanism anyway for cases wen the client is reinstalled, etc.