Before you connect your database to Stitch for replication, you might want the details on how we’ll replicate your data. What queries will Stitch run? Will the load from Stitch be enough to negatively impact performance? What can you do to insulate your database?
In this doc, we’ll cover what the Stitch replicator does and our recommendations for ensuring high availability and performance.
When you connect a database - like MySQL, Postgres, or MongoDB - as an input, Stitch only needs read-only access to the databases, tables, collections, and columns you want to sync.
The first part of the replication process is called a structure sync. This process will detect any changes to the structure of your database.
For example: you add a new column to one of the tables you’re syncing in Stitch. The structure sync will detect the new column and after the replication job completes, display it in the Stitch app.
The queries Stitch runs to detect these changes can vary depending on the type of database integration you’ve connected.
|Database||Queries we run|
|Microsoft SQL Server (MSSQL)||Structure syncs query the
For every collection in the database - even those that aren’t set to sync - we also run the following queries:
|Postgres||We run queries on the following tables in the
The second part of the replication process is called a data sync. This is where Stitch actually pulls data out of your database for replication.
The method Stitch uses is the same for all databases, but differs depending on the Replication Method that each table uses.
For tables using Full Table Replication, Stitch runs a single query (shown below) and reads out of the resulting cursor in batches. Note that the Mongo equivalent for this query looks a little different.
select column_a, column_b <,...> from table_a
For tables using Incremental Replication, Stitch runs a single query (shown below) and reads out of the associated cursor in batches. Note that the Mongo equivalent for this query looks a little different.
select column_a, column_b <,...> from table_a where replication_key_column >=’last_bookmark_value’ order by replication_key_column
While we make every effort to ensure the queries that Stitch runs don’t impart significant load on your databases, we still have some recommendations for guaranteeing database performance: