Before You Get Started:
Take note of the following:
For tables using Incremental Replication, Stitch currently loads data into Google BigQuery in an append-only fashion. This means that as time goes on, tables will wind up containing many different versions of the same row.
Data stored this way can provide insights and historical details about how those rows have changed over time - creating a timeline of the status changes of an order record, for example - but in some cases, you might just want the latest version of the table.
In this doc, we'll cover:
In each Stitch-generated integration table, you'll see a few columns prepended with
_sdc. The column we'll focus on here is the
_sdc_sequence column. This column is a Unix epoch (down to the milisecond) attached to the record during replication and can help determine the order of all the versions of a row.
Stitch uses these sequence values in a few places to correctly order rows for loading, but it can be also used to grab the latest version of a record in an append-only table.
Let's take a look at an example. Assume we have an
orders table that contains:
_sdccolumns added by Stitch, and
If you wanted to create a snapshot of the latest version of this table, you could run a query like this:
SELECT * FROM [stitch-analytics-bigquery-123:ecommerce.orders] o INNER JOIN ( SELECT MAX(_sdc_sequence) AS seq, id FROM [stitch-analytics-bigquery-123:ecommerce.orders] GROUP BY id ) oo ON o.id = oo.id AND o._sdc_sequence = oo.seq
This approach uses a subquery to get a single list of every row's Primary Key and maximum sequence number. It then joins the original table to both the Primary Key and maximum sequence, which makes all other column values available for querying.
To make this easier, you can turn queries like the one in this doc into a view. We recommend this approach because a view will encapsulate all the logic and simplify the process of querying against the latest version of your data.
In the Related section below, you'll find links to how to create views in each of the data warehouses supported by Stitch.