Please note that our BigQuery destination is currently in open beta. The information in this article is subject to change.
Google BigQuery is a fully managed, cloud-based big data analytics web service for processing very large read-only data sets. BigQuery was designed for analyzing data on the order of billions of rows, using a SQL-like syntax.
For more information, check out Google's BigQuery overview.
Currently the Stitch BigQuery destination is in open beta. We encourage you to participate and give us feedback, but please consider the following first:
We appreciate your patience and feedback as we work to perfect this destination.
The Stitch BigQuery destination is inherently different from our Amazon Redshift destination. Before testing out BigQuery, take note of the following so you know what to expect.
BigQuery's pricing model is based on usage instead of a fixed-rate, meaning your bill can vary over time. Before fully committing yourself to using BigQuery as your data warehouse, we recommend familiarizing yourself with the BigQuery pricing model and how using Stitch may impact your costs.
Click here for more info on how a BigQuery-Stitch partnership may impact your warehousing costs.
Unlike connecting Stitch to Redshift, setting up BigQuery isn't as simply a matter of having warehouse credentials. In addition to completing the authorization process inside Stitch, we also require a user that:
Click here for more info on setting up BigQuery and connecting it to Stitch.
BigQuery was originally designed as an append-only data store, and the initial release of our BigQuery destination follows a similar paradigm.
This means that updates to existing rows in incrementally replicated tables are appended as new rows to the end of the table, creating a record of how the rows have changed over time. When querying your data, you'll need to account for append-only replication.
Click here for more info on how Stitch replicates data to BigQuery.
When nested data is replicated to Redshift, Stitch will de-nest or break apart records into subtables. This is by design, as Redshift doesn't natively support nested record replication.
Unlike Redshift, BigQuery excels at supporting nested records. This means that Stitch will not de-nest records that are sent to BigQuery.
Click here for more info on nested record support for BigQuery and Stitch.
Because data can come from a variety of integrations and all those integrations may structure or handle data differently, Stitch will likely encounter numerous scenarios when replicating and loading your data. It's important to familiarize yourself with how certain scenarios will be handled so you can understand what's happening or how to diagnose an issue.
Click here for more info on what those scenarios are and how Stitch handles them.