Do you see a discrepancy between your source integration and your data warehouse? This article will walk you through some common data discrepancy “gotchas” that can help you pinpoint the root of the issue.
If you still have questions, provide the information outlined below to our support team. This will help us validate your request and resolve the discrepancy more quickly.
Before you reach out to support about a data discrepancy, please check the following:
While Stitch is designed to quickly and efficiently process large amounts of data, it can take some time to replicate and load your data into your data warehouse. What looks like missing data may actually be incomplete processing, meaning Stitch hasn’t finished loading all the data into your data warehouse.
This is especially true for SaaS historical syncs, which may also be affected by API quotas. Most data discrepancies can be solved by simply waiting and giving Stitch time to process and load the data.
Are you querying for the same timeframe in your data warehouse and in your data source? Keep in mind that historical syncs for SaaS integrations have varying default start dates.
For the majority of SaaS integrations, a historical sync goes back one year from the Stitch connection date. We recommend checking out the Syncing Historical SaaS Integration Data article to see if the default starting date corresponds with the discrepancy.
Have you accounted for any timezone variation between the data source and your data warehouse? If your data source is configured to report in a certain timezone, those timestamps will be converted to UTC in Redshift.
Check the Replication Method settings for the trouble table, accessed by clicking into the table in the Integration Details page, then the Table Settings button.
If your database table is set to replicate incrementally, ensure that the proper Replication Key is being used. Remember:
NULLvalues are only replicated during the first replication for that integration.
Check the Replication Frequency for the integration, accessed by clicking the Integration Settings button in the Integration Details page. If the missing records were created very recently, you may need to wait for an update of your data to complete before they appear in your data warehouse.
Check to see if the SaaS integration where you noticed the discrepancy is experiencing downtime. Here's a list of all our integrations' status pages and you can always see Stitch’s status on our status page.
Take a look at the data and identify whether there are any consistencies around the discrepancy, such as records missing over a specific timeframe or field value discrepancies affecting only certain types of records. Could these correspond with any recent changes in your source integration?
Make sure you’re using a SQL client to directly query your data warehouse. This will ensure that the discrepancy isn’t the result of report refresh lags, third party bugs/downtime, or any other type of data delay.
If the discrepancy can’t be explained by any of the points above, please reach out to support and provide the following information. Note that there are different sections for row count versus field value discrepancies.
[source_integration_schema].[table_name]and provide us with the results:
[data_warehouse_schema].[table_name]and provide us with the results:
updated_atvalue (when applicable)