Stitch Documentation
has moved!

Please update your bookmarks to https://www.stitchdata.com/docs

If you're not automatically redirected after 5 seconds, click here.

Syncing Historical SaaS Integration Data

When you connect a SaaS integration, Stitch will begin the process of syncing not only that integration’s recent data, but the historical data as well. Unless defined otherwise, Stitch will use the integration's default starting date to begin syncing historical data.

While it's possible to change the starting date for the majority of SaaS integrations, there are a few things you should keep in mind before doing so. In this doc, we'll cover:

Default Starting Dates

Each integration has a default starting date, which tells Stitch how far back to query for historical data. The majority of integrations have a default starting date of -1 year from the Stitch connection date.

If an integration has a default starting date of -1 year and was connected to Stitch on October 1, 2016, for example, a historical sync would go back to October 1, 2015.

A rollup of the default starting dates for all SaaS integrations can be found in the last section of this doc.

Default Starting Dates & Data Discrepancies

If you believe you’re missing data, try to narrow it down to a specific timeframe. If that timeframe falls outside the default starting date, this may be the root cause of the discrepancy. Check out the Data Discrepancy Troubleshooting Guide for more troubleshooting tips.

Considerations for Changing Start Dates

An integration's start date can be defined when you initially connect the integration to Stitch or after the fact. 

Before changing an integration's start date, we recommend considering the following points. Note that these points shouldn't cause worry or discourage you from setting up historical syncs - they're only intended to give you a comprehensive look at this process.

  1. This process cannot be undone. Once a historical sync is queued, there's no way to stop it.
  2. Depending on the integration, there may be limitations. Webhook-based integrations like SendGrid, for example, don't retain historical data.

    Check out the Default Starting Dates Rollup section for specifics.
  3. This process will result in higher row counts. It should be noted that some integrations - like Mixpanel - can contain large (sometimes astronomical) amounts of data. 
  4. This process may re-replicate recent data. For example: you set up an integration and the original sync contained data only for 2016. You are now setting up a historical sync for this integration with a start date of 1/1/2015. This will replicate data for all of 2015 AND 2016.
  5. This process may result in stale reporting. When a historical sync is run, no recent data will be retrieved until the replication and loading of the historical data is complete. The volume of data to be synced and the design of the provider's API can both affect how long a historical sync will take.

    For example: we're aware it can take quite a bit of time to retrieve and replicate Facebook Ads data, due to the design of their API and the sheer amount of data that's available.
  6. The time a historical sync takes may be affected by an integration's API quota. Some integrations - like Salesforce and Marketo - use API quotas, which limit your API usage. While our integrations are designed not to consume all of your available quota, if you're using the integration's API somewhere else, this process may use up your quota.

    As Stitch will be unable to continue replicating data once the quota has been consumed, this can extend the length of time the historical sync will take, thus affecting the freshness of your reports.

Changing an Integration's Start Date

Note that this feature is not available for some integrations. Pardot, for example, doesn't use date-based replication, making this date-based approach incompatible.

For New Integrations

If you don't want to use the integration's default starting date when connecting a new integration, you can do the following:

  1. After defining the rest of the integration's settings, locate the Sync Historical Data section.
  2. Uncheck the Use Integration Default box.
  3. Define the new starting date using the drop-down.
  4. When finished, click the Save Integration button.

Note that it may take some time for Stitch to perform a structure sync for the integration and begin replicating data.

For Already-Connected Integrations

  1. From the Stitch Dashboard page, click into the integration.
  2. In the Integration Details page, click the Integration Settings button in the top-right corner.
  3. Scroll down to the Sync Historical Data section.
  4. In the Start Date section, click the Change Date link.
  5. Define the new starting date using the drop-down.
  6. Click the Reset Date button.
  7. When prompted, click OK to confirm and proceed with changing the starting date.

If successful, a confirmation message will display indicating the historical sync has been queued. The integration's historical data will begin replicating according to that integration's defined Replication Frequency.

Default Starting Dates Rollup

Check out the table below for the default starting dates for each of our integrations and any limitations there may be about historical syncs. Note that in most cases, it's possible to retrieve data beyond the default starting date, but the amount of data Stitch replicates may be massive.

Integration Default Starting Date
(from date of Stitch connection)
Adroll 1 year
Autopilot 1 year
Bing Ads 1 year
Close.io 1 year
Desk.com 1 year
Facebook Ads 1 year*
Google AdWords (NEW) 30 days
Google Adwords (deprecated) 15 days
Google ECommerce 15 days
Google Analytics 30 days
HubSpot 1 year
Intercom 1 year
JIRA 1 year
MailChimp 1 year
Mandrill Webhook ✝
Marketo 1 year
Mixpanel 7 days**
NetSuite 1 year
Pardot 28 days
Pipedrive 1 year
Quickbooks 1/1/2010
Recurly 1 year
Salesforce 1 year
Segment Webhook ✝
SendGrid Webhook ✝
Shopify 1 year
Square 1 year
Stripe 1 year
Trello 1 year
Xero 1 year
Zendesk 1 year
Zopim 1 year
Zuora 1 year

* Facebook Ads: We can run a historical sync all the way back to January 1, 2013.

** Mixpanel: Going back further than 7 days is possible, but note that the potential amount of data to be replicated could be massive.

✝ Webhook integrations do not, by design, have historical data. Some integrations - like Segment - may be able to “replay” data depending on their app’s abilities and/or your account level. Contact that integration's support team if you have questions.

Related

Was this article helpful?
0 out of 0 found this helpful

Comments

Questions or suggestions? If something in our documentation is unclear, let us know in the comments!