Stitch Documentation
has moved!

Please update your bookmarks to https://www.stitchdata.com/docs

If you're not automatically redirected after 5 seconds, click here.

Expected SendGrid Data

Heads Up!

Now that your SendGrid data is connected to Stitch, what's next? In this article, we'll cover:

SendGrid Schema

Stitch will create two tables in your data warehouse for each SendGrid integration:

  • events - This is a top-level table and contains details about an event.
  • events__category - This is a subtable of events and contains the event category. For example: Processed, Dropped, Delivered.

The schema of these tables will contain two “types” of columns: columns used by Stitch (prepended with _sdc) and the columns sent by the provider’s webhook API.

Aside from the Stitch columns, the schema of these tables will depend entirely on SendGrid's webhook API. With the exception of the _sdc fields (and a Primary Key if one isn’t provided), Stitch does not augment Incoming Webhooks data nor does it have any control over the fields sent by the webhook provider.

events
Primary Key:
event:email:timestamp

The events table contains details about the event, such as the event category (event) - for example, Processed - email category (category), email address, IP address, timestamp, status, reason, attempt, cert error, SMTP ID, SendGrid event and message IDs, and TLS.

Note that a few columns will only be populated for specific event types:

  • type will be populated only for bounce events. This indicates the type of bounce: bounce, blocked, or expired.
  • attempt will be populated only for deferred events. This indicates the # of delivery attempts for the email.
  • reason will be populated only for bounce and dropped events. This contains the reason for the bounce or dropped email (ex: dropped address).

Additionally, if you utilize SendGrid’s newsletter feature, you’ll also see these columns in the events table:

  • newsletter__newsletter_user_list_id
  • newsletter__newsletter_id
  • newsletter__newsletter_send_id

If a column contains only NULL values, Stitch won’t replicate these columns to your data warehouse. For example: SendGrid gives you the option to associate emails with a newsletter list. If you don’t utilize this feature, then the columns specific to newsletter data won’t show up in the events table.

events__category
Primary Key:
The composite key - made up of the event, email, timestamp, and row ID - can be used to tie this table back to its parent, events: _sdc_source_key_event: _sdc_source_key_email: _sdc_source_key_timestamp: _sdc_level_0_id

The events__category table contains event category data. If you created a category called “Service Notification,” for example, that would show up in this table.

Replicating SendGrid Data

After you’ve successfully connected your SendGrid integration, Stitch will continuously replicate your webhook data into your data warehouse. Currently, this version of Stitch’s SendGrid integration uses Append-Only Replication.

Append-Only Replication is a type of Incremental Replication where newly replicated data is appended to the end of a table. Existing rows are not updated - any updates will be added to the table as new rows. Data stored this way can provide insights and historical details about how those rows have changed over time.

Querying SendGrid Webhook Data

If you simply want the latest version of the object - or objects, if you elected to track more than one during the setup - in the integration’s table (events), you’ll have to adjust your querying strategy to account for the append-only method of replication. This is a little different than querying records that are updated using updated_at Incremental Replication.

To do this, you can use the _sdc_sequence column and the table’s Primary Key. The _sdc_sequence is a Unix epoch (down to the millisecond) attached to the record during replication and can help determine the order of all the versions of a row.

If you wanted to create a snapshot of the latest version of this table, you could run a query like this:

SELECT * FROM [stitch-redshift:stitch-sendgrid.events] o
INNER JOIN (
    SELECT
        MAX(_sdc_sequence) AS seq,
        [primary_key]
    FROM [stitch-redshift:stitch-sendgrid.events]
    GROUP BY [primary_key]) oo
ON o.[primary_key] = oo.[primary_key]
AND o._sdc_sequence = oo.seq

This approach uses a subquery to get a single list of every row’s Primary Key and maximum sequence number. It then joins the original table to both the Primary Key and maximum sequence, which makes all other column values available for querying.

Related

Was this article helpful?
0 out of 0 found this helpful

Comments

Questions or suggestions? If something in our documentation is unclear, let us know in the comments!