Stitch Documentation
has moved!

Please update your bookmarks to https://www.stitchdata.com/docs

If you're not automatically redirected after 5 seconds, click here.

Postgres: Multiple Data Types & Column Splitting

Important!
Please note that our Postgres destination is currently in open beta. The information in this article is subject to change.

Stitch requires that there only be one data type per column to properly type and store your data. When a single column contains multiple data types, Stitch may not be able to load the data losslessly - or perfectly intact - into your data warehouse.

To compensate for this, when Stitch detects a column that contains multiple data types, a new column is created for each data type. The addition of new columns will also change the structure of the realized table in your data warehouse.

Let’s say we have an orders table in an integration called rep_sales. During the structure sync, Stitch will create a schema called rep_sales in your data warehouse and inside this schema, a table called orders.

The table is structured as follows, excluding the _sdc columns:

Column           | Data Type
-----------------+--------------+
id [pk]          | BIGINT
rep_name         | TEXT
order_amount     | BIGINT
order_confirmed  | BOOLEAN


During the first sync, the following rows are replicated:

id [pk] | rep_name    | order_amount | order_confirmed
--------+-------------+--------------+----------------+
1       | Marty McFly | 12           |
2       | Doc Brown   | 10           | true


During the next sync, the following rows are replicated:

id [pk] | rep_name                         | order_amount | order_confirmed
--------+----------------------------------+--------------+----------------+
3       | Marty McFly                      | 13.25        | true           
4       | Doc Brown                        | 11           | yes


These rows are decidedly different than those replicated during the first sync. Specifically:

  1. order_amount can sometimes be a decimal.
  2. order_confirmed can sometimes be a string.

To accommodate the data, Stitch will split the mixed data types into new columns - one for each data type - and append the column name with the data type.

This would modify the table structure to the following:

Column                   | Data Type
-------------------------+---------------+
id [pk]                  | BIGINT
rep_name                 | TEXT
order_amount__bigint     | BIGINT
order_amount__double     | DOUBLE
order_confirmed__boolean | BOOLEAN
order_confirmed__string  | TEXT

What happens if data can’t fit into the table?

You’ll experience some missing data on your end. To troubleshoot, we recommend taking a look at some of the more common causes for dropped (or missing) data:

  • A Primary Key for the table is missing.
  • The table name is longer than Postgres’s 63 character limit.
  • The column name is longer than Postgres’s 59 character limit.
  • Stitch’s permissions to your data warehouse are revoked.
  • You have column names with __string, __bigint, etc. These are reserved names for Stitch.

If you’re stumped or need some help pinpointing the issue, please reach out to support.

Was this article helpful?
0 out of 0 found this helpful

Comments

Questions or suggestions? If something in our documentation is unclear, let us know in the comments!