Stitch Documentation
has moved!

Please update your bookmarks to https://www.stitchdata.com/docs

If you're not automatically redirected after 5 seconds, click here.

Postgres: Expected Data Loading Behavior

Important!
Please note that our Postgres destination is currently in open beta. The information in this article is subject to change.

Because data can come from a variety of integrations and all those integrations may structure or handle data differently, Stitch will likely encounter numerous scenarios when replicating and loading your data. In this doc, you'll find a number of these scenarios broken out by type:

New Table Scenarios

Summary Scenario Expected Behavior
Multiple Primary Keys First batch of data arrives with multiple Primary Keys Table is created with Primary Key constraint (table_name_pkey); Primary Key columns have NOT-NULL constraints
No Primary Keys First batch of data arrives without Primary Keys Table is created without Primary Key and NOT-NULL constraints
Some empty columns First batch of data arrives with some columns that are completely empty Only columns that are populated in at least one record are created
ALL empty columns First batch of data arrives with columns that are all null Table is created and contains only the Stitch replication (_sdc) columns
Table name is too long Data arrives with a table name that exceeds the maximum length for Postgres. Table name limit is 63 characters. Postgres will reject all data for the table
Column name is too long Data arrives with a column name that exceeds the maximum length for Postgres. Column name limit is 59 characters. Columns with names longer than 59 characters will be rejected; columns that have names less than 59 characters will persist to Postgres
Too many columns A table arrives with more columns than the data warehouse allows Postgres will reject all data for the table
Duplicate fields Data arrives with two column names that canonicalize to the same name Postgres rejects the entire record containing the two columns. Note that Postgres is case-sensitive.
Negative/positive infinity, NaN Special cases of FLOAT/DOUBLE values Supported
Mixed case column name Data arrives with columns named LiKeThIs Case is maintained
Column name with spaces Data arrives with column names containing spaces Spaces are maintained
Unsupported special characters in column name Data arrives with column names containing special characters that are unsupported by Postgres Special characters are removed

 

Data Loading Scenarios

Summary Scenario Expected Behavior
Data for existing table version Data arrives that updates existing records in the data warehouse (matched on Primary Key) Also known as an upsert, the original row is deleted and a new row is created with the updated data
Data with a new table version A full set of data for a table (set to Full Table Replication) arrives Table is truncated and new data is loaded in a single atomic action
Timestamps out of range Data arrives with a date that is out of range for the data warehouse to handle. Postgres’s range is 4713 BC to 294276 AD. Postgres will reject records that contains that fall outside the supported time range
VARCHARs max width Data arrives within a text column that exceeds the Postgres size maximum. The VARCHAR limit is 1GB/row. Stitch will stop processing data for that table
Decimal range Data arrives within a decimal column that exceeds the Postgres size limit Postgres will reject the row if there are:
  • > 131,072 digits BEFORE the decimal point, or
  • > 16,383 digits AFTER the decimal point
Integer range Data arrives within an integer column that exceeds the Postgres size limit Postgres will reject records that exceed the size limit

 

Data Typing Scenarios

Summary Scenario Expected Behavior
Split data types (same batch) Initial batch of data contains mixed data types within a single column. For example: a column contains string and INTEGER data. Two columns will be created; column names are appended with the data type.

For example: age_fl, age_bi
Split data types (different batch) Initial batch of data contains single data type columns, but a subsequent batch contains mixed data types in a single column. For example: a column contains string and INTEGER data. Column will be “split” and the column name will be appended with the data type.

For example: age_fl, age_bi
VARCHARs Data arrives in a VARCHAR or text format of varying lengths Postgres stores all VARCHAR/TEXT data as TEXT
Dates with timezones Timestamp data arrives with no timestamp information Data type is changed to TIMESTAMP WITH TIME ZONE and adjusted to show as UTC
Decimals Data arrives in decimal format with an arbitrary amount of precision Postgres stores all decimals without precision

 

Schema Changing Scenarios

Summary Scenario Expected Behavior
New column A new column is added to an existing table New column is appended to the end of the table
Column removed A column is no longer seen within rows of data coming for an existing table Default NULL values are placed into the column
Column added to nested table Data arrives with a nested table that contains a field that did not exist in that nested table before New field is added to the nested table

 

Nested Data Handling Scenarios

Summary Scenario Expected Behavior
Nested JSON objects Data arrives that is nested with many top-level properties Keys in the object are de-nested and become columns in the table where the column name follows this naming convention:
top_level_key__secondary__third__etc
Nested JSON top-level arrays Data arrives that is nested and amongst the nested data are properties that contain arrays A subtable is created. Subtables follow this naming convention:
top_level_table__array_name
Nested JSON sub-level arrays Data arrives that is nested and amongst the nested data are properties that contain arrays, which themselves contain properties that are arrays See above
Nested JSON new columns in arrays The sub-level arrays in nested JSON data have a varying set of columns See above

 

Direct Data Warehouse Changes

Summary Scenario Expected Behavior
You add columns You add additional columns to a Stitch-generated table in Postgres Fine, as long as nullable
You apply indexes You apply indexes to columns created by Stitch No issue; Stitch will not remove them
You remove Stitch (_sdc) columns from a table You remove one or more of the _sdc columns created by Stitch Inserting and updating data will break as they require _sdc_table_version and _sdc_sequence columns.
You remove a data column You remove one or more of the data columns created by Stitch from the source Removing a column with a data type suffix (ex: __boolean) will break the table. Otherwise, the column will be recreated.
You revoke permissions to schemas and/or tables You remove Stitch’s access to create schemas/tables or to write to Stitch-created tables Data replication will stop until sufficient permissions are re-instated
You revoke data warehouse permissions You remove access to the data warehouse as a whole Data will fail to load and you will be notified
Was this article helpful?
0 out of 0 found this helpful

Comments

Questions or suggestions? If something in our documentation is unclear, let us know in the comments!