Stitch Documentation
has moved!

Please update your bookmarks to https://www.stitchdata.com/docs

If you're not automatically redirected after 5 seconds, click here.

Connecting MongoDB on Servers Using Auth Mode

Important!
Stitch supports Mongo versions 2.4+.
If you’re on an older version, our integration may not work correctly. Please reach out to support if you run into issues.

In this article, we'll walk you through how to connect MongoDB databases on servers using authentication to Stitch via an SSH tunnel.

If your server is using the default option (which is No Auth), use the instructions located here.

Auth mode requires every user who connects to Mongo to have a username and password. These credentials must be validated before the user will be granted access to the database.

Due to the technical nature this setup requires, we recommend looping in a developer to help you out. To connect MongoDB, you'll need to do the following:

  1. Index the fields you want to use as Replication Keys
  2. Retrieve the Stitch public key
  3. Whitelist the Stitch IP addresses
  4. Create a Linux user for Stitch
  5. Create a Mongo user for Stitch
  6. Enter the connection info into Stitch
  7. Define the Replication Frequency
  8. Select databases and tables to sync

Before You Start: Some Background

Stitch uses a standalone server connection to connect to your MongoDB instance. What this means is that if you want Stitch to run on secondary instances, you have to give Stitch a host IP for one of your secondary instances.

In the case of Mongos (Sharded Mongo), Stitch will always attempt to run data sync queries on your secondaries by default and you can provide the host IP for the master node.

Indexing Fields for Replication Keys

Before you jump into the actual setup, you should consider how the tables in your Mongo database are updated. Our Mongo integration uses Incremental Replication to replicate Mongo data, which means that only new and updated data will be replicated to your data warehouse when a sync runs. Stitch uses a field you designate - called a Replication Key - to identify new and updated data.

Once you've figured out the Replication Keys for your tables, the next step is to apply an index, if the fields aren't indexed already. Stitch only allows indexed fields to be set as Replication Keys for Mongo.

For more info on why we require this or guidance on how to select a Replication Key, check out the Selecting & Changing Mongo Replication Keys doc before continuing.

Retrieving the Stitch Public Key

The Public Key is used to authorize the Stitch Linux user. In the next step, we'll create the user and import the key.

To retrieve the key:

  1. On the Stitch dashboard page, click the Add an Integration button.
  2. Click the MongoDB icon.
  3. When the credentials page displays, click the Encryption Type menu and select the SSH Tunnel option.
  4. The Public Key will display, along with the other SSH fields.

Leave this page open throughout the tutorial - you'll need it to complete the rest of the setup.

Whitelisting the Stitch IP Addresses

For the connection to be successful, you must configure your firewall to allow access from our IP addresses. Whitelist the following IPs before continuing onto the next step:

  • 54.88.76.97/32
  • 52.23.137.21/32
  • 52.204.223.208/32
  • 52.204.228.32/32
  • 52.204.230.227/32

Creating a Stitch Linux User

Important!
If the sshd_config file associated with the server is not set to the default option, only certain users will have server access - this will prevent a successful connection to Stitch. In these cases, it's necessary to run a command like AllowUsers to allow the Stitch user access to the server.

This can be a production or slave machine as long as it contains real-time (or frequently updated) data. You may restrict this user any way you like as long as it retains the right to connect to the MongoDB server.

Note that anything inside square brackets - [like this], for example - is something you need to define when running the commands yourself.

To create the new user, run the following commands as root on your Linux server:

adduser [stitch username] -p
mkdir /home/[stitch username]
mkdir /home/[stitch username]/.ssh

To ensure the user has access to the database, we need to import the Public Key into authorized_keys. Copy the entire key into the authorized_keys file as follows:

touch /home/[stitch username]/.ssh/authorized_keys
"< [PASTE KEY HERE] >" >> /home/[stitch username]/.ssh/authorized_keys

To finish creating the user, alter the permissions on the /home/[stitch username] directory to allow access via SSH:

chown -R [stitch username]:[stitch username] /home/[stitch username]
chmod -R 700 /home/[stitch username]/.ssh

Creating a Stitch Mongo User

Important!
This section is written for Mongo versions 3.0+ and may differ for other versions. You should verify your version and refer to the Mongo documentation for your version before continuing.

To successfully connect and replicate your Mongo data, our user requires the ability to:

  • run the listDatabases command. We require this permission so we can detect the databases available for syncing.
  • run the listIndexes command. Because Stitch can only detect indexed fields in Mongo, we require this permission to identify fields that can be used as Replication Keys.
  • run the dbVersion command. While this isn’t mandatory, it’s beneficial for us to have access to this to troubleshoot any connection or replication issues that may arise.
  • COUNT and query on all the databases you want to sync. We require these permissions to replicate your data.

You can assign a role to our user if you like, as long as the role has the necessary permissions to perform the items listed above.

When connecting to multiple databases, you can add the user by logging into MongoDB as an admin user and running the following command. We’re using the createUser command, but older versions may use addUser. Documentation for addUser can be found here.

Replace [database name] with the name of database where the user is authenticated, or created:

use [database name]
db.createUser( { user: "[stitch username]",
				 pwd: "[secure password here]",
				 roles: ["roles here", "if you want them"]
			   }
		)

After you've created the user, the next step is to enter the connection and user info into Stitch.

Entering the Connection Info into Stitch

If you’re not still on the Mongo credentials page, open it now:

  1. From the Stitch Dashboard page, click the Add an Integration button.
  2. Click the MongoDB icon.
  3. Fill in the following fields:
    • Integration Name: This is the name that will display on the Stitch dashboard for the integration; it’ll also be used to create the schema in your data warehouse.

      For example, the name “Mongo Marketing” would create a schema called mongo_marketing in the data warehouse.
    • Host: The host for your MongoDB server
    • Username: The Stitch Mongo username
    • Password: The Stitch Mongo user password
    • Port: MongoDB's port on your server (27017 by default)
    • Database Name: The name of the authentication database. This is the database used to create the Stitch Mongo user.
  4. In the Encryption Type menu, select SSH.
  5. An additional set of fields will display. Fill in the following:
    • Remote Address: The IP address or hostname of the server we will SSH into
    • SSH Username: The Stitch Linux (SSH) user’s username
    • SSH Port: The SSH port on your server (22 by default)

Defining the Replication Frequency

The Replication Frequency controls how often Stitch will attempt to replicate data from your Mongo database. By default the frequency is set to 30 minutes, but you can change it to better suit your needs.

When you're finished, click the Save Integration button to complete the setup.

Selecting Databases and Tables to Sync

Now that your Mongo database is connected to Stitch, the last step is to select the databases and collections you want to sync and define the Replication Keys.

When selecting tables to sync, you might notice that there are more tables listed than what's in your native database. Stitch is built to de-nest any nested object arrays turn them into subtables, which will increase the number of tables available for syncing. This can result in a higher number of rows synced than what’s in your native database.

Related

Was this article helpful?
0 out of 0 found this helpful

Comments

Questions or suggestions? If something in our documentation is unclear, let us know in the comments!