Skip to main content
Skip table of contents

MongoDB tools

Overview

MongoDB is the most well-known NoSQL document database.

Data Management supports MongoDB versions 6.x, 7.x, and 8.x, including Atlas clusters.

MongoDB documents are based on BSON, which can be thought of as a binary JSON with added type support (including timestamp, decimal, and binary types). MongoDB supports unique indexes and indexing of array elements, and does not directly support joins.

Data Management's support for MongoDB includes:

MongoDB tool connection settings

Data Management's MongoDB tools use shared settings, which allows you to define a single set of configuration properties (typically access credentials) to share across multiple tools in your Data Management Site. You can override these settings on a per-tool basis by opening the Connection settings section on the tool's Properties pane, selecting Override, and specifying values for that specific tool.

To define MongoDB shared tool settings

  1. Open the Tools folder under Settings in the repository.

  2. Select the MongoDB tab, and then configure the tool properties for your environment.

Property

Description

Connection URI

The MongoDB URI connection string (as defined in the MongoDB documentation).

Explicit username/password

If selected, you must specify an explicit Username and Password (or Key Vault reference). While the username and password can be embedded in the Connection URI string, it is often preferable for security reasons to specify these separately, since the password will be encrypted.

To configure default shared tool settings from a MongoDB tool's Properties pane, open the Connection settings section, and select Edit default settings.

To override MongoDB tool shared settings

  1. Select the desired MongoDB tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Open the Connection settings section and select Override.

  4. Specify new values for the tool.

MongoDB Input

The MongoDB input tool reads documents from a collection of a MongoDB database and sends those documents to its output connector. The tool makes no attempt to interpret the documents; instead the documents are stored in a field of type Document.

MongoDB Input tool configuration parameters

The MongoDB Input tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Query style

Select the type of query to perform: Form or JSON.

Query

If Query style is Form, use the Edit filter terms grid to construct a query: select Field, Operation, and Type, and specify Values.

If Query style is JSON, enter a query in JSON form to filter the documents returned. The query string must correspond to MongoDB's specification. Some examples:

  • Equality on zip_code field: { "zip_code":"02134" }

  • Names starting with "Z": {"name": {"$regex":"Z.*" }

  • Salary greater than or equal to 10000: { "salary" : { "$gte" : 10000 } }

Select Find field names to generate a list of field names discovered in existing documents. If you do not enter a query, the tool will return all documents in the collection (unless a limit is specified on the Options tab).

Options

Parameter

Description

Sort by
Sort field
Sort descending

If Sort by is selected, the MongoDB server will sort documents on the specified Sort field before returning them. Select Find field names to generate a list of field names discovered in existing documents. Optionally, you may select Sort descending to sort values in descending order.

MongoDB will not sort more than 100MB of data. To query and sort a large amount of data, use Data Management's Sort tool downstream from the MongoDB Input tool.

Limit records
Process only the first

If selected, enter the maximum number of records to be returned. For test runs on large databases, this can significantly reduce run-time.

Enable trigger input

See Trigger input and output.

Only return selected fields

If selected, specify a list of field names to be returned. This will reduce the size and complexity of the returned documents and improve performance. Select Find field names to generate a list of field names discovered in existing documents.

Configure the MongoDB Input tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Input tool:

  1. Select the MongoDB Input tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select Query style and choose Form or JSON:

    • If Query style is Form, use the Edit filter terms grid to construct a query: select Field, Operation, and Type, and specify Values.

    • If Query style is JSON, enter a JSON query to filter the documents returned, or leave blank to return all documents in the Collection. You can select Find field names to generate a list of field names discovered in existing documents.

  6. Select the Options tab to configure how document fields are returned, or Enable Trigger input.

  7. Optionally, select the Sample tab, and then choose Refresh Sample data to view a sample of the input data.

  8. Optionally, go to the Execution tab, and then set Web service options.

MongoDB Output

The MongoDB Output tools inserts new documents into a collection, or optionally updates documents that exist and inserts them otherwise (upsert behavior).

MongoDB Output tool configuration parameters

The MongoDB Output tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Input document field

Select the field from the upstream connection that contains documents to insert. This field must exist and be of type “Document”.

Processing mode

Select processing mode:

  • Create: inserts new documents into the collection. If a document with the same Target key field value already exists (or if there is another unique index violation), the operation fails and the project aborts.

  • Upsert: attempts to insert new documents into the collection. If a document with the same Target key field value already exists (or if there is another unique index violation), the existing document is replaced by the new document. If Target key field is not the default _id, it should be indexed; otherwise the update operation will be inefficient.

Target key field

Key field for documents in the target Database and Collection. In MongoDB, the default primary key field is _id. We suggest that you use this rather than creating another key field.

Options

Parameter

Description

Enable trigger output

See Trigger input and output.

Block size

Controls how many records are sent to the MongoDB server at one time. If the connection to the server has some latency, you may achieve increased performance by increasing this value. However, higher settings will use more memory.

Configure the MongoDB Output tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Output tool:

  1. Select the MongoDB Output tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select Processing mode and choose either Create or Upsert:

    • Create: inserts new documents into the collection. If a document with the same Target key field value already exists (or if there is another unique index violation), the operation fails and the project aborts.

    • Upsert: attempts to insert new documents into the collection. If a document with the same Target key field value already exists (or if there is another unique index violation), the existing document is replaced by the new document. If Target key field is not the default _id, it should be indexed; otherwise the update operation will be inefficient.

  6. Optionally, select Target key field and specify the key field for documents in the target Database and Collection. In MongoDB, the default primary key field is _id. If you do not explicitly set an _id field in your documents, MongoDB will synthesize a new one for you.

  7. Select the Options tab to Enable Trigger output or set Block size.

  8. Optionally, go to the Execution tab, and then set Report options.

MongoDB Deleter

The MongoDB Deleter tool deletes documents in the target collection that match keys read from the input connection.

MongoDB Deleter tool configuration parameters

The MongoDB Deleter tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Input key field
Document key field

Select an Input key field and a Document key field. Input key field values will be read from source documents. The tool will delete all documents in the target collection with Document key field values matching any of the input values.

The _id field is the default unique primary key for MongoDB collections, and is the usual choice. Choose other key fields if you need more flexibility (for example, deleting multiple documents per key, such as "all documents in this ZIP code"). You may also need to choose another key if _id is not the natural key available in the source data.

Reducing duplicate key fields by using an upstream Unique tool may improve performance if there are a large number of duplicate keys.

Options

Parameter

Description

Enable trigger output

See Trigger input and output.

Block size

Controls how many delete operations are sent to the MongoDB server at one time. If the connection to the server has some latency, you may achieve increased performance by increasing this value. However, higher settings will use more memory.

Field conversion options

See Document database field conversion options.

Configure the MongoDB Deleter tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Deleter tool:

  1. Select the MongoDB Deleter tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select an Input key field.

  6. Enter a Document key field. The tool will delete all documents in the target collection whose Document key field matches any of the input values.

  7. Select the Options tab to Enable Trigger output or set Block size.

  8. Optionally, open the Field conversion section on the Options tab and configure Document database field conversion options. If the conversion options do not match the target database, the key values will not match.

  9. Optionally, go to the Execution tab, and then set Report options.

The _id field is the natural unique primary key for MongoDB collections, and is the usual choice. However, choosing other key fields can support behavior like deleting multiple documents per key (for example, "all documents in this ZIP Code"). You may also need to choose another key if _id is not the natural key available in the source data.

If there are a great number of duplicate keys, you may improve performance by reducing duplicate key fields with an upstream Unique tool.

MongoDB Updater

The MongoDB Updater tool alters existing documents in a MongoDB collection by replacing values with those from a Document field in the input records. Each update field must be specified separately.

MongoDB Updater tool configuration parameters

The MongoDB Updater tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Input document field

Select the field from the upstream connection that contains documents containing update values.

Update key

Select the update key, which is used to determine which records to update. The _id field is the default unique primary key for MongoDB collections, and is the usual choice. Choose a different update key if you need more flexibility (for example, updating multiple documents at once). The update key must exist in the input documents, or the project will abort with an error.

Update fields

Enter a list of fields that will be copied from the input source to target documents. Select Find field names to generate a list of field names discovered in existing documents.

Options

Parameter

Description

Include nulls in document

Select to add explicit null values to the documents.

Block size

Controls how many records are sent to the MongoDB server at one time. If the connection to the server has some latency, you may achieve increased performance by increasing this value. However, higher settings will use more memory.

Enable trigger output

See Trigger input and output.

Configure the MongoDB Updater tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Updater tool:

  1. Select the MongoDB Updater tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select an Input document field.\

  6. Enter an Update key, used to determine which records to update.

  7. To update the entire document with a new document, use the MongoDB Output tool with Processing mode set to Upsert.

  8. Select one or more Update fields that will be copied from the source to target documents. You can select Find field names to populate the list of field names discovered in existing documents.

  9. Select the Options tab to Include nulls in document, Enable Trigger output, or set Block size.

  10. Optionally, go to the Execution tab, and then set Report options.

MongoDB Array Updater

The MongoDB Array Updater tool alters existing documents in a MongoDB collection by adding data to an array as an atomic document database operation (MongoDB syntax $push). Enable the Unique result option to add a value to the target array only if the value does not already exist in the array (MongoDB syntax $addToSet).

MongoDB Array Updater tool configuration parameters

The MongoDB Array Updater tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Input document field

Select the array field from the upstream connection that contains documents containing update values.

Update key

Select the update key, which is used to determine which records to update. The _id field is the default unique primary key for MongoDB collections, and is the usual choice. Choose a different update key if you need more flexibility (for example, updating multiple documents at once). The update key must exist in the input documents, or the project will abort with an error.

Target field

Enter one or more fields to update. Select Find field names to generate a list of field names discovered in existing documents.

Options

Parameter

Description

Unique result

Select to add a value to the target array only if the value does not already exist in the array (MongoDB syntax $addToSet). This ensures that there will be no duplicate items added by the update.

Block size

Controls how many records are sent to the MongoDB server at one time. If the connection to the server has some latency, you may achieve increased performance by increasing this value. However, higher settings will use more memory.

Enable trigger output

See Trigger input and output.

Configure the MongoDB Array Updater tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Array Updater tool:

  1. Select the MongoDB Array Updater tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select an Input document field.

  6. Enter an Update key, used to determine which records to update.

  7. Select one or more Target fields to update. You can select Find field names to populate the list of field names discovered in existing documents.

  8. Select the Options tab to Unique result, Enable Trigger output, or set Block size.

  9. Optionally, go to the Execution tab, and then set Report options.

MongoDB Key Query

The MongoDB Key Query tool returns all documents from the source collection whose key fields match any of the input key field values.

MongoDB Key Query tool configuration parameters

The MongoDB Key Query tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Input key field
Document key field

Select an Input key field and Document key field. Select Find field names to generate a list of field names discovered in existing documents. The tool will return all documents in the target collection whose Document key field value matches any of the Input key field values.

Options

Parameter

Description

Block size

Controls the number of keys that are sent to the MongoDB server at one time. If the connection to the server has some latency, you may achieve increased performance by increasing this value. However, higher settings will use more memory.

Field conversion options

See Document database field conversion options.

Configure the MongoDB Key Query tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Key Query tool:

  1. Select the MongoDB Key Query tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database and Collection.

  5. Select an Input key field, and then choose a Document key field. You can select Find field names to generate a list of field names discovered in existing documents.

  6. Optionally, select the Options tab to set Block size.

  7. Optionally, open the Field conversion section on the Options tab and configure Document database field conversion options. If the defined conversion options do not match the target database, the key values will not match.

  8. Optionally, select the Sample tab, and then choose Refresh Sample data to view a sample of the input data.

  9. Optionally, go to the Execution tab, and then set Report options and Web service options.

The _id field is the natural unique primary key for MongoDB collections, and is the usual choice. However, choosing other key fields can support behavior like returning multiple documents per key (for example, "all documents in this ZIP Code"). You may also need to choose another key if _id is not the natural key available in the source data.

MongoDB Executor

The MongoDB Executor tool can perform the following commands on the configured database:

  • Drop collection

  • Clear collection

  • Drop index on collection

  • Create index on a collection, specifying field list, Unique, and Sparse options

MongoDB Key Query tool configuration parameters

The MongoDB Executor tool has two sets of configuration parameters in addition to the standard execution options.

Configuration

Parameter

Description

Override connection settings

See MongoDB shared settings.

Database

Select a database from the list of existing databases. You must have a valid MongoDB connection configured in order for this list to be populated.

Operation

Select one or more Collection actions:

  • CLEAR: remove all documents from the collection.

  • DROP: delete the collection and all associated configuration (indexes, validation, and options created outside Data Management).

Select one or more Index actions:

  • CREATE: create an index. No error is generated if the index already exists. However, if the index exists with an incompatible specification, the index will be dropped and re-created.

  • DROP: delete an index. No error is generated if the index does not exist.

Collection

Select a collection from the list of existing collections. You must have a valid MongoDB connection to an existing database in order for this list to be populated.

Index name

Name of the index to CREATE or DROP.

Sparse

Select to create a sparse index. This is good for seldom-used fields because it saves space.

Unique

Select to create a unique index. This enforces uniqueness across the specified fields.

The _id field always has a unique index.

An index that is both sparse and unique prevents collection from having documents with duplicate values for a field but allows multiple documents that omit the key.

Fields

Comma-separated list of field names on which to index. Indexes are ascending on all fields.

Enable shell commands
Enter shell commands

Select to enter database commands in JSON form, one per line. You can use this to perform operations that are unavailable in Collection actions and Index actions. See MongoDB Database Commands for details.

Options

Parameter

Description

Enable trigger input
Enable trigger output

See Trigger input and output.

Configure the MongoDB Executor tool

Before configuring a MongoDB tool, you should have a MongoDB connection defined in tool connection settings.

To configure the MongoDB Executor tool
  1. Select the MongoDB Executor tool.

  2. Go to the Configuration tab on the Properties pane.

  3. Optionally, override shared settings:

    1. Open the Connection settings section.

    2. Select Override.

    3. Specify new values for the tool.

  4. Select a Database.

  5. Use the Collection actions and/or Index actions grids to configure commands. You may add as many commands as you like.

To drop a collection

In the Collection actions grid:

  1. Select a grid cell under Operation and choose DROP.

  2. Choose the corresponding grid cell under Collection.

  3. Select the collection you want to drop.

To clear a collection

In the Collection actions grid:

  1. Select a grid cell under Operation and choose CLEAR.

  2. Choose the corresponding grid cell under Collection.

  3. Select the collection you want to clear.

To create an index
  1. In the Index actions grid:

    1. Select a grid cell under Operation and choose CREATE.

    2. Choose the corresponding grid cell under Collection.

    3. Select the collection you want to index.

  2. Select the corresponding grid cell under Index name and enter a name for the new index.

  3. Optionally, select Sparse and/or Unique to select those index types.

  4. Select the corresponding grid cell under Fields and enter one or more field names (separated by commas) to be indexed.

To drop an index
  1. In the Index actions grid:

    1. Select a grid cell under Operation and choose DROP.

    2. Choose the corresponding grid cell under Collection.

    3. Select the collection you want to drop.

  2. Select the corresponding grid cell under Index name and enter a name for the index you want to drop.

  3. Select the corresponding grid cell under Fields and enter one or more field names (separated by commas) whose indexes will be dropped.

  4. Optionally, select Enable shell commands and enter database commands in JSON form, one per line.

  5. Select the Options tab to Enable Trigger input or output.

  6. Optionally, go to the Execution tab, and then set Report options.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.