Skip to main content
Skip table of contents

Microsoft Azure

Microsoft Azure is a cloud computing service for building, testing, deploying, and managing applications and services through a global network of Microsoft-managed data centers. Data Management supports three flavors of Azure data storage: ADLS (Azure Data Lake Store) Gen1 and Gen2, and Azure Blob Storage. All support Azure Active Directory (AD) for authentication. In order for Data Management to authenticate with an Azure storage account, you must provide Azure AD authentication credentials. Check with your system administrator to determine which authentication method is used in your organization and obtain the appropriate credentials.

Microsoft ADLS

There are two versions of ADLS: Gen1 and Gen2. Both use Azure Active Directory (AD) for authentication. In order for Data Management to authenticate with Azure Data Lake storage, you must configure service-to-service authentication by creating an Azure Active Directory application, using the AD application to generate authentication credentials, and providing those credentials to Data Management. Consult your system administrator for details.

Authentication credentials are version-specific, and are not interchangeable between Gen1 and Gen2. If you have both Gen1 and Gen2 Data Lake Stores, you must configure each with its own credentials. While a misconfigured connection may successfully authenticate, functionality will be limited and unexpected behavior may occur.

Once you have configured access to an ADLS account, any Data Management browse dialog that has access to the account will display a new ADLS file system icon at the root level of the browse window. Expanding this item displays a list of all the authenticated ADLS accounts.

image-20240322-200112.png

Note the DFS path prefix adls:/// in the above image. If you configured access to an ADLS Gen2 account, any browse dialog that that has access to the account will display the DFS path prefix adl2:/// and the new file system labeled ADLS Gen2 Accounts. Expanding this item displays a list of all the authenticated ADLS Gen 2 accounts:

image-20240322-200152.png

Configure access to ADLS Gen1

To configure access to ADLS Gen1:

  1. In the repository, open the Settings folder, and then open the Cloud folder.

  1. Select the Azure icon, and then go to the Properties pane.

  2. On the ADLS Settings tab, enter the authentication credentials generated by the Azure AD application:

    • Tenant ID is the Directory ID associated with the AD application.

    • Application ID is the Application ID associated with the AD application.

    • Client secret is the Authentication ID associated with the AD application. This may be a Password or Key Vault reference.)

  3. Select in the ADLS accounts grid and enter the names of ADLS accounts to access using these authentication credentials.

  4. Optionally, configure Tuning settings:

    • Read ahead queue depth sets the queue depth to be used for parallelized read-aheads of files. Accept the default value of 15 to maximize read performance, or set a new value from 0 (no read-aheads) to 20.

    • Buffer size (MB) sets the size of the tool's internal read buffer. Accept the default value of 4 MB to maximize read performance, or set a new value from 1 to 4.

Configure access to ADLS Gen2

To configure access to ADLS Gen2:

  1. In the repository, open the Settings folder, and then open the Cloud folder.

  1. Select the Azure icon, and then go to the Properties pane.

  2. On the ADLS Gen2 Settings tab, select in the ADLS accounts grid and enter the names of ADLS accounts.

  3. For each account, select the Authentication type and enter the authentication credentials generated by the Azure AD application for that account. Authentication credentials are version-specific, and are not interchangeable between Gen1 and Gen2.

Authentication type

Credential

Access key

May be a Password or Key Vault reference.

OAuth2 Service Principal

Tenant ID is the Directory ID associated with the AD application.

Application ID is the Application ID associated with the AD application.

Client secret is the Authentication ID associated with the AD application. This may be a Password or Key Vault reference.

Azure Blob Storage

Azure Blob Storage (ABS) is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing large volumes of unstructured data, such as text or binary data.

To access ABS from Data Management, you need to know the Account name and authentication credentials for an Azure account that has been configured with a Storage Account and Azure Blob Storage. Consult your system administrator for details.

Once you have configured access to this account, any Data Management browse dialog that has access to the account will display a new file system icon labeled Azure Blob Storage at the root level of the browse window. Expanding this item displays a list of the containers in the account.

image-20240322-200535.png

URLs and paths: ABS versus WASB

Note that the URL shown in the above screenshot begins with abs:///. The same URL with the abs: scheme specifier will work everywhere for native access to the same blobs stored in ABS.

Directories and files in ABS and WASB

Technically, Azure does not have directories. In ABS and WASB paths, the slash character / in a path is actually just another character in the file name. However, Azure recognizes the slash character as an indicator of a "virtual directory," giving Data Management a way to enumerate paths and files that map to the file system displayed in a browse window. Note that empty virtual directories do not persist.

Append operations are not supported

Data Management supports ABS block blobs. You can modify an existing block blob by inserting, replacing, or deleting blocks. However, append operations are not supported. Enabling the Append to existing file option in a Flat File Output tool or a CSV Output tool will result in an error.

Configure access to Azure Blob Storage

To configure access to Azure Blob Storage:

  1. In the repository, open the Settings folder, and then open the Cloud folder.

  2. Select the Azure icon, and then go to the Properties pane.

  3. Select the Blob Storage Settings tab, and then enter the Account name.

  4. Select the Authentication type and enter the authentication credentials generated by the Azure AD application for that account:

Authentication type

Credential

Connection string

May be a Password or Key Vault reference.

OAuth2 Service Principal

Tenant ID is the Directory ID associated with the AD application.

Application ID is the Application ID associated with the AD application.

Client secret is the Authentication ID associated with the AD application. This may be a Password or Key Vault reference.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.