Amazon S3

This article describes how to connect Tableau to the driverless Amazon S3 connector and set up the data source.

Before you begin

Before you begin, gather the following connection information:

  • The AWS region of your S3 bucket.

  • The S3 bucket name.

  • Your AWS IAM access key for your S3 bucket (key ID and secret access key).

Permissions

Make sure that your AWS IAM user has read permissions for your S3 bucket.

Recommendations

These recommendations can help increase performance:

  • The client is running in an AWS environment (for example, Tableau Server installed in an EC2 instance or using Tableau Cloud).
  • Same region access has better performance than cross region bucket support. Cross region buckets may have additional costs due to data egress fees.

Make the connection and set up the data source

  1. Start Tableau and under Connect, select Amazon S3 from the list of Additional Connectors.
  2. Select Install and Restart Tableau to install the connector.
  3. After Tableau restarts go to Connect, and then select Amazon S3 from the list of installed connectors.
  4. Enter your Bucket Region, Bucket Name, Access Key ID, and Secret Access Key.
  5. Select Sign In.
  6. In the contents of your bucket, select a file you want to connect to.
  7. Select Connect.

Set up the data source

Complete the following steps to set up the data source.

  1. (Optional) Select the default data source name at the top of the page.
  2. Enter a unique data source name for use in Tableau.
  3. Drag one or more files you want to connect to from the left pane into the canvas.
  4. To start your analysis, select the Sheet 1 tab.

Union your data

You can union files from your S3 bucket. For more information about union, see Union Your Data.

Wildcard unions

To perform a wildcard union with files in subfolders, the root folder or bucket must have at least one file included in the union that matches the structure of the files in subfolders. This file is the first file that you connect to when creating the union.

Wildcard union works for CSV files but isn't supported for Excel files.

Known issues and limitations

The following sections have known issues and limitations that can affect how successful you’re when using the Amazon S3 connector.

Note: This connector isn’t currently supported in Tableau Prep Web Authoring or virtual connections.

Authentication known issues and limitations

  • Only Amazon IAM user secret key/access key authentication without session token is supported.

Publishing known issues and limitations

  • Workbooks and data sources must be published using the ‘Embedded password’ authentication option. ‘Prompt user’ isn’t currently supported.

Union known issues and limitations

  • Only Tableau Desktop supports a wildcard union.
  • Web Authoring only supports user-defined manual union (dragging files).
  • Wildcard union doesn't support Excel files.

File type known issues and limitations

  • Parquet, .csv, compressed .gz, and Excel files are supported at this time.
  • All data is, by design, imported in string format.
  • Only UTF-8 encoding is supported.
  • You can’t union or join across multiple file types in a single connection (for example, Parquet & .csv together).
  • Only comma-delimited .csv files are currently supported.
  • The file size limit is 15gb.
  • The cumulative result set of a join or union can’t exceed 15gb.
  • Excel files currently can’t exceed ~100 mb due to performance issues with the Excel file parser in the connector.

Additional Parquet file known issues and imitations

  • Parquet files must be in the format described in our Hyper API documentation.
  • Nested columns and therefore the nested types MAP and LIST aren’t supported.
  • The types BSON, UUID, and ENUM aren’t supported.
  • The physical type FIXED_LEN_BYTE_ARRAY without any logical or converted type isn’t supported.
  • The type DECIMAL is only supported up to 8 bytes (18 decimal digits). Consider using double if you need more than 18 decimal digits.
  • The types TIME_MILLIS and TIME_NANOS aren’t supported. Consider using TIME_MICROS instead.
  • The deprecated BIT_PACKED encoding isn’t supported. No recent Parquet files should use this encoding, as it’s deprecated for over half a decade.
  • The DELTA_LENGTH_BYTE_ARRAY encoding and the recent BYTE_STREAM_SPLIT encoding aren’t supported, as they aren’t written by any library.
  • Supported compressions are SNAPPY, GZIP, ZSTD, and LZ4_RAW.

Connecting to Data in China Regions

Starting in version 2.1.4 of the Amazon S3 connector, connecting to data in AWS China regions is possible, with these limitations:

  • Tableau Desktop, Tableau Server, and Tableau Cloud can connect to Excel files with no restrictions.

  • CSV and parquet files aren’t accessible in Tableau Desktop and Tableau Cloud.

  • Tableau Server can be configured to access CSV and parquet files using this TSM command:

    tsm configuration set -k hyper.external_allow_custom_endpoints -v 1 --force-keys

See also

Thanks for your feedback!Your feedback has been successfully submitted. Thank you!