Amazon S3

This article describes how to connect Tableau to the driverless Amazon S3 connector and set up the data source.

Before you begin

Before you begin, gather the following connection information:

  • The AWS region of your S3 bucket.

  • The S3 bucket name.

  • Your AWS IAM access key for your S3 bucket (key ID and secret access key).

Permissions

Ensure that your AWS IAM user or role has read permissions for your S3 bucket.

Recommendations

The following recommendations can help increase performance.

  • Optimal performance is achieved if the client is running in an AWS environment (for example, Tableau Desktop or Server installed in an EC2 instance or using Tableau Cloud).
  • Cross region bucket support works but performance will be worse than same region access. There may be additional cost due to data egress fees.

Make the connection and set up the data source

  1. Start Tableau and under Connect, select Amazon S3 from the list of Additional Connectors.
  2. Select Install and Restart Tableau to install the connector.
  3. After Tableau restarts go to Connect, and then select Amazon S3 from the list of installed connectors.
  4. Enter your Bucket Region, Bucket Name, Access Key ID, and Secret Access Key.
  5. Select Sign In.
  6. In the contents of your bucket, select a file you want to connect to.
  7. Select Connect.

Set up the data source

Complete the following steps to set up the data source.

  1. (Optional) Select the default data source name at the top of the page.
  2. Enter a unique data source name for use in Tableau.
  3. Drag one or more files you want to connect to from the left pane into the canvas.
  4. To start your analysis, select the Sheet 1 tab.

Union your data

You can union files from your S3 bucket. For more information about union, see Union Your Data. To perform a wildcard union that includes files in subfolders, the root folder or bucket must have at least one file, matching the structure of the files contained in subfolders, to include in the union. This file is the first file that you connect to when creating the union.

Known issues and limitations

The following sections have known issues and limitations that can affect how successful you are when using the Amazon S3 connector.

Note: This connector isn’t currently supported in Tableau Prep Web Authoring or virtual connections.

Authentication known issues and limitations
  • Only Amazon IAM User secret key/access key authentication without session token is supported.

Publishing known issues and limitations
  • Workbooks and data sources must be published using the ‘Embedded password’ authentication option. ‘Prompt user’ isn’t currently supported.

Union known issues and limitations
  • Only Tableau Desktop supports a Wildcard union.
  • Web Authoring only supports user-defined manual union (dragging files).
File type known issues and limitations
  • Parquet, .csv, compressed .gz, and Excel files are supported at this time.
  • All data is, by design, imported in string format.
  • You can’t union or join across multiple file types in a single connection (for example, Parquet & .csv together).
  • Only comma-delimited .csv files are currently supported.
  • The cumulative query and file size limit is 15gb.
  • Excel files currently can’t exceed ~100 mb due to performance issues with the Excel file parser in the connector.
Additional Parquet file known issues and imitations
  • Parquet files must be in the format described in our Hyper API documentation.
  • Nested columns and therefore the nested types MAP and LIST aren’t supported.
  • The types BSON, UUID, and ENUM aren’t supported.
  • The physical type FIXED_LEN_BYTE_ARRAY without any logical or converted type isn’t supported.
  • The type DECIMAL is only supported up to 8 bytes (18 decimal digits). Consider using double if you need more than 18 decimal digits.
  • The types TIME_MILLIS and TIME_NANOS aren’t supported. Consider using TIME_MICROS instead.
  • The deprecated BIT_PACKED encoding isn’t supported. No recent Parquet files should use this encoding, as it’s deprecated for over half a decade.
  • The DELTA_LENGTH_BYTE_ARRAY encoding and the recent BYTE_STREAM_SPLIT encoding aren’t supported, as they aren’t written by any library. If you encounter any Parquet files using these encodings, let us know.
  • Supported compressions are SNAPPY, GZIP, ZSTD, and LZ4_RAW.

See also

Thanks for your feedback!Your feedback has been successfully submitted. Thank you!