Additional Functions

REGEXP_REPLACE(string, pattern, replacement)

Returns a copy of the given string where the regular expression pattern is replaced by the replacement string. This function is available for Text File, Hadoop Hive, Google BigQuery, PostgreSQL, Tableau Data Extract, Microsoft Excel, Salesforce, Vertica, Pivotal Greenplum, Teradata (version 14.1 and above), Snowflake, and Oracle data sources.

For Tableau data extracts, the pattern and the replacement must be constants.

For information on regular expression syntax, see your data source's documentation. For Tableau extracts, regular expression syntax conforms to the standards of the current International Components for Unicode (ICU), an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. See the Regular Expressions(Link opens in a new window) page in the online ICU User Guide.

Example

REGEXP_REPLACE('abc 123', '\s', '-') = 'abc-123'

REGEXP_MATCH(string, pattern)

Returns true if a substring of the specified string matches the regular expression pattern. This function is available for Text File, Google BigQuery, PostgreSQL, Tableau Data Extract, Microsoft Excel, Salesforce, Vertica, Pivotal Greenplum, Teradata (version 14.1 and above), Impala 2.3.0 (through Cloudera Hadoop data sources), Snowflake, and Oracle data sources.

For Tableau data extracts, the pattern must be a constant.

For information on regular expression syntax, see your data source's documentation. For Tableau extracts, regular expression syntax conforms to the standards of the current International Components for Unicode (ICU), an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. See the Regular Expressions(Link opens in a new window) page in the online ICU User Guide.

Example

REGEXP_MATCH('-([1234].[The.Market])-','\[\s*(\w*\.)(\w*\s*\])')=true

REGEXP_EXTRACT(string, pattern)

Returns the portion of the string that matches the regular expression pattern. This function is available for Text File, Hadoop Hive, Google BigQuery, PostgreSQL, Tableau Data Extract, Microsoft Excel, Salesforce, Vertica, Pivotal Greenplum, Teradata (version 14.1 and above), Snowflake, and Oracle data sources.

For Tableau data extracts, the pattern must be a constant.

For information on regular expression syntax, see your data source's documentation. For Tableau extracts, regular expression syntax conforms to the standards of the current International Components for Unicode (ICU), an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. See the Regular Expressions(Link opens in a new window) page in the online ICU User Guide.

Example

REGEXP_EXTRACT('abc 123', '[a-z]+\s+(\d+)') = '123'

REGEXP_EXTRACT_NTH(string, pattern, index)

Returns the portion of the string that matches the regular expression pattern. The substring is matched to the nth capturing group, where n is the given index. If index is 0, the entire string is returned. This function is available for Text File, PostgreSQL, Tableau Data Extract, Microsoft Excel, Salesforce, Vertica, Pivotal Greenplum, Teradata (version 14.1 and above), and Oracle data sources.

For Tableau data extracts, the pattern must be a constant.

For information on regular expression syntax, see your data source's documentation. For Tableau extracts, regular expression syntax conforms to the standards of the current International Components for Unicode (ICU), an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. See the Regular Expressions(Link opens in a new window) page in the online ICU User Guide.

Example

REGEXP_EXTRACT_NTH('abc 123', '([a-z]+)\s+(\d+)', 2) = '123'

Hadoop Hive Specific Functions

Note: Only the PARSE_URL and PARSE_URL_QUERY functions are available for Cloudera Impala data sources.

GET_JSON_OBJECT(JSON string, JSON path)

Returns the JSON object within the JSON string based on the JSON path.

PARSE_URL(string, url_part)

Returns a component of the given URL string where the component is defined by url_part. Valid url_part values include: 'HOST', 'PATH', 'QUERY', 'REF', 'PROTOCOL', 'AUTHORITY', 'FILE' and 'USERINFO'.

Example

PARSE_URL('http://www.tableau.com', 'HOST') = 'www.tableau.com'

PARSE_URL_QUERY(string, key)

Returns the value of the specified query parameter in the given URL string. The query parameter is defined by the key.

Example

PARSE_URL_QUERY('http://www.tableau.com?page=1&cat=4', 'page') = '1'

XPATH_BOOLEAN(XML string, XPath expression string)

Returns true if the XPath expression matches a node or evaluates to true.

Example

XPATH_BOOLEAN('<values> <value id="0">1</value><value id="1">5</value>', 'values/value[@id="1"] = 5') = true

XPATH_DOUBLE(XML string, XPath expression string)

Returns the floating-point value of the XPath expression.

Example

XPATH_DOUBLE('<values><value>1.0</value><value>5.5</value> </values>', 'sum(value/*)') = 6.5

XPATH_FLOAT(XML string, XPath expression string)

Returns the floating-point value of the XPath expression.

Example

XPATH_FLOAT('<values><value>1.0</value><value>5.5</value> </values>','sum(value/*)') = 6.5

XPATH_INT(XML string, XPath expression string)

Returns the numerical value of the XPath expression, or zero if the XPath expression cannot evaluate to a number.

Example

XPATH_INT('<values><value>1</value><value>5</value> </values>','sum(value/*)') = 6

XPATH_LONG(XML string, XPath expression string)

Returns the numerical value of the XPath expression, or zero if the XPath expression cannot evaluate to a number.

Example

XPATH_LONG('<values><value>1</value><value>5</value> </values>','sum(value/*)') = 6

XPATH_SHORT(XML string, XPath expression string)

Returns the numerical value of the XPath expression, or zero if the XPath expression cannot evaluate to a number.

Example

XPATH_SHORT('<values><value>1</value><value>5</value> </values>','sum(value/*)') = 6

XPATH_STRING(XML string, XPath expression string)

Returns the text of the first matching node.

Example

XPATH_STRING('<sites ><url domain="org">http://www.w3.org</url> <url domain="com">http://www.tableau.com</url></sites>', 'sites/url[@domain="com"]') = 'http://www.tableau.com'

Google BigQuery Specific Functions

DOMAIN(string_url)

Given a URL string, returns the domain as a string.

Example

DOMAIN('http://www.google.com:80/index.html') = 'google.com'

GROUP_CONCAT(expression)

Concatenates values from each record into a single comma-delimited string. This function acts like a SUM() for strings.

Example

GROUP_CONCAT(Region) = "Central,East,West"

HOST(string_url)

Given a URL string, returns the host name as a string.

Example

HOST('http://www.google.com:80/index.html') = 'www.google.com:80'

LOG2(number)

Returns the logarithm base 2 of a number.

Example

LOG2(16) = '4.00'

LTRIM_THIS(string, string)

Returns the first string with any leading occurrence of the second string removed.

Example

LTRIM_THIS('[-Sales-]','[-') = 'Sales-]'

RTRIM_THIS(string, string)

Returns the first string with any trailing occurrence of the second string removed.

Example

RTRIM_THIS('[-Market-]','-]') = '[-Market'

TIMESTAMP_TO_USEC(expression)

Converts a TIMESTAMP data type to a UNIX timestamp in microseconds.

Example

TIMESTAMP_TO_USEC(#2012-10-01 01:02:03#)=1349053323000000

USEC_TO_TIMESTAMP(expression)

Converts a UNIX timestamp in microsseconds to a TIMESTAMP data type.

Example

USEC_TO_TIMESTAMP(1349053323000000) = #2012-10-01 01:02:03#

TLD(string_url)

Given a URL string, returns the top level domain plus any country domain in the URL.

Example

TLD('http://www.google.com:80/index.html') = '.com'

TLD('http://www.google.co.uk:80/index.html') = '.co.uk'

Thanks for your feedback!Your feedback has been successfully submitted. Thank you!