SQL Dialects#

Data sources we connect to support different SQL languages, for example, Oracle provider supports PL/SQL that differs in many areas from TSQL that Querona supports. Because of this differences, Querona needs to be configured to convert queries from one SQL language dialect to another. SQL Dialects provide the functionality to map TSQL language elements between Querona’s SQL and provider SQL.

While Querona has a lot of dialects configured out-of-the-box, there are ADO.net, JDBC or ODBC drivers that are less popular and are not included with Querona. Users might also decide not to use a specific provider functionality or just do things differently.

For example, provider SQL can be missing an aggregate function like SUM(). In such case, Querona has to provide the implementation of SUM(). Even if function parity exists, thanks to the mapping configuration, we can decide what SQL functionality we’d like to push-down to the provider.

Note

If the feature-set of a provider SQL is different than TSQL, Querona fills the gaps using its in-memory function implementations or using a default federator.

SQL dialect in action#

Let’s assume that we have connected Querona to MySql and use a single table called Shop.Sales. We want to execute in Querona the following SQL statement, which gives us a sum of all sales:

SELECT SUM(Total)
  FROM Shop.Sales

Let’s assume that MySql dialect has a defined mapping of the SUM() function. The resulting SQL executed in MySql would look like this:

SELECT SUM(`Total`)
  FROM `Shop`.`Sales`

Notice that the SQL statement was rewritten to SQL format of MySql database server, along with the SUM() function. It means that the whole statement will execute in one step in MySql database and only results of that query will be transferred from MySql to Querona.

Now let’s assume that MySql dialect has not defined the mapping of the SUM() function. In such case, the execution of the statement will be different. Due to lack of SUM() function mapping, the execution plan will be a two-step. The first step is to get the required data from MySql and transfer it to Querona for processing of SUM() function.

The first SQL statement will look like that:

SELECT `Total`  -- notice the lack of SUM() function
  FROM `Shop`.`Sales`

The second step will consist of processing the data obtained in the first step, using a default federator - usually the built-in Apache Spark.

The SQL statement will be similar to the following:

SELECT sum('Total')
  FROM <temporary_object_name_holding_data_from_step_one>

SQL dialect configuration is an essential part of the query execution and optimization engine. It enables a lot of flexibility in many scenarios, such as:

  • Configuring a new provider without coding, even for less popular exotic data sources that Querona vendor does not have access or built-in support.

  • Configuring a dialect that does not depend on a given technology (ADO.net, JDBC, ODBC) or dialect for each technology.

  • Control over optimizer behavior of what SQL features should be pushed down to the provider. For example, a given functionality in provider may be buggy or perform poorly. In such case, we can disable it and rely on a better option provided by Querona.

Features of SQL dialects#

Querona supports many data-access technologies, such as ADO.Net, JDBC, ODBC, REST, and others. Each technology has its requirements, may support the same features in different ways, or support features not present anywhere else.

To find the common denominator, we introduced the following features to SQL dialects:

  • Connection string definition customization,

  • Schema analyzers with configuration,

  • Query features configuration,

  • SQL features configuration,

  • Type mappings,

  • Function mappings.

We will describe each of them in turn.

Note

SQL dialect configuration requires administrative privileges and is available in web GUI under Administer ‣ SQL Dialects. You can use the SQL dialect configuration to tweak the existing dialect definitions or create a new dialect.

Connection definition customization#

Connection definition provides means to set a connection string template, which will be used by a new connection wizard. Identifier quote character is also configurable.

Query features#

Query feature

Description

CROSS JOIN

True if data provider’s SQL supports CROSS JOIN clause.

EXCEPT

True if data provider’s SQL supports EXCEPT operator.

FROM with a subquery

True if data provider’s SQL FROM clause supports subqueries.

FULL JOIN

True if data provider’s SQL supports a FULL JOIN operator.

GROUP BY

True if data provider’s SQL supports GROUP BY clause.

GROUP BY CUBE

True if data provider’s SQL supports GROUP BY CUBE clause.

GROUP BY GROUPING SETS

True if data provider’s SQL supports GROUP BY GROUPING SETS clause.

GROUP BY ROLLUP

True if data provider’s SQL supports GROUP BY ROLLUP clause.

HAVING

True if data provider’s SQL supports HAVING clause.

INNER JOIN

True if data provider’s SQL supports INNER JOIN operator.

INTERSECT

True if data provider’s SQL supports INTERSECT operator.

LEFT JOIN

True if data provider’s SQL supports LEFT JOIN clause.

Multisource FROM

True if data provider’s SQL supports a multi-source FROM clause.

ORDER BY

True if data provider’s SQL supports ORDER BY clause.

OVER

True if data provider’s SQL supports OVER clause.

Paging

True if data provider’s SQL supports paging.

Parametrized queries

True if data provider’s SQL supports parametrized queries.

PARTITION BY inside OVER

True if data provider’s SQL supports PARTITION BY inside an OVER clause.

RIGHT JOIN

True if data provider’s SQL supports RIGHT JOIN clause.

SELECT…INTO

True if data provider’s SQL supports SELECT...INTO clause.

TOP

True if data provider’s SQL supports TOP clause and which variant: TOP(<n>), LIMIT <n> or ROWS <n>. To allow TOP clause to be pushed-down to a source provider Literal value and Column reference SQL features must be enabled.

UNION

True if data provider’s SQL supports UNION operator.

UNION ALL

True if data provider’s SQL supports UNION ALL operator.

SQL features#

SQL features configuration section allows you to map SQL from source to SQL features in Querona.

SQL feature

Operator

Description

Arithmetic: And

&

True if data provider’s SQL supports arithmetic AND operator.

Arithmetic: Divide

/

True if data provider’s SQL supports arithmetic DIVISION operator.

Arithmetic: Subtract

-

True if data provider’s SQL supports arithmetic SUBTRACTION and unary NEGATIVE operators. The unary NEGATIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.

Arithmetic: Modulo

%

True if data provider’s SQL supports arithmetic MODULUS operator.

Arithmetic: Multiply

*

True if data provider’s SQL supports arithmetic MULTIPLICATION operator.

Arithmetic: Not

~

True if data provider’s SQL supports arithmetic NEGATION operator.

Arithmetic: Or

\

True if data provider’s SQL supports arithmetic OR operator.

Arithmetic: Plus

+

True if data provider’s SQL supports arithmetic ADDITION and UNARY POSITIVE operator. The unary POSITIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values.

Arithmetic: Xor

^

True if data provider’s SQL supports bitwise XOR operator.

Bitwise NOT expr.

~

True if data provider’s SQL supports bitwise NOT operator.

CAST

True if data provider’s SQL supports a CAST (expression AS data_type [ (length) ]) expression.

Column reference

True if data provider’s SQL supports column reference in expression.

Comparison: Equal

=

True if data provider’s SQL supports comparison EQUAL operator.

Comparison: Greater

>

True if data provider’s SQL supports comparison GREATER THAN operator.

Comparison: Greater or equal

>=

True if data provider’s SQL supports comparison GREATER THAN OR EQUAL operator.

Comparison: Less

<

True if data provider’s SQL supports comparison LESS THAN operator.

Comparison: Less or equal

<=

True if data provider’s SQL supports comparison LESS OR EQUAL THAN operator.

Comparison: Not equal

<>

True if data provider’s SQL supports comparison NOT EQUAL operator.

Concatenate strings

+

True if data provider’s SQL supports a string concatenation using + operator.

CONVERT

True if data provider’s SQL supports a CONVERT(data_type [ (length) ] , expression [ , style ] ) expression.

Date and time predefined value expr.

True if data provider’s SQL supports a predefined date value like in DATEPART(`month`, $1).

Function call

True if data provider’s SQL supports functions.

Literal value

True if data provider’s SQL supports literals.

Literal value in projection

True if data provider’s SQL supports literals in projection.

Negative expr.

True if data provider’s SQL supports unary negative operator.

Parameter reference

True if data provider’s SQL supports parameter references in expressions.

Predicate: AND

True if data provider’s SQL supports AND predicate.

Predicate: BETWEEN

True if data provider’s SQL supports BETWEEN predicate.

Predicate: EXISTS

True if data provider’s SQL supports EXISTS predicate.

Predicate: IN (expression)

True if data provider’s SQL supports a composite predicate IN <values>.

Predicate: IN (subquery)

True if data provider’s SQL supports a composite predicate IN (subquery).

Predicate: IS NULL

True if data provider’s SQL supports IS NULL predicate.

Predicate: LIKE

True if data provider’s SQL supports LIKE predicate.

Predicate: NOT

True if data provider’s SQL supports NOT predicate.

Predicate: OR

True if data provider’s SQL supports OR predicate.

Searched CASE

True if data provider’s SQL supports a Searched CASE expression, for example CASE WHEN case_operand = when_operand1 THEN result1 WHEN case_operand = when_operand2 THEN result2 ... END.

SELECT scalar expr.

True if data provider’s SQL supports a select scalar expression, which is a nested SELECT statement that returns one column and one row.

Simple CASE

True if data provider’s SQL supports a simple CASE expression.

Microsoft’s documentation#

Type mappings#

Type mappings section allows you to map data types from source to data types in Querona.

SQL type mapping column name

Scope

Description

System type

System

Data type in Querona, read only. Type system is compatible with Microsoft SQL Server 2012.

Target storage type

System

Data storage type in Querona.

Cast as

Provider

Data type that will be used to cast values to when preparing a query for provider.

Provider import types (CSV)

Provider

Comma-separated list of provider type names, including aliases.

Variable length

System

True if a provider data type has variable length.

Variable precision

System

True if a provider data type has variable precision.

Variable scale

System

True if a provider data type has variable scale.

For example, Teradata provider reports its supported binary data types are VARBYTE and BLOB. We store such data in Querona as VarBinary type. When we format a query to Teradata that uses any of types (VARBYTE, BLOB) we cast it to VARBYTE. The type mappings for the scenario above look like this:

SQL type mapping column name

Value

System type

VarBinary

Target storage type

VarBinary

Cast as

VARBYTE

Provider import types (CSV)

VARBYTE, BLOB

Variable length

True

Variable precision

False

Variable scale

False

Function mappings#

Function mappings section allows you to map provider functions to Querona functions. Left table column represents the Querona function declaration and is read-only. Right table column holds the function declaration supported by provider and is user-editable.

To map a function declaration find its name in the left column and enter the provider function declaration to the right column. Function parameters are represented by a dollar sign followed by one-based parameter ordinal number, for example, $3 represents the third parameter of Querona’s function.

Example, representative variants of function mappings:

Function name

Mapping to provider function

Comment

CEILING($1)

CEIL($1)

Querona function CEILING(parameter) has a different name in provider and maps to CEIL(parameter).

CHARINDEX($1, $2)

CHARINDEX($2, $1)

Querona function declaration maps to the same function name, but with reversed parameters ($1,$2)->($2,$1).

COUNT($1)

COUNT($1)

Querona function declaration maps to exactly the same function in provider.

COUNT_BIG($1)

<empty>

COUNT_BIG($1) function is not supported by provider and will be simulated by Querona.

YEAR($1)

DATEPART(year, $1)

YEAR($1) function does not directly exist in provider, but is substituted with DATEPART(year, $1).

QUOTENAME($1)

CONCAT (‘’’, REPLACE($1, ‘’’, CONCAT(‘’’,’’’)), ‘’’)

QUOTENAME($1) function does not exist in provider, but can be simulated with a compound expression.

Date/time function mappings#

Date/time function mappings section allows you to map all variants of the following functions:

Example variants of function mappings:

Function name

Mapping to provider function

Comment

DATEADD(day, $1, $2)

DATE_ADD($2, INTERVAL $1 day)

Querona function variant DATEADD(day, $1, $2) mapped to provider function DATE_ADD($2, INTERVAL $1 day).

DATEADD(nanosecond, $1, $2)

NANOSECOND($1)

Querona function variant DATEADD(nanosecond, $1, $2) mapped to provider function NANOSECOND($1).

Schema analyzers#

Schema analyzers are built-in, configurable adapters that read metadata from a specific provider and perform necessary steps to make the metadata importable to Querona.

We support the following analyzers:

  • Custom with restrictions,

  • Custom with Select,

  • JDBC.

Standard settings for all providers are:

Setting Name

Type of value

Description

Length field name

text

Column name in metadata table returned by the provider that holds length information. Usually a COLUMN_SIZE, COLUMNSIZE, CHARACTER_MAXIMUM_LENGTH.

Name field name

text

Column name in metadata table returned by the provider that holds column name. Usually a COLUMN_NAME.

Nullability check field return type

list of values

Type of value returned by the provider in the column designated in ‘Nullable field name’ setting. Supported values are Bool, Int, Text:Yes/No, Text:Y/N.

Nullable field name

text

Column name in metadata table returned by the provider that holds column nullability. Usually a NULLABLE, IS_NULLABLE.

Precision field name

text

Column name in metadata table returned by the provider that holds precision. Usually a NUM_PREC_RADIX, NUMERIC_PRECISION.

Scale field name

text

Column name in metadata table returned by the provider that holds scale. Usually a DECIMAL_DIGITS, NUMERIC_SCALE.

Type field name

text

Column name in metadata table returned by the provider that holds type name. Usually a TYPE_NAME, DATA_TYPE.

Field type retrieved by

list of values

Designates a method to extract type information from column designated in ‘Type field name’. Supported methods are Odbc code, Type.

Table’s metadata loading method

list of values

Supported methods are Simple object name restriction, Fully qualified object name restriction.

Table list loading method

list of values

Supported methods are With restrictions, Without restrictions.

Custom query for table’s metadata

text

Query to use to extract table’s metadata from the provider.

Divide Unicode column length by 2

true/false

True when column length returned from provider holds the length doubled for Unicode characters.

Use database name in restrictions

true/false

True when database name has to be used in restrictions. False otherwise.

Use schema name in restrictions

true/false

True when schema name has to be used in restrictions. False otherwise.

Use table name in restrictions

true/false

True when table name has to be used in restrictions. False otherwise.