SQL Dialects#
Data sources we connect to support different SQL languages, for example, Oracle provider supports PL/SQL that differs in many areas from TSQL that Querona supports. Because of this differences, Querona needs to be configured to convert queries from one SQL language dialect to another. SQL Dialects provide the functionality to map TSQL language elements between Querona’s SQL and provider SQL.
While Querona has a lot of dialects configured out-of-the-box, there are ADO.net, JDBC or ODBC drivers that are less popular and are not included with Querona. Users might also decide not to use a specific provider functionality or just do things differently.
For example, provider SQL can be missing an aggregate function like SUM()
.
In such case, Querona has to provide the implementation of SUM()
.
Even if function parity exists, thanks to the mapping configuration, we can decide what SQL functionality we’d like to push-down to the provider.
Note
If the feature-set of a provider SQL is different than TSQL, Querona fills the gaps using its in-memory function implementations or using a default federator.
SQL dialect in action#
Let’s assume that we have connected Querona to MySql and use a single table called Shop.Sales
.
We want to execute in Querona the following SQL statement, which gives us a sum of all sales:
SELECT SUM(Total)
FROM Shop.Sales
Let’s assume that MySql dialect has a defined mapping of the SUM()
function.
The resulting SQL executed in MySql would look like this:
SELECT SUM(`Total`)
FROM `Shop`.`Sales`
Notice that the SQL statement was rewritten to SQL format of MySql database server, along with the SUM()
function.
It means that the whole statement will execute in one step in MySql database
and only results of that query will be transferred from MySql to Querona.
Now let’s assume that MySql dialect has not defined the mapping of the SUM()
function.
In such case, the execution of the statement will be different.
Due to lack of SUM()
function mapping, the execution plan will be a two-step.
The first step is to get the required data from MySql and transfer it to Querona for processing of SUM()
function.
The first SQL statement will look like that:
SELECT `Total` -- notice the lack of SUM() function
FROM `Shop`.`Sales`
The second step will consist of processing the data obtained in the first step, using a default federator - usually the built-in Apache Spark.
The SQL statement will be similar to the following:
SELECT sum('Total')
FROM <temporary_object_name_holding_data_from_step_one>
SQL dialect configuration is an essential part of the query execution and optimization engine. It enables a lot of flexibility in many scenarios, such as:
Configuring a new provider without coding, even for less popular exotic data sources that Querona vendor does not have access or built-in support.
Configuring a dialect that does not depend on a given technology (ADO.net, JDBC, ODBC) or dialect for each technology.
Control over optimizer behavior of what SQL features should be pushed down to the provider. For example, a given functionality in provider may be buggy or perform poorly. In such case, we can disable it and rely on a better option provided by Querona.
Features of SQL dialects#
Querona supports many data-access technologies, such as ADO.Net, JDBC, ODBC, REST, and others. Each technology has its requirements, may support the same features in different ways, or support features not present anywhere else.
To find the common denominator, we introduced the following features to SQL dialects:
Connection string definition customization,
Schema analyzers with configuration,
Query features configuration,
SQL features configuration,
Type mappings,
Function mappings.
We will describe each of them in turn.
Note
SQL dialect configuration requires administrative privileges and is available in web GUI under
. You can use the SQL dialect configuration to tweak the existing dialect definitions or create a new dialect.Connection definition customization#
Connection definition provides means to set a connection string template, which will be used by a new connection wizard. Identifier quote character is also configurable.
Query features#
Query feature |
Description |
---|---|
CROSS JOIN |
True if data provider’s SQL supports |
EXCEPT |
True if data provider’s SQL supports |
FROM with a subquery |
True if data provider’s SQL |
FULL JOIN |
True if data provider’s SQL supports a |
GROUP BY |
True if data provider’s SQL supports |
GROUP BY CUBE |
True if data provider’s SQL supports |
GROUP BY GROUPING SETS |
True if data provider’s SQL supports |
GROUP BY ROLLUP |
True if data provider’s SQL supports |
HAVING |
True if data provider’s SQL supports |
INNER JOIN |
True if data provider’s SQL supports |
INTERSECT |
True if data provider’s SQL supports |
LEFT JOIN |
True if data provider’s SQL supports |
Multisource FROM |
True if data provider’s SQL supports a multi-source |
ORDER BY |
True if data provider’s SQL supports |
OVER |
True if data provider’s SQL supports |
Paging |
True if data provider’s SQL supports paging. |
Parametrized queries |
True if data provider’s SQL supports parametrized queries. |
PARTITION BY inside OVER |
True if data provider’s SQL supports |
RIGHT JOIN |
True if data provider’s SQL supports |
SELECT…INTO |
True if data provider’s SQL supports |
TOP |
True if data provider’s SQL supports |
UNION |
True if data provider’s SQL supports |
UNION ALL |
True if data provider’s SQL supports |
SQL features#
SQL features configuration section allows you to map SQL from source to SQL features in Querona.
SQL feature |
Operator |
Description |
---|---|---|
Arithmetic: And |
& |
True if data provider’s SQL supports arithmetic AND operator. |
Arithmetic: Divide |
/ |
True if data provider’s SQL supports arithmetic DIVISION operator. |
Arithmetic: Subtract |
- |
True if data provider’s SQL supports arithmetic SUBTRACTION and unary NEGATIVE operators. The unary NEGATIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values. |
Arithmetic: Modulo |
% |
True if data provider’s SQL supports arithmetic MODULUS operator. |
Arithmetic: Multiply |
* |
True if data provider’s SQL supports arithmetic MULTIPLICATION operator. |
Arithmetic: Not |
~ |
True if data provider’s SQL supports arithmetic NEGATION operator. |
Arithmetic: Or |
\ |
True if data provider’s SQL supports arithmetic OR operator. |
Arithmetic: Plus |
+ |
True if data provider’s SQL supports arithmetic ADDITION and UNARY POSITIVE operator. The unary POSITIVE operator can also be used to perform arithmetic operations on datetime and smalldatetime values. |
Arithmetic: Xor |
^ |
True if data provider’s SQL supports bitwise XOR operator. |
Bitwise NOT expr. |
~ |
True if data provider’s SQL supports bitwise NOT operator. |
CAST |
True if data provider’s SQL supports a |
|
Column reference |
True if data provider’s SQL supports column reference in expression. |
|
Comparison: Equal |
= |
True if data provider’s SQL supports comparison EQUAL operator. |
Comparison: Greater |
> |
True if data provider’s SQL supports comparison GREATER THAN operator. |
Comparison: Greater or equal |
>= |
True if data provider’s SQL supports comparison GREATER THAN OR EQUAL operator. |
Comparison: Less |
< |
True if data provider’s SQL supports comparison LESS THAN operator. |
Comparison: Less or equal |
<= |
True if data provider’s SQL supports comparison LESS OR EQUAL THAN operator. |
Comparison: Not equal |
<> |
True if data provider’s SQL supports comparison NOT EQUAL operator. |
Concatenate strings |
+ |
True if data provider’s SQL supports a string concatenation using + operator. |
CONVERT |
True if data provider’s SQL supports a |
|
Date and time predefined value expr. |
True if data provider’s SQL supports a predefined date value like in |
|
Function call |
True if data provider’s SQL supports functions. |
|
Literal value |
True if data provider’s SQL supports literals. |
|
Literal value in projection |
True if data provider’s SQL supports literals in projection. |
|
Negative expr. |
True if data provider’s SQL supports unary negative operator. |
|
Parameter reference |
True if data provider’s SQL supports parameter references in expressions. |
|
Predicate: AND |
True if data provider’s SQL supports |
|
Predicate: BETWEEN |
True if data provider’s SQL supports |
|
Predicate: EXISTS |
True if data provider’s SQL supports |
|
Predicate: IN (expression) |
True if data provider’s SQL supports a composite predicate |
|
Predicate: IN (subquery) |
True if data provider’s SQL supports a composite predicate |
|
Predicate: IS NULL |
True if data provider’s SQL supports |
|
Predicate: LIKE |
True if data provider’s SQL supports |
|
Predicate: NOT |
True if data provider’s SQL supports |
|
Predicate: OR |
True if data provider’s SQL supports |
|
Searched CASE |
True if data provider’s SQL supports a Searched |
|
SELECT scalar expr. |
True if data provider’s SQL supports a select scalar expression,
which is a nested |
|
Simple CASE |
True if data provider’s SQL supports a simple |
Microsoft’s documentation#
Type mappings#
Type mappings section allows you to map data types from source to data types in Querona.
SQL type mapping column name |
Scope |
Description |
---|---|---|
System type |
System |
Data type in Querona, read only. Type system is compatible with Microsoft SQL Server 2012. |
Target storage type |
System |
Data storage type in Querona. |
Cast as |
Provider |
Data type that will be used to cast values to when preparing a query for provider. |
Provider import types (CSV) |
Provider |
Comma-separated list of provider type names, including aliases. |
Variable length |
System |
True if a provider data type has variable length. |
Variable precision |
System |
True if a provider data type has variable precision. |
Variable scale |
System |
True if a provider data type has variable scale. |
For example, Teradata provider reports its supported binary data types are VARBYTE and BLOB. We store such data in Querona as VarBinary type. When we format a query to Teradata that uses any of types (VARBYTE, BLOB) we cast it to VARBYTE. The type mappings for the scenario above look like this:
SQL type mapping column name |
Value |
---|---|
System type |
VarBinary |
Target storage type |
VarBinary |
Cast as |
VARBYTE |
Provider import types (CSV) |
VARBYTE, BLOB |
Variable length |
True |
Variable precision |
False |
Variable scale |
False |
Function mappings#
Function mappings section allows you to map provider functions to Querona functions. Left table column represents the Querona function declaration and is read-only. Right table column holds the function declaration supported by provider and is user-editable.
To map a function declaration find its name in the left column and enter the provider function declaration to the right column. Function parameters are represented by a dollar sign followed by one-based parameter ordinal number, for example, $3 represents the third parameter of Querona’s function.
Example, representative variants of function mappings:
Function name |
Mapping to provider function |
Comment |
---|---|---|
CEILING($1) |
CEIL($1) |
Querona function CEILING(parameter) has a different name in provider and maps to CEIL(parameter). |
CHARINDEX($1, $2) |
CHARINDEX($2, $1) |
Querona function declaration maps to the same function name, but with reversed parameters ($1,$2)->($2,$1). |
COUNT($1) |
COUNT($1) |
Querona function declaration maps to exactly the same function in provider. |
COUNT_BIG($1) |
<empty> |
COUNT_BIG($1) function is not supported by provider and will be simulated by Querona. |
YEAR($1) |
DATEPART(year, $1) |
YEAR($1) function does not directly exist in provider, but is substituted with DATEPART(year, $1). |
QUOTENAME($1) |
CONCAT (‘’’, REPLACE($1, ‘’’, CONCAT(‘’’,’’’)), ‘’’) |
QUOTENAME($1) function does not exist in provider, but can be simulated with a compound expression. |
Date/time function mappings#
Date/time function mappings section allows you to map all variants of the following functions:
Example variants of function mappings:
Function name |
Mapping to provider function |
Comment |
---|---|---|
DATEADD(day, $1, $2) |
DATE_ADD($2, INTERVAL $1 day) |
Querona function variant DATEADD(day, $1, $2) mapped to provider function DATE_ADD($2, INTERVAL $1 day). |
DATEADD(nanosecond, $1, $2) |
NANOSECOND($1) |
Querona function variant DATEADD(nanosecond, $1, $2) mapped to provider function NANOSECOND($1). |
Schema analyzers#
Schema analyzers are built-in, configurable adapters that read metadata from a specific provider and perform necessary steps to make the metadata importable to Querona.
We support the following analyzers:
Custom with restrictions,
Custom with Select,
JDBC.
Standard settings for all providers are:
Setting Name |
Type of value |
Description |
---|---|---|
Length field name |
text |
Column name in metadata table returned by the provider that holds length information. Usually a COLUMN_SIZE, COLUMNSIZE, CHARACTER_MAXIMUM_LENGTH. |
Name field name |
text |
Column name in metadata table returned by the provider that holds column name. Usually a COLUMN_NAME. |
Nullability check field return type |
list of values |
Type of value returned by the provider in the column designated in ‘Nullable field name’ setting. Supported values are Bool, Int, Text:Yes/No, Text:Y/N. |
Nullable field name |
text |
Column name in metadata table returned by the provider that holds column nullability. Usually a NULLABLE, IS_NULLABLE. |
Precision field name |
text |
Column name in metadata table returned by the provider that holds precision. Usually a NUM_PREC_RADIX, NUMERIC_PRECISION. |
Scale field name |
text |
Column name in metadata table returned by the provider that holds scale. Usually a DECIMAL_DIGITS, NUMERIC_SCALE. |
Type field name |
text |
Column name in metadata table returned by the provider that holds type name. Usually a TYPE_NAME, DATA_TYPE. |
Field type retrieved by |
list of values |
Designates a method to extract type information from column designated in ‘Type field name’. Supported methods are Odbc code, Type. |
Table’s metadata loading method |
list of values |
Supported methods are Simple object name restriction, Fully qualified object name restriction. |
Table list loading method |
list of values |
Supported methods are With restrictions, Without restrictions. |
Custom query for table’s metadata |
text |
Query to use to extract table’s metadata from the provider. |
Divide Unicode column length by 2 |
true/false |
True when column length returned from provider holds the length doubled for Unicode characters. |
Use database name in restrictions |
true/false |
True when database name has to be used in restrictions. False otherwise. |
Use schema name in restrictions |
true/false |
True when schema name has to be used in restrictions. False otherwise. |
Use table name in restrictions |
true/false |
True when table name has to be used in restrictions. False otherwise. |