Instance configuration#

The are two models of connecting and working with Apache Spark:

  • using a local managed instance(s),

  • using an existing cluster (e.g. Azure HDInsight).

We can configure and run as many instances as needed, as long as they use different listening ports.


To communicate with Spark, Querona uses a Driver, which is a Java application acting as a proxy, delegating incoming requests to and from Spark. Once started, the Driver listens for incoming connections from Querona.

Incoming requests are authenticated using an API-key. When required, the Driver opens the reverse connections to Querona via TDS, using the authentication method set on the Spark connection.

Local Instance Configuration#

Navigate to Administer ‣ Spark Instances.

Main menu

Select the default Spark instance, usually found under name like “Spark 3.x”.

Details menu

The following configuration settings can be set for a given instance:



Default value


A friendly name of the instance e.g. Spark 3


The port that the driver will open to communicate with Querona.



The protocol that the driver will use to exchange data with Querona. Possible values: Thrift, http


Spark dialect

The dialect version used to communicate with the driver. Possible values: 2.0, 3.0.

Driver API key

The security key used to authenticate incoming requests - this must match the key given when configuring the connection to Spark. Using the default value is discouraged.


Driver version

The version of the driver to be deployed to the instance - should match the used Spark version.


Delta Lake version

The version of Delta Lake libraries to use. Each Delta Lake version is compatible with one or more Spark versions. Setting to Auto instructs Querona to infer the latest compatible Delta Lake version from Spark binaries. Setting to None disables loading of Delta Lake libraries. For more information about Delta Lake and Spark compatibility see releases at.

3.0 (Spark 3.5)


The root directory of Spark distribution used.



The directory where Spark configuration is being stored


Log directory

The directory where Spark log files are being generated. No spaces allowed. See also: Disable Log Tracing


Spark master

Spark Master URL


Driver memory

Amount of memory to use for the driver process.


Executor memory

Amount of memory to use per executor process


Total executor cores

The number of cores to use on each executor



Value of _JAVA_OPTIONS variable picked up by Spark. Note: setting too low like “-Xms1G” might trigger the following startup error: “‘Initial’ is not recognized as an internal or external command”


Sets whether the given instance should be automatically started together with Querona.



Sets whether the given instance should be left alive even the master Querona process is stopped.


Enable Log Tracing

Checking this flag will cause Querona to start tracing and forwarding Spark log entries. It’s useful when using a logging software such as SEQ, to have all logs in one place.


Advanced settings#

Querona stores a few configuration files in C:\ProgramData\Querona\conf\spark (by default).




Querona-specific HDFS configuration


Querona-specific Hive configuration


Querona configuration parameters


Spark environment variables configuration

Adding a managed instance#

Navigate to Administer ‣ Spark Instances.

To add a new local Spark instance:

  1. Click the Add button

  2. Uniquely name your instance

  3. Set values of the required settings and Save

To access your new instance, you need to create a Connection configuration targeting it.