Product: TIBCO Spotfire®
Databases - Connecting Spotfire Data Science to Impala via JDBC
Connecting Spotfire Data Science to Impala via JDBC
Connecting Spotfire Data Science to Impala via JDBC
Follow the steps below to connect Spotfire Data Science to Impala. In this example we're using Apache Impala 2.2 with JDBC API Version 4.1
1. Copy the Impala driver (made up of several JAR files) to the $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc_driver/Public and $CHORUS_HOME/shared/libraries directories and change the ownership of these copies to the user who runs Spotfire Data Science (usually user 'chorus'). Grab the JAR files from Cloudera by selecting the right version: http://www.cloudera.com/downloads/connectors/impala/jdbc/2-5-5.html
2. Create a new Impala directory: $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/impala and copy the driver.properties file from $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/default directory to the newly created impala directory.
3. Edit the content of $CHORUS_HOME/shared/ALPINE_DATA_REPOSITORY/jdbc/impala/driver.properties file:
# Specify the JDBC class driver for the desired database type.
# Examples:
# Oracle = oracle.jdbc.driver.OracleDriver
# Greenplum = org.postgresql.Driver
# DB2 = com.ibm.db2.jcc.DB2Driver
# Netezza = org.netezza.Driver
# PostgreSQL = org.postgresql.Driver
# SQLServer = com.microsoft.sqlserver.jdbc.SQLServerDriver
# MySQL = com.mysql.jdbc.Driver
# Teradata = com.teradata.jdbc.TeraDriver
# Vertica = com.vertica.jdbc.Driver
# Sybase = com.sybase.jdbc2.jdbc.SybDriver
# Informix = com.informix.jdbc.IfxDriver
# SAPDB = com.sap.dbtech.jdbc.DriverSapDB
# InterBase = interbase.interclient.Driver
# HSqlDB = org.hsqldb.jdbcDriver
# MariaDB = org.mariadb.jdbc.Driver
# MySQL = com.mysql.jdbc.Driver
# Make sure to use your specific JDBC API Version
driverClass = com.cloudera.impala.jdbc41.Driver
# Add this so that double quotes can be used
identifierQuotation=
4. Edit the content of additional_jdbc_drivers.rb file (with a path similar to this one: /usr/local/chorus/releases/5.9.1.0.3973-5d95f7c97/components/core/app/mixins/sequel/extensions/additional_jdbc_drivers.rb) and add a line for the impala class so that the content looks similar to this:
module Sequel
module AdditionalJdbcDrivers
MAP = {
mariadb: ->(db) { org.mariadb.jdbc.Driver },
teradata: ->(db) { com.teradata.jdbc.TeraDriver },
vertica: ->(db) { com.vertica.jdbc.Driver },
hive2: ->(db) { org.apache.hive.jdbc.HiveDriver },
hive: ->(db) { org.apache.hadoop.hive.jdbc.HiveDriver },
impala: ->(db) { com.cloudera.impala.jdbc41.Driver }
}
MAP.each do |key, driver|
::Sequel::JDBC::DATABASE_SETUP[key] = driver
end
end
end
Note: The change in additional_jdbc_drivers.rb file needs to be applied again after upgrading Spotfire Data Science.
5. Restart Spotfire Data Science and set the Data Connection using a similar URL (you can copy your Impala connection URL):
jdbc:impala://myServer:21050
6. Make sure that your MEM_LIMIT setting in your Impala configuration has adequate memory. For more information, see: http://www.cloudera.com/documentation/enterprise/5-5-x/topics/impala_config_options.html#config_options
Comments
0 comments
Article is closed for comments.