Date Posted:
Product: TIBCO Spotfire®
Product: TIBCO Spotfire®
Problem:
Hadoop - Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x
Solution:
Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x
Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x
In order to be able to create a native connection from Alpine 6.2 to kerberized CDH Hive, go through the following steps:
1. Make sure that the Hive hostname is properly configured in the /etc/hosts file of Alpine server.
2. Add a new data connection - Hadoop Hive (see the attached screenshot).
3. Configure the parameters on the first page accordingly (see the attached screenshot).
4. Then configure the additional parameters:
alpine.principal=alpine/chorus.alpinenow.local@ALPINENOW.LOCAL
alpine.keytab=/home/chorus/keytab/alpine.keytab
mapreduce.jobhistory.address=nn2.alpinenow.local:10020
hive.hiveserver2.uris=jdbc:hive2://cm.alpinenow.local:10000/default
hive.metastore.kerberos.principal=hive/_HOST@ALPINENOW.LOCAL
hive.server2.authentication.kerberos.principal=hive/_HOST@ALPINENOW.LOCAL
hive.metastore.client.connect.retry.delay=1
hive.metastore.client.socket.timeout=600
dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.datanode.kerberos.principal=hdfs/_HOST@ALPINENOW.LOCAL
dfs.ha.automatic-failover.enabled.nameservice1=true
dfs.ha.namenodes.nameservice1=namenode64,namenode72
dfs.namenode.http-address.nameservice1.namenode64=nn1.alpinenow.local:50070
dfs.namenode.http-address.nameservice1.namenode72=nn2.alpinenow.local:50070
dfs.namenode.https-address.nameservice1.namenode64=nn1.alpinenow.local:50470
dfs.namenode.https-address.nameservice1.namenode72=nn2.alpinenow.local:50470
dfs.namenode.kerberos.principal=hdfs/_HOST@ALPINENOW.LOCAL
dfs.namenode.rpc-address.nameservice1.namenode64=nn1.alpinenow.local:8020
dfs.namenode.rpc-address.nameservice1.namenode72=nn2.alpinenow.local:8020
dfs.namenode.servicerpc-address.nameservice1.namenode64=nn1.alpinenow.local:8022
dfs.namenode.servicerpc-address.nameservice1.namenode72=nn2.alpinenow.local:8022
dfs.nameservices=nameservice1
ha.zookeeper.quorum=cm.alpinenow.local:2181,nn1.alpinenow.local:2181,nn2.alpinenow.local:2181
hadoop.rpc.protection=authentication
hadoop.security.authentication=kerberos
mapreduce.jobhistory.principal=mapred/_HOST@ALPINENOW.LOCAL
mapreduce.jobhistory.webapp.address=nn2.alpinenow.local:19888
yarn.app.mapreduce.am.staging-dir=/tmp/hadoop-yarn/staging
yarn.resourcemanager.admin.address=nn1.alpinenow.local:8033
yarn.resourcemanager.principal=yarn/_HOST@ALPINENOW.LOCAL
yarn.resourcemanager.resource-tracker.address=nn1.alpinenow.local:8031
yarn.resourcemanager.scheduler.address=nn1.alpinenow.local:8030
hive.server2.enable.doAs=true
5. If high availability for the resource manager (HA for RM) is enabled on the cluster, add the following additional parameter with a comma-separated list of resource manger hostnames as a value to support HA for RM form Alpine side:
failover_resource_manager_hosts=nn1.alpinenow.local,nn2.alpinenow.local
Note: This parameter needs to be combined with a list of other parameters if SSL for the resource manager is enabled. In that case, find and add all the Resource Manager HA parameters from the yarn-site.xml file. For more information, look at this page - Connecting Alpine to a cluster with Resource Manager High Availability enabled .6. If "data in transit" encryption is enabled and set up with the following hadoop parameters from the cluster side:
dfs.encrypt.data.transfer=true
dfs.data.transfer.protection=Privacy
hadoop.rpc.protection=Privacy
dfs.encrypt.data.transfer.algorithm=AES/CTR/NoPadding
dfs.encrypt.data.transfer.cipher.key.bitlength=256
add these two lines to the Alpine connection additional parameters list:
dfs.data.transfer.protection=privacy
hadoop.rpc.protection=privacy
Comments
0 comments
Article is closed for comments.