Connect a standalone Indexima cluster to a kerberized instance
Kerberos configuration on Indexima cluster
For example, on CentOS
yum install krb5-workstation
Edit /etc/krb5.conf file
- Modify default domain
- admin_server and kdc location
- Possible problems
- Comment renew_lifetime parameter
- Modify default_ccache_name to use /tmp directory and not keyring
HDFS configuration
You will have to choose a user and a keytab to connect to your kerberized cluster. This user needs to be declared as a proxy user in your HDFS configuration.
After modification, you will need to restart your HDFS cluster.
Example of impala as a proxy user :
Galactica configuration modification
You can create a keytab for indexima on your kerberized cluster or use the keytab.
You need to copy your keytab on each machine from the Indexima cluster.
Create a jaas.conf file. { required
Add the following line
Additional actions to connect to a Kerberized Impala
Execute a manual Kinit
Depending on your Impala driver version (, you will need to do a manual kinit with the user your choose to connect on your Impala cluster.
kinit -kt ... (specify your user and keytab)
Table creation from Impala
create table from_impala from my_impala_table
IN 'jdbc:impala://impala_server_adress:impalaPort;AuthMech=1;KrbRealm=XXX.COM;KrbHostFQDN=ip-FQDN-adress-;KrbServiceName=impala'
Table load from Impala
You can load data from Impala by doing a JDBC load but it is more efficient to use an HDFS load
load data inpath 'hdfs://ipadress:8020/user/hive/warehouse/xxx' into table from_impala format parquet
Help for debug purposes
impala-shell installation
This section is not mandatory but may be useful for debugging purposes.
yum install python-pip gcc gcc-c++ cyrus-sasl-devel
python pip install impala-shell
Try to connect to remote Impala instance
kinit -kt ... (your keytab) ...
impala-shell -k
Check if you can browse tables and data from Impala.