Data Catalog Integration
Indexima is integrated with Apache Atlas data catalog and any data catalog using the same interface as Atlas. This integration allows organizations' metadata management and governance capabilities to build a catalog of their data assets connected to Indexima.
Installation & Deployment
Download and deploy the JAR file
To use the Apache Atlas integration, the Atlas library needs to be deployed to each Indexima node.
Download the appropriate version of this library indexima-atlas-lib-[VERSION].jar from https://download.indexima.com/release and deploy it to each indexima node in /galactica/atlas
directory.
Configure file atlas-application.properties
Please refer to atlas-application.properties to configure this file.
Restart Indexima service after adding the Atlas library and atlas-application.properties file for Indexima service to load Atlas.
Authentication Configuration
Indexima can authenticate to Atlas with FILE, LDAP, or KERBEROS authentication mechanism.
Rights
Whatever the mechanism that would be used to connect to Atlas ( dedicated used or Indexima user that runs the application), make sure this user will have the WRITE rights within Atlas.
FILE and LDAP authentication
For FILE and LDAP authentication, the user used to connect to Atlas can be provided with parameters atlas.user
and atlas.password.
The parameter atlas.enable
allows to activate/deactivate the atlas integration (see galactica.conf).
Example of atlas activation with a FILE or LDAP connexion: execute the following commands in Indexima console
SET_ atlas.user=[ATLAS_ADMIN_USER];
SET_ atlas.password=[ATLAS_PASSWORD];
SET_ atlas.enable=true;
As with any dynamic parameters, dynamically set atlas parameters are stored in the warehouse in galactica_ext.conf file. The Atlas password is automatically encrypted when set.
Please note that any change of Atlas.user or Atlas.password must be followed by an atlas.enable=true in order for the change to take effect.
Note: If the atlas parameters are added directly in galactica.conf (not recommended for dynamic parameters), the atlas password must not be encrypted.
Kerberos authentication
For Kerberos authentication, after adding atlas.authentication.method.kerberos=true in file atlas-application.properties, the atlas integration is enabled with the following command in Indexima console:
SET_ atlas.enable=true;
Data Catalog Feed
Initialization
To initiate Atlas just after enabling this integration, run ./start-node.sh --import-atlas
on an indexima node to send all the objects already created to Atlas.
Operations captured
Once Apache Atlas integration is enabled, any creation of an object in Indexima will trigger a call to propagate this object to Atlas. The metadata of the object, author, creation timestamp, and lineage of objects are available in Atlas. Indexima objects are modelized as standard hive objects in Atlas, as described in https://atlas.apache.org/#/HookHive.
The following hive operations are currently captured:
create database/table/view
alter database/table/view