General requirements
System and sizing requirements
Cluster nodes Indexima
Operating System supported : Centos, Amazon Linux, Debian, Redhat, OpenSUSE, Ubuntu, Windows Server 2016+
Number of machines : 2 minimum for production usage
CPU : 4 Cores minimum
RAM : 16 Gb minimum
Local storage : 100 Gb available
Administrator account / root privileges on the machines
For production use, we recommend running on a Centos, Amazon Linux, Debian.
The sizing (number of machines, of cores and of memory) highly depends on the volume of data, the query patterns and the number of queries. Discuss with Indexima team to assist you in sizing your infrastructure.
Shared storage
Shared storage accessible by Indexima nodes (ex : NFS, S3, HDFS, CEPH / …). See compatibility matrix Storage Compatibility Matrix.
Minimum 200 Gb available (depending on the use cases).
Console Indexima (optional)
1 machine
Operating System supported : same as indexima nodes
CPU : 2 Cores
RAM : 8 Gb
Storage : 40 Gb available
Administrator account / root privileges
Indexima console can also be executed from one of the indexima nodes.
Ansible installer (optional)
1 machine
Operating System supported : Centos, Amazon Linux, Debian
CPU : 2 Cores
RAM : 8 Gb
Storage : 40 Gb available
Administrator account / root privileges
Ansible is not required for Yarn deployment.
Ansible is not compatible with windows server.
Ansible installer can also be executed from one of the indexima nodes.
Network requirements
This table list all network requirements.
Source | Destination | Port | Protocol | Description |
---|---|---|---|---|
IP Cluster Indexima | 443 | TCP | Download packages from the internet, for the installation of the software prerequisites (Java / Hadoop / Indexima installer / Ansible) Also it’s possible to install with local zip packages | |
Private client network | IP Cluster Indexima | 8082 | TCP | Connect to Indexima web console / Connect to Indexima API |
Private client network | IP Cluster Indexima | 9999 | TCP | Nodes status web page |
IP Cluster Indexima | IP Cluster Indexima | All | All | Inter-node communication inside Indexima Cluster |
Ansible machine | IP Cluster Indexima | 22 | TCP | Ansible connection to Indexima cluster to install and configure Indexima |
Data consumers (dataviz) | IP Cluster Indexima | 10000 | All | SQL entrypoint to query Indexima core engine |
IP Cluster Indexima | Data sources | N/A | TCP | Requests between Indexima and data sources |
IP Cluster Indexima | Indexima warehouse | N/A | TCP | Requests between Indexima and the shared storage for Indexima warehouse |
This list of network requirements is valid for the default Indexima configuration. Network ports can be configured through Indexima configuration.
A network load balancer is advised to load balance the sql queries between Indexima Core nodes (port 10000). See Load balancing for more details.
A valid SSL certificate is required to activate SSL encryption (https to access Indexima console, or ssl encryption between nodes and console).
Software requirements
The following softwares are required to run Indexima
Software | Version | Install Link |
---|---|---|
Java JDK | 8 | Linux: https://openjdk.java.net/install/ |
Hadoop librairies for standalone deployment | > 2.8.3 | https://archive.apache.org/dist/hadoop/common/hadoop-2.8.3/hadoop-2.8.3.tar.gz https://archive.apache.org/dist/hadoop/common/hadoop-3.1.4/hadoop-3.1.4.tar.gz |
Tez librairies for standalone deployment with hadoop 3 | 0.9.2 | With hadoop 3 setup, add following librairies into galactica/tez folder: |
Hadoop libraries for yarn deployment | N/A | See compatibility matrix: Cloudera Compatibility matrix |
JDBC drivers for connecting to datasource | N/A | See compatibility matrix: Data Source Compatibility matrix |
If you are deploying on Windows, you will also need Winutils for Hadoop. You can download it at https://download.indexima.com/libs/winutils.exe. Then, copy it in the bin folder of the hadoop-2.8.3 libraries installed previously.
When deploying Indexima with YARN, we assume that the Hadoop libraries are already present on the machines. You will be able to simply use the yarn classpath command later on for the configuration.