Indexima Advanced Analyzer
What is the Analyzer
The analyzer is the Indexima proprietary tool that analyzes the last queries and suggests Indexes (to create or to delete) in order to optimize the queries to come (based on the assumption that the upcoming queries would look like the last queries).
What is the Advanced Analyzer
The advanced analyzer is the interface that allows advanced users to:
- Choose and adjust some parameters of the analyzer.
- Review the suggestions of the analyzer.
- Choose to apply or not the analyzer suggestions (ie keep, create or delete indexes).
Access the Advanced Analyzer
- Click on the tab catalog of your INDEXIMA Interface.
- Select the table to be analyzed.
- Click on New Analysis then Advanced Analyzer.
Using the Advanced Analyzer
1. Define Parameters
Required parameters
Days
Number of past days to analyze in order to find the most optimum index configuration to keep performance objectives.
Hits
Minimum hits of past queries to suggest the creation of a new hyperindex. Specifying the value of 1 indicates that all queries will be analyzed.
Additional parameters
Minimize changes
- Yes: Analyzer will suggest hyperindexes based on the existing hyperindexes in order to optimize non already optimized queries.
- No: Analyzer won't consider the existing hyperindexes and will suggest hyperindexes that would optimize all queries.
Prejoins
- Yes: If Queries include joins (left or inner) with a dimension table, the analyzer will suggest precomputing the result of the join query.
- No: Suggestions would be made using only fields of the fact (or limited) table.
Index coefficient (%)
The factor that controls the number of indexes:
- When the parameter
analyser.default-merge-policy
is equal to COEF (default value), An index coefficient of 0 will try to minimize the number of indexes, whereas of coefficient of 100 will propose the maximum number of indexes. - When the parameter
analyser.default-merge-policy
is equal to MAX_INDEX, the analyzer will propose no more thananalyser.default-max-expected-indexes
indexes. See galactica-conf for parameters description.
Index specialization
It is possible to prequalify the content of an index in order to include or exclude suggestions of an index containing some typical contents such as SELECT clause, expression with high cardinality.
Specialization | Description |
---|---|
Fully specialized | Always add expressions in the suggested index |
Half specialized | Add expressions in the suggested index when they are in WHERE clauses and exclude those in the SELECT clauses |
Not specialized | Only add expressions in the suggested index if expressions contain high cardinality fields |
High cardinality
refers to columns with values that are very uncommon or unique. High cardinality column values are typically identification numbers, email addresses, or user names.
An example of a data table column with high cardinality would be a USERS table with a column named USER_ID. Each time a new user is created in the USERS table, a new number would be created in the USER_ID column to identify them uniquely.
Columns with potentially high cardinality are suggested by the Indexima Analyser in the High cardinality
field.
It is possible to add or remove columns from this field and run the analysis again to verify the impact of these suggested indexes.
Mandatory columns
List of columns to be present in the hyperindex even if no SELECT hits one of those. Those columns will be included in the suggestion list.
Index specialization
It is possible to prequalify the content of an index in order to include or exclude suggestions of an index containing some typical contents such as SELECT clause, expression with high cardinality.
Number of authorized ALL columns (only for External tables)
Define a maximum threshold of the "ALL" columns in HyperIndexes proposals.
This makes it possible to determine if we want to delegate more or fewer requests who fetch raw data.
For example: By authorizing the maximum of ALL columns, you allow the analyzer to make HyperIndexes with more retention on the underlying database.
2. Launch Analysis
Click on the Submit
button. After a few seconds, The Indexima Analyzer displays indexes to be deleted and indexes to be created in the result area
Understand the suggestions
Indexes are displayed with the following information:
- Number of hits by analyzed queries
- Index size (estimated size when it is the case of a new index to create)
3. Apply Analyzer suggestions
It is possible to apply actions:
- index by index
- in one batch
When you click on the Execute
button, a batch of SQL commands is sent to the Indexima Cluster.
The Indexima Cluster displays the status of the submission with all indexes altered.
Click on Check Progress
to jump to the Indexima Monitor Query tab.
All the index actions submitted by the Indexima Cluster are translated into SQL queries :
ALTER TABLE ADD [MAIN] INDEX
(...)- or
ALTER TABLE DROP INDEX IndexName
All users can continue to access the Indexima Cluster.
See which Hyperindexes have been created and deleted in the Query panel
Advanced Actions
Download SQL
The index actions submitted by the Indexima Analyser are translated into SQL queries ALTER TABLE.
By clicking Download you can download the SQL queries (ALTER Table statements) into a sql file
Edit HyperIndexes
Why edit Hyperindexes ?
Users may want to adjust indexes by removing or adding columns:
There are many reasons to remove a column from a proposed hyperindex:
Cardinality is very high; All hyperindex are stored in INDEXIMA CLUSTER node RAM. A high cardinality index can produce very large indexes.
Complex expressions or clauses; Such expressions can lead to wrong and inaccurate results. It's an indication that the data model needs to be reworked and data must be flattened or the star schema must be de-normalized to simplify JOINs.
How to edit Hyperindexes
- Slide the
Edit Mode
toggle. - Untick create box.
- Click on
Edit Index
.
This will open a new window.
Remove columns by clicking on the cross nearby. Once you are done, click on Update.
Edit HyperIndexes Define an Index as the main index
Under certain circumstances, you may need to change your main index
. This is also the purpose of the advanced mode.
The main index
is the index that queries use to find columns not present in hyperindex in order to fetch the fine data. Most of the time, this data is located on disks, explaining why the performance of the query is not optimum even if you have fast SSD disks.
After a period of use of the data, running several times the INDEXIMA ANALYSER to optimize the queries execution time, you can feel that a new main index can allow you to continue to improve your performance.
Slide the button Switch to Main Index
and clean when necessary columns of your new future main index. Then click on the button Update your index expression
Advanced Actions (2)
Reset Analytics for a table
The Analyzer uses past queries and a sample to make suggestions. Users may want to delete past queries used for building those suggestions.
Use the SQL Command DROP ANALYTICS to delete all the analytics related to a table.