apache phoenix Join query performance apache phoenix Join query performance hadoop hadoop

apache phoenix Join query performance


By default, Phoenix uses hash-joins, requiring the data to fit in memory. If you run into problems (with very large tables), you can increase the amount of memory allocated to Phoenix (config setting) or set a query "hint" (ie. SELECT /*+ USE_SORT_MERGE_JOIN*/ FROM ...) to use sort-merge joins which do not have the same requirement. They plan to auto-detect the ideal join algorithm in the future. Additionally, Phoenix currently supports only a subset of join operations.


Did u try the LHS & RHS concept which has been described at the phoenix documentation as a performance optimizing feature(http://phoenix.apache.org/joins.html)? Incase of an inner join the RHS of the join will be built as a hash table in the server cache so please ensure that your smaller table forms the RHS of the inner join.Were the columns u were selecting in the query a part of the secondary index u created?If u have tried the above and still getting a latency in minutes then u need to check the memory of Hbase region servers and whether they are sufficient to serve your query.