CISUC

A Hybrid Memory Data Cube Approach for High Dimension Relations

Authors

Abstract

Approaches based on inverted indexes, such as Frag-Cubing, are considered efficient in terms of runtime and main memory usage for high dimension cube computation and query. These approaches do not compute all aggregations a priori. They index information about occurrences of attributes in a manner that it is time efficient to answer multidimensional queries. As any other main memory based cube solution, Frag-Cubing is limited to main memory available, thus if the size of the cube exceeds main memory capacity, external memory is required. The challenge of using external memory is to define criteria to select which fragments of the cube should be in main memory. In this paper, we implement and test an approach that is an extension of Frag-Cubing, named H-Frag, which selects fragments of the cube, according to attribute frequencies and dimension cardinalities, to be stored in main memory. In our experiment, H-Frag outperforms Frag-Cubing in both query response time and main memory usage. A massive cube with 60 dimensions and 10^9 tuples was computed by H-Frag sequentially using 110 GB of RAM and 286 GB of external memory, taking 64 hours. This data cube answers complex queries in less than 40 seconds. Frag-Cubing could not compute such a cube in the same machine.

Keywords

Data Cube, High Dimension, Inverted Index, OLAP, and External Memory.

Subject

Data Cube

Conference

Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1, April 2015

PDF File

DOI


Cited by

No citations found