CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees
- UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
- Lixin Fu, Associate Professor (Creator)
- Institution
- The University of North Carolina at Greensboro (UNCG )
- Web Site: http://library.uncg.edu/
Abstract: We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems.
CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees
PDF (Portable Document Format)
1218 KB
Created on 8/27/2010
Views: 2276
Additional Information
- Publication
- Distributed and Parallel Databases, An International Journal. November, 2003. 14:3, 221-254.
- Language: English
- Date: 2003
- Keywords
- data cube, data warehouse, multi-dimensional OLAP, query processing, statistics tree