Efficient Evaluation of Sparse Data Cubes
- UNCG Author/Contributor (non-UNCG co-authors, if there are any, appear on document)
- Lixin Fu, Associate Professor (Creator)
- Institution
- The University of North Carolina at Greensboro (UNCG )
- Web Site: http://library.uncg.edu/
Abstract: Computing data cubes requires the aggregation of measures over arbitrary combinations of dimensions in a data set. Efficient data cube evaluation remains challenging because of the potentially very large sizes of input datasets (e.g., in the data warehousing context), the well-known curse of dimensionality, and the complexity of queries that need to be supported. This paper proposes a new dynamic data structure called SST (Sparse Statistics Trees) and a novel, in-teractive, and fast cube evaluation algorithm called CUPS (Cubing by Pruning SST), which is especially well suitable for computing aggregates in cubes whose data sets are sparse. SST only stores the aggregations of non-empty cube cells instead of the detailed records. Furthermore, it retains in memory the dense cubes (a.k.a. iceberg cubes) whose aggregate values are above a threshold. Sparse cubes are stored on disks. This allows a fast, accurate approximation for queries. If users desire more refined answers, related sparse cubes are aggregated. SST is incrementally maintainable, which makes CUPS suitable for data warehousing and analysis of streaming data. Experiment results demonstrate the excellent performance and good scalability of our approach.
Efficient Evaluation of Sparse Data Cubes
PDF (Portable Document Format)
392 KB
Created on 1/1/2004
Views: 3861
Additional Information
- Publication
- Advances in Web-Age Information Management: 5th International Conference (WAIM'04), Dalian, China, July 15-17, 2004: Springer-Verlag GmbH, 2004, pp. 336-345
- Language: English
- Date: 2004
- Keywords
- Computing data cubes, Data set