CISUC

Efficient compression of text attributes of data warehouse dimensions

Authors

Abstract

This paper presents and evaluates an approach that allows the compression of data in Relational Database Management Systems (RDBMS) using existing text compression algorithms. Although the technique proposed is completely general, we believe it is particularly advantageous for the compression of medium size and large dimension tables in data warehouses. In fact, dimensions usually have a high number of text attributes and a reduction in the size of middle or large dimension have a big impact in the execution time of queries that join that dimension with the fact tables. In general, the high complexity and long execution time of most data warehouse queries make the compression of dimension text attributes (and possible text attributes that may exist in the fact table, such as false facts) an effective approach to speed up query response time. The proposed approach has been evaluated using the well-known TPC-H benchmark and the results show that speed improvements greater than 40% can be achieved for most of the queries

Subject

Data Warehousing

Conference

7th International Conference on Data Warehousing and Knowledge Discovery - DaWak, August 2005


Cited by

Year 2013 : 6 citations

 1 "Efficient compression of text attributes of data warehouse dimensions

 " 13984315614883010982 Optimizing multi storage parallel backup for real time database systems http://www.ijesat.org/Volumes/2012_Vol_02_Iss_05/IJESAT_2012_02_05_48.pdf M Muthukumar, T Ravichandran IJESAT Year/NA ijesat.org
2 "Efficient compression of text attributes of data warehouse dimensions

 " 13984315614883010982 OPTIMIZING AND ENHANCING PARALLEL MULTI STORAGE BACKUP COMPRESSION FOR REAL-TIME DATABASE SYSTEMS http://www.doaj.org/doaj?func=fulltext&aId=1108088 C Coimbatore Publication/NA Year/NA doaj.org
3 "Efficient compression of text attributes of data warehouse dimensions

 " 13984315614883010982 Pervasive Business Intelligence http://amsdottorato.cib.unibo.it/5232/ E Turricchia Publication/NA 2013 amsdottorato.cib.unibo.it
4 "Efficient compression of text attributes of data warehouse dimensions

 " 13984315614883010982 Query-relevant document representation for text clustering http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5664205 M Makrehchi … (ICDIM), 2010 Fifth International Conference on 2010 ieeexplore.ieee.org
5 "Efficient compression of text attributes of data warehouse dimensions

 " 13984315614883010982 Cost-Effective Data Allocation in Data Warehouse Striping http://www.doaj.org/doaj?func=fulltext&aId=1177338 R Almeida, J Vieira, M Vieira, H Madeira… International … 2012 doaj.org