A continuous data integration methodology for supporting real-time data warehousing
Authors
Abstract
A data warehouse provides information for analytical processing, decision making and data mining tools. As the concept of real-time enterprise evolves, the synchronism between transactional data and data warehouses, statically implemented, has been reviewed. Traditional data warehouse systems have static structures of their schemas and relationships between data, and therefore are not able to support any dynamics in their structure and content. Their data is only periodically updated because they are not prepared for continuous data integration. For these purposes, real-time data warehouses seem to be very promising. In this paper we present a methodology on how to adapt data warehouse schemas and user-end OLAP (On-Line Analytical Processing) queries for efficiently supporting real-time data integration. To accomplish this, we use techniques such as table structure replication and query predicate restrictions for selecting data, managing to enable continuous data integration in the data warehouse with minimum impact in query execution time. We demonstrate the functionality of the method by analyzing its impact in query performance using benchmark TPC-H executing query workloads while simultaneously performing continuous data integration at various insertion time rates.
Keywords
Real-time and active data warehousing, Continuous data integration, ETL, Refreshment loading process
Subject
Real-Time Data Warehousing
Conference
ICEIS 2007 - International Conference on Enterprise Information Systems, June 2007
PDF File
Cited by
No citations found