Testing SQL and NoSQL approaches for big data warehouse systems

Authors

Rafael Almeida
Jorge Bernardino
Pedro Furtado

Abstract

Decision support systems are increasingly used in the management of industrial enterprises, the state sector and the scientific community. With the increase of size of transactional and log data sources, as well as the increase in size and speed needs for data warehousing systems that support decision-making, the data becomes big and queries used in these systems are very complex and time-consuming. This leads to the need for large-scale processing in acceptable time intervals. In this paper, we assess whether both SQL and NoSQL scalability platforms are able to process efficiently big data warehouses. We evaluate the database engine MySQL cluster, which enables clustering and scalability of in-memory databases in a shared nothing platform, and we compare it to traditional database engine Microsoft SQL Server 2012 (non-parallel) and to Hive. In the experimental evaluation, we use one appropriate benchmark to evaluating decision support systems, the Star Schema Benchmark (SSB). The results obtained allow us to estimate the improvement obtained with the increase of resources allocated to the MySQL cluster or Hive, and to obtain a better understanding of scalability limitations and characteristics when these engines are used for decision support systems.

Journal

International Journal of Business Process Integration and Management, Vol. 7, #4, Inderscience 2015

DOI

Cited by

No citations found