A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment


Rupali Gill1 and Jaiteg Singh2 
1Assistant Professor, School of Computer Sciences, 2 Associate Professor, School of Computer Applications, Chitkara University, Punjab, India

Email: rupali.gill@chitkara.edu.in

Abstract: In today’s scenario, Extraction–transformation– loading (ETL) tools have become important pieces of software responsible for integrating heterogeneous information from several sources. The task of carrying out the ETL process is potentially a complex, hard and time consuming. Organisations now –a-days are concerned about vast qualities of data. The data quality is concerned with technical issues in data warehouse environment. Research in last few decades has laid more stress on data quality issues in a data warehouse ETL process. The data quality can be ensured cleaning the data prior to loading the data into a warehouse. Since the data is collected from various sources, it comes in various formats. The standardization of formats and cleaning such data becomes the need of clean data warehouse environment. Data quality attributes like accuracy, correctness, consistency, timeliness are required for a Knowledge discovery process. The present state -of –the- art purpose of the research work is to deal on data quality issues at all the aforementioned stages of data warehousing 1) Data sources, 2) Data integration 3) Data staging, 4) Data warehouse modelling and schematic design and to formulate descriptive classification of these causes. The discovered knowledge is used to repair the data deficiencies. This work proposes a framework for quality of extraction transformation and loading of data into a warehouse.


DOI: 10.15415/jotitt.2014.22012


LINK: http://dspace.chitkara.edu.in/jspui/bitstream/1/520/1/22012_JOTITT_Rupali.pdf

                                      











Comments

Popular posts from this blog

Manpower Planning, Scheduling and Tracking of a Construction Project Using Microsoft Project Software

Analysis of Student’s Data using Rapid Miner

Review of Modeling and Simulation Technologies Application to Wind Turbines Drive Train