We all know that systems are moving from legacy to new technology, and ETL is one of the common technique used to help this transformation. we can consolidate the scattered data for any organization while working with different data format and sources.
Now we will talk about the basic concept of ETL and how it has been tested
Dataware housing is when you are running a business which is generating data like we have to build a dataware house on the employee data . Then OLTP is no use, we have to maintain a OLAP database,because the data is so complex and data is not useful for analyses ,we cant make a business decision that is the reason we do the ETL process and thats where ETL comes into picture
ETL stands for Extracts Transform and Load.
we extract the data from the source and then transform the data by using etl tools, now in order to perform etl testing first ETL development has to be completed, there are many ETL tools like informatica, datastage, owd from oracle, msbi from Microsoft.
by using tools we extract the data from different source databases, transform the data according to the business rules and then load the data into the target database.
For testing we have to take the help from these development documents-
1. architecture docs - complete project architecture.
2. design docs- complete info about the component.
3. mapping docs - complete info about the source and target, data type of the component.
business rules and transformation rules will be implement on mapping level
The question is how to test-
1 .going through the docs.
2. identifying the scenarios.
3. identifying the test cases.
4. implementing or writing the test cases.
5. upload these in the quality management tool like QC.
Now real part of ETL testing started-
We test the code, environment, checking the access on databases , applying of SQL quarries for verifying after extraction , transformation comes into picture , for example we have to test that the 100 record from the employee database get successfully fetched into the department database and to check that all the 100 records loaded properly into the target or not we will be having different different business rules on the different t columns if one record is missing then we have to raise the defect in the QC , like this we do the testing by using different etl tools.