Automated Testing For Data Migration

Data Migration, Part 4/4, Automation

Data Migration Testing revolves around testing the integrity of Data, content and the quality of data between Source and Destination DB based on the given mapping rule.

Broadly, any data Migration has the following:

– Table level mapping

– Element level mapping (column level)

– New elements in the New DB

– Ignore few old elements from the Legacy DB

– Not to be converted data from Legacy to the New system.

– Count Check

Table Level mapping defines how the tables from the old DB should be mapped to the new DB. Based on the mapping rules and how the new system is designed, the table level mapping could be one to many or many to one or one to one.

Element Level mapping defines how each column in the legacy system is mapped to the columns in the new system and this could be also be one to many, manty to one or one to one.

New elements in the new DB are the columns which is very specific to the New DB and not existing in the Legacy

Ignore elements from the Legacy system are the columns which are being exempted from the mapping rule and never brought to the new system.

Not to be converted data are the records from the Legacy DB which are being exempted from getting converted to the new DB for whatsoever reason.

Count Check This is one of the most important validation check to ensure that the source count is in sync with the target count check based on the given mapping rule.

Essentially every automated/manual testing for Data migration projects focuses on the above validation points.

The very fact that we deal with millions and millions of data, manual testing of data migration project can be next to impossible and the situation demands  an automated process and everytime there is a new extraction of data and any changes to the code, we should be able to trigger the automated scripts to check the sanctity and integrity of the data.

This could be achieved with a bunch of SQL scripts which compares the Source data to the Target data based on the mapping rule against each source and target tables.

A simple methodology without using any expensive off the shelf testing tool is to use Excel Macros.

– Get the macros in Excel to connect to Source and Target DB.

– Store the SQL scripts which  validates each table and element level mapping in the macros

– Store the result of the execution back to the Excel which basically confirms if the mapping passed/failed.

The Pros:

1. User-friendly and doesn’t need anyone to learn and understand the complexities of some off -the shelf expensive tools.

2. All the sql scripts being used as part of the test execution are stored at one place and are quite easy for reference.

3. Any mapping rule changes would mean a change in the SQL script and it’s easy to go and make the SQL changes to the corresponding script.

4. It’s very likely that the testing environment (source and destination DB) could change with every execution depending on its availability and the same could be configured in the Excel as a drop down.

5. As the actual execution happens against the DB, validating millions of data is not a challenge and the excel row limit doesnt kick in as only the results of the execution gets stored back to the Excel.

6. Its quite easy to filter the failed test cases and focus on the issues and re-run the same once the issues are fixed.

7. We can even configure the macros to run for specific set of tables, failed test cases rather than going for the entire test suite based on the regression cycle needed for a specific run thereby saving time.and effort.

8. As its excel we could get some fancy reports once the run is complete without depending on some expensive tools and its reporting services.

9. We could even integrate this with the unit test cases of Development team and run it as and when they complete their coding for each tables thereby delivering the QA much matured and bug free code.

10. On an average, 2000 test cases with as much SQL queries takes about the 30 mins to execute and report depending on the load on the DB at a given point of time which signifies the amount of time getting saved as against executing as much test cases manually.

11. All this without having some expensive data migration testing tools thereby saving a lot of cost to the project.

Conclusion

The major factors which should drive QA automation for any project and not particularly migration project is a cost-benefit analysis. It makes sense to go for expensive off-the shelf automation tools if and only if the returns justify the investment. As long as if we can achieve even 80 percent of the efficiency by any in-hand existing inexpensive tool, we should always shoot for the same and get the best out of it.

As for Data Migration project, the biggest challenge is always the sheer quantity of data to be tested and getting a quality output. Most of the times, the very databases where the data gets loaded with a smart combination of excel macros could achieve the desired results.

Data Migration Part 3 of 4

Test Scenarios, in general, would be as below:

I)​ If the migration is to the same type of Database, then,

  • Verify if the queries executed in the new database yield same results as in the older one
  • Verify if the number of records in the old database and new database is the same. Here use appropriate automation tool
  • Verify that there are no redundancies and new database works exactly as the older one
  • Verify if the schema, relationships, table structures are unaltered or set back to match the old database image
  • Verify whether the changes made in application updates new database with correct values and type
  • Verify if after the new database connection is provided to all the components of the application. Application, server, interfaces, firewall, network connectivity etc.
  • Verify the query performance (time-taken to execute complex queries) of the new database is not more than earlier performance

Challenges faced in this testing are mainly with data. Below are few in the list:

#1) Data Quality:

We may find that the data used in the legacy application is of poor quality in the new/upgraded application. In such cases, data quality has to be improved to meet business standards.

Factors like assumptions, data conversions after migrations, data entered in the legacy application itself are invalid, poor data analysis etc. leads to poor data quality. This results in high operational costs, increased data integration risks, and deviation from the purpose of business.

#2) Data Mismatch:

Data migrated from the legacy to the new/upgraded application may be found mismatching in the new one. This may be due to the change in data type, format of data storage, the purpose for which the data is being used may be redefined.

This result in huge effort to modify the necessary changes to either correct the mismatched data or accept it and tweak to that purpose.

#3) Data Loss:

Data might be lost while migrating from the legacy to the new/upgraded application. This may be with mandatory fields or non-mandatory fields. If the data lost is for non-mandatory fields, then the record for it will still be valid and can be updated again.

But if the mandatory field’s data is lost, then the record itself becomes void and it cannot be retracted. This will result in huge data loss and should have to be retrieved either from the backup database or audit logs if captured correctly.

#4) Data Volume:

Huge Data that requires a lot of time to migrate within the downtime window of the migration activity. ​E.g:​ Scratch cards in Telecom industry, users on an Intelligent network platform etc., here the challenge is by the time, the legacy data is cleared, a huge new data will be created, which needs to be migrated again. Automation is the solution for huge data migration.

#5) Simulation of a real-time environment (with the actual data):

Simulation of a real-time environment​ ​in the testing lab is another real challenge, where testers get into different kind of issues with the real data and the real system, which is not faced during testing.

So, data sampling, replication of real environment, identification of volume of data involved in migration is quite important while carrying out data Migration Testing.

#6) Simulation of the volume of data:

Teams need to study the data in the live system very carefully and should come up with the typical analysis and sampling of the data.

E.g:​ users with age group below 10 years, 10-30 years etc., As far as possible, data from the live needs to be obtained, if not data creation needs to be done in the testing environment. Automated tools need to be used to create a large volume of data. Extrapolation, wherever applicable can be used, if the volume cannot be simulated.

 

Data Migration Testing Part 2 of 4

Verification Requirements in the Migrated Environment: 

It is highly important to have a verification process built in for a migration testing process. By putting out the 

The following tests are designed for a hypothetical test case. 

  • Check whether all the data in the legacy is migrated to the new application within the downtime that was planned. To ensure this, compare the number of records between legacy and the new application for each table and views in the database. Also, report the time taken to move say 10000 records.
  • Check whether all the schema changes (fields and tables added or removed) as per the new system are updated.
  • Data migrated from the legacy to new application should retain its value and format unless it is not specified to do so. To ensure this, compare data values between legacy and new application’s database.
  • Test the migrated data against the new application. Here cover a maximum number of possible cases. To ensure 100% coverage with respect to data migration verification, use the automated testing tool.
  • Check for database security.
  • Check for data integrity for all possible sample records.
  • Check and ensure that the earlier supported functionality in the legacy system works as expected in the new system.
  • Check the data flow within the application which covers most of the components.
  • The interface between the components should be extensively tested, as the data should not be modified, lost, and corrupted when it is going through components. Integration test cases can be used to verify this.
  • Check for legacy data redundancy. No legacy data should be duplicated itself during migration
  • Check for data mismatch cases like data type changed, storing format is changed etc.,
  • All the field level checks in the legacy application should be covered in the new application as well
  • Any data addition in the new application should not reflect back on the legacy
  • Updating legacy application’s data through the new application should be supported. Once updated in the new application, it should not reflect back on the legacy.
  • Deleting the legacy application’s data in the new application should be supported. Once deleted in the new application, it should not delete data in legacy as well.
  • Verify that the changes made to the legacy system support the new functionality delivered as a part of the new system.
  • Verify the users from the legacy system can continue to use both the old functionality and new functionality, especially the ones where the changes are involved. Execute the test cases and the test results stored during the Pre-migration testing.
  • Create new users on the system and carry out tests to ensure that functionality from the legacy as well as the new application, supports the newly created users and it works fine.
  • Carry out functionality related tests with a variety of data samples (different age group, users from different region etc.,)
  • It is also required to verify if ‘Feature Flags’ are enabled for the new features and switching it on/off enables the features to turn on and off.
  • Performance testing is important to ensure that migration to new system/software has not degraded the performance of the system.
  • It is also required to carry out Load and stress tests to ensure the system stability.
  • Verify that the software upgrade has not opened up any security vulnerabilities and hence carry out security testing, especially in the area where changes have been made to the system during migration.
  • Usability is another aspect which is to be verified, wherein if GUI layout/front-end system has changed or any functionality has changed, what is the Ease of Use that the end user is feeling as compared to the legacy system.

Since the scope of Post-Migration testing becomes large, it is ideal to segregate the important tests that need to be done first to qualify that Migration is successful and then carry out the remaining later.

It is also advisable to automate the end to end functional test cases and other possible test cases so that the testing time can be reduced and the results would be available quickly.

 

Data Migration Testing Part 1

Database Migration

The following is a four part series of educational material to examine in depth the processes of Data Migration, Testing and Automation that are fundamental for any proper operation to take place. 

Below is a short summary of the content that will be covered in the 4 part series, followed by part 1. 

Reasons and benefits as to why organization will choose Database Migration,

  • Application can have multiple databases at the backend to support huge customer data
  • Data enhancement can be achieved
  • Proper analysis of data will help in improving the data quality
  • Data sampling & data cleansing helps in keeping the database clean and effective
  • To carry out data analytics

Examples of Database Migration:

  • Migration from one RDBMS to another RDBMS
  • Migration from RDBMS to MongoDB
  • Upgrading from Informix HC4 to HC6 or HC7

Testing:

  • Ensuring that the legacy database is not updated during tests after migration.
  • To ensure the mapping at field and table levels do not change.
  • Ensuring data is migrated accurately and completely.
  • Pre-migration and Post-migration testing activities.

Testing Migration in a same type database:

  • Verify if the queries executed in the new database yield same results as in the older one.
  • Verify if the number of records in the old database and new database is the same. 
  • Verify that there are no redundancies and new database works exactly as the older one.
  • Verify if the schema, relationships, table structures are unaltered or set back to match the old database image.
  • Verify whether the changes made in application updates new database with correct values and type.
  • Verify if after the new database connection is provided to all the components of the application (Application, server, interfaces, firewall, network connectivity etc.).
  • Verify the query performance (time-taken to execute complex queries) of the new database is not more than earlier performance.

Automated Testing

  • Understanding the Cost Benefit analysis of Automated Data Migration Testing. 
  • Challenges of managing the data quantity required to be tested.
  • Ensuring the quality of the output. 

Conclusion

Considering the complexity involved in carrying out data Migration Testing, understanding that even missing a small aspect of of verification will lead to potential failure or damage to valuable data. It is very important to carry out careful and thorough examination and analysis of the system before and after migration. Plan and design the effective migration strategy with the robust tools along with skilled teams.

As we know that Migration has a huge impact on quality of the application, a good amount of effort must be put up by the entire team to verify the entire system in all aspects like functionality, performance, security, usability, availability, reliability, compatibility etc., which in turn will ensure successful ‘Migration Testing’.

 

Part 1, Overview of Data Migration Testing:

Data migration is the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another (Wikipedia, 2019). In many organizations data needs to be migrated for a variety of reasons, ranging from newer servers to better applications. But what does this actually mean? In this paper we will look thoroughly in to the process of data migration, and examine what is needed for a successful operation. Data Migration can be cut down to two major processes, the migration, and testing. From the testing point of view, it all means that the application has to be tested thoroughly end-to-end along with migration from the existing system to the new system successfully. Migration Testing is a verification process of migration of the legacy system to the new system with minimal disruption/downtime, with data integrity and no loss of data, while ensuring that all the specified functional and non-functional aspects of the application are met post-migration.

Data Migration Testing Strategy, is an important piece of the operation as it sets out proper parameters of how and when to evaluate the migration to ensure it is running smoothly throughout  the entire operation. This is to minimize the errors and risks that occur as a result of migration and to perform the migration testing effectively.

Some Important Activities in Testing:

1) Specialized team formation:

Form the testing team with the members having the required knowledge & experience and provide training related to the system that is being migrated.

2) Business risk analysis, possible errors analysis:

Current & ongoing business operations should not be hampered after migration and hence it is important to carry out ‘Business Risk Analysis’ meetings involving the right stakeholders (Test Manager, Business Analyst, Architects, Product Owners, Business Owner etc.,) to identify the risks and the implementable mitigations. The testing should include scenarios to uncover those risks and verify if proper mitigations have been implemented.

Conduct ‘Possible Error Analysis’ using appropriate ‘Error Guessing Approaches’ and then design tests around these errors to unearth them during testing.

3) Migration scope analysis and identification:

Analyze the clear scope of the migration test as when and what needs to be tested.

4) Identify the appropriate Tool for Migration:

While defining the strategy of this testing, automated or manual, identify the tools that are going to be used. E.g: Automated tool to compare source and destination data.

5) Identify the appropriate Test Environment for Migration:

Identify separate environments for Pre and Post Migration environments to carry out any verification that is required as part of testing. Understand and document the technical aspects of the Legacy and New system of Migration, to ensure that the test environment is set up as per that.

6) Migration Test Specification Document and review:

Prepare Migration Test Specification document which clearly describes the test approach, areas of testing, testing methods (automated, manual), testing methodology (black box, white box testing technique), Number of cycles of testing, schedule of testing, approach of creating data and using live data (sensitive info needs to be masked), test environment specification, testers qualification etc., and run a review session with the stakeholders.

7) Production launch of the migrated system:

Analyze and document the to-do list for production migration and publish it well in advance.

 

Estrada’s Micro-Methods for Data and Business Intelligence

First it was the enterprise data model, then data warehouse, enterprise architecture, and finally master data management that became the consultancy solution mega implementations. Each has run its initial course with, at times, very limited results. Some organizations have abandoned methods that demand high levels of investment with the promise of future payback. Today, the pace of change requires incremental, quick-paced action to implement something of value.

Read More Estrada’s Micro-Methods for Data and Business Intelligence