Quality Issue, Current Data Quality & Testing Trends...

ShushanTech offers Data Validation Solutions in various fields of software.

Big Data testing

Big Data is growing at a rapid pace. According to IBM, 90% of the world’s data has been created in the past 2 years. And with Big Data comes bad data. Analyst firm Gartner says the average organization loses $14.2 million annually through poor Data Quality. Experian Data Quality report states 99% of organizations have a data quality strategy in place. This is disturbing in that these Data Quality practices are not finding the bad data that exists in their Big Data

Current Data Quality & Testing Trends in Big Data

According to Gartner’s Magic Quadrant on Data Quality Tools, characteristics of these tools are: profiling, parsing cleansing, masking, matching, and monitoring of big data. None these characteristics deal with data validation in your Big Data store. Big Data testing is completely different. The primary goals of data testing your Big Data are verifying data completeness, ensure data transformation, ensure data quality, automate the regression testing. But the 2 main methods, Sampling (also known as “stare and compare”) and Minus Queries, have major flaws.

DvT - Data Validation Tool

Ensure data quality with Data Validation Tool. DvT is the collaborative data testing solution that finds bad data in Big Data and provides a holistic view of your data's health. It ensures that the data you extract from sources remains intact in the target by analyzing and quickly pinpointing any differences in your Big Data at every touchpoint.

Data Warehouse Testing

Automate the data testing of your Data Warehouse to accelerate testing cycles, reduce costs & risks and improve data quality.

How we do Data Warehouse Testing

According to authors Doug Vucevic and Wayne Yaddow in the book "Testing the Data Warehouse Praticum" (Trafford Publishing), some of the main challenges to test for in data warehouse testing are:

  • Data Completeness
  • Data Transformation
  • Data Quality
  • Regression Testing

The only way to perform these tests in a reasonable time frame, which will compare huge volumes of data, is through automating the tests.

The 3 Biggest Issues with Data Warehouse Testing

  • The #1 Method to compare data from sources and target data warehouse – Sampling, also known as “Stare and Compare” - is an attempt to verify data dumped into Excel spreadsheets by viewing or “eyeballing” the data. Less than 10% is usually verified and reporting is manual.
  • The #2 Method – MINUS queries – subtracts data sets from each other twice and you analyze leftover rows - is inefficient and produces no audit trail or reporting.
  • Both methods require SQL programming and very few testers, analysts or operations people know SQL.
Data migration Testing: Easily Validate & Test the Data Migration Process

Migrating data has become one of the most challenging initiatives for IT managers. Although these projects yield high business benefits (such as cost savings, increased productivity, and improved data manageability), they tend to involve a high level of risk due to the volume and criticality of the data.

In order to reduce risk and ensure that the data has been migrated and transformed, you need to implement a thorough validation and QA strategy. DvT helps you test your data quickly and easily.

Testing the Transformations (ETL)

For tables with transformations, you can create Table Pairs - one aimed at the existing system and one at the new system. You can run queries any time, by scheduling them for a particular time & date or after an event. Reports can then be automatically sent to your team. Learn about everything you can accomplish.

Database Upgrade Testing

Upgrading your database environment is a necessary component of your IT strategy. Testing throughout the upgrade process will mitigate the risk of lost, incomplete, or corrupted data, and help to ensure the success of your project.

The most important aspect of testing the database upgrade process is to make sure that the data within the older version transfers properly to the upgraded version, and to check that data integrity is maintained after the upgrade.

How DvT Can Help

DvT supports the complex logic that is needed to ensure that the transitioned data reflects your business logic, especially if the structure of the data changed as a result of the upgrade (due to switching vendors, differing versions, etc.).

Using DvT to test the database upgrade gives your team the ability to:

  • Ensure data integrity after loading is complete
  • Prevent unexpected failures in production
  • Accomplish up to 100% data verification.
Supported Data Sources

DvT supports databases, data marts, data warehouses, Hadoop, and flat files as either sources or targets.