Supported Platforms for Integration testing in DW/Big Data

ITAS Coverage

  • Development and Verification of Process workflow
  • Records count comparison and data validation
  • Capability to add business rules validation
  • Native Hadoop integration
  • Validation of primary and foreign key constraints
  • Defining and executing jobs through Web Interface
  • Monitoring the job execution and their success/failure
  • Tracking daily unit of work via Batch Processing
  • IBM Tivoli and Control-M workload scheduler integration
  • Ability to integrate and execute custom scripts developed by Discover team
  • File comparison
  • Execute SQLs and store procedures
  • Support for multiple environments
  • Custom function validations for extremely complex business rules

Goals for Project Management Team

  • data quality
  • Accelerate testing cycles
  • Reduce cost and risks

Recognizing the importance of testing
There are many reasons to thoroughly test the data warehouse and use a QA process that is specific to data and ETL testing. For example:

  • Source data is often huge in volume and originates from a variety of data repository types.
  • The quality of source data cannot be assumed and should often be profiled and cleaned.
  • Inconsistency and redundancy may exist in source data.
  • Many source data records may be rejected; ETL/stored procedure logs will contain messages that must be acted upon.
  • Source field values may be missing where they should always be present.
  • Source data history, business rules, and audits of source data may not be available.
  • Enterprise-wide data knowledge and business rules may not be available to verify data.
  • Because data ETLs must often pass through multiple phases before being loading into the data warehouse, ETL components must be thoroughly tested to ensure that the variety of i>data behaves as expected, within each development phase.
  • Heterogeneous sources of data—such as mainframes, spreadsheets, and UNIX files—will be updated asynchronously through time and then incrementally loaded.
  • Transaction-level traceability will be difficult to attain in a data warehouse.
  • The data warehouse will be a strategic enterprise resource and heavily relied upon.
Fusion Systems INC
2200 S Main St,
Suite 105,
Lombard, IL, 60148.

< >

© Copyright 2015. Fusion Systems Inc. All rights reserved | Design and Developed by