BISP 

60% Complete Courses » Implementing ETL Solution using IBM DataStage
  • Course Content
  • Course Overview
  • Training Schedule
  • Demo Video
  • Learning Material
  • Register


IBM Info sphere Data Stage and Quality Stage 11.3



Data Warehouse Fundamentals

An introduction to Data Warehousing – purpose of Data Warehouse – Data Warehouse Architecture – Operational Data Store – OLTP Vs Warehouse Applications – Data Marts- Data marts Vs Data Warehouses – Data Warehouse Life cycle .

Data Modeling

Introduction to Data Modeling – Entity Relationship model (E-R model) – Data Modeling for Data Warehouse, Normalization process – Dimensions and fact tables – Star Schema and Snowflake Schemas.

ETL Design Process

Introduction to Extraction, Transformation & Loading- Types of ETL Tools – Key tools in the market.

Introduction to Data stage Version 11.3

Datastage introduction – IBM information Server architecture – DataStage components – DataStage main functions – Client components.

Data Stage Designer

Introduction to Data stage Designer – Importance of Parallelism – Pipeline Parallelism – Partition Parallelism – Partitioning and collecting – Symmetric Multi Processing (SMP) Massively Parallel Processing (MPP) – Partition techniques – Data stage Repository Palette – Passive and Active stages – Job design overview – Designer work area – Annotations – Creating jobs – Importing flat file definitions – Managing the Metadata environment – Dataset management – Deletion of Dataset – Routines – Arguments.

Working with Parallel Job Stages

Database Stages

Oracle connector –Teradata Connector – ODBC

File Stages

Sequential file – Dataset – File set – Lookup file set-Complex Flat File Stage

Processing Stages

Copy – Filter – Funnel – Sort- Remove duplicate – Aggregator – Modify – Compress – Expand – Decode – Encode – Switch – Pivot stage -Lookup – Join – Merge –look up, join and merge – change capture – Change apply – Compare – Difference – Surrogate key generator – Transformer

Debug Stages

Head – Tail – Peek – Column generator – Row generator –Write Range Map Stage.

Routines creation

Advanced Stages in Parallel Jobs (Version 11.3)

Range Look process – Surrogate key generator stage – Slowly changing dimension stage – iway stage – FTP stage-Pivot Enterprise – Job performance analysis – Resource estimation- Performance Optimizer – Slowly Changing Dimensions implementation , Transformer stage looping condition, Transformer stage Last Row handling

Datastage Director

Introduction to Data stage Director – Validating Data stage Jobs – Executing Data stage jobs – Job execution status – Monitoring a job – Job log view – job scheduling – Creating Batches – Scheduling batches.

DATASTAGE Administrator

Data stage project Administration - Editing projects and Adding Projects – Deleting projects Cleansing up project files – Environmental Variables–Environment management – Auto purging – Rutime Column Propagation(RCP) – Add checkpoints for sequencer – NLS configuration – Generated OSH (Orchestra Engine) – System formats like data, timestamp – Project protect – Version details.


Job Sequencers

Arrange job activities in Sequencer – Triggers in Sequencer – Restablity – Recoverability – Notification activity – Terminator activity – Wait for file activity – Start Look activity – Execute Command activity – Nested Condition activity – Exception handling activity – User Variable activity – End Loop activity – Adding Checkpoints

Info sphere Quality Stage


  • Why Data Quality

  • Data Quality Challenges

  • Types of Data Quality Tools Provided by IBM

  • Differences between IA and QS

  • Quality stage Architecture

  • Data stage Quality Stages

    • Investigate Stage

      • Default Class Descriptions

      • Word Investigation

      • Character Discrete Investigation

      • Character concatenate investigation


    • Standardize Stage

      • Standardize Process

      • Domain Specific Rule sets

      • Domain Preprocessing Rule sets

      • Creation of Custom Rule sets with Examples( SEPLIST/STRIPLIST,Classification file,Dictionary Files,Pattern Action File,Lookup Tables,Override Tables )

      • Introduction to Pattern Action language

      • Types Of Patterns ( Conditional and Unconditional )

      • Build customized Action statements using PAL with Examples

      • Standardize Quality Assessment Report ( SQA )


    • Match Stage

      • Match Process

      • Creation of Match Passes

      • Match Frequency Stage & Reports

      • Unduplicate Match Stage with Examples

      • Reference Match Stage with Examples


    • Survive Stage

      • Importance of Survive Stage

      • Build Survive Process

      • Implementation of Survive Rules


  • Explanation about the entire Data Quality Life Cycle


IBM Information Server Administration

IBM Info sphere Data Stage administration – Opening the IBM Information Server Web console – setting up a project ion the console – Customizing the project dashboard – Setting up security – Creating users in the console – Assigning security roles to users and groups – Managing licenses – Managing active sessions – Managing logs – Managing schedules – Backing up and restoring IBM Information Server.




DATASTAGE 11.3 REAL TIME PROJECT


  • Project architecture and BRD discussion

  • Dimensional tables and fact tables with modeling

  • Flow of subject area discussion

  • Design of HLD’s and LLD’s for a project

  • Project flow-job design process with ETL Documents

  • Complex jobs discussion-unit testing process

  • System & User Acceptance & Regression & End-to-End Testing

  • Deployment Process of code to different phases

  • Creation of job design documents or overview docs, tech specs

  • Production support process

  • UNIX scripting for automation of code

  • Discussion of scheduling process with Control-M/Autosys

  • Fixing of Defects and Problem Tickets and Incidents


Additional Features


  • Data stage project on Banking & Insurance & Health Care domain.

  • Data stage Certification Guidance.

  • Performance Tunning of Parallel Jobs.

  • Datastage Installation process and setup.

  • Well Versed Materials Which Covers Data warehousing Basics, Datastage Concepts UnixCommands, Shall Script, Databases.

Course Id:
ETL001 
Course Fees:
301 USD