Monday, 29 July 2013

Frequently asked Inforamtica Questions



o    Which transformation should we use to normalize the COBOL and relational sources?
Normalizer Transformation. Normalizer Transformation.

Normalizer Transformation.When we drag the COBOL source in to the mapping Designer workspace,the normalizer transformation automatically appears,creating input and output ports for every column in the source.
 
o    Difference between static cache and dynamic cache?
In case of Dynamic catche when you are inserting a new row it looks at the lookup catche to see if the row existing or not,If not it inserts in the target and catche as well in case of Static catche when you are inserting a new row it checks the catche and writes to the target but not catche

If you cache the lookup table, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Informatica Server inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do.
o    What are the join types in joiner transformation?
Normal Join Master Join Detail Join Outer Join

the following are the join types Normal,MasterOuter,Detail Outer,Full Outer

Normal,Master Outer,Detail Outer and Full Outer
o    In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?
no restrictions

you perform the following task before configuring the joiner transformation configure the transformation to use sorted data and define the join condition to recieve sorted data in the same order as the sort origin

In the conditions; Either input pipeline contains an Update Strategy transformation, You connect a Sequence Generator transformation directly before the Joiner transformation

1.Both input pipelines originate from the same Source Qualifier transformation. 2.Both input pipelines originate from the same Normalizer transformation. 3.Both input pipelines originate from the same Joiner transformation. 4.Either input pipeline contains an Update Strategy transformation. 5.We connect a Sequence Generator transformation directly before the Joiner transformation.

1.Both input pipelines originate from the same Source Qualifier transformation. 2.Both input pipelines originate from the same Normalizer transformation.
3.Both input pipelines originate from the same Joiner transformation.
4.Either input pipeline contains an Update Strategy transformation.
5.We connect a Sequence Generator transformation directly before the Joiner transformation.

o    What is the look up transformation?
Used to look up data in a reational table or view.

Lookup is a passive transformation and used to look up data in a flat file or a relational table
o    What are the diffrence between joiner transformation and source qualifier transformation?
1. Source Qualifier Operates only with relational sources within the same schema. Joiner can have either heterogenous sources or relation sources in different schema 2. Source qualifier requires atleats one matching column to perform a join. Joiner joins based on matching port. 3. Additionally, Joiner requires two separate input pipelines and should not have an update strategy or Sequence generator (this is no longer true from Infa 7.2).

1)Joiner can join relational sources which come from different sources whereas in source qualifier the relational sources should come from the same data source. 2)We need matching keys to join two relational sources in source qualifier transformation.Where as we doesn?t need matching keys to join two sources.
 
o    Why use the lookup transformation?
Used to look up data in a relational table or view.

in Inf7.1, we can get from flat file also

look up is used to perform one of the following task: -to get related value -to perform calculation -to update slowley changing dimension table

generally we use lookup transformation for 1) get a related value from key column value 2) check whether the record already existing in the table 3) slowly changing dimension tables

A Lookup transformation is used for checking the matched values from the source or target tables,used for updating the slowly changing dimensions and also performs some calculations.

o     How can you improve session performance in aggregator transformation?
How can you improve session performance in aggregator transformation?

By using Incremental Aggregation

create the sorter transformation before the aggregator

sorted input

Ya we can use a Sorted Input option to improve the performance. Basically aggregate transformation reduces the performance because it uses caches.
o    Can you use the maping parameters or variables created in one maping into any other reusable transformation?
Yes. Because reusable transformation is not contained with any mapplet or mapping.  
o    What is meant by lookup caches?
Session will read all unique rows from the reference table/ file to fill the local buffer first; then for each row received from up-stream transformation, it tries to match them against the local buffer

Informatica server builts a cache in memory when it process the first row of a cached lookup transformation

- When server runs a lookup transformation, the server builds a cache in memory, when it process the first row of data in the transformation. - Server builds the cache and queries it for the each row that enters the transformation. - The server creates index and data cache files in the lookup cache drectory and used the server code page to create the files. - index cache contains conductional values and data cache contains output values

The informatica server builds a cache in memory when it processes the first row of a data in a cached look up transformation. It allocates memory for the cache based on the amount you configure in the transformation or session properties. The informatica server stores condition values in the index cache and output values in the data cache
o    What is source qualifier transformation?
SQ is an active tramsformation. It performs one of the following task: to join data from the same source database to filtr the rows when Power centre reads source data to perform an outer join to select only distinct values from the source

In source qualifier transformatio a user can defined join conditons,filter the data and eliminating the duplicates. The default source qualifier can over written by the above options, this is known as SQL Override.

The source qualifier represents the records that the informatica server reads when it runs a session.

When we add a relational or a flat file source definition to a mapping,we need to connect it to a source qualifier transformation.The source qualifier transformation represents the records that the informatica server reads when it runs a session.
 
o    How the informatica server increases the session performance through partitioning the source?
Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines  
o    What are the settiings that you use to cofigure the joiner transformation?
Master group flow detail group flow join condition type of join

take less no. of rows table as master table, more no of table as detail table and join condition. joiner will put all row from master table into chache and check condition with detail table rows.

1) Master Source 2) Detail Source 3) Type Of Join 4) Condition of Join
o    What are the rank caches?
the informatica server stores group information in an index catche and row data in data catche

when the server runs a session with a Rank transformation, it compares an input row with rows with rows in data cache. If the input row out-ranks a stored row,the Informatica server replaces the stored row with the input row.

During the session ,the informatica server compares an inout row with rows in the datacache. If the input row out-ranks a stored row, the informatica server replaces the stored row with the input row. The informatica server stores group information in an index cache and row data in a data cache.
o    What is Code Page Compatibility?
When two code pages are compatible, the characters encoded in the two code pages are virtually identical.

Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One code page can be a subset or superset of another. For accurate data movement, the target code page must be a superset of the source code page.
 
o    How can you create or import flat file definition in to the warehouse designer?
By giving server connection path

Create the file in Warehouse Designer or Import the file from the location it exists or modify the source if the structure is one and the same

first create in source designer then draginto warhouse designer you can't create a flat file target defenition directly ramraj

There is no way to import target definition as file in Informatica designer. So while creating the target definition for a file in the warehouse designer it is created considering it as a table, and then in the session properties of that mapping it is specified as file.

U can not create or import flat file definition in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source definition into warehouse designer workspace,the warehouse designer creates a relational target definition not a file definition.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.
o     
o     What is aggregate cache in aggregator transforamtion?
aggregator transformation contains two caches namely data cache and index cache data cache consists aggregator value or the detail record index cache consists grouped column value or unique values of the records

When the PowerCenter Server runs a session with an Aggregator transformation, it stores data in aggregator until it completes the aggregation calculation.

The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the transformation.If the informatica server requires more space,it stores overflow values in cache files
o     
o     How can you recognise whether or not the newly added rows in the source are gets insert in the target?
In the type-2 mapping we have three options to recognise the newly added rows. i) Version Number ii) Flag Value iii) Effective Date Range

we can add a count aggregator column to the target and generate it before running the mapping there might couple of different ways to do this or we can run a sql query after running the mapping each time to make sure new data is inserted

From session SrcSuccessRows can be compared with TgtSuccessRows

check the seesion log or check the target table
o    What are the types of lookup?
Connected look up and un-connected look up

Unconnected Lookup and Connected Lookup

connected and unconnected

1.connected 2.unconnected

connected

Connected look up and un-connected look up

Connected, unconnected, dynamic

Connected look up and un-connected look up static dynamic

two types 1.connected 2.unconnected

Static and Dynamic Lookup

There are tow tyes of lookups connected lookups and unconnected lookups

Two types of Look Ups Connected and Unconnected lookups

CONNECTED AND UNCONNECTED

Connected and unconnected cached and uncached

no kind of lookup

CONNECTED AND UNCONNECTED

Static Lookup and Dynamic Loook up...Static is agin devided into two parts connected lookup and unconnected lookup

Mainly first three Based on connectio: 1. Connected 2. unconnected Based on sourceType: 1. Flat file 2. Relational Based on cache: 1. Cached 2. uncached Based on cacheType: 1. Static 2. Dynamic Based on reuse: 1. persistance 2. Non persistance Based on input: 1. Sorted 2. unsorted

connected, unconnected

mainly two types of look up...there 1.static lookup 2.dynamic lookup In static lookup ..there two types are used one is connected and unconnected. In connected lookup means while using the pipeline symbol... In unconnected lookup means while using the expression condition..
 
o    What are the types of metadata that stores in repository?
Data base connections,global objects,sources,targets,mapping,mapplets,sessions,shortcuts,transfrmations

The repository stores metada that describes how to transform and load source and target data.

Data about data

Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets.

Following are the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target definitions Transformations.
o    What happens if Informatica server doesn't find the session parameter in the parameter file?
Workflow will fail.
o     Can you access a repository created in previous version of informatica?
We have to migrate the repository from the older version to newer version. Then you can use that repository.
o    Without using ETL tool can u prepare a Data Warehouse and maintain?
Yes we can do that using PL/ SQL or Stored procedures when all the data are in the same databases. If you have source as flat files you can?t do it through PL/ SQL or stored procedures.
o    How do you identify the changed records in operational data?
In my project source system itself sending us the new records and changed records from the last 24 hrs.
o    Why couldn't u go for Snowflake schema?
Snowflake is less performance while compared to star schema, because it will contain multi joins while retrieving the data.
Snowflake is preferred in two cases,
    If you want to load the data into more hierarchical levels of information example yearly, quarterly, monthly, daily, hourly, minutes of information. Prefer snowflake.
    Whenever u found input data contain more low cardinality elements. You have to prefer snowflake schema. Low cardinality example: sex , marital Status, etc., Low cardinality means no of distinct records is very less while compared to total number of the records,
o    Name some measures in your fact table? Sales amount.
o    How many dimension tables did you had in your project and name some dimensions (columns)?
Product Dimension : Product Key, Product id, Product Type, Product name, Batch Number.
Distributor Dimension: Distributor key, Distributor Id, Distributor Location,
Customer Dimension : Customer Key, Customer Id, CName, Age, status, Address, Contact
Account Dimension : Account Key, Acct id, acct type, Location, Balance,
o    How many Fact and Dimension tables are there in your project?
In my module (Sales) we have 4 Dimensions and 1 fact table
o    How many Data marts are there in your project?

There are 4 Data marts, Sales, Marketing, Finance and HR. In my module we are handling only sales data mart.
o     What is the daily data volume (in GB/records)? What is the size of the data extracted in the extraction process?
Approximately average 40k records per file per day. Daily we will get 8 files from 8 source systems.
o    What is the size of the database in your project?
Based on the client?s database, it might be in GB?s.  
o    What is meant by clustering?
It will join two (or more) tables in single buffer, will retrieve the data easily.  
o    Whether are not the session can be considered to have a heterogeneous target is determined?
It will consider (there is no primary key and foreign key relationship)
o     Under what circumstance can a target definition are edited from the mapping designer. Within the mapping where that target definition is being used?
We can't edit the target definition in mapping designer. we can edit the target in warehouse designer only. But in our projects, we haven't edited any of the targets. if any change required to the target definition we will inform to the DBA to make the change to the target definition and then we will import again. We don't have any permission to the edit the source and target tables.
o     Can a source qualifier be used to perform a outer join when joining 2 database?
No, we can't join two different databases join in SQL Override.
o    If u r source is flat file with delimited operator.when next time u want change that delimited operator where u can make?
In the session properties go to mappings and click on the target instance click set file properties we have to change the delimited option.
o     If index cache file capacity is 2MB and datacache is 1 MB. If you enter the data of capacity for index is 3 MB and data is 2 MB. What will happen?
Nothing will happen based the buffer size exists in the server we can change the cache sizes. Max size of cache is 2 GB.
o     Difference between next value and current value ports in sequence generator?
Assume that they r both connected to the input of another transformer?
It will gives values like nextvalue 1, currval 0.
o    How does dynamic cache handle the duplicates rows?
Dynamic Cache will gives the flags to the records while inserting to the cache it will gives flags to the records, like new record assigned to insert flag as "0", updated record is assigned to updated flag as "1", No change record assigned to rejected flag as "2"
o    How will u find whether your mapping is correct or not without connecting session?
Through debugging option
o    If you are using aggregator transformation in your mapping at that time your source contain dimension or fact?
According to requirements, we can use aggregator transformation. There is no limitation for the aggregator. We should use source as dimension or fact.  
o    My input is oracle and my target is flat file shall I load it? How?
Yes, Create flat file based on the structure match with oracle table in warehouse designer than develop the mapping according requirement and map to that target flat file. Target file is created in TgtFiles directory in the server system.
o    for a session, can I use 3 mappings?
No, for one session there should be only one mapping. We have to create separate session for each mapping
o    Type of loading procedures?
Load procedures are two types 1) Normal load 2) bulk loads if you are talking about informatica level. If you are talking about project load procedures based on the project requirement. Daily loads or weekly loads.
o     Are you involved in high level r low level design? What is meant by that high level design n low level design?
Low Level design:
Requirements should be in the excel format which describes field to field validations and business logic needs to present. Mostly onsite team will do this Low Level design.
High Level Design:
Describes the informatica flow chart from source qualifier to target simply we can say flow chart of the informatica mapping. Developer will do this design document.
o     what r the dimension load methods?
Daily loads or weekly loads based on the project requirement.
o    where we are using lkp b/n source to stage or stage to target?
Depend on the requirement. There is no rule we have to use in this stage only.
o     How will you do SQL tuning?
We can do SQL tuning using Oracle Optimizer, TOAD software
o    did u use any other tools for scheduling purpose other than workflow manager or pmcmd?
Using third party tools like "Control M",
o    What is SQL mass updating?
A)
Update (select hs1.col1 as hs1_col1
, hs1.col2 as hs1_col2
, hs1.col3 as hs1_col3
, hs2.col1 as hs2_col1
, hs2.col2 as hs2_col2
, hs2.col3 as hs2_col3
From hs1, hs2
Where hs1.sno = hs2.sno)
set hs1_col1 = hs2_col1
, hs1_col2 = hs2_col2
, hs1_col3 = hs2_col3;
o     what is unbounded exception in source qualifier?
"TE_7020 Unbound field in Source Qualifier" when running session
A) Problem Description:
When running a session the session fails with the following error:
TE_7020 Unbound field <field_name> in Source Qualifier <SQ_name>"
Solution:
This error will occur when there is an inconsistency between the Source Qualifier and the source table.
Either there is a field in the Source Qualifier that is not in the physical table or there is a column
of the source object that has no link to the corresponding port in the Source Qualifier.
To resolve this, re-import the source definition into the Source Analyzer in Designer.
Bring the new Source definition into the mapping.This will also re-create the Source Qualifier.
Connect the new Source Qualifier to the rest of the mapping as before.
o    Using unconnected lookup how we you remove nulls n duplicates?
We can't handle nulls and duplicates in the unconnected lookup. We can handle in dynamic connected lookup.
o    I have 20 lookup, 10 joiners, 1 normalizer how will you improve the session performance?
We have to calculate lookup & joiner caches size.
o     What is version controlling?
It is the method to differentiate the old build and the new build after changes made to the existing code. For the old code v001 and next time u have to increase the version number as v002 like that. In my last company we haven't use any version controlling. We just delete the old build and replace with the new code.
We don't maintain version controlling in informatica. We are maintaining the code in VSS (Virtual visual Source) that is the software with maintain the code with versioning. Whenever client made change request came once the production starts we have to create another build.
o     How is the Sequence Generator transformation different from other transformations?
The Sequence Generator is unique among all transformations because we cannot add, edit, or delete its default ports (NEXTVAL and CURRVAL).

Unlike other transformations we cannot override the Sequence Generator transformation properties at the session level. This protecxts the integrity of the sequence values generated.
 
o    What are the advantages of Sequence generator? Is it necessary, if so why?
We can make a Sequence Generator reusable, and use it in multiple mappings. We might reuse a Sequence Generator when we perform multiple loads to a single target.

For example, if we have a large input file that we separate into three sessions running in parallel, we can use a Sequence Generator to generate primary key values. If we use different Sequence Generators, the Informatica Server might accidentally generate duplicate key values. Instead, we can use the same reusable Sequence Generator for all three sessions to provide a unique value for each target row.
o     What are the uses of a Sequence Generator transformation?
We can perform the following tasks with a Sequence Generator transformation:
o    Create keys
o    Replace missing values
o    Cycle through a sequential range of numbers
o    What is Sequence Generator Transformation?
The Sequence Generator transformation generates numeric values. We can use the Sequence Generator to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers.

The Sequence Generation transformation is a connected transformation. It contains two output ports that we can connect to one or more transformations
o     What is the difference between connected lookup and unconnected lookup?
Differences between Connected and Unconnected Lookups:




Connected Lookup    Unconnected Lookup
Receives input values directly from the pipeline.    Receives input values from the result of a :LKP expression in another transformation.
We can use a dynamic or static cache    We can use a static cache
Supports user-defined default values    Does not support user-defined default values
o     What are connected and unconnected Lookup transformations?
We can configure a connected Lookup transformation to receive input directly from the mapping pipeline, or we can configure an unconnected Lookup transformation to receive input from the result of an expression in another transformation.

An unconnected Lookup transformation exists separate from the pipeline in the mapping. We write an expression using the :LKP reference qualifier to call the lookup within another transformation.

A common use for unconnected Lookup transformations is to update slowly changing dimension tables.
o     What is a Lookup transformation and what are its uses?
We use a Lookup transformation in our mapping to look up data in a relational table, view or synonym.

We can use the Lookup transformation for the following purposes:

&#61558;    Get a related value. For example, if our source table includes employee ID, but we want to include the employee name in our target table to make our summary data easier to read.
&#61558;    Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
&#61558;    Update slowly changing dimension tables. We can use a Lookup transformation to determine whether records already exist in the target.
 
o    What is a lookup table?
The lookup table can be a single table, or we can join multiple tables in the same database using a lookup query override. The Informatica Server queries the lookup table or an in-memory cache of the table for all incoming rows into the Lookup transformation.

If your mapping includes heterogeneous joins, we can use any of the mapping sources or mapping targets as the lookup table.
o    Where do you define update strategy?
We can set the Update strategy at two different levels:
?    Within a session. When you configure a session, you can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the session mapping to flag records for different database operations.
?    Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.
o     What is Update Strategy?
When we design our data warehouse, we need to decide what type of information to store in targets. As part of our target table design, we need to determine whether to maintain all the historic data or just the most recent changes.
The model we choose constitutes our update strategy, how to handle changes to existing records.

Update strategy flags a record for update, insert, delete, or reject. We use this transformation when we want to exert fine control over updates to a target, based on some condition we apply. For example, we might use the Update Strategy transformation to flag all customer records for update when the mailing address has changed, or flag all employee records for reject for people no longer working for the company.
o    What are the different types of Transformations?
a) Aggregator transformation: The Aggregator transformation allows you to perform aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that you can use the Aggregator transformation to perform calculations on groups. The Expression transformation permits you to perform calculations on a row-by-row basis only. (Mascot)

b) Expression transformation: You can use the Expression transformations to calculate values in a single row before you write to the target. For example, you might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers. You can use the Expression transformation to perform any non-aggregate calculations. You can also use the Expression transformation to test conditional statements before you output the results to target tables or other transformations.

c) Filter transformation: The Filter transformation provides the means for filtering rows in a mapping. You pass all the rows from a source transformation through the Filter transformation, and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output, and only rows that meet the condition pass through the Filter transformation.

d) Joiner transformation: While a Source Qualifier transformation can join data originating from a common source database, the Joiner transformation joins two related heterogeneous sources residing in different locations or file systems.
e) Lookup transformation: Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping.
The Informatica Server queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Use the result of the lookup to pass to other transformations and the target.
o    What is a transformation?
A transformation is a repository object that generates, modifies, or passes data. You configure logic in a transformation that the Informatica Server uses to transform data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data.
Each transformation has rules for configuring and connecting in a mapping. For more information about working with a specific transformation, refer to the chapter in this book that discusses that particular transformation.
You can create transformations to use once in a mapping, or you can create reusable transformations to use in multiple mappings.
 
o    What are the tools provided by Designer?
The Designer provides the following tools:
?    Source Analyzer. Use to import or create source definitions for flat file, XML, Cobol, ERP, and relational sources.
?    Warehouse Designer. Use to import or create target definitions.
?    Transformation Developer. Use to create reusable transformations.
?    Mapplet Designer. Use to create mapplets.
?    Mapping Designer. Use to create mappings.
o     What are the different types of Commit intervals?
The different commit intervals are:
?    Target-based commit. The Informatica Server commits data based on the number of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.
?    Source-based commit. The Informatica Server commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.
 
What is Event-Based Scheduling?
When you use event-based scheduling, the Informatica Server starts a session when it locates the specified indicator file. To use event-based scheduling, you need a shell command, script, or batch file to create an indicator file when all sources are available. The file must be created or sent to a directory local to the Informatica Server. The file can be of any format recognized by the Informatica Server operating system. The Informatica Server deletes the indicator file once the session starts.
 What are Sessions and Batches?
Sessions and batches store information about how and when the Informatica Server moves data through mappings. You create a session for each mapping you want to run. You can group several sessions together in a batch. Use the Server Manager to create sessions and batches.
What are Reusable transformations?
 You can design a transformation to be reused in multiple mappings within a folder, a repository, or a domain. Rather than recreate the same transformation each time, you can make the transformation reusable, then add instances of the transformation to individual mappings. Use the Transformation Developer tool in the Designer to create reusable transformations.  
o    What are the types of loading in Informatica?
There are two types of loading, normal loading and bulk loading. In normal loading, it loads record by record and writes log for that. It takes comparatively a longer time to load data to the target in normal loading. But in bulk loading, it loads number of records at a time to target database. It takes less time to load data to target.
o    What is the difference between active transformation and passive transformation?
An active transformation can change the number of rows that pass through it, but a passive transformation can not change the number of rows that pass through it.
o    What are the various types of transformation?
Various types of transformation are: Aggregator Transformation, Expression Transformation, Filter Transformation, Joiner Transformation, Lookup Transformation, Normalizer Transformation, Rank Transformation, Router Transformation, Sequence Generator Transformation, Stored Procedure Transformation, Sorter Transformation, Update Strategy Transformation, XML Source Qualifier Transformation, Advanced External Procedure Transformation, External Transformation.
o    What is the use of tracing levels in transformation?
Tracing levels store information about mapping and transformations

No comments:

Post a Comment