Wednesday, December 22, 2010

Interview Questions

Question

Article I. Previous Projects

What was your last project?

Explain the business side of the project?

What was your exact role?

What was the size of the database?

Who was your manager?

Who did you worked with?

What the largest table? How many rows?

Give me some examples of the database tables?

Was there any external source for files?

Article II. Informatica

Section 2.01 Objects

What is the use of joiner transformation?

How to use sorted port?

What is passive and active transformation?

Why do we override mapping SQL?

How to get a previous row in an informatica mapping?

What is pre session and post session database and OS level options?

How do I

Section 2.02 Example of mapping

Give me an example of most complex mapping you have done? Why was it complex?

Explain to me how a type II dimension can be implemented?

Section 2.03 Exception handling

Section 2.04 Re-run

Section 2.05 Performance tuning

Article III. Data Modeling

What is a dimension?

What is a fact?

Are dimensions normalized?

What is a slowing changing, type II dimension?

Article IV. SQL

What is an outer join?

Does oracle support full outer join?

How to find duplicate rows in a table?

There is a table will addresses of employees. There can be more than one address for the same employee. How do I get the latest address?

What is connect by clause?

Article V. Oracle

Section 5.01 Objects

What is the difference between procedures, packages and functions?

What are inline views?

What are materialized views?

How to incrementally refresh a materialized view?

What are the types of indexes?

What is difference between bitmap and B-tree index?

What are the types of cursors?

Give me an example where ref cursor is used?

Why do we need cursors?

What is Auth id clause?

What is commit point?

What is a exception block?

While updating a table; how do you know, how many rows have been updated?

Section 5.02 Tables

What are PCT free and PCT used?

What is te difference between time stamp and date?

Section 5.03 SQLLDR

Give me all the options for SQLLDR?

What is training space option?

Can you uses function in SQLLDR?

Can you use functions in SQLLDR?

How do process the bad records?

What is the difference between bad and rejected files?

How do you skip records?

How do yo skip the 1^st few records?

Section 5.04 Performance Tuning

When do you use bit map index?

What is gather statistics vs. DBMS_STATS?

How to use show plan?

A query is running slow; what will you do to make is perform faster?

Article VI. Shell Script

What is .profile file?

How to remove header row?

How to remove a specific row from file?

How to count rows?

How to sum a specific column in a file?

How to write a shell script to ftp a file?

How to write a shell script to call a oracle procedure?

How to write a script a to call SQLLDR?

Article VII. Testing

How did you do Unit testing?

How dd you compare rows between two different databases?

Article VIII. Oracle interview

INDEX

What is an index?
What types of indexes exists in oracle?
What is a bitmap index?
When do you use bitmap index?
Partitioned Tables and Indexes in Oracle 9i
Hash Partitioned Global Indexes Oracle 10g
Partitioned Tables and Indexes in Oracle 10g
What are IOT?
What are function index ? how do they work?

JOINS

What king of joins exists?
What is an equi join?
What is an inner join?
What is a outer join? When do you use it? Give an example?

PACKAGES

What’s the difference between procedures, packages & functions?
When do you use packages?
What is author id clause?

Oracle database

What is a tablespace?
What is block?
What is transportable tablespaces
What is a rollback segment?
What are the various memory components in Oracle?
Explain the physical architecture of oracle?

TABLES

What is pct free/ pct used
How do u set pct free pct used for a table?
What is partitioning?
What type of partitioning is present in oracle?
What is

TUNING

Define the how will you tune an sql statement data is performing badly?
Step 1
Step 2
Step 3
Step 4
What is analyzing table?
What is the type of optimizers?
How do optimize star schema?
How do u use explain plan?

How do you find duplicates in a table?
How do you filter a data in a group by query?

PLSQL

What is cursor? When do you use a cursor? What type of cursors exists? What are the differences and when each of them is used?
What are built in packages?
How do you write to a file from oracle?
What is PLSQL block?
How do you do exception handling? What is error propagation?
What is %ROWTYPE, %TYPE
What is a pragama ?
What is ref cursor?
What is dynamic PLSQL

SECURITY

How do you implement row level security in oracle?

10.

What is a control file in SQL Loader?
What are synonyms?
What is data pump?

11. ORACLE other objects

What are materialized views?
How are materialized views used?
What are the materialized views refresh strategy?
What are IOT?
What are oracle streams?

Article IX. ETL interview

What is ETL?
How may etl resource have you managed?
Discuss UR etl strategy?
Discuss all steps, analysis, mapping document, design document , development, testing, migration, & performance tuning strategies for ETL?
How do you distribute work?
What is migration worthy ETL?
Discuss automation of ETL?
What is the biggest load you have managed?
How do you manage files?
How do you populate a type II dimensions?
Explain the most difficult etl process
How do you do reconciliation?
How do u manage exceptions?
What kind of etl exceptions exists? How are they managed?
What is error handling?
What is data quality?
how do assure data quality in etl process?

Article X. Data Warehousing interview

What is data warehouse?

Consolidation- Many systems/Subject areas/Location
Integration
Historical Store
Aggregation

What’s the difference between ODS/DWH & data mart?

What is R Kimball strategy vs Bill Inmon strategy?

What strategy do you prefer? Why? When?

Article XI. Data Modeling interview

What is data model?

What is 1^st normal form..2^nd..3^rd…

What’s the difference between logical & physical model?
What is Star schema, Snowflake schema, and 3nf modeling?

How does many to many relationship implemented in Modeling?

What notations are used for modeling?

What IS Type I & type II dim?

What is fact?
What is a dim?
What is a fact less fact?
What is a non-aggregate fact?
What is snapshot fact?
Have you done data modeling?
What is the difference between OLTP data model & OLAP data model?

Article XII. Informatica Interview

Article XIII. Unix interview

Ksh

Awk

Sed

How do u get success & failure of Script

How can U 2 run process in parallel?

Article XIV. Reporting & OALP interview

Have you done reporting?

What kind of tools have you used?

What’s the difference between Reporting & olap?

Whats OLAP? ROLAP?MOLAP?HOLAP?

Name some tools on MOLAP & ROLAP?

Article XV. General interview Questions

Tell me about yourself?

I have 10 year so experience in IT

90% of the time I have worked on Informatica.

My recent experience has been on Informatica.

I will start with my recent experience.

Tell about the current project

Talk about UR Role in Previous Project

Ask him if he wants details…

Project about

Role

How many people managing?

Tech aspects

I will like to come and give a personal interview.

Tuesday, June 15, 2010

Difference between OLTP and OLAP

OLTP

Current data
Short database transactions
Online update/insert/delete
Normalization is promoted
High volume transactions
Transaction recovery is necessary

OLAP
Current and historical data
Long database transactions
Batch update/insert/delete
Denormalization is promoted
Low volume transactions
Transaction recovery is not necessary

Sunday, June 13, 2010

Constraint Based Loading

http://www.geekinterview.com/question_details/30013

Chapter 3

Search for Deltas
Use of Minus
Audit Columns
XML DTD
Hint for performance

Sunday, May 16, 2010

Wednesday, May 12, 2010

Informatica Random

The Integration Service ensures that the target database code page is a superset of the source database code page.

Wednesday, May 5, 2010

Lookup (Informatica)

Transformation type:

Passive

Connected/Unconnected

Lookup transformation is used to look up data in a flat file, relational table, view, synonym or source qualifier.

Following tasks can be performed with a Lookup transformation:

Get a related value. Retrieve a value from the lookup table based on a value in the source. For example, the source has an employee ID. Retrieve the employee name from the lookup table.

Perform a calculation.Retrieve a value from a lookup table and use it in a calculation. For example, retrieve a sales tax percentage, calculate a tax, and return the tax to a target.

Update slowly changing dimension tables. Determine whether rows exist in a target.

Types of Lookups:

Relational or flat file lookup:
Use ODBC for relational. Use flat file wizard for flat file lookup.

Pipeline lookup:
Perform a lookup on application sources such as a JMS, MSMQ, or SAP. Drag the source into the mapping and associate the Lookup transformation with the source qualifier.

Connected or unconnected lookup:
A connected Lookup transformation receives source data, performs a lookup, and returns data to the pipeline. An unconnected Lookup transformation is not connected to a source or target. A transformation in the pipeline calls the Lookup transformation with a :LKP expression. The unconnected Lookup transformation returns one column to the calling transformation.
Points to remember:

Unconnected cache can be called by more than one ports.
Unconnected can be called by one or more tranformations.
More than one column can be returned from unconnected lookups by concatenating the fields in the sql override
Lookup cache persistant can be saved in a file and shared by lookups.
Lookup cache persistant is used when the cache is not going to change for a long time. This cache can be shared in multiple phases of the project.
To improve performance when processing relational or flat file lookup sources, create a pipeline Lookup transformation instead of a relational or flat file Lookup transformation.

Cached or uncached lookup:
Cache the lookup source to improve performance. If you cache the lookup source, you can use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session.
With a dynamic cache, the Integration Service inserts or updates rows in the cache. When you cache the target table as the lookup source, you can look up values in the cache to determine if the values exist in the target. The Lookup transformation marks rows to insert or update the target.

Option

Lookup
Type

Description

Lookup SQL Override

Relational

Overrides the default SQL statement to query the lookup table.

Specifies the SQL statement you want the Integration Service to use for querying lookup values. Use only with the lookup cache enabled.

For more information, see Lookup Query.

Lookup Table Name

Pipeline

Relational

The name of the table or source qualifier from which the transformation looks up and caches values. When you create the Lookup transformation, choose a source, target, or source qualifier as the lookup source. You can also import a table, view, or synonym from another database when you create the Lookup transformation.

If you enter a lookup SQL override, you do not need to enter the Lookup Table Name.

Lookup Source Filter

Relational

Restricts the lookups the Integration Service performs based on the value of data in Lookup transformation input ports. For more information about filtering source data from lookups, see Filtering Lookup Source Rows.

Lookup Caching Enabled

Flat File

Pipeline

Relational

Indicates whether the Integration Service caches lookup values during the session.

When you enable lookup caching, the Integration Service queries the lookup source once, caches the values, and looks up values in the cache during the session. Caching the lookup values can improve session performance.

When you disable caching, each time a row passes into the transformation, the Integration Service issues a select statement to the lookup source for lookup values.

Note: The Integration Service always caches flat file and pipeline lookups.

Lookup Policy on Multiple Match

Flat File

Pipeline

Relational

Determines which rows that the Lookup transformation returns when it finds multiple rows that match the lookup condition. You can select the first or last row returned from the cache or lookup source, or report an error. Or, you can allow the Lookup transformation to use any value. When you configure the Lookup transformation to return any matching value, the transformation returns the first value that matches the lookup condition. It creates an index based on the key ports instead of all Lookup transformation ports.

If you do not enable the Output Old Value On Update option, the Lookup Policy On Multiple Match option is set to Report Error for dynamic lookups. For more information about lookup caches, see Lookup Caches.

Lookup Condition

Flat File

Pipeline

Relational

Displays the lookup condition you set in the Condition tab.

Connection Information

Relational

Specifies the database containing the lookup table. You can define the database in the mapping, session, or parameter file:

-	Mapping. Select the connection object. You can also specify the database connection type. Type `Relational:` before the connection name if it is a relational connection. Type `Application:` before the connection name if it is an application connection.

Session. Use the $Source or $Target connection variable. If you use one of these variables, the lookup table must reside in the source or target database. Specify the database connection in the session properties for each variable. For more information about using $Source and $Target, see the Workflow Administration Guide.

-	Parameter file. Use the session parameter $DBConnectionName or $AppConnectionName, and define it in the parameter file.For more information about session parameters and parameter files, see the Workflow Administration Guide.

By default, the Designer specifies $Source if you choose an existing source table and $Target if you choose an existing target table when you create the Lookup transformation. You can override these values in the session properties.

The Integration Service fails the session if it cannot determine the type of database connection.

Source Type

Flat File

Pipeline

Relational

Indicates that the Lookup transformation reads values from a relational table, flat file, or source qualifier.

Tracing Level

Flat File

Pipeline

Relational

Sets the amount of detail included in the session log.

Lookup Cache Directory Name

Flat File

Pipeline

Relational

Specifies the directory used to build the lookup cache files when you configure the Lookup transformation to cache the lookup source. Also used to save the persistent lookup cache files when you select the Lookup Persistent option.

By default, the Integration Service uses the $PMCacheDir directory configured for the Integration Service.

Lookup Cache Persistent

Flat File Pipeline Relational

Indicates whether the Integration Service uses a persistent lookup cache, which consists of at least two cache files. If a Lookup transformation is configured for a persistent lookup cache and persistent lookup cache files do not exist, the Integration Service creates the files during the session. Use only with the lookup cache enabled.

Lookup Data Cache Size Lookup Index Cache Size

Flat File

Pipeline

Relational

Default is Auto. Indicates the maximum size the Integration Service allocates to the data cache and the index in memory. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at run-time. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache.

If the Integration Service cannot allocate the configured amount of memory when initializing the session, it fails the session. When the Integration Service cannot store all the data cache data in memory, it pages to disk.

Use with the lookup cache enabled.

Dynamic Lookup Cache

Flat File

Pipeline

Relational

Indicates to use a dynamic lookup cache. Inserts or updates rows in the lookup cache as it passes rows to the target table.

Use only with the lookup cache enabled.

Output Old Value On Update

Flat File

Pipeline

Relational

Use with dynamic caching enabled. When you enable this property, the Integration Service outputs old values out of the lookup/output ports. When the Integration Service updates a row in the cache, it outputs the value that existed in the lookup cache before it updated the row based on the input data. When the Integration Service inserts a new row in the cache, it outputs null values.

When you disable this property, the Integration Service outputs the same values out of the lookup/output and input/output ports.

This property is enabled by default.

Cache File Name Prefix

Flat File

Pipeline

Relational

Use with persistent lookup cache. Specifies the file name prefix to use with persistent lookup cache files. The Integration Service uses the file name prefix as the file name for the persistent cache files it saves to disk. Only enter the prefix. Do not enter .idx or .dat.

You can enter a parameter or variable for the file name prefix. Use any parameter or variable type that you can define in the parameter file. For information about using parameter files, see the Workflow Administration Guide.

If the named persistent cache files exist, the Integration Service builds the memory cache from the files. If the named persistent cache files do not exist, the Integration Service rebuilds the persistent cache files.

Recache From Lookup Source

Flat File

Pipeline

Relational

Use with the lookup cache enabled. When selected, the Integration Service rebuilds the lookup cache from the lookup source when it first calls the Lookup transformation instance.

If you use a persistent lookup cache, it rebuilds the persistent cache files before using the cache. If you do not use a persistent lookup cache, it rebuilds the lookup cache in memory before using the cache.

Insert Else Update

Flat File

Pipeline

Relational

Use with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of insert. When enabled, the Integration Service inserts new rows in the cache and updates existing rows When disabled, the Integration Service does not update existing rows.

For more information about defining the row type, see Using Update Strategy Transformations with a Dynamic Cache.

Update Else Insert

Flat File

Pipeline

Relational

Use with dynamic caching enabled. Applies to rows entering the Lookup transformation with the row type of update.

When enabled, the Integration Service updates existing rows, and inserts a new row if it is new. When disabled, the Integration Service does not insert new rows. For more information about defining the row type, see Using Update Strategy Transformations with a Dynamic Cache.

Datetime Format

Flat File

Click the Open button to select a datetime format. Define the format and field width. Milliseconds, microseconds, or nanoseconds formats have a field width of 29.

If you do not select a datetime format for a port, you can enter any datetime format.Default is MM/DD/YYYY HH24:MI:SS. The Datetime format does not change the size of the port.

Thousand Separator

Flat File

If you do not define a thousand separator for a port, the Integration Service uses the properties defined here.

You can choose no separator, a comma, or a period. Default is no separator.

Decimal Separator

Flat File

If you do not define a decimal separator for a particular field in the lookup definition or on the Ports tab, the Integration Service uses the properties defined here.

You can choose a comma or a period decimal separator. Default is period.

Case-Sensitive String Comparison

Flat File

Pipeline

The Integration Service uses case-sensitive string comparisons when performing lookups on string columns.

For relational lookups, the case-sensitive comparison is based on the database support.

Null Ordering

Flat File

Pipeline

Determines how the Integration Service orders null values. You can choose to sort null values high or low. By default, the Integration Service sorts null values high. This overrides the Integration Service configuration to treat nulls in comparison operators as high, low, or null.

For relational lookups, null ordering is based on the database support.

Sorted Input

Flat File

Pipeline

Indicates whether or not the lookup file data is sorted. This increases lookup performance for file lookups. If you enable sorted input, and the condition columns are not grouped, the Integration Service fails the session. If the condition columns are grouped, but not sorted, the Integration Service processes the lookup as if you did not configure sorted input. For more information about sorted input, see Flat File Lookups.

Lookup Source is Static

Flat File

Pipeline

Relational

The lookup source does not change in a session.

Pre-build Lookup Cache

Allow the Integration Service to build the lookup cache before the Lookup transformation requires the data.

Auto.

-	Always allowed. The Integration Service builds the lookup cache before the Lookup transformation receives the first source row.

-	Always disallowed. The Integration Service builds the lookup cache when the Lookup transformation receives the source row.

Subsecond Precision

Relational

Specifies the subsecond precision for datetime ports.

For relational lookups, you can change the precision for databases that have an editable scale for datetime data. You can change subsecond precision for Oracle Timestamp, Informix Datetime, and Teradata Timestamp datatypes.

Enter a positive integer value from 0 to 9. Default is 6 (microseconds).

Differences Between Connected and Unconnected Lookups
Connected Lookup	Unconnected Lookup
Receives input values directly from the pipeline.	Receives input values from the result of a :LKP expression in another transformation.
Use a dynamic or static cache.	Use a static cache.
Cache includes the lookup source columns in the lookup condition and the lookup source columns that are output ports.	Cache includes all lookup/output ports in the lookup condition and the lookup/return port.
Can return multiple columns from the same row or insert into the dynamic lookup cache.	Designate one return port (R). Returns one column from each row.
If there is no match for the lookup condition, the Integration Service returns the default value for all output ports. If you configure dynamic caching, the Integration Service inserts rows into the cache or leaves it unchanged.	If there is no match for the lookup condition, the Integration Service returns NULL.
If there is a match for the lookup condition, the Integration Service returns the result of the lookup condition for all lookup/output ports. If you configure dynamic caching, the Integration Service either updates the row the in the cache or leaves the row unchanged.	If there is a match for the lookup condition, the Integration Service returns the result of the lookup condition into the return port.
Pass multiple output values to another transformation. Link lookup/output ports to another transformation.	Pass one output value to another transformation. The lookup/output/return port passes the value to the transformation calling :LKP expression.
Supports user-defined default values.	Does not support user-defined default values.

To be continued...