Let's Engineer things

Thursday, January 5, 2017

Exports and Imports in ODI

Hello Friends!!

Lets try to explore, the import and exports in ODI. This is infact a very important part of you being able to implement a development project successfully.

We should be able to know how to export the development artefacts from one environment and put them into another environment.

The typical life cycle goes like this: Develop Projects in Development environment and then export the projects, models etc and check in to SVN.

Now for a different environment check out the same artefacts from the SVN and import them into the other repository.

Now lets try to dig into this topic in a simplistic manner.

To understand this concept, lets looks at a concept of Internal ID.

An internal ID is a unique id that is given by ODI to each and every object that is developed inside it.

How Internal id is generated?

While you create a work repository, a repository id is asked, this can an any 3 digit number. right from 0-999.

If the Internal id given is suppose 2, then ODI converts the internal id to 002, so it becomes a three digit number.

The internal ID of an object is calculated by appending the value of the RepositoryID to an auto incremented number <uniqueNo><RepoID>

So now that we understood that ODI identifies things with internal ids, if 2 different master/Work Rep have the same ids, and they contain different type of objects also, still they will be considered as containing the same data.

All the objects in the Work Repository are related to each other using the IDS. The tables in the repository are related to each other using the IDS.

However the objects of the master and the work Repository are related to each other using the code.

If objects from one repository are imported to other Repository then there are chances that the missing reference error may come, this is because,the object that is imported is referring to an object id that is not present in the new repository.

If you get an error like this, import the parent element first and then the child.

In a short while,let's explore how to import/export the project in a typical full development lifecycle projects for your customer.

Lets try to explore the various modes of import/export this will clear everything that is required to understand the Development flow in a typical ODI flow.

Smart Import and Smart Export: The Smart import and Smart export feature takes care of the parent child dependencies of the object that are being imported or exported.

Due to this facility of ODI to facilitate the dependencies, the problem of missing reference never comes.

Import in Duplication mode: All the objects that are being imported will be imported in the work repository with the NEWid which will be x<work rep number>

However in this mode if the object that is being imported has an object id that is referenced, then either the object with the object id should already be present in the repo or you will have to fix manually the missing reference errors.

The IDs generated by the Duplication mode/ or any import, works on giving the ids values based on the Object type.

So it might be possible that a Model folder may have 11101 as id and at the same time in the same repository 11101 is the id for a datastore.

Synonym Insert mode: 1. Same object id is preserved in the target.

2. Missing reference are not recalculated.

3. If an object with the same id already exist in the repo, then nothing is inserted into the rep.

4. If any of the incoming attributes violates any referential constraints, the import operation is aborted and an error message is thrown.

5. Note that sessions can only be imported in this mode.

Synonym mode update:

Tries to modify the same object (with the same internal ID) in the repository.

This import type updates the objects already existing in the target Repository with the content of the export file.

If the object does not exist, the object is not imported.

Note that this import type does NOT delete child objects that exist in the repository but are not in the export file. For example, if the target repository contains a project with some variables and you want to replace it with one that contains no variables, this mode will update for example the project name but will not delete the variables under this project. The Synonym Mode INSERT_UPDATE should be used for this purpose.

Synonym mode: Insert_update

If no ODI object exists in the target Repository with an identical ID, this import type will create a new object with the content of the export file. Already existing objects (with an identical ID) will be updated; the new ones, inserted.

Existing child objects will be updated, non-existing child objects will be inserted, and child objects existing in the repository but not in the export file will be deleted.

Dependencies between objects which are included into the export such as parent/child relationships are preserved. References to objects which are not included into the export are not recalculated.

This import type is not recommended when the export was done without the child components. This will delete all sub-components of the existing object.

import replace:

This import type replaces an already existing object in the target repository by one object of the same object type specified in the import file.

This import type is only supported for scenarios, Knowledge Modules, actions, and action groups and replaces all children objects with the children objects from the imported object.

Note the following when using the Import Replace mode:

If your object was currently used by another ODI component like for example a KM used by an integration interface, this relationship will not be impacted by the import, the interfaces will automatically use this new KM in the project.

Here's a word of caution

When replacing a Knowledge module by another one, Oracle Data Integrator sets the options in the new module using option name matching with the old module's options. New options are set to the default value. It is advised to check the values of these options in the interfaces.

Replacing a KM by another one may lead to issues if the KMs are radically different. It is advised to check the interface's design and execution with the new KM.

So Now if you are in a Development environment and you are asked to set up your ODI workspace by giving you the URL for SVN repository where the ODI artifiacts are kept.

Follow the following steps:

1. First ask your Infra team for the Work Repository details, try to connect to that schema.

2. Once you connect to the schema, check if the ODI objects are already imported by some other developer, if yes, then you are saved by this task.

3. If not, then you need to start with the task.

4. Check what sort of exports are formed for your Project, if they are smart Exports, nothing else is required, the dependencies will be taken care, you just need to smart import.

5. If the exports are not smart exports, then follow the steps, first topology->then model folders->models(if model folders and models are exported separately)->Projects. ( In order to avoid conflicts, import things in Synonym Insert update mode.)

6. Once completed the import report will be generated, Bingo check your project set up is done.

Thanks for Reading this!!

Peace!!

Monday, December 5, 2016

Writing Custom Python Code in ODI

Namaskaram Friends!

Here is a short post on how to write custom python code inside ODI.

What does this mean?

If you look at the concept of Procedures in ODI, you will find that the procedures can be written for a number of technologies, using the technology drop down.

This flexibility makes it a beautiful product, where you can write your procedure in any given technology if the integration project involves a lot of different technologies.

Python is a well known language these days, which is used in a number of industries because of its rich set of libraries.

In this post let's try to explore how to write Python code in ODI ( I am using ODI 11g version here).

Let's work on a very simple scenario:

Lets say we are working on a Data integration project where the client gives us a location to pick the file, and process it using Interfaces (file to table).

The step is,before we process the interface we need to check whether the file exist in the given location/path or not.

To accomplish this task we will write simple python code.

Create a Procedure name it CheckFile, select the Technology as Jython.

Don't get confused, Jython is Python language implementation that is tightly integrated with the Java Platform. So all the python code is still applicable here

Here is the python code: (take care of the indentation, because python works on the concept of indentation to define its scope, else the code will throw error)

import os

from os import path

def main():

file_path="D:\\empdata.txt"

#The above file_path can also be an odi variable which queries the location from some configuration table, that makes the solution even more configurable.

my_file = os.path.exists(file_path)

if my_file:

print("path does exist")

return 0

else:

raise

#You can write your custom python code here to do mitigations in the else section.

if __name__=="__main__":

main()

The procedure checks whether the file exists in the mentioned path or not.

Create a package like this:

When you execute the package, it will first check if the file exist or not, if the file does not exist it will perform the things that you would have mentioned in the else part of your python code.

Hope this would have been useful!!!

Peace

Tuesday, November 29, 2016

ODI Standalone Agents what are they and how to configure

The ODI environment comes with a concept of Agent.

If you have to give a one liner crisp definition about what an ODI Agent is, we will say that it is:

An ODI runtime process that orchestrates ETL flow.

In other words, an ODI agent is a java process, that facilitates the execution of code developed in ODI.

ODI 11g has 2 types of agent: Standalone agent and J2EE agent.

JEE enable ODI agent that allows the ODI agent to inherently take advantage of the application server's enterprise features, such as high availability, connection pooling and so on. However a standalone agent does not give all this features but it does not require a Application server to host it.

In this article, lets try to explore how standalone agents can be configured and used in ODI.

Navigate to the <oracle installed location>\oracledi\agent\bin

The directories may changes from installation to installation , based on where have you installed them.

You will see 2 files there: odiparams.bat and odiparams.sh. The installation folder contains both the files, based on the platform we need to choose which file needs to be edited.

So for a typical windows installation, open the odiparams.bat file. Modify the below parameters.

set ODI_MASTER_DRIVER=oracle.jdbc.OracleDriver
set ODI_MASTER_URL=jdbc:oracle:thin:@localhost:1521:ORCL
set ODI_MASTER_USER=<name of the master repository schema>

set ODI_MASTER_ENCODED_PASS=aYyp5IHh5AGwjgDJw2DgCy

How to generate the encoded password?

In the same folder, type the following command in cmd:

>> encode <master_repo_schema_password>
one encoded string appears.

copy the above encoded string and change the parameter value accordingly.

Similarly change the following parameters:

set ODI_SUPERVISOR=SUPERVISOR

set ODI_SUPERVISOR_ENCODED_PASS=h2yXu8fMrWKoE90kGUN5Extg (generate this string using the encode utility)

One more parameter:
set ODI_SECU_WORK_REP=WORKREP1 (this would be the name of the workrepo associated with the master repo)

Login to ODI studio and Navigate to Topology:

Under Agents: Right Click and select New Agent:
Enter the information:

once the information is entered in the page: navigate to agent/bin folder.
Execute the following command:

>>agent/bin>agent -NAME=<name of the Agent> -PORT=<port no> (The value for the parameters is as per the values given while creating the agent in ODI studio).

The cmd window will display this, once the agent is started:

Go to the agent page in ODI studio and click on the test button, since the agent is running, it will show agent successfully connected message:

That's it, the standalone ODI agent is configured.
Please leave comments if there are any doubts regarding any concepts.

Friday, November 25, 2016

Resetting USER passwords in ODI 11g

Lets try to explore one small thing.

Suppose you installed ODI into your system, now after the installation is done, the user is created.
But due to some other works, you did not login to the ODI studio, or some one messed up with the password etc and some how now, you are not able to access the studio because the password does not work.

The obvious solution for this is to reset the password for the user and start using.

Let's see how this is done.

As you would know, all the users created will go to the tables in master_repository.

Login to master_repository in the database where the repository is hosted.

Execute the following Query:

Select * from SNP_USER where WUSER_NAME= <name of your user>;
You will see one column with the name PASS, now to reset the password we will have to update this column with the new encoded password.

To get the encoded password, we will make use of the encode utility shipped with ODI.

based on the platform/OS where ODI is installed, navigate to the following folder

<ODI installed location>\Oracle_ODI1\oracledi\agent\bin

open cmd/ terminal and change to this directory.

>> cd <ODI installed location>\Oracle_ODI1\oracledi\agent\bin
>> encode <newpassword> (give just the password, arrows are only to show things clearly)

One encoded string will be generated.

Execute the following Query in master repository:

update snp_user
set PASS='aYya6Tx0FiJV,hpw94CKPy' (the above encoded string)
where WUSER_NAME= <name of your user>;

commit the changes.

You are done, navigate to ODI studio and use the new password (the normal string , not the encoded one).

Thanks for reading, see you in another post.

Saturday, October 24, 2015

Sunopsis Memory Engine in ODI

Hello Friends!!

So here is the requirement, using ODI we need to create interfaces for transferring data from file ( in one location) to file ( to another location).

You might be thinking we can directly put the files from one location to another using any of the secure copy command from Unix, but think it this way: The files come to the desired location as everyday job, our scheduled ODI jobs will pick the files and put them to another location ( how easy is that).

Now the constraint is, I am not provided with any of the RDBMS from the client to run my mappings in between and if I have to run ODI I need at least one RDBMS area for processing.

What to do now?

In a situation like this, “Sunopsis Memory Engine” Can help us

Actually Sunopsis Memory Engine is an In memory pure java based RDBMS. (hsql).

So our problem is solved now, we can utilise this as the RDBMS area for processing, so now when you create the interface where the source and target data store are both of File type, check the checkbox “Staging area different from target” and select the Sunopsis Memory Engine.

In the flow tab select the KMS:
LKM: File to SQL
IKM: SQL to File Append.

Save the interface and Execute it…….
Bingo!!! the file is generated at the other server.

Tuesday, April 14, 2015

PL/SQL- Package Data Session persistence, how to use it, using PRAGMA SERIALLY_REUSABLE

In this post I will try to show the session's persistence of Package Data, what are the issues with the Default behaviour and what can be done to alter this behaviour.

Session: When ever a Client connects from a machine to a Database server, a session is set up between the server and the Client, now this session is responsible for any communication between the server and the client.

If the same user logs in from different client/machine then a separate session will be created.

In Oracle the data for a particular session resides in the PGA.

What is meant by Package Data persistence for a Session?

The Default behaviour for a Package Data is once the variables in the Packages are assigned then they will hold their values until the session end or Disconnects.

Lets look at examples to understand the issue better.

Note: All code mentioned here can be executed in the HR schema provided with Oracle installation 11g.

Create a new session using Sql/Developer

CREATE OR REPLACE PACKAGE DEMO1

employee_name VARCHAR2(256);

PROCEDURE EMPLOYEE_SALARY(e_name IN VARCHAR2);

END DEMO1;

CREATE OR REPLACE PACKAGE BODY DEMO1

PROCEDURE EMPLOYEE_SALARY(e_name IN VARCHAR2)

BEGIN

SELECT last_name into employee_name FROM EMPLOYEES WHERE first_name=e_name;

END;

This is a simple package which is going to give the last name of the employee when the first name of the employee is used a Parameter in the EMPLOYEE_SALARY procedure.

Execute the below block now:

SET SERVEROUTPUT ON ;

BEGIN

DEMO1.EMPLOYEE_SALARY('Neena');

dbms_output.put_line('The name of the employee is '||DEMO1.employee_name);

END;

output of the block is :

anonymous block completed

The name of the employee is Kochhar

Now the DEMO1.employee_name variable is assigned this value and until this value is over written with another call with a different parameter value, the value will persist for the session.

So if you only execute the below block:

SET SERVEROUTPUT ON ;

BEGIN

dbms_output.put_line('The name of the employee is '||DEMO1.employee_name);

END;

Still the output will be :

anonymous block completed

The name of the employee is Kochhar

Now execute the below block from a different session, so for this open the SQL PLUS :

You will see that for this session, employee_name variable is not yet assigned any value as this is a new session.

This data persistence at session level is useful, but it can easily cause memory issues as the session variables will remain in the PGA for the whole session and eat up the memory area and will decrease the performance.

In order to overcome this behaviour we can make Oracle Package data to persist only for a specific call within the session using a PRAGMA SERIALLY_REUSABLE.

Lets look at one example for this :

CREATE OR REPLACE PACKAGE DEMO2

PRAGMA SERIALLY_REUSABLE;

employee_name VARCHAR2(256);

PROCEDURE EMPLOYEE_SALARY(e_name IN VARCHAR2);

END DEMO2;

CREATE OR REPLACE PACKAGE BODY DEMO2

PRAGMA SERIALLY_REUSABLE;

PROCEDURE EMPLOYEE_SALARY(e_name IN VARCHAR2)

BEGIN

SELECT last_name into employee_name FROM EMPLOYEES WHERE first_name=e_name;

END;

SET SERVEROUTPUT ON ;

BEGIN

DEMO2.EMPLOYEE_SALARY('Nancy');

dbms_output.put_line('The name of the employee is '||DEMO2.employee_name);

END;

Output for this is

anonymous block completed

The name of the employee is Greenberg

Execute the second block:

SET SERVEROUTPUT ON ;

BEGIN

dbms_output.put_line('The name of the employee is '||DEMO2.employee_name);

END;

Output is :

anonymous block completed

The name of the employee is

As compared to the previous case now the data persist only for a specific call and not for the whole session!!!!!

Hope this helps............

Wednesday, April 8, 2015

ODI- Static Error Control and Data Quality checks using CKM in Oracle Data Integrator

In order to perform Data Quality checks and initial cleaning of the data ODI provides 2 kinds of DQ checks: Static Control and Flow Control.
Static Control->This is used to check the quality of the data that is already present in the Data store, that is why this is known as Static.
Flow Control->This is used to check the quality of the data at the time of Integration process, that is when the data is being processed into the Target table, at that time we can separate out the error records (i.e. the records that do not full fill the constraints set for the Data Quality)

In this post we will see Static control:

Take the SCOTT schema and the EMP table from the schema.

Now add one record to the table:

INSERT INTO SCOTT.EMP
VALUES (1234,'Test','Test',null,sysdate,1000,NULL,10);

Assume that this record came while extract process and this is bad data and now you want to do a static check of your data and want to know quality before initiating the transfers to Target database .

ODI part:
1. Reverse Engg this table to some Model in ODI.

2. Import your technology specific CKM into the Project as shown in the screen shot.

3. In the Emp datastore add one constraint.

4. Start the DQ check process.

5. Once the session is complete Click on Control-Errors, this will display the records that do not satisfy the constraint.

Understanding the process (CKM)
1. The operator window for your session will look something like this.

2. One SNP_CHECK_TAB is created in the work schema of the Default physical schema.
3. E$ tables will be created in the work Schema.
4. The insert CK errors step will display the code like this:
from SCOTT.EMP EMP
where not (
ENAME<>'Test'
)

this is where the data is getting filtered out into the E$ table.