Category Archives: Rittman Mead

OBIEE Performance – Why Metrics Matter (and…Announcing obi-metrics-agent v2!)

One of the first steps to improve OBIEE performance is to determine why it is slow. That may sound obvious—can’t fix it if you don’t know what you’re fixing, right? Unfortunately, the “Drunk Man anti-method”, in which we merrily stumble from one change to another, maybe breaking things along the way and certainly having a headache at the end of it, is far too prevalent. This comes about partly through unawareness of a better method to follow, and partly encouraged by tuning documents comprising reams of configuration settings to “tune” and fiddle with without really knowing why or how to prove if they indeed actually fixed anything…

Determining the cause of performance problems is often a case of working out what it’s not just as much as what it is. This is for two important reasons. Firstly, we begin to narrow down the area of focus and analysis. Secondly, we know what to leave alone. If we can prove that, for example, the database is running the query behind a report quickly, then there is no point “tuning” the database, because the problem doesn’t lie there. Similarly, if we can see that a report taking 60 seconds in total to run spends 59 seconds of that in the database, fiddling with Java Heap Size settings on OBIEE is going to at the very, very most reduce our total runtime to…59 seconds! This kind of time profiling is important to do, and something that we produce automatically in our Performance Analytics Report:

timeprofile01

So, how do we pinpoint what is, or isn’t, going wrong? We need data, and specifically, we need metrics. We need log files, too, maybe for the real nitty-gritty of explain plans, but a huge amount can be understood about a system by looking at the metrics available.

Any modern operating system, from Windows to Linux, AIX to Solaris, will have copious utilities that will expose important metrics such as CPU usage, disk throughout, and so on. These can often be of great assistance in diagnosing performance problems.

OBIEE DMS Metrics

When it comes to OBIEE itself, we are spoilt by the performance counters available that since 11g (and still in 12c) have been exposed through the Dynamic Monitoring System (DMS). They were even there in 10g too, but accessed through JMX. These metrics give us information ranging from things like the number of logged in users, through how many connections are open to a given database, down to real low-level internals like how many threads are in use for handling LDAP lookups. Crucially, there are also metrics showing current and peak levels of queueing within the various internal systems in OBIEE, which is where DMS becomes particularly important.

By being able to prove that OBIEE has, for example, run out of available connections to the database, we can confidently state that by changing a given configuration parameter we will alleviate a bottleneck. Not only that, but we can monitor and determine how many connections we really do need at a given workload level. The chart below illustrates this. The capacity of the connection pool is plotted against the number of busy connections. As the number of active sessions increases so does the pressure on the connection pool, until it hits capacity at which point queueing starts—which now means queries are waiting for a connection to the database before they can even begin to execute (and it’s at this point we’d expect to see response times suffer).

So this is the kind of valuable information that is just not available anywhere other than the DMS metrics, and you can see from the above illustration just how useful it is. To access DMS metrics in OBIEE 11g and 12c, you have several options available out of the box:

Some of these are useful for programmatically scraping the data, others for interactively checking values at a point in time.

obi-metrics-agent – v2

At Rittman Mead, we always recommend collecting and storing DMS metrics (alongside others, including OS) all the time—not just if you find yourself with performance problems. That way you can compare before and after states, you can track historical trends—and you’re all set to hit the ground running with your diagnostics when (if) you do hit performance problems.

You can capture DMS metrics with the BI Management Pack in Enterprise Manager, you can write something yourself, or you can take advantage of an open-source tool from Rittman Mead, obi-metrics-agent.

I wrote about obi-metrics-agent originally when we first open-sourced it almost two years ago. The principle in version 2 is still the same, we’ve just rewritten it in Jython so as to remove the need for any dependencies like Python and associated libraries. We’ve also added native InfluxDB output, as well as retained the option to send data in the original carbon/graphite protocol.

You can run obi-metrics-agent and just write the DMS data to CSV, but our recommendation is always to persist it straight to a time series data store such as InfluxDB. Once you’ve collected the data you can analyse and monitor it with several tools, our favourite being Grafana (read more about this here).

As part of our Performance Analytics Service we’ve built a set of Performance Analytics Dashboards, making available a full-stack view of OBIEE metrics (including DMS, OS, and even Oracle ASH data), as seen in this video here (click on the image to enlarge it):

If you’d like to find out more about these and the Performance Analytics service offered by Rittman Mead, please get in touch. You can download obi-metrics-agent itself freely from our github repository.

The post OBIEE Performance – Why Metrics Matter (and…Announcing obi-metrics-agent v2!) appeared first on Rittman Mead Consulting.

Becky’s BI Apps Corner: Incrementals and Future dated Employee records

During the last few posts, we have delved into a few of the many interesting aspects of a BI Apps installation. Today I wanted to change gears a bit and talk about what starts to happen when you are past installation and configuration and begin running load plans. On a client project, I recently worked through a unique constraint error on W_EMPLOYEE_D that I found really interesting related to how the incremental logic was working in the knowledge module (KM). Before I can really get into the workaround, we need to understand how incremental loads work in general for BI Apps.

High Level Overview

In the initial run, the load will grab a full set of data, i.e. all data from the source system, based on the data load parameters set during configuration. The same load plan will be used to load data incrementally, picking up only data that has changed since the most recent load plan has completed (Last Extract Date). The pre-built mappings have incremental change capture built into the knowledge module logic. When a load runs, it will extract records that have changed or been created since the Last Extract Date. The load plan determines which rows to extract by using the formula Source Last Updated Date >= (Last Extract Date – Prune Days).

In the weeds

Is it an incremental load? How does that get decided? Actually, that isn’t decided at the load plan level. Each individual package (run as a scenario) starts with a step that refreshes a variable called #IS_INCREMENTAL.

This variable’s refresh logic, shown in the below screenshot, will determine if this package previously completed successfully. After every successful completion an entry gets made into W_ETL_LOAD_DATES with the package name and date timestamp, amongst other audit information.

So we have a scenario running now with the #IS_INCREMENTAL be set to ‘Y’. What does the Knowledge Module (KM) do? Incremental runs normally have steps to load an I$ table (flow table) from the source logic and update the records in the target table based on the DETECTION_STRATEGY option in the KM. For Fact table loads, the option can accept the possible values (explanation given).

  • OUTER: Outer join to target table when populating flow table in order to determine insert/update/useless records
  • NOT_EXISTS: NOT EXISTS clause is used when populating flow table in order to exclude records, which identically exist in target.
  • POST_FLOW: all records from source are loaded into flow table. After that an update statement is used to flag all rows in flow table, which identically exist in target.
  • NONE: all records from source are loaded into flow table. All target records are updated even when a target record is identical to flow table record.

In most cases, the option OUTER is used for facts, which updates the records based on primary keys (PK’s). Incremental decisions are based on the values of the system date columns like CHANGED_ON_DT, AUX1_CHANGED_ON_DT, AUX2_CHANGED_ON_DT, AUX3_CHANGED_ON_DT and AUX4_CHANGED_ON_DT columns populated from the source. This is better performing than the NOT_EXISTS and POST_FLOW options that compares each and every column to identify the records present.

For Slowly Changing Dimensions (like W_EMPLOYEE_D), the DETECTION_STRATEGY option can take the possible values (explanation given).

  • MINUS: MINUS clause is used when populating flow table in order to exclude records, which identically exist in target.
  • NOT_EXISTS: NOT EXISTS clause is used when populating flow table in order to exclude records, which identically exist in target.

The default option is NOT_EXISTS and Incremental decisions are based on PK’s and the date columns.

Future dated rows

Imagine now that during a full load, all records from the source tables for EMPLOYEES are brought forward into the data warehouse table W_EMPLOYEE_D. One of those records is an entry with an effective start date 2 weeks in the future. For W_EMPLOYEE_D one of the columns in the primary key is the effective start date. Fast forward two weeks to the date when the future dated row’s effective start date is the current date. During the incremental load on that date, the incremental logic for this one record is comparing the primary keys and all of the change indicator columns, and sees that the effective start date is greater than the last extract date from last night. This incremental comparison incorrectly determines this is a record that needs to be added to the fact table, even though the record is already in the fact table. Now we have an ERROR! The familiar unique constraint on the _U1 unique index rears its ugly head. On top of that, troubleshooting this duplicate is not coming up with any duplicate records in the usual places (I$, DS, source tables, nada!). Isolating the two identical records and tracking them back to the source tables however, there is the one record. The only clue is that the effective start date is today’s date. After a second occurrence, discussions and back and forth on an SR, a workaround is now available.

Workaround

Step 1. Remove any Future dated rows in W_EMPLOYEE_D

Step 2. Add a filter on the interface to prevent future dated rows from loading into W_EMPLOYEE_DS until they are <= current date.

At our client, this mapping continues to run without any additional errors. The steps here are most likely version specific, and this issue is a known bug to Oracle, so please don’t hesitate to open an SR if you are getting this specific issue, as a quick turn around is very likely.

There are some other odds and ends about how incremental load plans work and I plan to gather them up and have another post about those in the coming weeks. If you want to learn more ins and outs of incrementals and more, join me for the upcoming remote ODI for BIApps course on March 14th-16th. We have only a few spots left so sign up here.

The post Becky’s BI Apps Corner: Incrementals and Future dated Employee records appeared first on Rittman Mead Consulting.

Data Integration Tips: Oracle Data Integrator – Quotes in Variables

It’s Sunday night, and we have just enough time for another quick Data Integration Tip from Rittman Mead. In another recent, yet somewhat trivial, challenge I ran into this week was the ability to pass quotations through to a Scenario as a part of an ODI 11g Variable (this is also applicable to ODI 12c). Let’s get right into the example.

To set the stage, I’m working on project that includes integration with Hyperion Planning. This specific task is a command line call to the CalcMgrCmdLineLauncher.cmd to execute a specific business rule against the data in the Planning application. In this case, our business rule name contains spaces and, for various reasons, the name cannot be changed to remove the spaces. The ODI 11g project is setup to run many of these planning business rules by calling a generic Scenario, based on a Package, passing variables to make it dynamic.

The command for the OdiStartScen ODI Tool in the calling Package looks like this:

OdiStartScen "-SCEN_NAME=UTIL_EXECUTE_CALCMGR" "-SCEN_VERSION=001" "-SYNC_MODE=1" "-EPM.PV_PLANNING_CALCMGR_RULE=Revenue and Burden All Contracts"

As you can see, there are double quotes around each of the parameters and their values being passed into the ODI Tool, including the Scenario name and version, the synchronous flag, and additional variables such as the PV_PLANNING_CALCMGR_RULE. What I found was that the Scenario had issues accepting the ODI variable value with spaces.

To work around this issue, I simply added an escape to the double quotes around the variable value with spaces. The double double quotes allow the ” to be passed through as a part of the variable string value.

OdiStartScen "-SCEN_NAME=UTIL_EXECUTE_CALCMGR" "-SCEN_VERSION=001" "-SYNC_MODE=1" "-EPM.PV_PLANNING_CALCMGR_RULE=""Revenue and Burden All Contracts"""

Now, the Scenario UTIL_EXECUTE_CALCMGR can accept the variable value with spaces, and the necessary double quotes, and pass it on through to the CalcMgrCmdLineLauncher.cmd command.

So there you have it, another Data Integration Tip from the folks at Rittman Mead. Simple as it was, I hope you find it useful! Check back regularly for more DI Tips and have a great week!

 

The post Data Integration Tips: Oracle Data Integrator – Quotes in Variables appeared first on Rittman Mead Consulting.

Introduction to Oracle R Training

Rittman Mead is thrilled to announce our new Introduction to Oracle R courses!

With the launch of 12c comes an exciting extension to OBIEE’s analytical capabilities—embedded R execution. Oracle has made several investments in the R language and its related technologies (Oracle R, ROracle, ORAAH, and Oracle R Enterprise). Based on our experience deploying advanced analytics solutions from within business intelligence departments, we view this simplified pairing of OBIEE and R as a great way forward in delivering advanced analytics across organizations.

Oracle R

But the world of R is a big one, with hundreds of open-source R packages and countless training avenues. With so many confusing options on the market, we wanted to simplify things for you. It can be difficult to know where to start, so we’ve designed an R course to help you bridge the gap between existing BI/DW skills and the new skills required to confidently derive insights with R. We want to help you look beyond the dashboard and delve into novel analytical techniques.

The Advanced Analytics course dives deeply into the analytical capabilities of 12c and R. We’ve designed the training so that prior knowledge of R isn’t assumed. The course builds on existing SQL and data visualization skills—covering R use cases from predictive analytics and time series forecasting to natural language processing and many others.

Oracle R

We want to share with you our vision of R-enhanced business intelligence that provides more insights, predictions, and actionable discoveries, all while scaling effortlessly with your data. With our training, you’ll learn to expand your analytic skills and become an invaluable asset to your company’s BI operation.

We are excited about what R brings to the world of BI and can’t wait to share our knowledge.

We will follow up this blog post with one about the business value of our new Introduction to Oracle R course and another post about the more technical aspects of the course.

The post Introduction to Oracle R Training appeared first on Rittman Mead Consulting.

Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table

Well, it’s been awhile since I’ve posted one of our Rittman Mead Data Integration Tips, so I thought a recent challenge might be the next great candidate. I was working through the Kimball ETL Subsystems and the Error Event Schema, Subsystem 5 if you’re familiar with the methodology, and attempting to build the schema from Oracle Data Integrator 12c (ODI 12c) metadata tables. If you caught my presentation “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration” at Oracle OpenWorld 2015, then you’ll know a bit more about where I’m coming from. You can still catch my presentation on this topic at both IOUG Collaborate16 and ODTUG KScope16 (being invited to speak at each of these events is always an honor). Now this Data Integration Tip isn’t solely related to the Kimball ETL Subsystems, though the solution did prove rather useful for my presentation (and upcoming article in the IOUG Select Journal). It’s actually an interesting twist in how Oracle Data Integrator mapping metadata is stored differently between the ODI 11g and ODI 12c repositories.

The challenge is to find the target table, or Datastore, for a given Mapping in ODI 12c, or Interface in ODI 11g. When I say “find”, I mean query the ODI work repository tables and return the target table, or tables in 12c, for a given Mapping. Luckily, I’m equipped with some guidance from our friends at My Oracle Support. If you look at support document Oracle Data Integrator 11g and 12c Repository Description (Doc ID 1903225.1) you’ll find the data dictionaries for both ODI11g (11.1.1.6.0+) and ODI12c (12.1.2, 12.1.3, & 12.2.1). These are invaluable resources from Oracle – though the may be works in progress and somewhat incomplete!

Let’s take a look first at ODI 11g and how simple things used to be. Back when Interfaces were the mechanism for extracting, transforming, and loading, we were allowed only 1 single target Datastore. Ahh, those were the good ‘ol days! Digging into the repository we really only had one place to look for the target table – SNP_POP. This table in the ODI 11g repository, SNP_POP (which essentially stands for Synopsis – Populate), contains a column I_TABLE. This identifying column represents the target table for that particular Interface. Here’s a query that ties it all together.

select
    p.pop_name interface_name,
    t.table_name target_table,
    m.cod_mod model_code
from snp_pop p inner join snp_table t on p.i_table = t.i_table
    inner join snp_model m on t.i_mod = m.i_mod;

As you can see, the key to capturing the target table for an Interface is simply in the SNP_POP.I_TABLE column. Because there is only one target, we can easily figure it out.

interface-target-table

Now, ODI 12c is where the real challenge lies. As you may know, with the move from Interfaces in 11g to flow based Mappings in 12c, we were allowed to do new and exciting things, such as load multiple target tables from a single Mapping. We may also have a case where a Datastore component maps to a Filter component, which then maps to another Datastore component, etc. As you can see in the image below, we can have lots of tables, and lots of tables that may be sources or targets, but we’re only interested in the final target table (or tables, for that matter!).

odi12c-map

Ok, so let’s dig into the Work Repository now. It seems an ODI Mapping can’t be that much more difficult than an Interface, right? Well…

First, there are quite a few tables related to the Mapping itself.

SNP_MAPPING
SNP_MAP_ATTR
SNP_MAP_ATTR_INFO
SNP_MAP_COMP
SNP_MAP_COMP_TYPE
SNP_MAP_CONN
SNP_MAP_CP
SNP_MAP_CP_ROLE
SNP_MAP_DATA_TYPE
SNP_MAP_EXPR
SNP_MAP_EXPR_REF
SNP_MAP_PROP
SNP_MAP_PROP_DEF
SNP_MAP_REF
SNP_MAP_REF_PP

Whoa…we’ve got some work to do. Lucky for you, I’ve already done the work. Let’s look at how it all fits together. We can start with the Mapping (SNP_MAPPING) table. We also have different components on the Mapping, such as Lookups, etc, so we can join in the table SNP_MAP_COMP as well. Here’s the information we’ll be able to see with that simple join.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping

odi12c-map-component

That’s interesting, we’ve captured all components in the mapping. But there are still quite a lot of non-targets here. Ok, maybe if we add the connection points for each component we can find the input and output for each. Components each have an input connector point, allowing an output from a different component to flow into it.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp

That just added the connection point information for each component and whether it is an INPUT or OUTPUT. Not extremely useful by itself, so let’s dig a bit deeper. How about adding the SNP_MAP_REF table? This table seems to contain a reference to all types of other attributes about the Mapping and its Components. We also need to consider that Datastores, just as any other Component, will have both an input and output. Right now, the dataset shows both the input and output connectors for each Component. Let’s remove the input connection points to limit our result set.

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = 'O' --output connection point only

Joining the reference table has now allowed us to focus on the Datastores in the Mapping and their OUTPUT connection point. What I really want is to see only the Datastores that do not have their OUTPUT connection point connected to an INPUT connector. Therefore, if the OUTPUT is empty, it must be the target table!

select ...
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
where cp.direction = ‘O' and --output connection point only.
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point

The SNP_MAP_CONN, which stores the mapping connections, will allow me to limit the query to the components that only have an output, but not an input. The connections table will contain all component connections in the ODI 12c mappings. Here’s what we get as a result.

mapping target tables found

Hey, now we’re onto something here. In fact, this is what I was looking for! Target table(s) in a single Mapping in ODI12c. Not quite as simple as ODI11g, but with a bit of SQL and understanding of the repository tables, you can do it. Here’s the final query again, joining in the SNP_TABLE & SNP_MODEL tables to complete the dataset.

select
    m.name mapping_name,
    mr.qualified_name,
    mc.name datastore_alias,
    t.table_name target_table,
    mdl.cod_mod model_code
from snp_mapping m inner join snp_map_comp mc on m.i_mapping = mc.i_owner_mapping
    inner join snp_map_cp cp on mc.i_map_comp = cp.i_owner_map_comp
    inner join snp_map_ref mr on mc.i_map_ref = mr.i_map_ref
    inner join snp_table t on mr.i_ref_id = t.i_table
    inner join snp_model mdl on t.i_mod = mdl.i_mod
where cp.direction = 'O' and --output connection point
    cp.i_map_cp not in
        (select i_start_map_cp from snp_map_conn) --not a starting connection point
;

Please let me know if you find this Data Integration Tip useful or if you have a better way of accessing the target table in a mapping. One of my Rittman Mead colleagues asked, “why not just use the ODI Java API?”. For accessing the ODI repository, I do prefer using some Groovy script and the API. But in this case, I’m interested in building out a dimensional schema and writing ETL to load the dimensions and facts, which lends itself to SQL rather than Groovy script.

As always, if you’re team needs help around Oracle Data Integrator, or Oracle Data Integration Solutions in general, drop us a line at info@rittmanmead.com. Or feel free to reach out to me directly via email (michael.rainey@rittmanmead.com) or Twitter (@mRainey). Cheers!

The post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table appeared first on Rittman Mead Consulting.