Tag Archives: Performance
Introducing obi-metrics-agent – an Open-Source OBIEE Metrics Collector
Understanding what is going on inside OBIEE is important for being able to diagnose issues that arise, monitor its health, and dig deep into its behaviour under stress in a load test. OBIEE exposes a set of metrics through the Dynamic Monitoring Service (DMS) and viewable through Enterprise Manager (EM) Fusion Middleware Control. EM is a great tool but doesn’t meet all requirements for accessing these metrics, primarily because it doesn’t retain any history.
obi-metrics-agent is a tool that extracts OBIEE’s performance metrics from the Dynamic Monitoring Service (DMS) functionality. Venkat wrote the original version, which I have rewritten in python and added additional functionality. It polls DMS on a specified interval and output the data to a variety of formats. It was written to aid OBIEE performance monitoring either as part of testing or longer-term use. Its features include:
- Multiple output options, including CSV, XML, and Carbon (for rendering in Graphite etc)
- Parse data as it is collected, or write to disk
- Parse data collected previously
How does it work?
obi-metrics-agent is written in Python, and uses the documented OPMN functionality to expose DMS metrics opmnctl metric op=query
. We experimented with the WLST route but found the overhead was too great. OPMN supplies the DMS data as a large XML message, which obi-metrics-agent can either store raw, or parse out into the constituent metrics. It can write these to CSV or XML files, generate INSERT statements for sending them to a database, or send them to graphite (see below).
obi-metrics-agent can also parse previously-extracted raw data, so if you want to store data in graphite but don’t have the server to hand at execution time it can be loaded retrospectively.
On which platforms does it work?
- Tested thoroughly on Oracle Linux 5 and 6
- Works on Windows 2003, should work on later versions
Which OBI metrics are collected?
All of the ones that OPMN supports. Currently, BI Server and BI Presentation Services, plus the opmn process metrics (such as CPU time of each OBI component)
To explore the DMS metrics available, you can use Enterprise Manager, or the DMS Spy servlet that is installed by default with OBIEE and available at http://<obi-server>:7001/dms/
(assuming your AdminServer is on port 7001).
I have used DMS metrics primarily when investigating OBIEE’s behaviour under stress in performance testing, but some of the higher-level metrics are useful for day-to-day monitoring too. The DMS metrics let you peer into OBIEE’s workings and deduce or hypothesise the cause of behaviour you are seeing.
- How many users does OBIEE see as logged in?
- How many active requests are there from Presentation Services to BI Server (nqserver)?
- How many active connections are there from each Connection Pool to the database? How many queued connections?
- What’s the average (careful…) response time by database?
- What’s the error rate for Presentation Services queries?
- How does the memory profile of each OBIEE component behave during testing?
- How are the BI Server’s internal thread pools coping with the load? Do they need resizing?
- How many queries per second are being run on each database?
- How is the graphing engine behaving? Is it queuing requests?
- What’s the Presentation Services and BI Server cache hit rate?
Sounds great, where do I get it?
Rittman Mead have released obi-metrics-agent as open source. You can find it on GitHub: https://github.com/RittmanMead/obi-metrics-agent.
Simply clone the repository and run the python script. You need to install the lxml library first – full details are supplied in the repository.
$ export OPMN_BIN=$FMW_HOME/instances/instance1/bin/opmnctl $ python ./obi-metrics-agent.py # =================================================================== # Developed by @rmoff / Rittman Mead (http://www.rittmanmead.com) # Absolutely no warranty, use at your own risk # Please include this notice in any copy or reuse of the script you make # =================================================================== --------------------------------------- Output format : csv raw/csv/xml/carbon/sql : False/True/False/False/False Data dir : ./data FMW instance : None OPMN BIN : /u01/app/oracle/product/fmw/instances/instance1/bin/opmnctl Sample interval (seconds) : 5 --------------------------------------- --Gather metrics-- Time of sample: Wed, 26 Mar 2014 10:38:38 +0000 (1395830318) Get metrics for coreapplication_obips1 Processed : 469 data values @ Wed, 26 Mar 2014 10:38:38 +0000 Oracle BI Presentation Server Appended CSV data to ./data/metrics.csv Get metrics for coreapplication_obis1 Processed : 230 data values @ Wed, 26 Mar 2014 10:38:38 +0000 Oracle BI Server Appended CSV data to ./data/metrics.csv Get metrics for opmn Processed : 91 data values @ Wed, 26 Mar 2014 10:38:38 +0000 opmn Appended CSV data to ./data/metrics.csv Processed: 3 Valid: 3 (100.00%) Invalid: 0 (0.00%) [...]
See the next blog post for a step-by-step on getting it set up and running using SampleApp v309R2 as an example server, including with Graphite and Collectl for visualising the data and collecting OS stats too.
Visualising the collected data – Graphite
Graphite is an open-source graphing tool that comes with a daemon called carbon that receives incoming data and stores it its own times-series database (called whisper). Graphite is a very popular tool meaning there’s lots of support out there for it and additional tools written to complement it, some of which I’ll be exploring in later blog posts. It’s also very easy to get data into graphite, and because it stores it in a time series you can then display OBIEE DMS data alongside anything else you may have – for example, OS metrics from collectl, or jmeter performance test counters.
Whilst obi-metrics-agent can be used on its own and data stored to CSV for subsequent parsing, or accessing in Oracle as an external table, the focus on this and subsequent blog posts will primarily be on using obi-metrics-agent writing data to graphite and the benefits this brings when it comes to visualising it.
“Graphite?! Haven’t we got a visualisation tool already in OBIEE?”
You can graph this data out through OBIEE, if you want. The OBI metric data can be loaded in by external table from the CSV files, or using the generated INSERT statements.
The benefit of using Graphite is twofold:
- Its primary purpose is graphing time-based metrics. Metrics in, time-based graphs out. You don’t need to build an RPD or model a time dimension. It also supports one-click rendering of wildcarded metric groups (for example, current connection count on all connection pools), as well as one-click transformations such as displaying deltas of a cumulative measure.
- It gives an alternative to a dependency on OBIEE. If we use OBIEE we’re then rendering the data on the system that we’re also monitoring, thus introducing the significant risk of just measuring the impact of our monitoring! To then set up a second OBIEE server just for rendering graphs then it opens up the question of what’s the best graphing tool for this particular job, and Graphite is a strong option here.
Graphite just works very well in this particular scenario, which is why I use it….YMMV.
Contributing
There are several obvious features that could be added so do please feel free to fork the repository and submit your own pull requests! Ideas include:
- support for scaled-out clusters
- init.d / run as a daemon
- selective metric collection
Known Issues
Prior to OBIEE 11.1.1.7, there was a bug in the opmn process which causes corrupt XML sometimes. This could sometimes be as much as 15% of samples. On corrupt samples, the datapoint is just dropped.
The FMW patch from Oracle for this issue is 13055259.
License
=================================================================== Developed by @rmoff / Rittman Mead (http://www.rittmanmead.com) Absolutely no warranty, use at your own risk Please include this notice in any copy or reuse of the script you make ===================================================================
What next?
There are several posts on this has subject to come, including :
- Installing obi-metrics-agent, graphite, and collectl.
- Exploring the graphite interface, building simple dashboards
- Alternative front-ends to graphite
Monitoring OBIEE Performance for the End User with JMeter from EM12c
This is the third article in my two-article set of posts (h/t) on extending the monitoring of OBIEE within EM12c. It comes after a brief interlude discussing Metric Extensions as an alternative using Service Tests to look at Usage Tracking data.
Moving on from the rich source of monitoring data that is Usage Tracking, we will now cast our attention to a favourite tool of mine: JMeter. I’ve written in detail before about this tool when I showed how to use it to build performance tests for OBIEE. Now I’m going to illustrate how easy it can be to take existing OBIEE JMeter scripts and incorporate them into EM12c.
Whilst JMeter can be used to build big load tests, it can also be used as a single user. Whichever way you use it the basis remains the same. It fires a bunch of web requests (HTTP POSTs and GETs) at the target server and looks at the responses. It can measure the response time alone, or it can check the data returned matches what’s expected (and doesn’t match what it shouldn’t, such as error messages).
In the context of monitoring OBIEE we can create simple JMeter scripts which do simple actions such as
- Login to OBIEE, check for errors
- Run a dashboard, check for errors
- Logout
If we choose an execution frequency (“Collection Schedule” in EM12c parlance) that is not too intensive (otherwise we risk impacting the performance/availability of OBIEE!) we can easily use the execution profile of this script as indicative of both the kind of performance that the end user is going to see, as well as a pass/fail of whether user logins and dashboard refreshes in OBIEE are working.
EM12c offers the ability to run “Custom Scripts” as data collection methods in Service Tests (which I explain in my previous post), and JMeter can be invoked “Headless” (that is, without a GUI) so lends itself well to this. In addition, we are going to look at EM12c’s Beacon functionality that enables us to test our JMeter users from multiple locations. In an OBIEE deployment in which users may be geographically separated from the servers themselves this is particularly useful to check that the response times seen from one site are consistent with those from another.
Note that what we’re building here is an alternative version to the Web Transaction Test Type that Adam Seed wrote about here, but with pretty much the same net effect – a Service Test that enables to you say whether OBIEE is up or down from an end user point of view, and what the response time is. The difference between what Adam wrote about and what I describe here is the way in which the user is simulated:
- Web Transaction (or the similar ATS Transaction) Test Types are built in to EM12c and as such can be seen as the native, supported option. However, you need to record and refine the transaction that is used, which has its own overhead.
- If you already have JMeter skills at your site, and quite possibly existing JMeter OBIEE scripts, it is very easy to make use of them within EM12c to achieve the same as the aforementioned Web Transaction but utilising a single user replay technology (i.e. JMeter rather than EM12c’s Web Transaction).
So, if you are looking for a vanilla EM12c implementation, Web/ATS transactions are probably more suitable. However, if you already use JMeter then it’s certainly worth considering making use of it within EM12c too
The JMeter test script
You can find details of building OBIEE JMeter scripts here and even a sample one to download here. In the example I am building here the script consists of three simple steps:
- Login
- Go to dashboard
- Logout
The important bit to check is the Thread Group – it needs to run a single user just once. If you leave in settings from an actual load test and start running hundreds of users in this script called from EM12c on a regular basis then the effect on your OBIEE performance will be interesting to say the least
Test the script and make sure you see a single user running and successfully returning a dashboard
Running JMeter from the command line
Before we get anywhere near EM12c, let us check that the JMeter script runs successfully from the commandline. This also gives us opportunity to refine the commandline syntax without confounding any issues with its use in EM12c.
The basic syntax for calling JMeter is:
./jmeter --nongui -t /home/oracle/obi_jmeter.jmx
With --nongui
being the flag that tells JMeter not to run the GUI (i.e. run headless), and -t
passing the absolute path to the JMX JMeter script. JMeter runs under java so you may also need to set the PATH
environment variable so that the correct JVM is used.
To run this from EM12c we need a short little script that is going to call JMeter, and will also set a return code depending on whether an error was encountered when the user script was run (for example, an assertion failed because the login page or dashboard did not load correctly). A simple way to do this is to set the View Results In Table sampler to write to file only if an error is encountered, and then parse this file post-execution to check for any error entries.
We can then do a simple grep
against the file and check for errors. In this script I’m setting the PATH
, and using a temporary file /tmp/jmeter.err
to capture and check for any errors. I also send any JMeter console output to /dev/null
.
export PATH=/u01/OracleHomes/Middleware/jdk16/jdk/bin:$PATH rm /tmp/jmeter.err /home/oracle/apache-jmeter-2.10/bin/jmeter --nongui -t /home/oracle/obi_jmeter.jmx 1>/dev/null 2>&1 grep --silent "<failure>true" /tmp/jmeter.err if [ $? -eq 0 ]; then exit 1 else exit 0 fi
Note that I am using absolute paths throughout, so that there is no ambiguity or dependency on the folder from which this is executed.
Test the above script that you’ll be running from EM12c, and check the return code that is set:
$ ./run_jmeter.sh ; echo $?
The return code should be 0 if everything’s worked (check in Usage Tracking for a corresponding entry) and 1 if there was a failure (check in nqquery.log
to confirm that there was a failure)
Making the script available to run on EM12c server
To start with we’ll be looking at getting EM12c to run this script locally. Afterwards we’ll see how it can be run on multiple servers, possible geographically separated.
So that the script can be run on EM12c, copy across your run_jmeter.sh
script, JMeter user test script, and the JMeter binary folder. Check that the script still runs after copying it across.
Building the JMeter EM12c Service Test
So now we’ve got a JMeter test script, and a little shell script harness with which to call it. We hook it into EM12c using a Service Test.
From Targets -> Services, create a new Generic Service (or if you have one already that it makes sense in which to include this, do so).
Give the service a name and associate it with the appropriate System
Set the Service’s availability as being based on a Service Test. On the Service Test screen set the Test Type to Custom Script. Give the Service Test a sensible Name and then the full path to the script that you built above. At the moment, we’re assuming it’s all local to the EM12c server. Put in the OS credentials too, and click Next
On the Beacons page, click Add and select the default EM Management Beacon. Click Next and you should be on the Performance Metrics screen. The default metric of Total Time is what we want here. The other metric we are interested in is availability, and this is defined by the Status metric which gets its value from the return code that is set by our script (anything other than zero is a failure).
Click Next through the Usage Metrics screen and then Finish on the Review screen
From the Services home page, you should see your service listed. Click on its name and then Monitoring Configuration -> Sevice Tests and Beacons. Locate your Service Test and click on Verify Service Test
Click on Perform Test and if all has gone well you should see the Status as a green arrow and a Total Time recorded. As data is recorded it will be shown on the front page of the service you have defined:
One thing to bear in mind with this test that we’ve built is that we’re measuring the total time that it takes to invoke JMeter, run the user login, run the dashboard and logout – so this is not going to be directly comparable with what a user may see in timing the execution of a dashboard alone. However, as a relative measure for performance against itself, it is still useful.
Measuring response times from additional locations
One of the very cool things that EM12c can do is run tests such as the one we’ve defined but from multiple locations. It’s one thing checking the response time of OBIEE from local to the EM12c server in London, but how realistically will this reflect what users based in the New York office see? We do this through the concept of Beacons, which are bound to existing EM12c Agents and can be set as execution points for Service Tests.
To create a Beacon, go to the Services page and click on Services Features and then Beacons:
You will see the default EM Management Beacon listed. Click on Create…, and give the Beacon a name (e.g. New York) and select the Agent with which it is associated. Hopefully it is self-evident that a Beacon called New York needs to be associated with an Agent that is physically located in New York and not Norwich…
After clicking Create you should get a confirmation message and then see your new Beacon listed:
Before we can configure the Service Test to use the Beacon we need to make sure that the JMeter test rig that we put in place on the EM12c server above is available on the server on which the new Beacon’s agent runs, with the same paths. As before, run it locally on the server of the new Beacon first to make sure the script is doing what it should.
To get the Service Test to run on the new Beacon, go back to the Services page and as before go to Monitoring Configuration -> Sevice Tests and Beacons. Under the Beacons heading, click on Add.
Select both Beacons in the list and click Select
When returned to the Service Tests and Beacons page you should now see both beacons listed. To check that the new one is working, use the Verify Service Test button and set the Beacon to New York and click on Perform Test.
To see the performance of the multiple beacons, use the Test Performance page:
Conclusion
As stated at the beginning of this article, the use of JMeter in this way and within EM12c is not necessarily the “purest” design choice. However, if you have already invested time in JMeter then this is a quick way to make use of those scripts and get up and running with some kind of visibility within EM12c of the response times that your users are seeing.
Oracle Data Integrator 12c release : Part 2
In the first part of this article, we discovered the new mappings and reusable mappings, the debugger and the support of OWB jobs in the freshly released ODI 12c. Let’s continue to cover the new features in ODI 12c with CAM, the new component KMs and some features to improve performances or ease of use.
WebLogic Management Framework
Another big change with this new version is the new WebLogic Management Framework used to manage the standalone agent in replacement of OPMN (Oracle Process Manager and Notification). Based on Weblogic Server, it brings a unified management of Fusion Middleware components and it provides a wizard for the configuration – the Fusion Middleware Configuration Wizzard.
Practically the standalone agent is now installed outside of the ODI installation, in a domain similar to WLS domains. The benefits of it, apart from the centralised management, is that it supports Weblogic Scripting Tool (WLST) and can be monitored through Enterprise Manage Fusion Middleware Control or Enterprise Manager Cloud Control. The standalone agent can be launched manually or through the Node Manager.
Component Knowledge Modules
Next to the standard KMs – now called Template KMs -, a new type of knowledge modules has been introduced in this release. Going one step further in the reusability, the Component KMs contain steps that are defined once and shared among several of them. Instead of being templates of code with some call to the Susbtitution API at runtime , these KMs are actually compiled libraries that will generate the code based on the components present in a mapping.
The Component KMs are provided out-of-the-box and are not editable. You can check the type of a KM in the physical tab of a mapping.
Knowledge Module Editor
Don’t worry, you can still import, edit and create Template KMs. You can now find them in the folder <ODI_HOME>/odi/sdk/xml-reference.
Good news, the KM Editor has been reviewed to avoid us to constantly switch from one tab to another. Now when the focus is set on a KM task, it is directly displayed in the Property Editor. In addition the command field supports syntax highlighting and auto-completion (!!). Options are also now directly managed from the KM editor and not from the Project tree.
In-Session Parallelism
With ODI 12c, it is now possible to have the extract tasks (LKMs) running in parallel. It’s actually done by default. If two sources are located in the same execution unit on the physical tab, they will run in parallel. If you want a sequential execution, you can drag and drop one of your units onto a blank area. A new execution unit will be created and ODI will choose in which order it will be loaded.
Thanks to this the execution time can be reduced, especially if your sources come from different dataservers.
In the topology, you can now define the number of threads per Session for your physical agents to avoid one session to get all the resources.
Parallel Target Table Load
With ODI 11g, if two interfaces loading data in the same datastore are executed at the same time or if the same interface is executed twice, you can face some problem. For instance a session might delete a worktable in which the other session wants to insert data.
This now belongs to the past. With 12c, a new “Use Unique Temporary Object Names” checkbox appears in the Physical tab of your mapping. If you select it, the worktables for every session will have a unique name. You are now sure that another session won’t delete it or insert other data.
I can hear a little voice : “Ok but what if the execution of the mapping fails? With these unique name, these tables won’t be deleted the next time we run the mapping.” Don’t worry, the ODI team thought about everything. A task in a KM can now be created under the section “Mapping Cleanup”. It will be executed at the end of the mapping even if it fails. That’s a nice feature that I will also use for my logging tasks!
“But eh, what if the agent crashes? These cleanup tasks won’t be executed anyway.” No worries… A new ODI Tool is now available : OdiRemoveTemporaryObjects. It runs every cleanup tasks that have not run normally and by default it is automatically called when a agent restarts. Little voice 0 – 2 ODI Team.
Session Blueprint
When doing real-time integration with ODI, you can have a lot of jobs running every few seconds or minutes. For each call of a scenario, the ODI 11g agent retrieves all the steps and tasks it should execute from the repository and write it synchronously in the logs before starting to execute anything. After the execution it deletes the logs that shouldn’t be kept when the log level is low. At the end of the day, even if you set the logs to a low level, you end up with a lot of redo logs and archive logs in your database.
Instead of retrieving the scenario for every execution, the ODI 12c agent now only retrieves a Session Blueprint once and keep it in the cache. As it is cached in the agent, there is no need to get anything from the repository if the job runs again a few minutes later. The agent also write – asynchronous – only the needed logs defined by the log level, as it only relies on the blueprint for it’s execution and not on the logs. The parameters related to blueprints are available in the definition of the physical agents in the topology.
In summary, thanks to the Session Blueprints the overhead of executing session is greatly reduced. The agent needs to retrieve less data from the repository, doesn’t insert-and-delete logs anymore and write it asynchronously. Isn’t it great?
Datastore Change Notification
This happened to me once again last week : we needed to rename a datastore column in ODI but we couldn’t because it was used as a target datastore in an interface. The solution is to remove the mapping expression for this column in every interface where the datastore is used as a target.
It will not happen anymore with 12c. Now, you can change it in the model and a change notification will be displayed in every mapping using this datastore. In the following example, I removed the LOAD_DATE column.
Wallet
Ever faced this dilemma in 11g ? Giving the Master Repository password to everyone is not safe, but saving it in the Login Credentials is not better if a developer forgets to lock his screen when he is away from keyboard.
Once again, 12c is there to help you. You can now store your Login Credentials in a Wallet, itself protected by a password. One password to rule them all… and keep them safe!
Try it !
If you want to give a try to ODI 12c and follow the Getting Started Guide, you can download a Pre-Built Virtual Machine. It’s running on Oracle Enterprise Linux with an Oracle XE Database 11.2.0. Of course, you will need Virtual Box to run it.
See you soon
Needless to say I’m very exited about this release. Combining the greatness of the ODI architecture with the new flow-based paradigm and all the new features, ODI is now a very mature tool and the best-of-breed for Data Integration regardless of the source or the target technology.
Keep watching this blog, there are more posts to come about data integration! Stewart Bryson already announced in his last post he would talk about the ODI 12c JEE agent and Michael Rainey started with GoldenGate 12c New Features. Who will shoot next?
[Update] Peter Scott will also present ODI 12c in his session at the Bulgarian Oracle Users Group Autumn Conference on 22nd November 2013. Stewart and myself will also talk about it at RMOUG Training Days in February.
Rittman Mead BI Forum Atlanta Special Guest: Cary Millsap
I feel like I’m introducing the Beatles… though I think Kellyn Pot’Vin calls them the “DBA Gods”. Today I’ll be talking about Cary Millsap, and tomorrow I’ll introduce our other special guest: Alex Gorbachev.
As many of you know, I grew up as a DBA (albeit, focusing on data warehouse environments) before transitioning to development… initially as an ETL developer and later as an OBIEE architect. I had three or four “heroes” during that time… and Cary Millsap was certainly one of them. His brilliant white paper “Why a 99%+ Database Buffer Cache Hit Ratio is Not Ok” changed my whole direction with performance tuning: it’s probably the first time I thought about tuning processes instead of systems. Many of you also know about my love of Agile Methodologies… a cause that Cary has championed of late, and is also the subject of an excellent white paper “Measure Once, Cut Twice”. This purposeful inversion of the title helps to remind us that many of the analogies we use for software design don’t compute… it’s relatively simple to modify an API after the fact, so go ahead and “cut”.
A brief bit of history on Cary. He’s an Oracle ACE Director and has been contributing to the Oracle Community since 1989. He is an entrepreneur, software technology advisor, software developer, and Oracle software performance specialist. His technical papers are quoted in many Oracle books, in Wikipedia, in blogs all over the world, and in dozens of conference presentations each month. He has presented at hundreds of public and private events around the world, and he is published in Communications of the ACM. He wrote the book “Optimizing Oracle Performance” (O’Reilly 2003), for which he and co-author Jeff Holt were named Oracle Magazine’s 2004 Authors of the Year. Though many people (Kellyn included) think of Cary as a DBA… Cary considers himself to be software developer first, but explains what he believes to be the reason for this misconception:
“I think it’s fair to say that I’ve dedicated my entire professional career (27 years so far) to the software performance specialization. Most people who know me in an Oracle context probably think I’m a DBA, because I’ve spent so much time working with DBAs (…It’s still bizarre to me that performance is considered primarily a post-implementation operational topic in the Oracle universe). But my background is software development, and that’s where my heart is. I built the business I own so that I can hang out with extraordinarily talented software researchers and designers and developers and write software that helps people solve difficult problems.”
Cary’s presentation is called “Thinking Clearly about Performance” and will be given at the end of the day on Thursday before we head over to 4th and Swift for the Gala dinner. His message for the BI developers in the audience is an encouraging one:
“My message at the Rittman Mead BI Forum is that, though it’s often counterintuitive, software performance actually makes sense. When you can think clearly about it, you generally make progress more quickly and more permanently than if you just stab at different possible solutions without really understanding what’s going on. That’s what this presentation called “Thinking Clearly about Performance” is all about. It’s the result of more than 25 years of helping people understand performance in just about every imaginable context, from the tiny little law office in east Texas to the Large Hadron Collider at CERN near Geneva. It’s the result of seeing the same kinds of wasteful mistakes: buying hardware upgrades in hopes of reducing response times without understanding what a bottleneck is, adding load to overloaded systems in hopes of increasing throughput.”
My experience is that BI and DW systems suffer more from the “fast=true” disease than do OLTP systems, but that could simply be perspective. I’m excited that a group of BI developers, the majority of which are reporting against a database of some kind, will get an opportunity to see Cary’s approach to problem solving and performance tuning. As Cary tells us:
“The fundamental problems in an OBIEE implementation are just that: fundamental. The solution begins with understanding what’s really going on, which means engaging in a discussion of what we should be measuring and how (of course, in the OBIEE world, Robin Moffatt’s blog posts come in handy), and it continues through the standard subjects of profiles, and skew, and efficiency, and load, and queueing, and so on.”
If you are interested in seeing Cary and all the other great speakers at this year’s BI Forum, you can go over to the main page to get more information about the Atlanta event, or go directly to the registration page so we can see you in Atlanta in May.
Performance and OBIEE – Summary and FAQ
This article is the final one in a series about OBIEE and performance. You can find the previous posts here:
Summary
- The key to long term, repeatable, successful performance optimisation of a system is KNOWING where problems lie and then resolving them accordingly. This means investing time up front to develop an understanding of the system and its instrumentations and metrics. Once you have this, you can apply the same method over and over to consistently and successfully resolve performance problems.
- GUESSING what the problem is and throwing best practice checklists at it will only get you so far. If you hit a lucky streak it may get you far enough to convince yourself that it is enough alone. But sooner or later you will hit a dead-end with your guesswork and have to instead start truly diagnosing problems. When this happens, you are starting from scratch in learning and developing your method for doing this.
Getting the best performance out of OBIEE is all about good design and empirical validation.
To be able to improve performance you must first identify the cause of problem. If you start ‘tuning’ without identifying the actual cause you risk making things much worse.
The slightly acerbic tone at times of these articles may betray my frustration with the incorrect approach that I see people take all too often. A methodical approach is the correct one, and I am somewhat passionate about this, because:
- It works! You gather the data to diagnose the issue, and then you fix the issue. There is no guessing and there is no luck
- By approaching it methodically, you learn so much more about how the OBIEE stack works, which aside from being useful in itself means that you will design better OBIEE systems and troubleshoot performance more easily. You actually progress in your understanding, rather than remaining in the dark about how OBIEE works and throwing fixes at it to hope one works.
FAQ
Q: How do I improve the performance of my OBIEE dashboards/reports?
A: Start here. Use the method described to help understand where your performance is slow, and why. Then you set to resolving it as described in this series of blog articles.
Q: No seriously, I don’t have time to read that stuff… how do I fix the performance? My boss is mad and I must fix it urgently!
A: You can either randomly try changing things, in which case Google will turn up several lists of settings to play with, or you can diagnose the real cause of the performance problem. If you’ve run a test then see here for how to analyse the performance and understand where the problem lies
Q: Why are you being such a bore? Can’t you just tell me the setting to change?
A: I’m the guy putting £0.50 bets on horses because I don’t want to really risk my money with big bets. In my view, changing settings to fix performance without knowing which setting actually needs changing is a gamble. Sometimes the gamble pays off, but in the end the house always wins.
Q: Performance is bad. When I run the report SQL against the database manually it is fast. Why is OBIEE slow?
A1: You can see from nqquery.log
on the BI Server how long the SQL takes on the database, so you don’t necessarily need to run it manually. Bear in mind the network between your BI Server and the database, and also the user ID that the query is being executed as. Finally, the database may have cached the query so could be giving a better impression of the speed.
A2: If the query really does run faster manually against the database then look at nqquery.log
to see where the rest of the query time is being spent. It could be the BI Server is having to do additional work on the data before it is returned up to the user. For more on this, see response time profiling.
Q: This all sounds like overkill to me. In my day we just created indexes and were done.
A: I’ve tried to make my method as comprehensive as possible, and usable in both large-scale performance tests but also isolated performance issues. If a particular report is slow for one use, then the test define, design and build is pretty much done already – you know the report, and to start with running it manually is probably fine. Then you analyse why the performance isn’t as good as you want and based on that, you optimise it.
Q: Should I disable query logging for performance? I have read it is a best practice.
A: Query logging is a good thing and shouldn’t be disabled, although shouldn’t be set too detailed either.
Reading & References
The bulk of my inspiration and passion for trying to understand more about how to ‘do’ performance properly has come from three key places:
- Cary Millsap’s blog and particularly his paper Thinking Clearly about Performance
- Greg Rahn’s blog structureddata.org
- Zed Shaw’s essay Programmers Need To Learn Statistics Or I Will Kill Them All
plus the OBIEE gurus at rittmanmead including markrittman and krisvenkat, along with lots of random twitter conversations and stalking of neilkod, martinberx, alexgorbachev, kevinclosson, nialllitchfield, orcldoug, martindba, jamesmorle, and more.
Comments
I’d love your feedback on this. Do you agree with this method, or is it a waste of time? What have I overlooked or overemphasised? Am I flogging a dead horse?
I’ve enabled comments on this article in the series only, to keep the discussion in one place. You can tell me what you think on twitter too, @rmoff