Tag Archives: Obiee

OBIEE 11.1.1.9 Now Supports HiveServer2 and Cloudera Impala

As you all probably know I’m a big fan of Oracle’s BI and Big Data products, but something I’ve been critical of is OBIEE11g’s lack of support for HiveServer2 connections to Hadoop clusters. OBIEE 11.1.1.7 supported Hive connections using the older HiveServer1 protocol, but recent versions of Cloudera CDH4 and CDH5 use the HiveServer2 protocol by default and OBIEE 11.1.1.7 wouldn’t connect to them; not unless you switched to the Windows version of OBIEE and used the Cloudera ODBC drivers instead, which worked but weren’t supported by Oracle.

OBIEE 11.1.1.9 addresses this issue by shipping more recent DataDirect ODBC drivers for Hive, that are compatible with the HiveServer2 protocol used by CDH4 and CDH5 (check out this other article by Robin on general new features in 11.1.1.9). Oracle only really support Hive connectivity for Linux installs of OBIEE, and the Linux version of OBIEE 11.1.1.9 comes with the DataDirect ODBC drivers already installed and configured for use, all you have to do then is set up the ODBC connection in the odbc.ini file on Linux and install the Cloudera Hive ODBC drivers on your Windows workstation for the Admin too (the Hive ODBC drivers that Oracle supply on MOS still look like the old HIveServer1 version, though I could be wrong). To check that it all worked on this new 11.1.1.9 version of OBIEE11g I therefore downloaded and installed the Windows Cloudera Hive ODBC drivers and set up the System DSN like this:

NewImage

and set up a corresponding entry in the Linux OBIEE 11.1.1.9’s odbc.ini file, like this:

NewImage

with the key thing being to make sure you have matching DSN names on both the Windows workstation (for the Admin tool initial datasource setup and table metadata import) and the Linux server (for the actual online connection to Hive from the BI Server, and subsequent data retrieval). One thing I did notice was that whilst I could connect to the Hive database server and set up the connection in the Admin tool, I couldn’t view any Hive tables and had to manually create them myself in the RPD Physical Layer – this could just be a quirk on my workstation install though so I wouldn’t read too much into it. Checking connectivity in the Admin tool then showed it connecting properly and retrieving data from Hive on the Hadoop cluster. I didn’t test Kerberos-authentication connections but I’m assuming it’d work, as the previous version of OBIEE 11.1.1.7 on Linux just failed at this point anyway. The docs are here if you’d like to look into any more details, or check the full set of setup steps.

NewImage

For Cloudera Impala connections, you’re directed in the docs to download the Windows Cloudera Impala ODBC drivers as Oracle don’t even ship them on MOS, but again the Linux install of OBIEE 11.1.1.9 comes with DataDirect Impala drivers that are already setup and ready for use (note that if you upgrade from 11.1.1.7 to 11.1.1.9 rather than do the fresh install that I did for testing purposes, you’ll need to edit the opmn.xml file to register these updated DataDirect drivers). Then it’s a case of setting the Windows System DSN up for the initial metadata import, like this:

NewImage

then creating a corresponding entry in the Linux server’s odbc.ini file, like this:

NewImage

Note that the docs do mention the issue with earlier versions of Impala where the Impala server is expecting LIMIT clauses when using ORDER BY in Impala SQL queries, and gives a couple of workarounds to fix the issue and stop Impala expecting this clause; for more recent (CDH5+) versions of Impala this requirement is in-fact lifted and you can connect-to and use Impala without needing to make the configuration change mentioned in the doc (or use the workaround I mentioned in this earlier blog post). Checking connectivity in the Admin tool then shows the connection is making its way through OK, from the Windows environment to the Linux server’s ODBC connection:

NewImage

and creating a quick report shows data returned as expected, and considerably quicker than with Hive.

NewImage

As I said, I’ve not really tested either of these two connections using Kerberos or any edge-case setups, but connectivity seems to be working and we’re now in a position where OBIEE11g can properly connect to both Hive, and Impala, on recent CDH installs and of course the Oracle Big Data Appliance. Good stuff, now what about Spark SQL or ElasticSearch..?

What’s New in OBIEE 11.1.1.9 for Systems Administrators and Developers

After over two years since the last major release of OBIEE, Oracle released version 11.1.1.9 in May 2015. You can find the installers here and documentation here. 11.1.1.9 is termed the “terminal release” of the 11g line, and the 12c version is already out in closed-beta. We’d expect to see patchsets for 11g to continue for some time covering bugs and any security issues, but for new functionality in 11g I would hazard a guess that this is pretty much it as Oracle concentrate their development efforts on OBIEE 12c and BICS, particularly Visual Analyser.

For both the end user and backend administrator/developer, OBIEE 11.1.1.9 has brought with it some nice little touches, none of which are going to revolutionise the OBIEE world but many of which are going to make life with the tool just that little bit smoother. In this article we take a look at what 11.1.1.9 brings for the sysadmin & developer.

BI Server Query Instrumentation and Usage Tracking

There are some notable developments here:

  1. Millisecond precision when logging events from the BI Server
  2. Usage Tracking now includes the physical query hash, which is what is also visible in the database, enabling end-to-end tracing
  3. User sessions can be tracked and summarised more precisely because session ID is now included in Usage Tracking.
  4. The execution of initialisation blocks is now also recorded, in a new Usage Tracking table called S_NQ_INITBLOCK.

Millisecond precision in BI Server logs

OBIEE 11.1.1.9 writes the nqquery.log with millisecond precision for both the timestamp of each entry, and also the summary timings for a query execution (at last!). It also calls out explicitly “Total time in BI Server” which is a welcome addition from a time profiling/performance analysis point of view:

[2016-07-31T02:11:48.231-04:00 [...] Sending query to database named X0 - Airlines Demo Dbs (ORCL) (id: <<221516>>), connection pool named Aggr Connection, logical request hash 544131ec, physical request hash 5018e5db: [[  
[2016-07-31T02:12:04.31-04:00 [...] Query Status: Successful Completion  
[2016-07-31T02:12:04.31-04:00 [...] Rows 2, bytes 32 retrieved from database query id: <<221516>>  
[2016-07-31T02:12:04.31-04:00 [...] Physical query response time 2.394 (seconds), id <<221516>>  
[2016-07-31T02:12:04.31-04:00 [...] Physical Query Summary Stats: Number of physical queries 1, Cumulative time 2.394, DB-connect time 0.002 (seconds)  
[2016-07-31T02:12:04.31-04:00 [...] Rows returned to Client 2  
[2016-07-31T02:12:04.31-04:00 [...] Logical Query Summary Stats: Elapsed time 16.564, Total time in BI Server 16.555, Response time 16.564, Compilation time 0.768 (seconds), Logical hash 544131ec

One thing to notice here is the subsecond timestamp precision seems to vary between 2 and 3 digits, which may or may not be a bug.

Being able to see this additional level of precision is really important. Previously OBIEE recorded information by the second, which was fine if you were looking at query executions taking dozens of seconds or minutes – but hopefully our aspirations for systems performance are actually closer to the realms of seconds or subsecond. At this scale the level of precision in the timings really matters. On the assumption that OBIEE was rounding values to the nearest whole number, you’d see “0 seconds” for a Logical SQL compile (for example) that was maybe 0.499 seconds. Per query this is not so significant, but if those queries run frequently then cumulatively that time stacks up and would be useful to be properly aware of and target with optimisation if needed.

Usage Tracking changes

Usage Tracking has five new columns for each logical query recorded in S_NQ_ACCT:

  • ECID
  • TENANT_ID
  • SERVICE_NAME
  • SESSION_ID
  • HASH_ID

The presence of SESSION_ID is very useful, because it means that user behaviour can be more accurately analysed. For example, within a session, how many reports does a user run? What is the median duration of a session? Note that the session here is the session as seen by the BI Server, rather than Presentation Services.


ECID is also very useful for being able to link data in Usage Tracking back to more detailed entries in nqquery.log. Note that an ECID is multipart and concanated with RID and you won’t necessarily get a direct hit on the ECID you find in Usage Tracking with that in nqquery.log, but rather a substring of it. In this example here the root ECID is 11d1def534ea1be0:20f8da5c:14d4441f7e9:–8000–0000000000001891,0:1:103 and the varying component of the relationship (RID) id 1 and 3 respectively:

Usage Tracking:

select ecid,session_id,start_dt,start_hour_min ,saw_src_path from biee_biplatform.s_nq_acct

sa50208

nqquery.log:

[2015-05-12T08:58:38.704-04:00] [...] [ecid: 11d1def534ea1be0:20f8da5c:14d4441f7e9:-8000-0000000000001891,0:1:103:3] [...]  
-------------------- SQL Request, logical request hash:  
3fabea2b  
SET VARIABLE QUERY_SRC_CD='Report',SAW_DASHBOARD='/shared/02. Visualizations/_portal/2.11 Table Designs',SAW_DASHBOARD_PG='Conditional Format',SAW_SRC_PATH='/shared/02. Visualizations/Configured Visuals/Conditional Formats/CF based on a hidden column',PREFERRED_CURRENCY='USD';SELECT^M  
   0 s_0,^M  
[...]

In the above example note how the absence of a timezone in the Usage Tracking data is an impedance to accurate interpretation of the results, compared to nqquery.log which has a fully qualified timezone offset.

Usage Tracking changes – Physical Hash ID

As well as additions to the logical query table, there are two new columns for each physical query logged in S_NQ_DB_ACCT:

  • HASH_ID
  • PHYSICAL_HASH_ID

The implications of this are important – there is now native support in OBIEE for tracing OBIEE workloads directly down to the database (as discussed for OBIEE < 11.1.1.9 here), because the PHYSICAL_HASH_ID is what OBIEE sets as the ACTION field when it connects to the database and is available in Oracle through both AWR, V$ views, and DBMS_MONITOR. For example, in V$SESSION the ACTION field is set to the physical hash:

SQL> select username,program,action 
  from v$session where lower(program) like 'nqs%';

USERNAME PROGRAM                                          ACTION  
-------- ------------------------------------------------ ---------  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)         5065e891  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)         2b6148b2  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)         8802f14e  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)         206c8d54  
BISAMPLE nqsserver@demo.us.oracle.com (TNS V1-V3)         c1c121a7

The ACTION is also available in many EM screens such as this one:

sa50210
Now with OBIEE 11.1.1.9 the physical hash – which was previously only available in the nqquery.log file – is available in S_NQ_DB_ACCT which can in turn be joined to S_NQ_ACCT to find out the logical request related to the physical query seen on the database. Cool huh!

SELECT PHYSICAL_HASH_ID,  
       USER_NAME,  
       SAW_SRC_PATH,  
       SAW_DASHBOARD,  
       SAW_DASHBOARD_PG  
FROM   BIEE_BIPLATFORM.S_NQ_DB_ACCT PHYS  
       INNER JOIN BIEE_BIPLATFORM.S_NQ_ACCT LOGL  
               ON LOGL.ID = PHYS.LOGICAL_QUERY_ID  
WHERE  PHYS.PHYSICAL_HASH_ID = '5065e891'

sa50207

This can be extended even further to associate AWR workload reports with specific OBIEE requests:

sa50209

One little grumble (no pleasing some people…) – it would have been nice if Usage Tracking also stored:

  • Timings at millisecond precision as well
  • The number of bytes (rather than just row count)
  • A proper TIMESTAMP WITH TIME ZONE (rather than the weird triplet of TS/DT/HOUR_MIN)
  • “Total time in BI Server”

Who knows, maybe in 12c?…

Footnote – START_TS in Usage Tracking in 11.1.1.9

As a note for others who may hit this issue, my testing has shown that Usage Tracking in 11.1.1.9 appears to have introduced a bug with START_TS (on both S_NQ_ACCT and S_NQ_DB_ACCT), in that it stores only the date, not date + time as it did in previous versions. For example:

  • 11.1.1.7:
    SELECT TO_CHAR(START_TS, 'YYYY-MM-DD HH24:MI:SS') AS START_TS, 
           TO_CHAR(START_DT, 'YYYY-MM-DD HH24:MI:SS') AS START_DT, 
           START_HOUR_MIN 
    FROM   S_NQ_ACCT 
    WHERE  ROWNUM < 2 
    
    START_TS            START_DT            START_HOUR_MIN   
    ------------------- ------------------- -----  
    2015-03-19 15:32:23 2015-03-19 00:00:00 15:32
  • 11.1.1.9:
    SELECT TO_CHAR(START_TS, 'YYYY-MM-DD HH24:MI:SS') AS START_TS, 
           TO_CHAR(START_DT, 'YYYY-MM-DD HH24:MI:SS') AS START_DT, 
           START_HOUR_MIN 
    FROM   S_NQ_ACCT 
    WHERE  ROWNUM < 2 
    
    START_TS            START_DT            START_HOUR_MIN   
    ------------------- ------------------- -----  
    2015-01-27 00:00:00 2015-01-27 00:00:00 10:41

Initialisation Block information in Usage Tracking

A new table, S_NQ_INITBLOCK, has been added to BIPLATFORM and holds details of when an init block ran, for which user, and importantly, how long it took. From a performance analysis point of view this is really valuable data and it’s good to seeing it being added to the diagnostic data captured to database with Usage Tracking.

From a glance at the data it looks like there’s a bit of a bonus logging going on, with user sign in/sign out also recorded (“SIGNNING ON/SIGNED ON/SIGNED OFF”).

2015-05-13_22-56-30

Note that there is no MBean for Init Block Usage Tracking, so regardless of how you configure the rest of Usage Tracking, you need to go to NQSConfig.ini to enable this one.

Presentation Services Cursor Cache

Oracle have added some additional Administration functionality for viewing and managing sessions and the cursor cache in Presentation Services. These let you track and trace more precisely user sessions.

From the Administration Page in OBIEE the new options are:


  1. Set dynamic log level per session from manage sessions

  2. Filter cursor cache based on specific user sessions

  3. Change sort order of cursor cache

  4. Show Presentation Services diagnostics per cursor

  5. Download cursor cache list as CSV

Some of these are somewhat low-level and will not be used day-to-day, but the general move towards a more open diagnostics interface with OBIEE is really positive and I hope we see more of it in 12c… :-)

Command Line Aggregate Advisor

Only for use by those with an Exalytics licence, the Summary Advisor was previously available in the Windows Administration Tool only but can now be run from the command line:

[oracle@demo setup]$ nqaggradvisor -h

Usage:  
    nQAggrAdvisor -d <dataSource> -u <userName> -o <outputFile> -c <tupleInQuotes>  
                  [-p <password>] [-F <factFilter>] [-z <maxSizeAggr>] [-g <gainThreshold>]  
                  [-l <minQueryTime>] [-t <timeoutMinutes>] [-s <startDate>]  
                  [-e <endDate>] [-C <on/off>] [-M <on/off>] [-K <on/off>]

Options:  
    -d      : Data source name  
    -u      : User name  
    -o      : Output aggregate persistence script file name  
    -c      : Aggregate persistence target - tuple in quotes: Fully qualified Connection pool, fully qualified schema name, capacity in MB  
    -p      : Password  
    -F      : Fact filter file name  
    -z      : Max size of any single aggregate in MB  
    -g      : Summary advisor will run until performance improvement for new aggregates drops below this value, default = 1  
    -l      : The minimum amount of query time accumulated per LTS in seconds, before it is included for analysis, default = 0  
    -t      : Max run time in minutes - 0 for unlimited, default = 0  
    -s      : Statistics start date  
    -e      : Statistics end date  
    -C      : Prefer optimizer estimates - on/off, default = off  
    -M      : Only include measures used in queries - on/off, default = off  
    -K      : Use surrogate keys - on/off, default = on

Examples:  
    nQAggrAdvisor -d "AnalyticsWeb" -u "Administrator" -p "ADMIN" -o "C:tempaggr_advisor.out.txt"  
        -c "DW_Aggr"."Connection Pool","DW_Aggr".."AGGR",1000

    nQAggrAdvisor -d "AnalyticsWeb" -u "Administrator" -p "ADMIN" -o "C:tempaggr_advisor.out.txt" -F "C:tempfact_filter.txt" -g 10  
        -c "TimesTen_instance1"."Connection Pool","dbo",2000 -s "2011-05-02 08:00:00" -e "2011-05-07 18:30:00"  -C on -M on -K off

Note that in the BIPLATFORM schema S_NQ_SUMMARY_STATISTICS is now called S_NQ_SUMMARY_ADVISOR.

HTML5 images

In previous versions of OBIEE graph images were rendered in Flash by default, and PNG on mobile devices. You could force it to use PNG for all images but would loose the interactivity (tooltips etc). Now in OBIEE 11.1.1.9 you can change the default from Flash to HTML5. This removes the need for a Flash plugin and is generally the way that a lot of visualisations are done on the web nowadays. To my eye there’s no difference in appearance:


To use HTML5 graphs by default, edit instanceconfig.xml and under <Charts> section add:

<DefaultWebImageType>html5</DefaultWebImageType>

Note that html5 is case-sensitive. The config file should look something like this:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>  
<WebConfig xmlns="oracle.bi.presentation.services/config/v1.1">  
   <ServerInstance>  
   [...]  
        <Views>  
        [...]  
            <Charts>  
                <DefaultWebImageType>html5</DefaultWebImageType>  
            [...]  
            </Charts>  
        [...]  
        </Views>  
    [...]  
   </ServerInstance>  
</WebConfig>

If Presentation Services doesn’t come back up when you restart it after making this change then check the stdout logfile console~coreapplication_obips1~1.log as well as the standard sawlog.log file, both of which you’ll find in $FMW_HOME/instances/instance1/diagnostics/logs/OracleBIPresentationServicesComponent/. The reason to check the console log file is that Presentation Services will refuse to start if the configuration supplied is invalid, and you’ll see an error message stating this here.

NQS ODBC functions

One for the Neos amongst you, a quick call of NQSGetSQLProcedures (as seen in SampleApp dashboard 7.90 NQS ODBC Procedures) and comparison with 11.1.1.7.150120 shows the following new & changed NQS ODBC calls. If this means nothing to you then it probably doesn’t need to, but if you’re interested in exploiting OBIEE functionality from all angles, documented or not, then these might be of interest. It goes without saying, these are entirely undocumented and unsupported, completely liable to change or be removed at any time by Oracle.

  • Added
    • NQSGetUserDefinedFunctions
    • NQSPAFIntegration
    • NQSSearchPresentationObjects
    • NQS_GetAllCacheEntries
    • NQS_GetOverallCacheInfo
    • NQS_GetRepositories
    • NQS_LoadNewBaseRP
    • NQS_LoadNewRPVersion
    • NQS_LockSessionAgainstAutoRPSwitchOver
    • NQS_SetRPDReadOnlyMode
    • NQS_SwitchOverThisSessionToNewRP
    • SAPurgeCacheBySubjectArea
    • SAPurgeCacheEntryByIDVector
    • SAPurgeXSACache
    • SASeedXSACache
  • Modified
    • NQSGetQueryLogExcerpt (additional parameter)
    • SAPurgeInternalCache (additional enum)
  • Removed
    • NQSChangeSelfPassword

Web Services

Web Services are one of the best ways to integrate with OBIEE programatically. You don’t need to be building heavy java apps just to use them – you can create and send the necessary SOAP messages from python or even just send it from bash with curl.

There are 2.5 new WSDLs – two new ones (v9, v10) plus v8 which has changed. The new services are:

  • KPIAssessmentService
  • ScorecardAssessmentService
  • ScorecardMetadataService
  • UserPersonalizationService

You’ll find documentation for the Web Services in the Integrator’s Guide.

User Image Upload

Users can now upload their own images for use in Title views, conditional formats, etc. From an administration point of view this means you’ll want to be keeping an eye on /root/shared/custom/images/ in the Presentation Catalog, either on disk and/or through the OBIEE Catalog View, switch to Admin and enable “Show Hidden Items”:

QUERY_LIMIT_WARNING_INSTEAD_OF_ERROR

This new setting in NQSConfig.ini will warn users when they’re breaching defined query limits, but it won’t abort the query.

Pointless hacks

If you’re a geek like me, part of the fun of a new tool is simply poking around and seeing what’s new – not necessarily what’s useful. There’s plenty of great new stuff in 11.1.1.9, but let’s take a look at the “under the hood”, just Because Geek.

It was John Minkjan who first blogged several years about the xsd configuration schema files, and it is from this that we can find all the things that Presentation Services might be able to do – not just what it definitely can do, and not just what Oracle have documented that it can do. I wrote about some of these options a while back, and there are a few new ones in 11.1.1.9.

ALL OF THESE ARE COMPLETELY UNDOCUMENTED AND UNSUPPORTED. DO NOT USE THEM.

  • EnableCloudBIEEHome sets the home page of OBIEE to be as it would be on BI Cloud Service (BICS). This is completely pointless since all the interesting stuff (Load Data, Model, Manage) is non-existent, even if it does give us a clue which application deployments are going to be supplying them (bimodeler and biserviceadministration respectively)

  • GridViews/ShowDataModels outputs a bunch of debug data in Answers Table or Pivot Views:

  • VirusScannerConfiguration – When a user uploads a custom image, this command will be called with it. For example, this simple script writes to a file the time and name of the file passed to it:

    echo '---' >> /tmp/log.txt  
    date >> /tmp/log.txt  
    echo $1 >> /tmp/log.txt

    If I save this as /tmp/test-script.sh and add it to instanceconfig.xml:

    <VirusScannerConfiguration>  
       <ScannerInvocationCommandLine>/tmp/test-script.sh</ScannerInvocationCommandLine>  
    </VirusScannerConfiguration>

    When I upload an image I get a row written to my log file. That in itself isn’t useful, but it could be a handy hook maybe from an auditing point of view, or indeed, virus scanning:

    [oracle@demo tmp]$ cat /tmp/log.txt  
    ---  
    Wed May 20 16:01:47 EDT 2015  
    /app/oracle/biee/instances/instance1/tmp/OracleBIPresentationServicesComponent/coreapplication_obips1/defaultpool/sawserver_8673_5553759a_2-1.tmp

BI Forum 2015 Preview — OBIEE Regression Testing, and Data Discovery with the ELK stack

I’m pleased to be presenting at both of the Rittman Mead BI Forums this year; in Brighton it’ll be my fourth time, whilst Atlanta will be my first, and my first trip to the city too. I’ve heard great things about the food, and I’m sure the forum content is going to be awesome too (Ed: get your priorities right).

OBIEE Regression Testing

In Atlanta I’ll be talking about Smarter Regression testing for OBIEE. The topic of Regression Testing in OBIEE is one that is – at last – starting to gain some real momentum. One of the drivers of this is the recognition in the industry that a more Agile approach to delivering BI projects is important, and to do this you need to have a good way of rapidly testing changes made. The other driver that I see is OBIEE 12c and the Baseline Validation Tool that Oracle announced at Oracle OpenWorld last year. Understanding how OBIEE works, and therefore how changes made can be tested most effectively, is key to a successful and efficient testing process.

In this presentation I’ll be diving into the OBIEE stack and explaining where it can be tested and how. I’ll discuss the common approaches and the relative strengths of each.

If you’ve not registered for the Atlanta BI Forum then do so now as places are limited and selling out fast. It runs May 14–15 with an optional masterclass on Wednesday 13th May from Mark Rittman and Jordan Meyer.

Data Discovery with the ELK Stack

My second presentation is at the Brighton forum the week before Atlanta, and I’ll be talking about Data Discovery and Systems Diagnostics with the ELK stack. The ELK stack is a set of tools from a company called Elastic, comprising Elasticsearch, Logstash and Kibana (E – L – K!). Data Discovery is a crucial part of the life cycle of acquiring, understanding, and exploiting data (one could even say, leverage the data). Before you can operationalise your reporting, you need to understand what data you have, how it relates, and what insights it can give you. This idea of a “Discovery Lab” is one of the key components of the Information Management and Big Data Reference Architecture that Oracle and Rittman Mead produced last year:

ELK gives you great flexibility to ingest data with loose data structures and rapidly visualise and analyse it. I wrote about it last year with an example of analysing data from our blog and associated tweets with data originating in Hadoop, and more recently have been analysing twitter activity using it. The great power of Kibana (the “K” of ELK) is the ability to rapidly filter and aggregate data, as well as see a summary of values within a data field:

The second aspect of my presentation is still on data discovery, but “discovering data” within the logfiles of an application stack such as OBIEE. ELK is perfectly suited to in-depth diagnostics against dense volumes of log data that you simply could not handle within simple log viewers or Enterprise Manager, such as the individual HTTP requests and types of value passed within the interactions of a single user session:

By its nature of log streaming and full text search, ELK also lends itself well to near real time system monitoring dashboards reporting the status of systems including OBIEE and ODI, and I’ll be discussing this in more detail during my talk.

The Brighton BI Forum is on 7–8 May, with an optional masterclass on Wednesday 6th May from Mark Rittman and Jordan Meyer. If you’ve not registered for the Brighton BI Forum then do so now as places are very limited!


Don’t forget, we’re running a Data Visualisation Challenge at each of the forums, and if you need to convince your boss to let you go you can find a pre-written ‘justification’ letter here.

BI Forum 2015 Preview — OBIEE Regression Testing, and Data Discovery with the ELK stack

I’m pleased to be presenting at both of the Rittman Mead BI Forums this year; in Brighton it’ll be my fourth time, whilst Atlanta will be my first, and my first trip to the city too. I’ve heard great things about the food, and I’m sure the forum content is going to be awesome too (Ed: get your priorities right).

OBIEE Regression Testing

In Atlanta I’ll be talking about Smarter Regression testing for OBIEE. The topic of Regression Testing in OBIEE is one that is – at last – starting to gain some real momentum. One of the drivers of this is the recognition in the industry that a more Agile approach to delivering BI projects is important, and to do this you need to have a good way of rapidly testing changes made. The other driver that I see is OBIEE 12c and the Baseline Validation Tool that Oracle announced at Oracle OpenWorld last year. Understanding how OBIEE works, and therefore how changes made can be tested most effectively, is key to a successful and efficient testing process.

In this presentation I’ll be diving into the OBIEE stack and explaining where it can be tested and how. I’ll discuss the common approaches and the relative strengths of each.

If you’ve not registered for the Atlanta BI Forum then do so now as places are limited and selling out fast. It runs May 14–15 with an optional masterclass on Wednesday 13th May from Mark Rittman and Jordan Meyer.

Data Discovery with the ELK Stack

My second presentation is at the Brighton forum the week before Atlanta, and I’ll be talking about Data Discovery and Systems Diagnostics with the ELK stack. The ELK stack is a set of tools from a company called Elastic, comprising Elasticsearch, Logstash and Kibana (E – L – K!). Data Discovery is a crucial part of the life cycle of acquiring, understanding, and exploiting data (one could even say, leverage the data). Before you can operationalise your reporting, you need to understand what data you have, how it relates, and what insights it can give you. This idea of a “Discovery Lab” is one of the key components of the Information Management and Big Data Reference Architecture that Oracle and Rittman Mead produced last year:

ELK gives you great flexibility to ingest data with loose data structures and rapidly visualise and analyse it. I wrote about it last year with an example of analysing data from our blog and associated tweets with data originating in Hadoop, and more recently have been analysing twitter activity using it. The great power of Kibana (the “K” of ELK) is the ability to rapidly filter and aggregate data, as well as see a summary of values within a data field:

The second aspect of my presentation is still on data discovery, but “discovering data” within the logfiles of an application stack such as OBIEE. ELK is perfectly suited to in-depth diagnostics against dense volumes of log data that you simply could not handle within simple log viewers or Enterprise Manager, such as the individual HTTP requests and types of value passed within the interactions of a single user session:

By its nature of log streaming and full text search, ELK also lends itself well to near real time system monitoring dashboards reporting the status of systems including OBIEE and ODI, and I’ll be discussing this in more detail during my talk.

The Brighton BI Forum is on 7–8 May, with an optional masterclass on Wednesday 6th May from Mark Rittman and Jordan Meyer. If you’ve not registered for the Brighton BI Forum then do so now as places are very limited!


Don’t forget, we’re running a Data Visualisation Challenge at each of the forums, and if you need to convince your boss to let you go you can find a pre-written ‘justification’ letter here.

OBIEE nqcmd Tidbits

nqcmd is the ODBC command line tool that always has, and hopefully always will, shipped with OBIEE. It enables you to manually fire queries directly at the BI Server, rather than through the usual way of Presentation Services generating Logical SQL and sending it to BI Server. This can be useful in several cases:

  1. Automated cache purging, by sending one of the SAPurge[…] ODBC commands to the BI Server, usually done as part of a script
  2. Automated execution of Logical SQL, often done to support testing scenarios
  3. Load Testing the BI Server (via a magic undocumented switch, SA_NQCMD_ADVANCED)
  4. Manual interogation of the BI Server – if you want to poke and prod nqsserver without launching a web browser, nqcmd is your friend :)

In using nqcmd there’re a couple of things I want to demonstrate here that I find useful but haven’t seen discussed [in detail] elsewhere.

Query Log via nqcmd

All BI Server queries run with a LOGLEVEL>=1 will write some log details to nqquery.log. The usual route to view this is either on the server directly itself, transferring it off with a tool such as WinSCP, or through the Administration page of OBIEE. Another option that is available is from nqcmd itself. You need to do two things:

  1. Set the environment variable SA_NQCMD_ADVANCED to Yes
  2. Include the command line arguments -ShowQueryLog -H when you invoke nqcmd. I don’t know what -H does – it’s just specified as being required for this to work.

Here’s a simple example in action:

[oracle@demo ~]$ export SA_NQCMD_ADVANCED=Yes
[oracle@demo ~]$ nqcmd -d AnalyticsWeb -u prodney -p Admin123 -ShowQueryLog -H

-------------------------------------------------------------------------------
          Oracle BI ODBC Client
          Copyright (c) 1997-2013 Oracle Corporation, All rights reserved
-------------------------------------------------------------------------------



Connection open with info:
[0][State: 01000] [DataDirect][ODBC lib] Application's WCHAR type must be UTF16, because odbc driver's unicode type is UTF16

        [T]able info
        [C]olumn info
        [D]ata type info
        [F]oreign keys info
        [P]rimary key info
        [K]ey statistics info
        [S]pecial columns info
        [Q]uery statement
Select Option: Q

Give SQL Statement: SET VARIABLE LOGLEVEL=1:SELECT "A - Sample Sales"."Base Facts"."1- Revenue" s_1 FROM "A - Sample Sales"
SET VARIABLE LOGLEVEL=1:SELECT "A - Sample Sales"."Base Facts"."1- Revenue" s_1 FROM "A - Sample Sales"
-----------------------
s_1
-----------------------
70000000.00
-----------------------
Row count: 1
-----------------------
[2015-03-21T16:36:31.000+00:00] [OracleBIServerComponent] [TRACE:1] [USER-0] [] [ecid: 0054Sw944KmFw000jzwkno0003ac0000rl,0] [tid: 56660700] [requestid: 201f0002] [sessionid: 201f0000] [username: prodney] ###
########################################### [[
-------------------- SQL Request, logical request hash:
d2294415
SET VARIABLE LOGLEVEL=1:SELECT "A - Sample Sales"."Base Facts"."1- Revenue" s_1 FROM "A - Sample Sales"

]]
[2015-03-21T16:36:31.000+00:00] [OracleBIServerComponent] [TRACE:1] [USER-34] [] [ecid: 0054Sw94mRzFw000jzwkno0003ac0000ro,0] [tid: 56660700] [requestid: 201f0002] [sessionid: 201f0000] [username: prodney] -------------------- Query Status: Successful Completion [[

]]
[2015-03-21T16:36:31.000+00:00] [OracleBIServerComponent] [TRACE:1] [USER-28] [] [ecid: 0054Sw94mRzFw000jzwkno0003ac0000ro,0] [tid: 56660700] [requestid: 201f0002] [sessionid: 201f0000] [username: prodney] -------------------- Physical query response time 0 (seconds), id <<333971>> [[

]]

]]
[2015-03-21T16:36:31.000+00:00] [OracleBIServerComponent] [TRACE:1] [USER-29] [] [ecid: 0054Sw94mRzFw000jzwkno0003ac0000ro,0] [tid: 56660700] [requestid: 201f0002] [sessionid: 201f0000] [username: prodney] -------------------- Physical Query Summary Stats: Number of physical queries 1, Cumulative time 0, DB-connect time 0 (seconds) [[

]]
[2015-03-21T16:36:31.000+00:00] [OracleBIServerComponent] [TRACE:1] [USER-33] [] [ecid: 0054Sw94mRzFw000jzwkno0003ac0000ro,0] [tid: 56660700] [requestid: 201f0002] [sessionid: 201f0000] [username: prodney] -------------------- Logical Query Summary Stats: Elapsed time 0, Response time 0, Compilation time 0 (seconds) [[

]]

Neat! But so what? Well, I see two uses straight away:

  1. In some situations you may not have access to the filesystem of the server on which the BI Server is running. For example, as a consultant I’ve been to clients where I’m given the Administration Tool client installation only. If I want to debug an RPD that I’m developing I’ll usually want to poke around in nqquery.log to see quite what physical SQL is being generated – and now I can.
  2. There was a discussion on the EMG mailing list recently about generating Physical SQL without executing it on the database. I’m going to discuss this in the next section of this article, and to do the analysis for this rapidly I’m using the inline query log.

Generating Physical SQL for OBIEE without Executing it – SKIP_PHYSICAL_QUERY_EXEC

OBIEE generates the Physical SQL that it runs against the database dynamically, at runtime. It takes the Logical request (“Logical SQL”), runs it through the RPD and generates one or more “Physical SQL” statements to be executed on the database as required to pull back the necessary data. A question arose recently on the EMG mailing list as to whether it is possible to get the Physical SQL – without executing it. You can imagine the benefits of this (namely, regression testing) since executing the database query each time is typically going to be expensive in machine resource and time consuming.

In SampleApp v406 there is a /home/oracle/scripts/PhysicalSQLGenerator, which does two things. First off it generates the Logical SQL for a given analysis, presumably using the generateReportSQL web service. It then takes that and runs it through nqcmd, scraping the nqquery.log for the resulting Physical SQL. In all of this no database queries get run. Very cool. But what’s the “secret sauce” at play here – can we distill it down in order to use it ourselves?

First, let’s look at how the SampleApp script does it. It sets some additional request variables in the Logical SQL:

[oracle@demo PhysicalSQLGenerator]$ cat lsql-out-dir/q1.lsql
SET VARIABLE SKIP_PHYSICAL_QUERY_EXEC=1, LOGLEVEL=2, DISABLE_CACHE_HIT=1, DISABLE_CACHE_SEED=1, QUERY_SRC_CD='SampleApp-PSQLGEN', SAW_SRC_PATH='/users/prodney/folder/request variable example':SELECT
   0 s_0,
   "A - Sample Sales"."Base Facts"."1- Revenue" s_1
FROM "A - Sample Sales"
ORDER BY 1
FETCH FIRST 5000001 ROWS ONLY
;

And if we extract the relevant part out of the bash script we can see that it also uses a couple of extra command line arguments (-q -NoFetch) when invoking nqcmd:

nqcmd -q -NoFetch -d AnalyticsWeb -u weblogic -p Admin123 -s lsql-out-dir/q1.lsql

When it’s run we check nqquery.log and lo-and-behold we get this: (edited for brevity)

------------------- Sending query to database named 01 - Sample App Data (ORCL) (id: <<69923>>), connection pool named Sample Relational Connection, logical request hash dd4fb54f, physical request hash 8d6f36
3d: [[
WITH
SAWITH0 AS (select sum(T42442.Revenue) as c1
from
     BISAMPLE.SAMP_REVENUE_FA2 T42442 /* F21 Rev. (Aggregate 2) */ )
select D1.c1 as c1, D1.c2 as c2 from ( select distinct 0 as c1,
     D1.c1 as c2
from
     SAWITH0 D1 ) D1 where rownum <= 5000001

]]

Query Status: Successful Completion [[

Rows 0, bytes 24 retrieved from database query id: <<69923>> Simulation Gateway 

Physical query response time 0 (seconds), id <<69923>> Simulation Gateway

Whilst the log says it is “Sending query to database” it does no such thing, and the “Simulation Gateway” is the giveaway clue. Proof that it doesn’t connect to the database? I shut the database down, and it still worked just fine. Crude, yes, but effective.

I’ll intersperse here the little trick that I mentioned in the first part of this article : -ShowQueryLog. It’s tedious switching back and forth between nqcmd and the nqquery.log when doing this kind of testing, so let’s do it all as one:

export SA_NQCMD_ADVANCED=Yes
nqcmd -H -ShowQueryLog -q -NoFetch -d AnalyticsWeb -u weblogic -p Admin123 -s lsql-out-dir/q1.lsql

Unfortunately it looks like -ShowQueryLog is mutually exclusive to -q and -NoFetch since it doesn’t return anything, even though the nqquery.log did get additional entries. But that’s fine, since by removing these two flags in order to get -ShowQueryLog to work we’re whittling down what is actually needed to generate the physical SQL on its own without database execution. Here’s the nqcmd, showing the query log inline and showing still the “Simulation Gateway” indicative of no physical query execution:

[oracle@demo PhysicalSQLGenerator]$ export SA_NQCMD_ADVANCED=Yes
[oracle@demo PhysicalSQLGenerator]$ nqcmd -H -ShowQueryLog -d AnalyticsWeb -u weblogic -p Admin123 -s lsql-out-dir/q1.lsql

-------------------------------------------------------------------------------
          Oracle BI ODBC Client
          Copyright (c) 1997-2013 Oracle Corporation, All rights reserved
-------------------------------------------------------------------------------

[...]

------------------------------------
s_0          s_1
------------------------------------
------------------------------------
Row count: 0
------------------------------------
[2015-03-23T05:52:57.000+00:00] [OracleBIServerComponent] [TRACE:2] [USER-0] [] [ecid: 0054Ut7AJ33Fw000jzwkno0005UZ00005Q,0] [tid: 8f194700] [requestid: 8a1e0002] [sessionid: 8a1e0000] [username: weblogic] ############################################## [[
-------------------- SQL Request, logical request hash:
dd4fb54f
SET VARIABLE SKIP_PHYSICAL_QUERY_EXEC=1, LOGLEVEL=2, DISABLE_CACHE_HIT=1, DISABLE_CACHE_SEED=1, QUERY_SRC_CD='SampleApp-PSQLGEN', SAW_SRC_PATH='/users/prodney/folder/request variable example':SELECT
   0 s_0,
   "A - Sample Sales"."Base Facts"."1- Revenue" s_1
FROM "A - Sample Sales"
ORDER BY 1
FETCH FIRST 5000001 ROWS ONLY



[...]

[2015-03-23T05:52:57.000+00:00] [OracleBIServerComponent] [TRACE:2] [USER-18] [] [ecid: 0054Ut7AK5DFw000jzwkno0005UZ00005S,0] [tid: 8f194700] [requestid: 8a1e0002] [sessionid: 8a1e0000] [username: weblogic] -------------------- Sending query to database named 01 - Sample App Data (ORCL) (id: <<70983>>), connection pool named Sample Relational Connection, logical request hash dd4fb54f, physical request hash 8d6f363d: [[
WITH
SAWITH0 AS (select sum(T42442.Revenue) as c1
from
     BISAMPLE.SAMP_REVENUE_FA2 T42442 /* F21 Rev. (Aggregate 2) */ )
select D1.c1 as c1, D1.c2 as c2 from ( select distinct 0 as c1,
     D1.c1 as c2
from
     SAWITH0 D1 ) D1 where rownum <= 5000001

]]
[2015-03-23T05:52:57.000+00:00] [OracleBIServerComponent] [TRACE:2] [USER-34] [] [ecid: 0054Ut7AYi0Fw000jzwkno0005UZ00005T,0] [tid: 8f194700] [requestid: 8a1e0002] [sessionid: 8a1e0000] [username: weblogic] -------------------- Query Status: Successful Completion [[

]]
[2015-03-23T05:52:57.000+00:00] [OracleBIServerComponent] [TRACE:2] [USER-26] [] [ecid: 0054Ut7AYi0Fw000jzwkno0005UZ00005T,0] [tid: 8f194700] [requestid: 8a1e0002] [sessionid: 8a1e0000] [username: weblogic] -------------------- Rows 0, bytes 24 retrieved from database query id: <<70983>> Simulation Gateway [[

]]
[2015-03-23T05:52:57.000+00:00] [OracleBIServerComponent] [TRACE:2] [USER-28] [] [ecid: 0054Ut7AYi0Fw000jzwkno0005UZ00005T,0] [tid: 8f194700] [requestid: 8a1e0002] [sessionid: 8a1e0000] [username: weblogic] -------------------- Physical query response time 0 (seconds), id <<70983>> Simulation Gateway [[

[...]

It’s clear that the “-q -Nofetch” parameters used in nqcmd don’t have an effect on whether the physical query is executed (they’re to do with whether nqcmd as an ODBC client pulls back and displays the data you ask for). It’s actually just a single request variable that does the job, and it goes under the rather obvious name of SKIP_PHYSICAL_QUERY_EXEC. When set to 1 it generates all the necessary physical SQL but doesn’t execute it, and the presence of “Simulation Gateway” in the log signals this.