Tag Archives: Obiee

ETL Offload with Spark and Amazon EMR – Part 1

We recently undertook a two-week Proof of Concept exercise for a client, evaluating whether their existing ETL processing could be done faster and more cheaply using Spark. They were also interested in whether something like Redshift would provide a suitable data warehouse platform for them. In this series of blog articles I will look at how we did this, and what we found.

Background

The client has an existing analytics architecture based primarily around Oracle database, Oracle Data Integrator (ODI), Oracle GoldenGate, and Oracle Business Intelligence Enterprise Edition (OBIEE), all running on Amazon EC2. The larger architecture in the organisation is all AWS based too.

Existing ETL processing for the system in question is done using ODI, loading data daily into a partitioned Oracle table, with OBIEE providing the reporting interface.

There were two aspects to the investigation that we did:

  • Primarily, what would an alternative platform for the ETL look like? With lots of coverage recently of the concept of "ETL offloading" and "Apache-based ETL", the client was keen to understand how they might take advantage of this

    Within this, key considerations were:

    • Cost
    • Scalability
    • Maintenance
    • Fit with existing and future architectures
  • The second aspect was to investigate whether the performance of the existing reporting could be improved. Despite having data for multiple years in Oracle, queries were too slow to provide information other than within a period of a few days.

Oracle licenses were a sensitive point for the client, who were keen to reduce - or at least, avoid increased - costs. ODI for Big Data requires additional licence, and so was not in scope for the initial investigation.

Data and Processing

The client uses their data to report on the level of requests for different products, including questions such as:

  • How many requests were there per day?
  • How many requests per product type in a given period?
  • For a given product, how many requests were there, from which country?

Data volumes were approximately 50MB, arriving in batch files every hour. Reporting requirements were previous day and before only. Being able to see data intra-day would be a bonus but was not a requirement.

High Level Approach

Since the client already uses Amazon Web Services (AWS) for all its infrastructure, it made sense to remain in the AWS realm for the first round of investigation. We broke the overall requirement down into pieces, so as to understand (a) the most appropriate tool at each point and (b) the toolset with best overall fit. A very useful reference for an Amazon-based big data design is the presentation Big Data Architectural Patterns and Best Practices on AWS. Even if you're not running on AWS, the presentation has some useful pointers for things like where to be storing your data based on volumes, frequency of access, etc.

Data Ingest

The starting point for the data was Amazon's storage service - S3, in which the data files in CSV format are landed by an external process every hour.

Processing (Compute)

Currently the processing is done by loading the external data into a partitioned Oracle table, and resolving dimension joins and de-duplication at query time.

Taking away any assumptions, other than a focus on 'new' technologies (and a bias towards AWS where appropriate), we considered:

  • Switch out Oracle for Redshift, and resolve the joins and de-duplication there
    • Loading the data to Redshift would be easy, but would be switching one RDBMS-based solution for another. Part of the aim of the exercise was to review a broader solution landscape than this.
  • Use Hadoop-based processing, running on Elastic Map Reduce (EMR):

    • Hive QL to process the data on S3 (or HDFS on EMR)
      • Not investigated, because provides none of the error handling etc that Spark would, and Spark has SparkSQL for any work that needs doing in SQL.
    • Pig
      • Still used, but 'old' technology, somewhat esoteric language, and superseded by Spark
    • Spark
      • Support for several languages including commonly-used ones such as Python
      • Gaining increasing levels of adoption in the industry
      • Opens up rich eco-system of processing possibilities with related projects such as Machine Learning, and Graph.

We opted to use Spark to process the files, joining them to the reference data, and carrying out de-duplication. For a great background and discussion on Spark and its current place in data architectures, have a listen to this podcast.

Storage

The output from Spark was written back to S3.

Analytics

With the processed data in S3, we evaluated two options here:

  • Load it to Redshift for query
  • Query in-place with a SQL-on-Hadoop engine such as Presto or Impala
    • With the data at rest on S3, Amazon's Athena is also of interest here, but was released after we carried out this particular investigation.

The presumption was that OBIEE would continue to provide the front-end to the analytics. Oracle's Data Visualization Desktop tool was also of interest.

In the next post we'll see the development environment that we used for prototyping. Stay tuned!

Source Control and Automated Code Deployment Options for OBIEE

It's Monday morning. I've arrived at a customer site to help them - ironically enough - with automating their OBIEE code management. But, on arrival, I'm told that the OBIEE team can't meet with me because someone did a release on the previous Friday, had now gone on holiday - and the wrong code was released but they didn't know which version. All hands-on-deck, panic-stations!

This actually happened to me, and in recent months too. In this kind of situation hindsight gives us 20:20 vision, and of course there shouldn't be a single point of failure, of course code should be under version control, of course it should be automated to reduce the risk of problems during deployments. But in practice, these things often don't get done - and it's understandable why. In the very early days of a project, it will be a manual process because that's what is necessary as people get used to the tools and technology. As time goes by, project deadlines come up, and tasks like this are seen as "zero sum" - sure we can automate it, but we can also continue doing it manually and things will still get done, code will still get released. After a while, it's just accepted as how things are done. In effect, it is technical debt - and this is your reminder that debt has to be paid, sooner or later :)

I'll not pretend that managing OBIEE code in source control, and automating code deployments, is straightforward. But, it is necessary, so in this post I'll walk through why you should be doing it, and then importantly how.

Why Source Control?

Do we really need source control for OBIEE? After all, what's wrong with the tried-and-tested method of sticking it all in a folder like this?

sdlc01.png

What's wrong with this? What's right with this? Oh lack of source control, let me count the number of ways that I doth hate thee:

  1. No audit trail of who changed something
  2. No audit of what was changed, and when
  3. No enforceable naming standards for versions
  4. No secure way of identifying deployment candidates
  5. No distributed method for sharing code (don't tell me that a network share counts!)
  6. No way of reliably identifying the latest version of code

These range from the immediately practical through to the slightly more abstract but necessary in a mature deployment.

Of immediate impact is the simply ability to identify the latest version of code on which to make new changes. Download the copy from the live server? Really? No. If you're tracking your versions accurately and reliably then you simply pull the latest version of code from there, in the knowledge that it is the version that is live. No monkeying around trying to figure out if it really is (just because it's called "PROD-091216.rpd" how do you know that's actually what got released to Production? And was that on 12th December or 9th September? Who knows!).

Longer term, having a secure and auditable code line simply makes it easier and less risky to manage. It also gives you the ability to work with it in a much more flexible manner, such as genuine concurrent development by multiple developers against the RPD. You can read more about this in my presentation here.

Which Source Control?

I don't care. Not really. So long as you are using source control, I am happy.

For preference, I always advocate using git. It is a modern platform, with strong support from many desktop clients (SourceTree is my favourite, along with the commandline too, natch). Git is decentralised, meaning that you can commit and branch code locally on your own machine without having to be connected to a server. It supports a powerful fork and pull process too, which is part of the reason it has almost universal usage within the open source world. The most well known of git platforms is github, which in effect provides git as a Platform-as-a-service (PaaS), in a similar fashion to Bitbucket too. You can also run git on its own locally, or more pragmatically, with gitlab.

But if you're using Subversion (SVN), Perforce, or whatever - that's fine. The key thing is that you understand how to use it, and that it is supported within your organisation. For simple source control, pretty much all the common platforms work just fine. If you get onto more advanced use, such as feature-branches and concurrent development, you may find it worth ensuring that your chosen platform supports the workflow that you adopt. Even then, whilst I'd chose git for preference, at Rittman Mead we've helped clients develop very powerful concurrent development processes with Subversion providing the underlying source control.

What Goes into Source Control? Part 1

So you've drunk the Source Control koolaid, and accepted that really there is no excuse not to use it. So what do you put into it? The RPD? The OBIEE 12c BAR file? What if you're still on OBIEE 11g? The answer here depends partially on how you are planning to manage code deployment in your environment. For a fully automated solution, you may opt to store code in a more granular fashion than if you are simply migrating full BAR files each time. So, read on to understand about code deployment, and then we'll revisit this question again after that.

How Do You Deploy Code Changes in OBIEE?

The core code artefacts are the same between OBIEE 11g and OBIEE 12c, so I'll cover both in this article, pointing out as we go any differences.

The biggest difference with OBIEE 12c is the concept of the "Service Instance", in which the pieces for the "analytical application" are clearly defined and made portable. These components are:

  • Metadata model (RPD)
  • Presentation Catalog ("WebCat"), holding all analysis and dashboard definitions
  • Security - Application Roles and Policy grants, as well as OBIEE front-end privilege grants

Part of this is laying the foundations for what has been termed "Pluggable BI", in which 'applications' can be deployed with customisations layered on top of them. In the current (December 2016) version of OBIEE 12c we have just the Single Service Instance (ssi). Service Instances can be exported and imported to BI Archive files, known as BAR files.

The documentation for OBIEE environment migrations (known as "T2P" - Test to Production) in 12c is here. Hopefully I won't be thought too rude for saying that there is scope for expanding on it, clarifying a few points - and perhaps making more of the somewhat innocuous remark partway down the page:

PROD Service Instance metadata will be replaced with TEST metadata.

Hands up who reads the manual fully before using a product? Hands up who is going to get a shock when they destroy their Production presentation catalog after importing a service instance?...

Let's take walk through the three main code artefacts, and how to manage each one, starting with the RPD.

The RPD

The complication of deployments of the RPD is that the RPD differs between environments because of different connection pool details, and occassionally repository variable values too.

If you are not changing connection pool passwords between environments, or if you are changing anything else in your RPD (e.g. making actual model changes) between environments, then you probably shouldn't be. It's a security risk to not have different passwords, and it's bad software development practice to make code changes other than in your development environment. Perhaps you've valid reasons for doing it... perhaps not. But bear in mind that many test processes and validations are based on the premise that code will not change after being moved out of dev.

With OBIEE 12c, there are two options for managing deployment of the RPD:

  1. BAR file deploy and then connection pool update
  2. Offline RPD patch with connection pool updates, and then deploy
    • This approach is valid for OBIEE 11g too

RPD Deployment in OBIEE 12c - Option 1

This is based on the service instance / BAR concept. It is therefore only valid for OBIEE 12c.

  1. One-off setup : Using listconnectionpool to create a JSON connection pool configuration file per target environment. Store each of these files in source control.
  2. Once code is ready for promotion from Development, run exportServiceInstance to create a BAR file. Commit this BAR file to source control

    /app/oracle/biee/oracle_common/common/bin/wlst.sh <<EOF
    exportServiceInstance('/app/oracle/biee/user_projects/domains/bi/','ssi','/home/oracle','/home/oracle')
    EOF
    

  3. To deploy the updated code to the target environment:

    1. Checkout the BAR from source control
    2. Deploy it with importServiceInstance, ensuring that the importRPD flag is set.

      /app/oracle/biee/oracle_common/common/bin/wlst.sh <<EOF
      importServiceInstance('/app/oracle/biee/user_projects/domains/bi','ssi','/home/oracle/ssi.bar',true,false,false)
      EOF
      
    3. Run updateConnectionPool using the configuration file from source control for the target environment to set the connection pool credentials

      /app/oracle/biee/user_projects/domains/bi/bitools/bin/datamodel.sh updateconnectionpool -C ~/prod_cp.json -U weblogic -P Admin123 -SI ssi
      

      Note that your OBIEE system will not be able to connect to source databases to retrieve data until you update the connection pools.

    4. The BI Server should pick up the new RPD after a few minutes. You can force this by restarting the BI Server, or using "Reload Metadata" from OBIEE front end.

Whilst you can also create the BAR file with includeCredentials, you wouldn't use this for migration of code between environments - because you don't have the same connection pool database passwords in each environment. If you do have the same passwords then change it now - this is a big security risk.

The above BAR approach works fine, but be aware that if the deployed RPD is activated on the BI Server before you have updated the connection pools (step 3 above) then the BI Server will not be able to connect to the data sources and your end users will see an error. This approach is also based on storing the BAR file as whole in source control, when for preference we'd store the RPD as a standalone binary if we want to be able to do concurrent development with it.

RPD Deployment in OBIEE 12c - Option 2 (also valid for OBIEE 11g)

This approach takes the RPD on its own, and takes advantage of OBIEE's patching capabilities to prepare RPDs for the target environment prior to deployment.

  1. One-off setup: create a XUDML patch file for each target environment.

    Do this by:

    1. Take your development RPD (e.g. "DEV.rpd"), and clone it (e.g. "PROD.rpd")
    2. Open the cloned RPD (e.g. "PROD.rpd") offline in the Administration Tool. Update it only for the target environment - nothing else. This should be all connection pool passwords, and could also include connection pool DSNs and/or users, depending on how your data sources are configured. Save the RPD.
    3. Using comparerpd, create a XUDML patch file for your target environment:

      /app/oracle/biee/user_projects/domains/bi/bitools/bin/comparerpd.sh \
      -P Admin123 \
      -W Admin123 \
      -G ~/DEV.rpd \
      -C ~/PROD.rpd \
      -D ~/prod_cp.xudml
      
    4. Repeat the above process for each target environment

  2. Once code is ready for promotion from Development:

    1. Extract the RPD

      • In OBIEE 12c use downloadrpd to obtain the RPD file

        /app/oracle/biee/user_projects/domains/bi/bitools/bin/datamodel.sh \
        downloadrpd \
        -O /home/oracle/obiee.rpd \
        -W Admin123 \
        -U weblogic \
        -P Admin123 \
        -SI ssi
        
      • In OBIEE 11g copy the file from the server filesystem

    2. Commit the RPD to source control

  3. To deploy the updated code to the target environment:

    1. Checkout the RPD from source control
    2. Prepare it for the target environment by applying the patch created above

      1. Check out the XUDML patch file for the appropriate environment from source control
      2. Apply the patch file using biserverxmlexec:

        /app/oracle/biee/user_projects/domains/bi/bitools/bin/biserverxmlexec.sh \
        -P Admin123 \
        -S Admin123 \
        -I prod_cp.xudml \
        -B obiee.rpd \
        -O /tmp/prod.rpd
        
    3. Deploy the patched RPD file

      • In OBIEE 12c use uploadrpd

        /app/oracle/biee/user_projects/domains/bi/bitools/bin/datamodel.sh \
        uploadrpd \
        -I /tmp/prod.rpd \
        -W Admin123 \
        -U weblogic \
        -P Admin123 \
        -SI ssi \
        -D
        

        The RPD is available straightaway. No BI Server restart is needed.

      • In OBIEE 11g use WLST's uploadRepository to programatically do this, or manually from EM.

        After deploying the RPD in OBIEE 11g, you need to restart the BI Server.

This approach is the best (only) option for OBIEE 11g. For OBIEE 12c I also prefer it as it is 'lighter' than a full BAR, more solid in terms of connection pools (since they're set prior to deployment, not after), and it enables greater flexibility in terms of RPD changes during migration since any RPD change can be encompassed in the patch file.

Note that the OBIEE 12c product manual states that uploadrpd/downloadrpd are for:

"...repository diagnostic and development purposes such as testing, only ... all other repository development and maintenance situations, you should use BAR to utilize BAR's repository upgrade and patching capabilities and benefits.".

Maybe in the future the BAR capabilites will extend beyond what they currently do - but as of now, I've yet to see a definitive reason to use them and not uploadrpd/downloadrpd.

The Presentation Catalog ("WebCat")

The Presentation Catalog stores the definition of all analyses and dashboards in OBIEE, along with supporting objects including Filters, Conditions, and Agents. It differs significantly from the RPD when it comes to environment migrations. The RPD can be seen in more traditional software development lifecycle terms, sine it is built and developed in Development, and when deployed in subsequent environment overwrites in entirety what is currently there. However, the Presentation Catalog is not so simple.

Commonly, content in the Presentation Catalog is created by developers as part of 'pre-canned' reporting and dashboard packs, to be released along with the RPD to end-users. Where things get difficult is that the Presentation Catalog is also written to in Production. This can include:

  • User-developed content saved in one (or both) of:
    • My Folders
    • Shared, e.g. special subfolders per department for sharing common reports outside of "gold standard" ones
  • User's profile data, including timezone and language settings, saved dashboard customisations, preferred delivery devices, and more
  • System configuration data, such as default formatting for specific columns, bookmarks, etc

In your environment you maybe don't permit some of these (for example, disabling access to My Folders is not uncommon). But almost certainly, you'll want your users to be able to persist their environment settings between sessions.

The impact of this is that the Presentation Catalog becomes complex to manage. We can't just overwrite the whole catalog when we come to deployment in Production, because if we do so all of the above listed content will get deleted. And that won't make us popular with users, at all.

So how do we bring any kind of mature software development practice to the Presentation Catalog, assuming that we have report development being done in non-Production environments?

We have two possible approaches:

  1. Deploy the full catalog into Production each time, but backup first existing content that we don't want to lose, and restore it after the deploy
    • Fiddly, but means that we don't have to worry about which bits of the catalog go in source control - all of it does. This has consequences for if we want to do branch-based development with source control, in that we can't. This is because the catalog will exist as a single binary (whether BAR or 7ZIP), so there'll be no merging with the source control tool possible.
    • Risky, if we forget to backup the user content first, or something goes wrong in the process
    • A 'heavier' operation involving the whole catalog and therefore almost certainly requiring the catalog to be in maintenance-mode (read only).
  2. Deploy the whole catalog once, and then do all subsequent deploys as deltas (i.e. only what has changed in the source environment)
    • Less risky, since not overwriting whole target environment catalog
    • More flexible, and more granular so easier to track in source control (and optionally do branch-based development).
    • Requires more complex automated deployment process.

Both methods can be used with OBIEE 11g and 12c.

Presentation Catalog Migration in OBIEE - Option 1

In this option, the entire Catalog is deployed, but content that we want to retain backed up first, and then re-instated after the full catalog deploy.

First we take the entire catalog from the source environment and store it in source control. With OBIEE 12c this is done using the exportServiceInstance WLST command (see the example with the RPD above) to create a BAR file. With OBIEE 11g, you would create an archive of the catalog at its root using 7-zip/tar/gzip (but not winzip).

When ready to deploy to the target environment, we first backup the folders that we want to preserve. Which folders might we want to preserve?

  1. /users - this holds both objects that users have created and saved in My Folders, as well as user profile information (including timezone preferences, delivery profiles, dashboard customisations, and more)
  2. /system - this hold system internal settings, which include things such as authorisations for the OBIEE front end (/system/privs), as well as column formatting defaults (/system/metadata), global variables (/system/globalvariables), and bookmarks (/system/bookmarks).
    • See note below regarding the /system/privs folder
  3. /shared/<…>/<…> - if users are permitted to create content directly in the Shared area of the catalog you will want to preserve this. A valid use of this is for teams to share content developed internally, instead of (or prior to) it being released to the wider user community through a more formal process (the latter being often called 'gold standard' reports).

Regardless of whether we are using OBIEE 11g or 12c we create a backup of the folders identified by using the Archive functionality of OBIEE. This is NOT just creating a .zip file of the file system folders - which is completely unsupported and a bad idea for catalog management, except at the very root level. Instead, the Archive functionality creates a .catalog file which can be stored in source control, and unarchived back into OBIEE to restore content.

You can create OBIEE catalog archives in one of four ways, which are also valid for importing the content back into OBIEE too:

  1. Manually, through OBIEE front-end
  2. Manually, through Catalog Manager GUI
  3. Automatically, through Catalog Manager CLI (runcat.sh)

    • Archive:

      runcat.sh \
      -cmd archive  \
      -online http://demo.us.oracle.com:7780/analytics/saw.dll \
      -credentials /tmp/creds.txt \
      -folder "/shared/HR" \
      -outputFile /home/oracle/hr.catalog
      
    • Unarchive:

      runcat.sh \
      -cmd unarchive \
      -inputFile hr.catalog \
      -folder /shared \
      -online http://demo.us.oracle.com:7780/analytics/saw.dll  \
      -credentials /tmp/creds.txt \
      -overwrite all
      
  4. Automatically, using the WebCatalogService API (copyItem2 / pasteItem2).

Having taken a copy of the necessary folders, we then deploy the entire catalog (with the changes from the development in) taken from source control. Deployment is done in OBIEE 12c using importServiceInstance. In OBIEE 11g it's done by taking the server offline, and replacing the catalog with the filesystem archive to 7zip of the entire catalog.

Finally, we then restore the folders previously saved, using the Unarchive function to import the .catalog files:

Presentation Catalog Migration in OBIEE - Option 2

In this option we take a more granular approach to catalog migration. The entire catalog from development is only deployed once, and after that only .catalog files from development are put into source control and then deployed to the target environment.

As before, the entire catalog is initially taken from the development environment, and stored in source control. With OBIEE 12c this is done using the exportServiceInstance WLST command (see the example with the RPD above) to create a BAR file. With OBIEE 11g, you would create an archive of the catalog at its root using 7zip.

Note that this is only done once, as the initial 'baseline'.

The first time an environment is commissioned, the baseline is used to populate the catalog, using the same process as in option 1 above (in 12c, importServiceInstance/ in 11g unzip of full catalog filesystem copy).

After this, any work that is done in the catalog in the development environment is migrated through by using OBIEE's archive function against just the necessary /shared subfolder to a .catalog file, storing this in source control

This is then imported to target environment with unarchive capability. See above in option 1 for details of using archive/unarchive - just remember that this is archiving with OBIEE, not using 7zip!

You will need to determine at what level you take this folder: -

  • If you archive the whole of /shared each time you'll never be able to do branch-based development with the catalog in which you want to merge branches (because the .catalog file is binary).
  • If you instead work at, say, department level (/shared/HR, /shared/sales, etc) then the highest grain for concurrent catalog development would be the department. The lower down the tree you go the greater the scope for independent concurrent development, but the greater the complexity to manage. This is because you want to be automating the unarchival of these .catalog files to the target environment, so having to deal with multiple levels of folder hierarchy gets hard work.

It's a trade off between the number of developers, breadth of development scope, and how simple you want to make the release process.

The benefit of this approach is that content created in Production remains completely untouched. Users can continue to create their content, save their profile settings, and so on.

Presentation Catalog Migration - OBIEE Privilege Grants

Permissions set in the OBIEE front end are stored in the Presentation Catalog's /system/privs folder.

Therefore, how this folder is treated during migration dictates where you must apply your security grants (or conversely, where you set your security grants dictates how you should treat the folder in migrations). For me the "correct" approach would be to define the full set of privileges in the development environment and the migrate these through along with pre-built objects in /shared through to Production. If you have a less formal approach to environments, or for whatever reason permissions are granted directly in Production, you will need to ensure that the /system/privs folder isn't overwritten during catalog deployments.

When you create a BAR file in OBIEE 12c, it does include /system/privs (and /system/metadata). Therefore, if you are happy for these to be overwritten from the source environment, you would not need to backup/restore these folders. If you set includeCatalogRuntimeInfo in the OBIEE 12c export to BAR, it will also include the complete /system folder as well as /users.

Agents

Regardless of how you move Catalog content between environments, if you have Agents you need to look after them too. When you move Agents between environment, they are not automatically registered with the BI Scheduler in the target environment. You either have to do this manually, or with the web service API : WebCatalogService.readObjects to get the XML for the agent, and then submit it to iBotService.writeIBot which will register it with the BI Scheduler.

Security

  • In terms of the Policy store (Application Roles and Policy grants), these are managed by the Security element of the BAR and migration through the environments is simple. You can deploy the policy store alone in OBIEE 12c using the importJazn flag of importServiceInstance. In OBIEE 11g it's not so simple - you have to use the migrateSecurityStore WLST command.
  • Data/Object security defined in the RPD gets migrated automatically through the RPD, by definition
  • See above for a discussion of OBIEE front-end privilege grants.

What Goes into Source Control? Part 2

So, suddenly this question looks a bit less simple than when orginally posed at the beginning of this article. In essence, you need to store:

  1. RPD
    1. BAR + JSON configuration for each environment's connection pools -- 12c only, simpler, but less flexible and won't support concurrent development easily
    2. RPD (.rpd) + XUDML patch file for each environment's connection pools -- works in 11g too, supports concurrent development
  2. Presentation Catalog
    1. Entire catalog (BAR in 12c / 7zip in 11g) -- simpler, but impossible to manage branch-based concurrent development
    2. Catalog baseline (BAR in 12c / 7zip in 11g) plus delta .catalog files -- More complex, but more flexible, and support concurrent development
  3. Security
    1. BAR file (OBIEE 12c)
    2. system-jazn-data.xml (OBIEE 11g)
  4. Any other files that are changed for your deployment.

    It's important that when you provision a new environment you can set it up the same as the others. It is also invaluable to have previous versions of these files so as to be able to rollback changes if needed, and to track what settings have changed over time.

    This could include:

    • Configuration files (nqsconfig.ini, instanceconfig.xml, tnsnames.ora, etc)
    • Custom skins & styles
    • writeback templates
    • etc

Summary

I never said it was simple ;-)

OBIEE is an extremely powerful product, and just as you have to take care to build your data models correctly, you also need to take care to understand why and how to manage your code correctly. What I've tried to do here is pull together the different options available, and lay them out with their respectively pros and cons. Let me know in the comments below what you think and how you manage OBIEE code at your site.

One of the key messages that it's important to get across is this: there are varying degrees of complexity with which you can embrace source control. All are valid, and in fact an incremental adoption of them rather than big-bang can sometimes be a better idea:

  • At one end of the scale, you simply use source control to hold copies of all your code, and continue to deploy manually
  • Getting a bit smarter, automating code deployments from source control. Code development is still done serially though.
  • At the other end of the scale, you use source control with branch-based feature-driven concurrent development. Completed features are merged automatically with RPD conflicts managed by the OBIEE tooling from the command line. Testing and deployment are both automated.

If you'd like assistance with your OBIEE development and deployment practices, including fully automated source-control driven concurrent development management, please get in touch with us here at Rittman Mead. We would be delighted to use our extensive experience in this field to produce a flexible and customised process for your particular environment and requirements.


You can find the companion slide deck to this article, with further discussion on concurrent development, here.

The Visual Plugin Pack for OBIEE

Last week we announced the Rittman Mead Open Source project, and released into open source:

  • the excellent Insights project, a javascript API/framework for building a new frontend for OBIEE, written by Minesh Patel
  • Enhanced usage tracking for OBIEE, to track click-by-click how your users interact with the application

Today it is the turn of the Visual Plugin Pack.....

What is the Visual Plugin Pack for OBIEE ?

Visual Plugin Pack (VPP) is a means by which users of OBIEE Answers can use custom JavaScript visualisations without having to write any javascript!

It is a framework that enables developers to build Javascript visualisation plugins, that report builders can then utilise and configure through native OBIEE user interface.

I want to point this out from the very start, that despite its name, the Visual Plugin Pack is not a pack of all-singing, all-dancing, super-duper visualisations for OBIEE.

Instead, VPP should be thought of as a framework that allows you to quickly develop and integrate all-singing, all-dancing, super-duper visualisations that will work natively within OBIEE.

Subtle difference, but an important one.

So what does it do ?

Essentially, VPP allows you to accelerate the development and deployment of custom, configurable and reusable OBIEE JavaScript visualisations.

Back in 2013 I wrote this post describing how to embed a D3 Visualisation within OBIEE. The same method will still work today, but it's a cumbersome process and requires heavy use of the narrative form, which let's be honest, is a painful experience when it comes to JavaScript development.

Some drawbacks with this method:

  • Code editing in the Narrative view is not productive.
  • Reusing visualisations in other analyses requires the copying and pasting of code.
  • Basic Visualisation configuration changes, for example, width, height, colours, font etc requires code changes.
  • Remapping Column bindings requires code changes.
  • JavaScript library dependencies and load order can be tricky to manage.

The Visual Plugin Pack attempts to address these issues by abstracting away the complexities of the Narrative form and allowing developers to focus on visualisation code, not OBIEE integration code.
If you choose to use VPP for your visualisations then you will never have to touch the narrative form, all visualisation development can take place outside of OBIEE in your favourite code editor and deployed to Weblogic when you are done.

VPP also allows you to define design-time controls that affect column bindings and visualisation behaviour. The example visualisation below has been written to accept 5 column bindings and 1 configuration component, which controls the visualisation size. You can create as many column bindings and configuration components as you need

scatterplot matrix

How do I get started ?

You can download or fork the repository from here.

Installation and developer guides can be found on the wiki:-

There are several visualisations that come bundled with VPP, some more polished than others, but they should serve as good examples that can be enhanced further.

Summary

If you've got some in-house JavaScript skills and are looking to develop and integrate custom visualisations into OBIEE, then VPP can help alleviate a lot of the frustrations associated with the traditional method. Once you're up and running you'll be able to develop faster, integrate quickly and share your visualisations with all OBIEE report writers.

If you'd like to discuss how Rittman Mead can help with deployment or assist with custom visualisation development feel free to contact us.

Enhanced Usage Tracking for OBIEE – Now Available as Open Source!

Introduction

OBIEE provides Usage Tracking as part of the core product functionality. It writes directly to a database table every Logical Query that hits the BI Server, including details of who ran it, when, and information about how it executed including for how long, how many rows, and so on. This in itself is a veritable goldmine of information about your OBIEE system. All OBIEE deployments should have Usage Tracking enabled, for supporting performance analysis, capacity planning, catalog rationalisation, and more.

What Usage Tracking doesn't track is interactions between the end user and the Presentation Services component. Presentation Services sits between the end user and the BI Server from where the actual queries get executed. This means that until a user executes an analysis, there's no record of their actions in Usage Tracking. There is this audit data available, but you have to manually enable and collect it, which can be tricky. This is where Enhanced Usage Tracking comes in. It enables the collection and parsing of every click a user makes in OBIEE. For an overview of the potential of this data, see the article here and here.

Today we're pleased to announce the release into open-source of Enhanced Usage Tracking! You can find the github repository here: https://github.com/RittmanMead/obi-enhanced-usage-tracking.

Highlights of the data that Enhanced Usage Tracking provides includes:

  • Which web browsers do people use? Who is accessing OBIEE with a mobile device?

  • Who deleted a catalog object? Who moved it?

  • What dashboards get exported to Excel most frequently, and by whom?

The above visualisations are from both Kibana, and OBIEE. The data from Enhanced Usage Tracking can be loaded into Elasticsearch, and is also available from Oracle tables too, hence you can put OBIEE itself on top of it, or DV:

eut108.png

How to use Enhanced Usage Tracking

See the github repository for full detail on how to install and run the code.

TODO

What's left TODO? Here are a few ideas if you'd like to help build on this tool. I've linked each title to the relevant github issue.

TODO 01

The sawlog is a rich source of lots of data, but the Logstash script has to know how to parse it. It's all down to the grok statement which identifies fields to extract and defined their deliniators. Use grokdebug.herokuapp.com to help master your syntax. From there, the data can be emitted to CSV and loaded into Oracle.

Here's an example of something yet to build - when items are moved and deleted in the Catalog, it is all logged. What, who, and when. The Logstash grok currently scrapes this, but the data isn't included in the CSV output, nor loaded into Oracle.

eut105.png

Don't forget to submit a pull request for any changes to the code that would benefit others in the community!

You'll also find loading the data directly into Elasticsearch easier than redefining the Oracle table DDL and load script each time, since in Elasticsearch the 'schema' can evolve based simply on the data that Logstash sends to it.

TODO 02

Version 5 of the Elastic stack was released in late 2016, and it would be good to test this code with it and update the README section above to indicate if it works - or submit the required changes needed for it to do so.

TODO 03

There's lots of possibilities for this data. Auditing who did what, when, is useful (e.g. who deleted a report?). Taking it a step further, are there patterns in user behaviour? Certain patterns of clicks that could be identified to highlight users who are struggling to find the data that they want? For example, opening lots of presentation folders in the Answers editor before adding columns to the analysis? Can we extend that to identify users who are struggling to use the tool and are going to "churn" (stop using it) and thus contact them before they do so to help resolve any issues they have?

TODO 04

At the moment the scripts are manual to invoke and run. It would be neat to package this up into a service (or set of services) that could run automagically at server boot.

Until then, using GNU screen is a handy hack for setting scripts running and being able to disconnect from the server without terminating them. It's like using nohup ... &, except you can reconnect to the session itself as and when you want to.

TODO 05

Click events have defined 'Request' types, and these I have roughly grouped together into 'Request Groups' to help describe what the user was doing (e.g. Logon / Edit Report / Run Report). Not all requests have been assigned to request groups. It would be useful to identify all request types, and refine further the groupings.

TODO 06

At the moment only clicks in Presentation Services are captured and analysed. I bet the same can be done for Data Visualization/Visual Analyzer too ...

Problems?

Please raise any issues on the github issue tracker. This is open source, so bear in mind that it's no-one's "job" to maintain the code - it's open to the community to use, benefit from, and maintain.

If you'd like specific help with an implementation, Rittman Mead would be delighted to assist - please do get in touch to discuss our rates.

Insights – An Open-Source Visualisation Platform for OBIEE

On and off over the last year, I have spent some time developing a customisable framework for building visualisations and dashboards, using OBIEE as the back-end. The result is Insights, a JavaScript web application that offers a modern alternative to OBIEE Answers. As of today, we have officially open sourced the project, so you are free to download, install, hack and contribute as you please.

The primary motive for building this application was to meet some very bespoke reporting requirements for a client, which I mention in my previous blog describing the prototype. During this piece of work I wrote an object orientated interface for the OBIEE web services. The icing on the cake was tying it into Tom Underhill's Visual Plugin Pack.

You can see more information about Insights in a presentation that I did at the recent UKOUG conference here: Bridging the Gap: Enhancing OBIEE with a Custom Visualisation Platform

Since then a lot of the work has been put in to make it developer friendly, visually appealing and hopefully easier to use. I'll be the first to admit that it's far from perfect, but it should be a decent starting point.

Getting Started

In order to use Insights you will need OBIEE 11.1.1.9 or above. Additionally, the application has only been tested using IE11 or Chrome browsers and so compatibility with other browsers cannot be guaranteed.

First, download the application or fork the Git repository.

There is an installation guide in the project at docs/installation.html. Follow this guide to deploy the application on your OBIEE server.

Demo

This is a quick step-by-step demonstration creating a basic dashboard, showing off some of the features in the application (apologies if the GIFs take a while to load).

First you log in, using your usual OBIEE credentials. The homepage shows some pre-configured dashboards here, but we're going to click the pencil to create a new one.

Logging in

Next I've dragged in some columns from my subject area, Sample App and run the query, displaying the default table plugin.

Add Columns

In this step, I've gone to the configuration tab, and changed the colour of my table.

Configuration

Now I change the plugin type using the drop down menu at the top. Notice that my previous table visualisation gets stored on the right. By clicking the Store button manually, it also adds my new pie chart. Then we can flick between them easily.

Store Pie Chart

Filters can be added by clicking the icon next to the column on the subject area panel.

Filters

Adding in a sunburst chart, and playing with some of the colours here.

Sunburst

Now we have our visualisations, we can begin constructing our dashboard. You can freely move around and resize the visualisations as you choose. I recommend hiding the panels for this, as the full screen is much closer to what users will see when viewing the dashboard.

Dashboard

The next GIF shows the interaction framework, which can be used to implement UI features where the user interacts with one visualisation and another visualisation on the page reacts to it. In its most basic form, each plugin type can be filtered - where OBIEE runs the query again. Although more complex reactions that are specific to a certain chart type can also be configured, as seen below with the sunburst chart.

Interactivity

Dashboard prompts can be added by clicking the filter icon next to one of the RPD columns. Any visualisations using this subject area will respond to the prompt. The prompt box can be freely placed on the canvas like any other object.

Dashboard Prompt

Finally, we can save the object to the web catalogue. This saves as a hidden analysis object in the OBIEE web catalogue and contains all of the information to recreate the dashboard when loading. All OBIEE security features are preserved, so users will only be able to access folders and reports they have permissions for.

Save to Web Catalogue

Finished dashboards can be viewed in the application once they have been saved. The dashboard viewer will show all dashboard objects in that folder as different pages, available from the left pane. Images can be exported to PNG and PDF as well as data from the visualisations exporting to Excel and CSV.

Viewing Dashboards

So How Do I Learn More?

The slides that I did at UKOUG describing Insights give a comprehensive overview of the design behind the tool. You can find them here.

Summary

In a nutshell, those are the main features of the application. Feel free to try it out and have a read through the documentation (available through the application itself or offline as HTML files in the docs directory).

As an open source application there is no official support, however if you experience any bugs or have any requests for enhancements, please post them on the issue tracker.

We hope you enjoy using the app and if you would like to enlist our expertise to help you deploy and develop using this platform, feel free to contact us to discuss it further.