Category Archives: Rittman Mead

Using Python to ‘Wrangle’ Your Messy Data

 or How to Organize Your Comic Book Collection Based on Issue Popularity

In addition to being a product manager at Rittman Mead, I consider myself to be a nerd of the highest order.  My love of comic books, fantasy, sci-fi and general geekery began long before the word ‘Stark’ had anything to do with Robert Downey Jr or memes about the impending winter.  As such, any chance to incorporate my hobbies into my work is a welcomed opportunity.  For my next few blog entries, I’ve decided to construct a predictive classification model using comic book sales data whose eventual goal will be to build a model that can accurately predict whether a comic will rocket off the shelves or if it will be a sales dud.  The first blog of the series shows some of the pitfalls that can come up when preparing your data for analysis.  Data preparation, or data wrangling as it has come to be known, is an imperfect process that usually takes multiple iterations of transformation, evaluation and refactoring before the data is “clean” enough for analysis.

While the steps involved in data wrangling vary based on the state and availability of the raw data, for this blog I have chosen to focus on the gathering of data from disparate sources, the enrichment of that data by merging their attributes and the restructuring of it to facilitate analysis. Comic book sales data is readily available on the interwebs, however, finding that data in a usable format proved to be a more difficult task.  In the end, I had to resort to dreaded process of screen scraping the data from a comic research site.  For those of you who are lucky enough be unfamiliar with it, screen scraping is the process of programmatically downloading HTML data and stripping away that formatting to make it suitable for use.  This is generally used as a last resort because web sites are volatile creatures that are prone to change their appearance as often as my teenage kids do preparing to leave the house.  However, for the purposes of this series, as my friend Melvin the handyman would often say, “We works with what we gots.”

blog-ironman-pythonexclamation-point-icon-30522This leads us to the first issue you may run into while wrangling your data.  You have access to lots of data but it’s not pretty.  So make it pretty.  Working with raw data is like being a sculptor working with wood.  Your job is not to change the core composition of the data to suit your purposes but to chip away at the excess to reveal what was there all along, a beautiful horse… er I mean insight.  Sorry, I got lost in my analogy.  Actually to expand on this analogy a bit, the first tool I pulled out of my toolbox for this project was Python, the Leatherman of  programming languages.  Python is fast, plays well with other technologies and most importantly in this case, Python is ubiquitous.   Used for tasks ranging from process automation and ETL to gaming and academic pursuits, Python is truly a multipurpose tool.  As such, if you have a functional need, chances are there is a native module or someone has already written a public library to perform that function.  In my case, I needed some scripts to “scrape” HTML tables containing comic sales data and combine that data with other comic data that I retrieved elsewhere.  The “other” data is metadata about each of the issues.  Metadata is just data about data.  In this case, information about who authored it, how it was sold, when it was published, etc..  More on that later.  

blog-sales-tableLuckily for me, the format of the data I was scraping was tabular, so extracting the data and transforming it into Python objects was a relatively simple matter of iterating through the table rows and binding each table column to the designated Python object field.   There was still a lot of unnecessary content on the page that needs to be ignored, like the titles and all of the other structural tags, but once I found the specific table holding the data, I was able to isolate it.  At that point, I wrote the objects to to a CSV file, to make the data easy to transport and to facilitate usability by other languages and/or processes.

The heavy lifting in this process was performed by three different Python modules: urllib2, bs4 and csv. Urllib2, as the name implies, provides functions to open URLs.  In this case, I found a site that hosted a page containing the estimated issue sales for every month going back to the early 1990’s.  To extract each month without manually updating the hardcoded URL over and over, I created a script that accepted MONTH and YEAR as arguments, month_sales_scraper.py
blog-get-sales

The response from the urlopen(url) function call was the full HTML code that is typically rendered by a web browser.  In that format, it does me very little good, so I needed to employ a parser to extract the data from the HTML.  In this context, a parser is a program that is used to read in a specific document format, break it down into its constituent parts while preserving the established relationships between those parts, and then finally provide a means to selectively access said parts.  So an HTML parser would allow me to easily access all the <TD> column tags for a specific table within an HTML document.  For my purposes, I chose BeautifulSoup, or bs4.

BeautifulSoup provided search functions that I used to find the specific HTML table containing the sales data and loop through each row, while using the column values to populate a Python object.

blog-bs4

This Python object, named data, contains fields populated with data from different sources.  The year and month are populated using the arguments passed to the module.  The format field is dynamically set based on logic related to the rankings and the remaining fields are set based on their source’s position in the HTML table.  As you can see, there is a lot of hard coded logic that would need to be updated, should the scraped site change their format.  However, for now this logic gets it done.

The final step of this task was to write those Python objects to a CSV file.   The python module, CSV, provides the function writerow(), which accepts an array as a parameter and writes each of the array elements as columns in the CSV.
blog-csv-write

My first pass raised the an exception because the title field contained unicode characters that the CSV writer could not handle.
blog-unicode-error

To rectify this, I had to add a check for unicode and encoded the content as UTF-8.  Unicode and UTF-8 are character encodings; meaning they provide a map computers use to identify characters.  This includes alphabets and logographic symbols from different languages as well as common symbols like ®.

blog-unicode-to-utf8

Additionally, there was the matter of reformatting the values of some of the numeric fields to allow math to be performed on them later(ie stripping ‘$’ and commas).  Other than that, the data load went pretty smoothly.  A file named (MONTH)_(YEAR).CSV was generated for each month.  Each file turned out like so:

blog-sales-csv

While this generated tens of thousands of rows of comic sales data, it was not enough.  Rather, it had the volume but not the breadth of information I needed.  In order to make an accurate prediction, I needed to feed more variables to the model than just the comic’s title, issue number, and price.  The publisher was not relevant as I decided to limit this exercise to only Marvel comics and passing in the the estimated sales would be cheating, as rank is based on sales.  So to enhance my dataset, I pulled metadata about each of the issues down from “the Cloud” using Marvel’s Developer API.  Thankfully, since the API is a web service, there was no need for screen scraping.

exclamation-point-icon-30522Retrieving and joining this data was not as easy as one might think.  My biggest problem was that the issue titles from the scraped source were not an exact match to the titles stored in the Marvel database.  For example, the scraped dataset lists one title as ‘All New All Different Avengers”.  Using their API to search the Marvel database with that title retrieved no results.  Eventually, I was able to manually find it in their database listed as “All-New All-Different Avengers”.  In other cases, there were extra words like “The Superior Foes of Spider-Man” vs “Superior Foes of Spider-Man”.  So in order to perform a lookup by name, I needed to know the titles as they expected them.  To do this I decided to pull a list of all the series titles whose metadata was modified during the timeframes for which I had sales data.  Again, I ran into a roadblock.  The Marvel API only allows you to retrieve up to 100 results per request and Marvel has published thousands of titles. To get around this I had to perform incremental pulls, segmented alphabetically.  

blog-incremental-pulls

Even then there were a few minor issues, as some letters like ‘S’ had more than 100 titles.  To get around that I had to pull the list for ‘S’ titles sorted ascending and descending then combine the results, making sure to remove duplicates.  So my advice on this one is be sure to read up on the limitations of any API you are using.  It may enforce limits but you may be able to work around the limits using creative querying.

blog-sticky-tab

At this point I have my list of Marvel series titles, stored in some CSV files that I eventually combined into a single file, MarvelSeriesList.csv, for ease of use.  Actually, I have more than that.  While retrieving the series titles, I also pulled down the ID for the series and an appropriateness rating.  Searching the API by ID will be much more accurate than name and the appropriateness  rating may be useful when building out the prediction model.  The next step was to iterate through each row of the CSVs we created from the sales data, find the matching ID from MarvelSeriesList.csv and use that ID to retrieve its metadata using the API.

exclamation-point-icon-30522If you remember, the point of doing that last step was that the titles stored in the sales data files don’t match the titles in the API, so I needed to find a way to join the two sources.  Rather than writing cases to handle each of the scenarios (e.g. mismatched punctuation, extra filler words), I looked for a python library to perform some fuzzy matching.  What I found was a extremely useful library called, Fuzzy Wuzzy.  Fuzzy Wuzzy provides a function called extractOne() that allows you to pass in a term and compare it with an array of values.  The extractOne() function will then return the term in the array that has the highest match percentage.  Additionally, you can specify a lower bound for acceptable matches (ie. only return result where the match is >= 90%).

Again, it took a few passes to get the configuration to work effectively.  The first time through about only about 65% of the titles in the sales file found a match.  That was throwing away too much data for my liking so I had to look at the exceptions and figure out why the matches were falling through.  One issue that I found was titles that tacked on the publication years in the Marvel database, like “All-New X-Men (2012)”, had a match score in the 80’s when matched against a sales title like, “All New X-Men”.  This was a pretty consistent issue, so rather than lowering the match percentage, which could introduce some legitimate mismatches, I decided to strip the year, if present, on mismatches and run it through that matching process again.  This got me almost there.  The only other issue I ran into was Fuzzy Wuzzy had trouble matching acronyms/acrostics.  So ‘S.H.E.I.L.D.’  had a match score in the 50’s when matching ‘SHIELD’.  That’s because half the characters (periods) were missing.  Since  there were only two titles affected, I built a lookup dictionary of special cases that needed to be translated.  For the purposes of this exercise, I would still have had enough matches to skip that step, but by doing it brought us up to 100% matching between the two sources.  Once the matching function was working, I pulled out urllib2 and retrieved all the metadata I could for each of the issues.

The resulting files contained not only sales data (title, issue number, month rank, estimated sales), but information about the creative team, issue descriptions, characters, release dates and  associated story arcs.  This would be enough to get us started with building our predictive classification model.
blog-csv-all That being said, there was still a lot of structural rearranging required to make it ready for the type of analysis I wanted to do, but we will deal with that in the next blog.  Hopefully,  you picked up some useful tips on how to combine data from different sources or at the very least found solace in knowing that while you may not be the coolest person in the world, somewhere out there is a grown man who still likes to read comic books enough to write a blog about it.  Be sure to tune in next week,
True Believers, as we delve into The Mysteries of R!

The post Using Python to ‘Wrangle’ Your Messy Data appeared first on Rittman Mead Consulting.

Becky’s BI Apps Corner: OBIA install Perl Script Patching and troubleshooting when they fail.

During a recent project installing Oracle BI Applications, I became much better acquainted with OPatch, Oracle’s standard tool for managing application patches. By acquainted, I mean how to troubleshoot when OPatch patching fails. Since, at last count, there are around 50 patches total for Oracle BI Applications 11.1.1.9.2, the first patching attempt may not apply all patches successfully. There are any number of reasons for a failure, like an extra slash at the end of a path, a misspelled word, Weblogic or NodeManager still running, or some other reason. We will take a look at the logs for each section, learn where additional logs can be found, and learn how to turn on OPatch debugging to better understand the issue. Then, following the ideas from a previous OPatch post by Robin, I’ll describe how to manually apply the patches with OPatch at the command line for any patches that weren’t applied successfully using the provided perl script.

*Disclaimers – Please read the readme files for patches and follow all Oracle recommendations. Patch numbers are subject to change depending on OS and OBIA versions. Commands and paths here are of the linux/unix variety, but there are similar commands available for Windows OS.

Perl Script patching

Unzip the patch files to a patch folder. I have included the OBIEE patch as well.

unzip pb4biapps_11.1.1.9.2_.zip -d patches/
unzip pb4biapps_11.1.1.9.2_generic_1of2.zip -d patches/
unzip pb4biapps_11.1.1.9.2_generic_2of2.zip -d patches/
unzip p20124371_111170_.zip -d patches/

While installing the Oracle BI Applications versions 11.1.1.7. and up, patches get applied with a perl script called APPLY_PATCHES.pl. Following Oracle’s install documentation for 11.1.1.9 version of Oracle BI Applications here, there is a text file to modify and pass to the perl script. Both the perl script and the text file reside in the following directory: $ORACLE_HOME/biapps/tools/bin. In the text file, called apply_patches_import.txt, parameters are set with the path to the following directories:

JAVA_HOME
INVENTORY_LOC
ORACLE_HOME
MW_HOME
COMMON_ORACLE_HOME
WL_HOME
ODI_HOME
WORKDIR
PATCH_ROOT_DIR
WINDOWS_UNZIP_TOOL_EXE (only needed if running on Windows platforms)

Some pro tips to modifying this text file:
1. Oracle recommends you use the JDK in the ORACLE_BI1 directory.
2. Use ORACLE_BI1 as the ORACLE_HOME.
3. Ensure WORKDIR and PATCH_ROOT_DIR are writeable directories.
4. Don’t add a path separator at the end of the path.
5. Commented lines are safe to remove.

Then you run the APPLY_PATCHES.pl passing in the apply_patches_import.txt. If everything goes well, at the end of the perl script, the results will look similar to the following:

If this is the case, CONGRATULATIONS!!, you can move on to the next step in the install documentation. Thanks for stopping by and come back soon! However, if any patch or group of patches failed, the rest of this post is for you.

Log file location

First, the above patching report does not tell you where to find the logs, regardless of success or failure. If you remember though, you set a WORKDIR path in the text file earlier. In that directory is where you will find the following log files:

  1. final_patching_report.log
  2. biappshiphome_generic_patches.log
  3. odi_generic_patches.log
  4. oracle_common_generic_patches.log
  5. weblogic_patching.log

Open the final_patching_report.log to determine first if all patches were applied and identify ones that were not successful. For example, looking that this log may show that the Oracle Common patches failed.

cd $WORKDIR
vi final_patching_report.log

However, this doesn’t tell you what caused the failure. Next we will want to look into the oracle_common_generic_patches.log to gather more information.

From the $WORKDIR:

vi oracle_common_generic_patches.log

Here you will see the error, that a component is missing. Patch ######## requires component(s) that are not installed in OracleHome. These not-installed components are oracle.jrf.thirdparty.jee:11.1.1.7.0. Notice also that in this log there is a path to another log file location. The path is in the $COMMON_ORACLE_HOME/cfgtoollogs/opatch/ directory. This directory has more detailed logs specific to patches applied to oracle_common. Additionally, there are logs under $ORACLE_HOME/cfgtoollogs/opatch/, $WL_HOME/cfgtoollogs/opatch/, and $ODI_HOME/cfgtoollogs/opatch/. These locations are very helpful to know, so you can find the logs for each group of patches in the same relative path.

Going back to the above error, we are going to open the most recent log file listed in the $COMMON_ORACLE_HOME/cfgtoollogs/opatch/ directory.

cd $COMMON_ORACLE_HOME/cfgtoollogs/opatch/
vi opatch2015-08-08_09-20-58AM_1.log

The beginning of this log file has two very interesting pieces of information to take note of for use later. It has the actual OPatch command used, and it has a path to a Patch History file. Looks like we will have to page down in the file to find the error message.

Now we see our missing component error. Once the error occurs, the java program starts backing out and then starts cleanup by deleting the files extracted earlier in the process. This log does have more detail, but still doesn’t say much about the missing component. After some digging around on the internet, I found a way to get more detailed information out of OPatch by setting export OPATCH_DEBUG=TRUE. After turning OPatch debugging on, run the OPatch command we found earlier that was at the top of the log. A new log file will be generated and we want to open this most recent log file.

Finally, the results now get me detailed information about the component and the failure.

Side Note: If you are getting this specific error, I’ll refer you back to a previous blog post that happened to mention making sure to grab the correct version of OBIEE and ODI. If you have a wrong version of OBIEE or ODI for the Oracle BI Apps version you are installing, unfortunately you won’t start seeing errors until you get to this point.

Manually running Oracle BI Application patches

Normally, the error or reason behind a patch or group of patches failing doesn’t take that level of investigation, and the issue will be identified in the first one or two logs. Once the issue is corrected, there are a couple of options available. Rerunning the perl script is one option, but it will cycle through all of the patches again, even the ones already applied. There is no harm in this, but it does take longer than running the individual patches. The other option is to run the OPatch command at the command line. To do that, first I would recommend setting the variables from the text file. I also added the Oracle_BI1/OPatch directory to the PATH variable.

export JAVA_HOME=$ORACLE_HOME/jdk
export INVENTORY_LOC=
export COMMON_ORACLE_HOME=$MW_HOME/oracle_common
export WL_HOME=$MW_HOME/wlserver_10.3
export SOA_HOME=$MW_HOME/Oracle_SOA1
export ODI_HOME=$MW_HOME/Oracle_ODI1
export WORKDIR=
export PATCH_FOLDER=/patches
export PATH=$ORACLE_HOME/OPatch:$JAVA_HOME/bin:$PATH

Next, unzip the patches in the required directory. For example, the $PATCH_FOLDER/oracle_common/generic might look like this after unzipping files:

Below are the commands for each group of patches:

Oracle Common Patches:

cd $PATCH_FOLDER/oracle_common/generic
unzip "*.zip"

$COMMON_ORACLE_HOME/OPatch/opatch napply $PATCH_FOLDER/oracle_common/generic -silent -oh $COMMON_ORACLE_HOME -id 16080773,16830801,17353546,17440204,18202495,18247368,18352699,18601422,18753914,18818086,18847054,18848533,18877308,18914089,19915810

BIApps Patches:

cd $PATCH_FOLDER/biappsshiphome/generic
unzip "*.zip"

opatch napply $PATCH_FOLDER/biappsshiphome/generic -silent -id 16913445,16997936,19452953,19526754,19526760,19822893,19823874,20022695,20257578

ODI Patches:

cd $PATCH_FOLDER/odi/generic
unzip "*.zip"

/$ODI_HOME/OPatch/opatch napply $PATCH_FOLDER/odi/generic -silent -oh $ODI_HOME -id 18091795,18204886

Operating Specific Patches:

cd $PATCH_FOLDER/
unzip "*.zip"

opatch napply $PATCH_FOLDER/ -silent -id ,,

Weblogic Patches:

cd $PATCH_FOLDER/suwrapper/generic
unzip "*.zip"

cd $PATCH_FOLDER/weblogic/generic

$JAVA_HOME/bin/java -jar $PATCH_FOLDER/suwrapper/generic/bsu-wrapper.jar -prod_dir=$WL_HOME -install -patchlist=JEJW,LJVB,EAS7,TN4A,KPFJ,RJNF,2GH7,W3Q6,FKGW,6AEJ,IHFB -bsu_home=$MW_HOME/utils/bsu -meta=$PATCH_FOLDER/suwrapper/generic/suw_metadata.txt -verbose > $PATCH_FOLDER/weblogic_patching.log

Even though this is a very specific error as an example, understanding the logs and having the break-down of all of the patches will help with any number of patch errors at this step of the Oracle BI Applications installation. I would love to hear your thoughts if you found this helpful or if any part was confusing. Keep an eye out for the next Becky’s BI Apps Corner where I move on from installs and start digging into incremental logic and Knowledge Modules.

The post Becky’s BI Apps Corner: OBIA install Perl Script Patching and troubleshooting when they fail. appeared first on Rittman Mead Consulting.

Corporate Social Responsibility (Where Can We Serve?)

At Rittman Mead, we believe that people are more important than profit.
This manifests itself in two ways. First, we want to impact the world beyond data and analytics, and secondly, we want our employees to be able to contribute to organizations they believe are doing impactful work.

This year, we’ve put a Community Service requirement in place for all of our full-time employees.

We’ll each spend 40 hours this year serving with various nonprofits. Most of our team are already involved with some amazing organizations, and this “requirement” allows us to not only be involved after hours and on the weekends, but even during normal business hours.

We want to highlight a few team members and show how they’ve been using their Community Service hours for good.

Beth deSousa
Beth is our Finance Manager and she has been serving with Sawnee Women’s Club. Most of her work has been around getting sponsorship and donations for their annual silent auction. She’s also helped with upgrading a garden at the local high school, collecting toys and gift wrap for their Holiday House, and collecting prom dresses and accessories for girls in need.

Charles Elliott
Charles is the Managing Director of North America. He recently ran in the Dopey Challenge down at Disney World which means he ran a 5k, 10k, half marathon, and full marathon in 4 days. He did the run to raise funds for Autism Speaks. Charles was recognized as the third largest fundraiser for Autism Speaks at the Dopey Challenge!

David Huey
David is our U.S. Business Development rep. He recently served with the nonprofit Hungry For A Day for their Thanksgiving Outreach. He flew up to Detroit the week of Thanksgiving and helped serve over 8,000 Thanksgiving dinners to the homeless and needy in inner city Detroit.

Andy Rocha

Andy is our Consulting Manager. Andy is a regular volunteer and instructor with Vine City Code Crew. VC3 works with inner city youth in Atlanta to teach them about electronics and coding.

Pete Tamisin

Pete is a Principal Consultant. He is also involved as a volunteer and instructor with the aforementioned Code Crew. Pete has taught a course using Makey Makey electronic kits for VC3.

This is just a sample of what our team has done, but engaging in our local communities is something that Rittman Mead is striving to make an integral piece of our corporate DNA.
We can’t wait to show you how we’ve left our communities better in 2016!

The post Corporate Social Responsibility (Where Can We Serve?) appeared first on Rittman Mead Consulting.

The best OBIEE 12c feature that you’re probably not using.

With the release of OBIEE 12c we got a number of interesting new features on the front-end.  We’re all talking about the cleaner look-and-feel, Visual Analyzer, and the ability to create data-mashups, etc.

While all this is incredibly useful, it’s one of the small changes you don’t hear about that’s truly got me excited.  I can’t tell you how thrilled I am that we can finally save a column back to the web catalog as an independent object (to be absolutely fair, this actually first shipped with 11.1.1.9).

For the most part, calculations should be pushed back to the RPD.  This reduces the complexity of the reports on the front-end, simplifies maintenance of these calculations, and ultimately assures that the same logic is used across the board in all dashboards and reports… all the logic should be in the RPD.  I agree with that 110%… at least in theory.  In reality, this isn’t always practical.  When it comes down to it, there’s always some insane deadline or there’s that pushy team (ahem, accounting…) riding you to get their dashboard updated and migrated in time for year end, or whatever.  It’s quite simply just easier sometimes to just code the calculation in the analysis.  So, rather than take the time to modify the RPD, you fat finger the calculation in the column formula.  We’ve all done it.  But, if you spend enough time developing OBIEE reports and dashboards, sooner or later you’ll find that this is gonna come back to bite you.

Six months, a year from now, you’ll have completely forgotten about that calculation.  But there will be a an org change, or a hierarchy was updated… something, to change the logic of that calculation and you’ll need make a change.  Only now, you know longer remember the specifics of the logic you coded, and even worse you don’t remember if you included that same calculation in any of the other analyses you were working on at the time.  Sound familiar?  Now, a change that should have been rather straightforward and could have been completed in an afternoon takes two to three times longer as you dig through all your old reports trying to make sense of things.  (If only you’d documented your development notes somewhere…)

Saving columns to the web catalog is that middle ground that gives us the best of both worlds… the convenience of quickly coding the logic on the front-end but the piece of mind knowing that the logic is all in one place to ensure consistency and ease maintenance.

After you update your column formula, click OK.

From the column dropdown, select the Save Column As option.

Save the column to the web catalog.  Also, be sure to use the description field.  The description is a super convenient place to store a few lines of text that your future self or others can use to understand the purpose of this column.

As an added bonus, this description field is also used when searching the web catalog.  So, if you don’t happen to remember what name you gave a column but included a little blurb about the calculation, all is not lost.

Saved columns can be added from the web catalog.

Add will continue to reference the original saved column, so that changes to made to the saved column will be reflected in your report.  Add Copy will add the column to your report, but future changes to the saved column will not be reflected.

One thing to note, when you add a saved column to a report it can no longer be edited from within the report.  When you click on Edit Formula you will still be able to see the logic, but you will need to open and edit that saved column directly to make any changes to the formula.

Try out the saved columns, you’ll find it’s invaluable and will greatly reduce the time it takes to update reports.  And with all that free time, maybe you’ll finally get to play around with the Visual Analyzer!

The post The best OBIEE 12c feature that you’re probably not using. appeared first on Rittman Mead Consulting.

OBIEE12c – Three Months In, What’s the Verdict?

I’m over in Redwood Shores, California this week for the BIWA Summit 2016 conference where I’ll be speaking about BI and analytics development on Oracle’s database and Hadoop platforms. As it’s around three months now since OBIEE12c came out and we were here for Openworld, I thought it’d be a good opportunity to reflect on how OBIEE12c has been received by ourselves, the partner community and of course by customers. Given OBIEE11g was with us for around five years it’s still early days in the overall lifecycle of this release, but it’s also interesting to compare back to where we were around the same time with 11g and see if we can spot any similarities and differences to take-up then.

Starting with the positives; Visual Analyzer (note – not the Data Visualization option, I’ll get to that later) has gone down well with customers at least over in the UK. The major selling point seems to be “Tableau with a shared governed data model and integrated with the rest of our BI platform” (see Oracle’s slide from Oracle Openworld 2015 below), and given that the DV option’s price point per named-used seems to be comparable with Tableau server the cost-savings in terms of not having to learn and support a new platform means that customers seem pleased this new feature is now available.

NewImage

Given that VA is an extra-cost option what I’m seeing is customers planning to upgrade their base OBIEE platform from 11g to 12c as part of their regular platform refresh schedule, and then postponing the VA part until after the upgrade and as part of a separate cost/benefit exercise. But VA seems to be the trigger for customers to start considering an upgrade now, with the business typically now holding the budget for BI and Visual Analyzer (like Mobile with 11g) being the new capability that unlocks the upgrade spend.

On the negative side, Oracle charging for VA hasn’t gone down well, either from the customer side who ask what it is they actually get for their 22% upgrade and maintenance fee if they they have to pay for anything new that comes with the upgrade; or from partners who now see little in the 12c release to incentivise customers to upgrade that’s not an additional cost option. My response is usually to point to previous releases – 11g with Scorecards and Mobile, the database with In-Memory, RAC and so on – and say that it’s always the case that anything net-new comes at extra cost, whereas the upgrade should be something you do anyway to keep the platform up-to-date and be prepared to uptake new features. My observation over the past month or so is that this objection seems to be going away as people get used to the fact that VA costs extra; the other push-back I get a lot is from IT who don’t want to introduce data mashups into their environment, partly I guess out of fear of the unknown but also partly because of concerns around governance, how well it’ll work in the real world, so on and so on. I’d say overall VA has gone down well at least once we got past the “shock” of it costing extra, I’d expect there’ll be some bundle along the lines of BI Foundation Suite (BI Foundation Suite Plus?) in the next year or so that’ll bundle BIF with the DV option, maybe include some upcoming features in 12c that aren’t exposed yet but might round out the release. We’ll see.

The other area where OBIEE12c has gone down well, surprisingly well, is with the IT department for the new back-end features. I’ve been telling people that whilst everyone thinks 12c is about the new front-end features (VA, new look-and-feel etc) it’s actually the back-end that has the most changes, and will lead to the most financial and cost-saving benefits to customers – again note the slide below from last year’s Openworld summarising these improvements.

NewImage

Simplifying install, cloning, dev-to-test and so on will make BI provisioning considerably faster and cheaper to do, whilst the introduction of new concepts such as BI Modules, Service Instances, layered RPD customizations and BAR files paves the way for private cloud-style hosting of multiple BI applications on a single OBIEE12c domain, hybrid cloud deployments and mass customisation of hosted BI environments similar to what we’ve seen with Oracle Database over the past few years.

What’s interesting with 12c at this point though is that these back-end features are only half-deployed within the platform; the lack of a proper RPD upload tool, BI Modules and Services Instances only being in the singular and so on point to a future release where all this functionality gets rounded-off and fully realised in the platform, so where we are now is that 12c seems oddly half-finished and over-complicated for what it is, but it’s what’s coming over the rest of the lifecycle that will make this part of the product most interesting – see the slide below from Openworld 2014 where this full vision was set-out, but in Openworld this year was presumably left-out of the launch slides as the initial release only included the foundation and not the full capability.

NewImage

Compared back to where we were with OBIEE11g (11.1.1.3, at the start of the product cycle) which was largely feature-complete but had significant product quality issues, with 12c we’ve got less of the platform built-out but (with a couple of notable exceptions) generally good product quality, but this half-completed nature of the back-end must confuse some customers and partners who aren’t really aware of the full vision for the platform.

And finally, cloud; BICS had an update some while ago where it gained Visual Analyzer and data mashups earlier than the on-premise release, and as I covered in my recent UKOUG Tech’15 conference presentation it’s now possible to upload an on-premise RPD (but not the accompanying catalog, yet) and run it in BICS, giving you the benefit of immediate availability of VA and data mashups without having to do a full platform upgrade to 12c.

NewImage

In-practice there are still some significant blockers for customers looking to move their BI platform wholesale into Oracle Cloud; there’s no ability yet to link your on-premise Active Directory or other security setup to BICS meaning that you need to recreate all your users as Oracle Cloud users, and there’s very limited support for multiple subject areas, access to on-premise data sources and other more “enterprise” characterises of an Oracle BI platform. And Data Visualisation Cloud Service (DVCS) has so far been a non-event; for partners the question is why would we get involved and sell this given the very low cost and the lack of any area we can really add value, while for customers it’s perceived as interesting but too limited to be of any real practical use. Of course, over the long term this is the future – I expect on-premise installs of OBIEE will be the exception rather than the rule in 5 or 10 years time – but for now Cloud is more “one to monitor for the future” rather than something to plan for now, as we’re doing with 12c upgrades and new implementations.

So in summary, I’d say with OBIEE12c we were pleased and surprised to see it out so early, and VA in-particular has driven a lot of interest and awareness from customers that has manifested itself in enquires around upgrades and new features presentations. The back-end for me is the most interesting new part of the release, promising significant time-saving and quality-improving benefits for the IT department, but at present these benefits are more theoretical than realisable until such time as the full BI Modules/multiple Services Instances feature is rolled-out later this year or next. Cloud is still “one for the future” but there’s significant interest from customers in moving either part or all of their BI platform to the cloud, but given the enterprise nature of OBIEE it’s likely BI will follow after a wider adoption of Oracle Cloud for the customer rather than being the trailblazer given the need to integrate with cloud security, data sources and the need to wait for some other product enhancements to match on-premise functionality.

The post OBIEE12c – Three Months In, What’s the Verdict? appeared first on Rittman Mead Consulting.