Category Archives: Rittman Mead
Rittman Mead at Collaborate 16: Data Integration Focus
It’s that time of year again when Oracle technologists from around the world gather in Las Vegas, NV to teach, learn, and, of course, network with their peers. The Collaborate conference, running for 10 years now, has been a collaboration, if you will, between the Independent Oracle Users Group (IOUG), Oracle Applications Users Group (OAUG) and Quest International Users Group (Quest), making it one of the largest user group conferences in the world. Rittman Mead will once again be in attendance, with two data integration focused presentations by me over the course of the week.
My first session, “A Walk Through the Kimball ETL Subsystems with Oracle Data Integration”, scheduled for Monday, April 11 at 10:30am, will focus on how we can implement the ETL Subsystems using Oracle Data Integration solutions. As you know, Big Data integration has been the hot topic over the past few years, and it’s an excellent feature in the Oracle Data Integration product suite (Oracle Data Integrator, GoldenGate, & Enterprise Data Quality). But not all analytics require big data technologies, such as labor cost, revenue, or expense reporting. Ralph Kimball, dimensional modeling and data warehousing expert and founder of The Kimball Group, spent much of his career working to build an enterprise data warehouse methodology that can meet these reporting needs. His book, “The Data Warehouse ETL Toolkit“, is a guide for many ETL developers. This session will walk you through his ETL Subsystem categories; Extracting, Cleaning & Conforming, Delivering, and Managing, describing how the Oracle Data Integration products are perfectly suited for the Kimball approach.
I go into further detail on one of the ETL Subsystems in an upcoming IOUG Select Journal article, titled “Implement an Error Event Schema with Oracle Data Integrator”. The Select Journal is a technical magazine published quarterly and available exclusively to IOUG members. My recent post Data Integration Tips: ODI 12c Repository Query – Find the Mapping Target Table shows a bit of the detail behind the research performed for the article.
If you’re not familiar with the Kimball approach to data warehousing, I definitely would recommend reading one (or more) of their published books on the subject. I would also recommend attending one of their training courses, but unfortunately for the data warehousing community the Kimball Group has closed shop as of December 2015. But hey, the good news is that two of the former Kimball team members have joined forces at Decision Works, and they offer the exact same training they used to deliver under The Kimball Group name.
On Thursday, April 14 at 11am, I will dive into the recently released Oracle GoldenGate for Big Data 12.2 in a session titled “Oracle GoldenGate and Apache Kafka: A Deep Dive into Real-Time Data Streaming”. The challenge for us as data integration professionals is to combine relational data with other non-structured, high volume and rapidly changing datasets, known in the industry as Big Data, and transform it into something useful. Not just that, but we must also do it in near real-time and using a big data target system such as Hadoop. The topic of this session, real-time data streaming, provides us a great solution for that challenging task. By combining GoldenGate, Oracle’s premier data replication technology, and Apache Kafka, the latest open-source streaming and messaging system for big data, we can implement a fast, durable, and scalable solution.
If you plan to be at Collaborate next week, feel free to drop me a line in the comments, via email at michael.rainey@rittmanmead.com, or on Twitter @mRainey, I’d love to meet up and have a discussion around my presentation topics, data integration, or really anything we’re doing at Rittman Mead. Hope to see you all there!
The post Rittman Mead at Collaborate 16: Data Integration Focus appeared first on Rittman Mead Consulting.
New OTN Article – OBIEE Performance Analytics : Analysing the Impact of Suboptimal Design
I’m pleased to have recently had my first article published on the Oracle Technology Network (OTN). You can read it in its full splendour and glory (!) over there, but I thought I’d give a bit of background to it here and the tools demonstrated within.
OBIEE Performance Analytics Dashboards
One of the things that we frequently help our clients with is reviewing and optimising the performance of their OBIEE systems. As part of this we’ve built up a wealth of experience in the kind of suboptimal design patterns that can cause performance issues, as well as how to go about identifying them empirically. Getting a full stack view on OBIEE performance behaviour is key to demonstrating where an issue lies, prior to being able to resolve it and proving it fixed, and for this we use the Rittman Mead OBIEE Performance Analytics Dashboards.
A common performance issue that we see is analyses and/or RPDs built in such a way that the BI Server inadvertently returns many gigabytes of data from the database and in doing so often has to dump out to disk whilst processing it. This can create large NQS_tmp files, impacting the disk space available (sometimes critically), and the disk I/O subsystem. This is the basis of the OTN article that I wrote, and you can read the full article on OTN to find out more about how this can be a problem and how to go about resolving it.
OBIEE implementations that cause heavy use of temporary files on disk by the BI Server can result in performance problems. Until recently in OBIEE it was really difficult to track because of the transitory nature of the files. By the time the problem had been observed (for example, disk full messages), the query responsible had moved on and so the temporarily files deleted. At Rittman Mead we have developed lightweight diagnostic tools that collect, amongst other things, the amount of temporary disk space used by each of the OBIEE components.
This can then be displayed as part of our Performance Analytics Dashboards, and analysed alongside other performance data on the system such as which queries were running, disk I/O rates, and more:
Because the Performance Analytics Dashboards are built in a modular fashion it is easy to customise them to suit specific analysis requirements. In this next example you can see performance data from Oracle being analysed by OBIEE dashboard page in order to identify the cause of poorly-performing reports:
We’ve put online a set of videos here demonstrating the Performance Analytics Dashboards, and explaining in each case how they can help you quickly and accurately diagnose OBIEE performance problems.
You can read more about our Performance Analytics offering here, or get in touch to find out more!
The post New OTN Article – OBIEE Performance Analytics : Analysing the Impact of Suboptimal Design appeared first on Rittman Mead Consulting.
The Importance of BI Documentation
Why Is BI Documentation Important?
Business intelligence systems come with a lot of extra information. Even beautifully constructed analyses have piles of background information and histories. Administrators might often have memos and updates that they’d like share with analysts. Sales figures might have anomalies that need further explanation. But OBIEE does not currently have any options for BI Documentation inside the dashboard.
Let’s say a BI user for a cell phone distribution company is viewing a report comparing the yearly sales figures for several different cell phones. If the analyst notices that one specific cell phone is outperforming the others, but doesn’t know what makes that specific model unique, then they have to go searching for that information.
But what if the individual phone model specifications and advertising and marketing histories were already included as reports inside the dashboard? What if the analyst, with only a couple of clicks, discovered that the reason one cell phone was outperforming the others was due to its next-gen screen, camera, and chip upgrades, which proved popular with consumers? Or what if the analyst discovered that the popular phone, while containing outdated peripherals, was selling so well because a Q3 advertising push for that model only? All of this information might not be contained in the dashboard’s visuals, but greatly affects the analysts’ understanding of the reports.
Current Options for OBIEE Documentation
Some information can be displayed as visuals, but many times this isn’t a practical solution. Besides making dashboards too cluttered, memos, product descriptions, company directories, etc., are not practical as charts and graphs. As of right now, important documentation can be stored in a wide range of places outside of the BI dashboard, but the operating reality at most organizations means that important information is spread across several locations and not always accessible to the people who need it.
Workarounds are inefficient, cost time, cause BI users to leave the BI environment (potentially reducing usage), and increase frustration. If an analyst has to email several different people to locate the information she wants, that complicates her workflow and produces extraneous communications (who likes answering emails?). Before now, there wasn’t an easy solution to these problems.
ChitChat’s BI Documentation Features
With ChitChat, it’s now possible to store critical documentation where it belongs—at the source of the conversation. Keep phone directories, memos from administrators (or requests from analysts to administrators), product descriptions, analytical histories—really, the possibilities are endless—inside the dashboard where they are accessible to the people who need them. Shorten workflows and make life easier for your BI users.
ChitChat’s easy-to-use functionality allows BI users to copy and paste or write (ChitChat has a built-in WYSIWYG text editor) important information inside the BI dashboard, creating a quicker path to insightful and actionable analytics. And isn’t that the goal in the end?
To learn more about ChitChat’s many commentary features, or to request a demo, click here.
The post The Importance of BI Documentation appeared first on Rittman Mead Consulting.
ASO Slice Clears – How Many Members?
Essbase developers have had the ability to (comparatively) easily clear portions of our ASO cubes since version 11.1.1, getting away from fiddly methods involving manually contra-ing existing data via reports and rules files, making incremental loads substantially easier.
Along with the official documentation in the TechRef and DBAG, there are a number of excellent posts already out there that explain this process and how to effect “slice clears” in detail (here and here are just two I’ve come across that I think are clear and helpful). However, I had a requirement recently where the incremental load was a bit more complex than this. I am sure people must have fulfilled in the same or a very similar way, but I could not find any documentation or articles relating to it, so I thought it might be worth recording.
For the most part, the requirements I’ve had in this area have been relatively straightforward—(mostly) financial systems where the volatile/incremental slice is typically a months-worth (or quarters-worth) of data. The load script will follow this sort of sequence:
- [prepare source data, if required]
- Perform a logical clear
- Load data to buffer(s)
- Load buffer(s) to new database slice(s)
- [Merge slices]
With the last stage being run here if processing time allows (this operation precludes access to the cube) or in a separate routine “out of hours” if not.
The “logical clear” element of the script will comprise a line like (note: the lack of a “clear mode” argument means a logical clear; only a physical clear needs to be specified explicitly):
alter database ‘Appname‘.’DBName‘ clear data in region ‘{[Jan16]}’
or more probably
alter database ‘Appname‘.’DBName‘ clear data in region ‘{[&CurrMonth]}’
i.e., using a variable to get away from actually hard coding the member values to clear. For separate year/period dimensions, the slice would need to be referenced with a CrossJoin:
alter database ‘Appname‘.’DBName‘ clear data in region ‘Crossjoin({[Jan]},{[FY16]})’
alter database ‘${Appname}’.’${DBName}’ clear data in region ‘Crossjoin({[&{CurrMonth]},{[&CurrYear]})’
which would, of course, fully nullify all data in that slice prior to the load. Most load scripts will already be formatted so that variables would be used to represent the current period that will potentially be used to scope the source data (or in a BSO context, provide a FIX for post-load calculations), so using the same to control the clear is an easy addition.
Taking this forward a step, I’ve had other systems whereby the load could comprise any number of (monthly) periods from the current year. A little bit more fiddly, but achievable: as part of the prepare source data stage above, it is relatively straightforward to run a select distinct period query on the source data, spool the results to a file, and then use this file to construct that portion of the clear command (or, for a relatively small number, prepare a sequence of clear commands).
The requirement I had recently falls into the latter category in that the volatile dimension (where “Period” would be the volatile dimension in the examples above) was a “product” dimension of sorts, and contained a lot of changed values each load. Several thousand, in fact. Far too many to loop around and build a single command, and far too many to run as individual commands—whilst on test, the “clears” themselves ran satisfyingly quickly, it obviously generated an undesirably large number of slices.
So the problem was this: how to identify and clear data associated with several thousand members of a volatile dimension, the values of which could change totally from load to load.
In short, the answer I arrived at is with a UDA.
The TechRef does not explicitly say or give examples, but because the Uda function can be used within a CrossJoin reference, it can be used to effect a clear: assume the Product dimension had an UDA of CLEAR against certain members…
alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)})’
…would then clear all data for all of those members. If data for, say, just the ACTUAL scenario is to be cleared, this can be added to the CrossJoin:
alter database ‘Appname‘.’DBName‘ clear data in region ‘CrossJoin({Uda([Product], “CLEAR”)}, {[ACTUAL]})’
But we first need to set this UDA in order to take advantage of it. In the load script steps above, the first step is prepare source data, if required. At this point, a SQLplus call was inserted to a new procedure that
- examines the source load table for distinct occurrences of the “volatile” dimension
- populates a table (after initially truncating it) with a list of these members (and parents), and a third column containing the text “CLEAR”:
A “rules” file then needs to be built to load the attribute. Because the outline has already been maintained, this is simply a case of loading the UDA itself:
In the “Essbase Client” portion of the load script, prior to running the “clear” command, the temporary UDA table needs to be loaded using the rules file to populate the UDA for those members of the volatile dimension to be cleared:
import database ‘AppName‘.’DBName‘ dimensions connect as ‘SQLUsername‘ identified by ‘SQLPassword‘ using server rules_file ‘PrSetUDA’ on error write to ‘LogPath/ASOCurrDataLoad_SetAttr.err’;
With the relevant slices cleared, the load can proceed as normal.
After the actual data load has run, the UDA settings need to be cleared. Note that the prepared table above also contains an empty column, UDACLEAR. A second rules file, PrClrUDA, was prepared that loads this (4th) column as the UDA value—loading a blank value to a UDA has the same effect as clearing it.
The broad steps of the load script therefore become these:
- [prepare source data, if required]
- ascertain members of volatile dimension to clear from load source
- update table containing current load members / CLEAR attribute
- Load CLEAR attribute table
- Perform a logical clear
- Load data to buffers
- Load buffer(s) to new database slice(s)
- [Merge slices]
- Remove CLEAR attributes
So not without limitations—if the data was volatile over two dimensions (e.g., Product A for Period 1, Product B for Period 2, etc.) the approach would not work (at least, not exactly as described, although in this instance you could possible iterate around the smaller Period dimension)—but overall, I think it’s a reasonable and flexible solution.
Clear / Load Order
While not strictly part of this solution, another little wrinkle to bear in mind here is the resource taken up by the logical clear. When initializing the buffer prior to loading data into it, you have the ability to determine how much of the total available resource is used for that particular buffer—from a total of 1.0, you can allocate (e.g.) 0.25 to each of 4 buffers that can then be used for a parallel load operation, each loaded buffer subsequently writing to a new database slice. Importing a loaded buffer to the database then clears the “share” of the utilization afforded to that buffer.
Although not a “buffer initialization” activity per se, a (slice-generating) logical clear seems to occupy all of this resource—if you have any uncommitted buffers created, even with the lowest possible resource utilization of 0.01 assigned, the logical clear will fail:
The Essbase Technical Reference states at “Loading Data Using Buffers“:
While the data load buffer exists in memory, you cannot build aggregations or merge slices, as these operations are resource-intensive.
It could perhaps be argued that as we are creating a “clear slice,” not merging slices (nor building an aggregation), that the logical clear falls outside of this definition, but a similar restriction certainly appears to apply here too.
This is significant as, arguably, the ideally optimum incremental load would be along the lines of
- Initialize buffer(s)
- Load buffer(s) with data
- Effect partial logical clear (to new database slice)
- Load buffers to new database slices
- Merge slices into database
As this would both minimize the time that the cube was inaccessible (during the merge), and also not present the cube with zeroes in the current load area. However, as noted above, this does not seem to be possible—there does not seem to be a way to change the resource usage (RNUM) of the “clear,” meaning that this sequence has to be followed:
- Effect partial logical clear (to new database slice)
- Initialize buffer(s)
- Load buffer(s) with data
- Load buffers to new database slices
- Merge slices into database
I.e., the ‘clear’ has to be fully effected before the initialization of the buffers. This works as you would expect, but there is a brief period—after the completion of the “clear” but before the load buffer(s) have been committed to new slices—where the cube is accessible and the load slice will show as “0” in the cube.
The post ASO Slice Clears – How Many Members? appeared first on Rittman Mead Consulting.
ChitChat: The Importance of BI Integrations
A user’s workflow shouldn’t change to accommodate a new tool. A new tool should fill a gap in the current workflow and help streamline the user’s process. An application without a clearly defined scope eventually overlaps with existing solutions, creating confusion and distress among users. It takes both time and effort to clarify the appropriate situations to use the application, reconcile different use cases and approaches, and resolve incorrect uses. We designed ChitChat with appropriate scopes in mind, implementing key integrations, to fit seamlessly into existing workflows.
What exactly do we mean by “scope?”
Let’s look at an example with JIRA. JIRA owns the complete ticketing process, meaning tickets are stored and maintained by the tool. Using a competing ticket solution, such as Trello, for the same purpose within the organization will cause havoc among users. However, JIRA tickets are still extremely useful outside of the JIRA application. They can be linked to and displayed inside other applications, but they are still maintained by JIRA itself.
If you can recognize that the ticketing management should be handled solely by JIRA, but exposure of those tickets outside of the tool is also important, then you understand the correct scope of the application. The scope of the application does not determine where the context of an application is useful. It only describes what section of a workflow the application has absolute control over. The question isn’t “Where should we be able to view the information?” The question is “Where should the content be maintained?”
ChitChat respects the appropriate scopes of neighboring applications and allows the flexibility to continue maintaining the scopes of these applications. With integrations to Atlassian JIRA and Confluence and Salesforce Chatter, the information you need is available where you need it, without infringing on your existing workflow.
Examples of Integrations
Let’s look at some examples. As we use a BI dashboard, we stumble upon an issue. Using ChitChat, the issue can be identified and a conversation can be made about temporarily working around the problem. However, the IT team uses JIRA to accept issues and resolves them as appropriate. We obviously want the IT team to know of this issue, so we must create a ticket in JIRA as well. Rather than going to JIRA and creating a ticket manually, we can simply export the initial annotation to JIRA. The workflow remains generally identical, but now requires less time and effort. And this comes with the added benefit of the ticket pointing directly to the location of the issue on the dashboard.
In another instance, let’s say our dashboard has some confusing calculations on it, some of which are not immediately recognizable. The formulas used, and the reasons to use such formulas, are available in Atlassian Confluence for us to view. However, not all users have a Confluence account, and even fewer have access to the document. We could copy and paste the calculations as a document using ChitChat, but now we have two separate instances of the same information. If the calculations are changed, we must ensure both locations are accurate. Alternatively, ChitChat can sync directly with Confluence and pull a page into the application. The page guarantees accuracy by consistently pulling new updates from Confluence, as well as pushing updates to Confluence if the content is changed in ChitChat.
These approaches allow the JIRA ticket and Confluence document to be maintained in the appropriate location, while also being available in a useful context. Chitchat does not impede on the purposes of other applications. ChitChat offers integrations that seamlessly enhance your workflow without making it convoluted. Our tool is designed specifically to fill the missing pieces in your BI workflow, allowing for a seamless transition between analysis and communication.
To learn more about ChitChat’s many commentary features, or to request a demo, click here.
The post ChitChat: The Importance of BI Integrations appeared first on Rittman Mead Consulting.