Category Archives: Rittman Mead
ChitChat for OBIEE – Now Available as Open Source!
ChitChat is the Rittman Mead commentary tool for OBIEE. ChitChat enhances the BI experience by bridging conversational capabilities into the BI dashboard, increasing ease-of-use and seamlessly joining current workflows. From tracking the history behind analytical results to commenting on specific reports, ChitChat provides a multi-tiered platform built into the BI dashboard that creates a more collaborative and dynamic environment for discussion.
Today we're pleased to announce the release into open-source of ChitChat! You can find the github repository here: https://github.com/RittmanMead/ChitChat
Highlights of the features that ChitChat provides includes:
-
Annotate - ChitChat's multi-tiered annotation capabilities allow BI users to leave comments where they belong, at the source of the conversation inside the BI ecosystem.
-
Document - ChitChat introduces the ability to include documentation inside your BI environment for when you need more that a comment. Keeping key materials contained inside the dashboard gives the right people access to key information without searching.
-
Share - ChitChat allows to bring attention to important information on the dashboard using the channel or workflow manager you prefer.
-
Verified Compatibility - ChitChat has been tested against popular browsers, operating systems, and database platforms for maximum compatibility.
Getting Started
In order to use ChitChat you will need OBIEE 11.1.1.7.x, 11.1.1.9.x or 12.2.1.x.
First, download the application and unzip it to a convenient access location in the OBIEE server, such as a home directory or the desktop.
See the Installation Guide for full detail on how to install ChitChat.
Database Setup
Build the required database tables using the installer:
cd /home/federico/ChitChatInstaller
java -jar SocializeInstaller.jar -Method:BuildDatabase -DatabasePath:/app/oracle/oradata/ORCLDB/ORCLPDB1/ -JDBC:"jdbc:oracle:thin:@192.168.0.2:1521/ORCLPDB1" -DatabaseUser:"sys as sysdba" -DatabasePassword:password -NewDBUserPassword:password1
The installer will create a new user (RMREP
), and tables required for the application to operate correctly. -DatabasePath
flag tells the installer where to place the datafiles for ChitChat in your database server. -JDBC
indicates what JDBC driver to use, followed by a colon and the JDBC string to connect to your database. -DatabaseUser
specifies the user to access the database with. -DatabasePassword
specifies the password for the user previously given. -NewDBUserPassword
indicates the password for the new user (RMREP
) being created.
WebLogic Data Source Setup
Add a Data Source object to WebLogic using WLST:
cd /home/federico/ChitChatInstaller/jndiInstaller
$ORACLE_HOME/oracle_common/common/bin/wlst.sh ./create-ds.py
To use this script, modify the ds.properties
file using the method of your choice. The following parameters must be updated to reflect your installation: domain.name
, admin.url
, admin.userName
, admin.password
, datasource.target
, datasource.url
and datasource.password
.
Deploying the Application on WebLogic
Deploy the application to WebLogic using WLST:
cd /home/federico/ChitChatInstaller
$ORACLE_HOME/oracle_common/common/bin/wlst.sh ./deploySocialize.py
To use this script, modify the deploySocialize.py
file using the method of your choice. The first line must be updated with username, password and url to connect to your Weblogic Server instance. The second parameter in deploy
command must be updated to reflect your ChitChat access location.
Configuring the Application
ChitChat requires several several configuration parameters to allow the application to operate successfully. To change the configuration, you must log in to the database schema as the RMREP
user, and update the values manually into the APPLICATION_CONSTANT
table.
See the Installation Guide for full detail on the available configuration and integration options.
Enabling the Application
To use ChitChat, you must add a small block of code on any given dashboard (in a new column on the right-side of the dashboard) where you want to have the application enabled:
<rm id="socializePageParams"
user="@{biServer.variables['NQ_SESSION.USER']}"
tab="@{dashboard.currentPage.name}"
page="@{dashboard.name}">
</rm>
<script src="/Socialize/js/dashboard.js"></script>
Congratulations! You have successfully installed the Rittman Mead commentary tool. To use the application to its fullest capabilities, please refer to the User Guide.
Problems?
Please raise any issues on the github issue tracker. This is open source, so bear in mind that it's no-one's "job" to maintain the code - it's open to the community to use, benefit from, and maintain.
If you'd like specific help with an implementation, Rittman Mead would be delighted to assist - please do get in touch with Jon Mead or DM us on Twitter @rittmanmead to get access to our Slack channel for support about ChitChat.
Please contact us on the same channels to request a demo.
Real-time Sailing Yacht Performance – Kafka (Part 2)
In the last two blogs Getting Started (Part 1) and Stepping back a bit (Part 1.1) I looked at what data I could source from the boat's instrumentation and introduced some new hardware to the boat to support the analysis.
Just to recap I am looking to create the yachts Polars with a view to improving our knowledge of her abilities (whether we can use this to improve our race performance is another matter).
Polars give us a plot of the boat's speed given a true wind speed and angle. This, in turn, informs us of the optimal speed the boat could achieve at any particular angle to wind and wind speed.
In the first blog I wrote a reader in Python that takes messages from a TCP/IP feed and writes the data to a file. The reader is able, using a hash key to validate each message (See Getting Started (Part 1)). I'm also converting valid messages into a JSON format so that I can push meaningful structured data downstream. In this blog, I'll cover the architecture and considerations around the setup of Kafka for this use case. I will not cover the installation of each component, there has been a lot written in this area. (We have some internal IP to help with configuration). I discuss the process I went through to get the data in real time displayed in a Grafana dashboard.
Introducing Kafka
I have introduced Kafka into the architecture as a next step.
Why Kafka?
I would like to be able to stream this data real time and don't want to build my own batch mechanism or create a publish/ subscribe model. With Kafka I don't need to check that messages have been successfully received and if there is a failure while consuming messages the consumers will keep track of what has been consumed. If a consumer fails it can be restarted and it will pick up where it left off (consumer offset stored in Kafka as a topic). In the future, I could scale out the platform and introduce some resilience through clustering and replication (this shouldn't be required for a while). Kafka therefore is saving me a lot of manual engineering and will support future growth (should I come into money and am able to afford more sensors for the boat).
High level architecture
Let's look at the high-level components and how they fit together. Firstly I have the instruments transmitting on wireless TCP/IP and these messages are read using my Python I wrote earlier in the year.
I have enhanced the Python I wrote to read and translate the messages and instead of writing to a file I stream the JSON messages to a topic in Kafka.
Once the messages are in Kafka I use Kafka Connect to stream the data into InfluxDB. The messages are written to topic-specific measurements (tables in InfluxdDB).
Grafana is used to display incoming messages in real time.
Kafka components
I am running the application on a MacBook Pro. Basically a single node instance with zookeeper, Kafka broker and a Kafka connect worker. This is the minimum setup with very little resilience.
In summary
ZooKeeper is an open-source server that enables distributed coordination of configuration information. In the Kafka architecture ZooKeeper stores metadata about brokers, topics, partitions and their locations.
ZooKeeper is configured in zookeeper.properties
.
Kafka broker is a single Kafka server.
"The broker receives messages from producers, assigns offsets to them, and commits the messages to storage on disk. It also services consumers, responding to fetch requests for partitions and responding with the messages that have been committed to disk." [1]
The broker is configured in server.properties
. In this setup I have set auto.create.topics.enabled=false
. Setting this to false gives me control over the environment as the name suggests it disables the auto-creation of a topic which in turn could lead to confusion.
Kafka connect worker allows us to take advantage of predefined connectors that enable the writing of messages to known external datastores from Kafka. The worker is a wrapper around a Kafka consumer. A consumer is able to read messages from a topic partition using offsets. Offsets keep track of what has been read by a particular consumer or consumer group. (Kafka connect workers can also write to Kafka from datastores but I am not using this functionality in this instance). The connect worker is configured in connect-distributed-properties
. I have defined the location of the plugins in this configuration file. Connector definitions are used to determine how to write to an external data source.
Producer to InfluxDB
I use kafka-python to stream the messages into kafka. Within kafka-python there is a KafkaProducer that is intended to work in a similar way to the official java client.
I have created a producer for each message type (parameterised code). Although each producer reads the entire stream from the TCP/IP port it only processes it's assigned message type (wind or speed) this increasing parallelism and therefore throughput.
producer = KafkaProducer(bootstrap_servers='localhost:9092' , value_serializer=lambda v: json.dumps(v).encode('utf-8'))
producer.send(topic, json_str)
I have created a topic per message type with a single partition. Using a single partition per topic guarantees I will consume messages in the order they arrive. There are other ways to increase the number of partitions and still maintain the read order but for this use case a topic per message type seemed to make sense. I basically have optimised throughput (well enough for the number of messages I am trying to process).
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic wind-json
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic speed-json
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic gps-json
When defining a topic you specify the replaication-factor
and the number of partitions
.
The topic-level configuration is replication.factor. At the broker level, you control the default.replication.factor for automatically created topics. [1:1] (I have turned off the automatic creation of topics).
The messages are consumed using Stream reactor which has an InfluxDB sink mechanism and writes directly to the measurements within a performance database I have created. The following parameters showing the topics and inset mechanism are configured in performance.influxdb-sink.properties
.
topics=wind-json,speed-json,gps-json
connect.influx.kcql=INSERT INTO wind SELECT * FROM wind-json WITHTIMESTAMP sys_time();INSERT INTO speed SELECT * FROM speed-json WITHTIMESTAMP sys_time();INSERT INTO gps SELECT * FROM gps-json WITHTIMESTAMP sys_time()
The following diagram shows the detail from producer to InfluxDB.
If we now run the producers we get data streaming through the platform.
Producer Python log showing JSON formatted messages:
Status of consumers show minor lag reading from two topics, the describe also shows the current offsets for each consumer task and partitions being consumed (if we had a cluster it would show multiple hosts):
Inspecting the InfluxDB measurements:
When inserting into a measurement in InfluxDB if the measurement does not exist it gets created automatically. The datatypes of the fields are determined from the JSON object being inserted. I needed to adjust the creation of the JSON message to cast the values to floats otherwise I ended up with the wrong types. This caused reporting issues in Grafana. This would be a good case for using Avro and Schema Registry to handle these definitions.
The following gif shows Grafana displaying some of the wind and speed measurements using a D3 Gauge plugin with the producers running to the right of the dials.
Next Steps
I'm now ready to do some real-life testing on our next sailing passage.
In the next blog, I will look at making the setup more resilient to failure and how to monitor and automatically recover from some of these failures. I will also introduce the WorldMap pannel to Grafana so I can plot the location the readings were taken and overlay tidal data.
References
OAC – Thoughts on Moving to the Cloud
Last week, I spent a couple of days with Oracle at Thames Valley Park and this presented me with a perfect opportunity to sit down and get to grips with the full extent of the Oracle Analytics Cloud (OAC) suite...without having to worry about client requirements or project deadlines!
As a company, Rittman Mead already has solid experience of OAC, but my personal exposure has been limited to presentations, product demonstrations, reading the various postings in the blog community and my existing experiences of Data Visualisation and BI cloud services (DVCS and BICS respectively). You’ll find Francesco’s post a good starting place if you need an overview of OAC and how it differs (or aligns) to Data Visualisation and BI Cloud Services.
So, having spent some time looking at the overall suite and, more importantly, trying to interpret what it could mean for organisations thinking about making a move to the cloud, here are my top three takeaways:
Clouds Come In Different Shapes and Flavours
Two of the main benefits that a move to the cloud offers are simplification in platform provisioning and an increase in flexibility, being able to ramp up or scale down resources at will. These both comes with a potential cost benefit, depending on your given scenario and requirement. The first step is understanding the different options in the OAC licensing and feature matrix.
First, we need to draw a distinction between Analytics Cloud and the Autonomous Analytics Cloud (interestingly, both options point to the same page on cloud.oracle.com, which makes things immediately confusing!). In a nutshell though, the distinction comes down to who takes responsibility for the service management: Autonomous Analytics Cloud is managed by Oracle, whilst Analytics Cloud is managed by yourself. It’s interesting to note that the Autonomous offering is marginally cheaper.
Next, Oracle have chosen to extend their BYOL (Bring Your Own License) option from their IaaS services to now incorporate PaaS services. This means that if you have existing licenses for the on-premise software, then you are able to take advantage of what appears to be a significantly discounted cost. Clearly, this is targeted to incentivise existing Oracle customers to make the leap into the Cloud, and should be considered against your ongoing annual support fees.
Since the start of the year, Analytics Cloud now comes in three different versions, with the Standard and Enterprise editions now being separated by the new Data Lake edition. The important things to note are that (possibly confusingly) Essbase is now incorporated into the Data Lake edition of the Autonomous Analytics Cloud and that for the full enterprise capability you have with OBIEE, you will need the Enterprise edition. Each version inherits the functionality of its preceding version: Enterprise edition gives you everything in the Data Lake edition; Data Lake edition incorporates everything in the Standard edition.
Finally, it’s worth noting that OAC aligns to the Universal Credit consumption model, whereby the cost is determined based on the size and shape of the cloud that you need. Services can be purchased as Pay as You Go or Monthly Flex options (with differential costing to match). The PAYG model is based on hourly consumption and is paid for in arrears, making it the obvious choice for short term prototyping or POC activities. Conversely, the Monthly Flex model is paid in advance and requires a minimum 12 month investment and therefore makes sense for full scale implementations. Then, the final piece of the jigsaw comes with the shape of the service you consume. This is measured in OCPU’s (Oracle Compute Units) and the larger your memory requirements, the more OCPU’s you consume.
Where You Put Your Data Will Always Matter
Moving your analytics platform into the cloud may make a lot of sense and could therefore be a relatively simple decision to make. However, the question of where your data resides is a more challenging subject, given the sensitivities and increasing legislative constraints that exist around where your data can or should be stored. The answer to that question will influence the performance and data latency you can expect from your analytics platform.
OAC is architected to be flexible when it comes to its data sources and consequently the options available for data access are pretty broad. At a high level, your choices are similar to those you would have when implementing on-premise, namely:
- perform ELT processing to transform and move the data (into the cloud);
- replicate data from source to target (in the cloud) or;
- query data sources via direct access.
These are supplemented by a fourth option to use the inbuilt Data Connectors available in OAC to connect to cloud or on-premise databases, other proprietary platforms or any other source accessible via JDBC. This is probably a decent path for exploratory data usage within DV, but I’m not sure it would always make the best long term option.
Unsurprisingly, with the breadth of options comes a spectrum of tooling that can be used for shifting your data around and it is important to note that depending on your approach, additional cloud services may or may not be required.
For accessing data directly at its source, the preferred route seems to be to use RDC (Remote Data Connector), although it is worth noting that support is limited to Oracle (including OLAP), SQL Server, Teradata or DB2 databases. Also, be aware that RDC operates within WebLogic Server and so this will be needed within the on-premise network.
Data replication is typically achieved using Data Sync (the reincarnation of the DAC, which OBIA implementers will already be familiar with), although it is worth mentioning that there are other routes that could be taken, such as APEX or SQL Developer, depending on the data volumes and latency you have to play with.
Classic ELT processing can be achieved via Oracle Data Integrator (either the Cloud Service, a traditional on-premise implementation or a hybrid-model).
Ultimately, due care and attention needs to be taken when deciding on your data architecture as this will have a fundamental effect on the simplicity with which data can be accessed and interpreted, the query performance achieved and the data latency built into your analytics.
Data Flows Make For Modern Analytics Simplification
A while back, I wrote a post titled Enabling a Modern Analytics Platform in which I attempted to describe ways that Mode 1 (departmental) and Mode 2 (enterprise) analytics could be built out to support each other, as opposed to undermining one another. One of the key messages I made was the importance of having an effective mechanism for transitioning your Mode 1 outputs back into Mode 2 as seamlessly as possible. (The same is true in reverse for making enterprise data available as an Mode 1 input.)
One of the great things about OAC is how it serves to simplify this transition. Users are able to create analytic content based on data sourced from a broad range of locations: at the simplest level, Data Sets can be built from flat files or via one of the available Data Connectors to relational, NoSQL, proprietary database or Essbase sources. Moreover, enterprise curated metadata (via RPD lift-and-shift from an on-premise implementation) or analyst developed Subject Areas can be exposed. These sources can be ‘mashed’ together directly in a DV project or, for more complex or repeatable actions, Data Flows can be created to build Data Sets. Data Flows are pretty powerful, not only allowing users to join disparate data but also perform some useful data preparation activities, ranging from basic filtering, aggregation and data manipulation actions to more complex sentiment analysis, forecasting and even some machine learning modelling features. Importantly, Data Flows can be set to output their results to disk, either written to a Data Set or even to a database table and they can be scheduled for repetitive refresh.
For me, one of the most important things about the Data Flows feature is that it provides a clear and understandable interface which shows the sequencing of each of the data preparation stages, providing valuable information for any subsequent reverse engineering of the processing back into the enterprise data architecture.
In summary, there are plenty of exciting and innovative things happening with Oracle Analytics in the cloud and as time marches on, the case for moving to the cloud in one shape or form will probably get more and more compelling. However, beyond a strategic decision to ‘Go Cloud’, there are many options and complexities that need to be addressed in order to make a successful start to your journey - some technical, some procedural and some organisational. Whilst a level of planning and research will undoubtedly smooth the path, the great thing about the cloud services is that they are comparatively cheap and easy to initiate, so getting on and building a prototype is always going to be a good, exploratory starting point.
OAC – Thoughts on Moving to the Cloud
Last week, I spent a couple of days with Oracle at Thames Valley Park and this presented me with a perfect opportunity to sit down and get to grips with the full extent of the Oracle Analytics Cloud (OAC) suite...without having to worry about client requirements or project deadlines!
As a company, Rittman Mead already has solid experience of OAC, but my personal exposure has been limited to presentations, product demonstrations, reading the various postings in the blog community and my existing experiences of Data Visualisation and BI cloud services (DVCS and BICS respectively). You’ll find Francesco’s post a good starting place if you need an overview of OAC and how it differs (or aligns) to Data Visualisation and BI Cloud Services.
So, having spent some time looking at the overall suite and, more importantly, trying to interpret what it could mean for organisations thinking about making a move to the cloud, here are my top three takeaways:
Clouds Come In Different Shapes and Flavours
Two of the main benefits that a move to the cloud offers are simplification in platform provisioning and an increase in flexibility, being able to ramp up or scale down resources at will. These both comes with a potential cost benefit, depending on your given scenario and requirement. The first step is understanding the different options in the OAC licensing and feature matrix.
First, we need to draw a distinction between Analytics Cloud and the Autonomous Analytics Cloud (interestingly, both options point to the same page on cloud.oracle.com, which makes things immediately confusing!). In a nutshell though, the distinction comes down to who takes responsibility for the service management: Autonomous Analytics Cloud is managed by Oracle, whilst Analytics Cloud is managed by yourself. It’s interesting to note that the Autonomous offering is marginally cheaper.
Next, Oracle have chosen to extend their BYOL (Bring Your Own License) option from their IaaS services to now incorporate PaaS services. This means that if you have existing licenses for the on-premise software, then you are able to take advantage of what appears to be a significantly discounted cost. Clearly, this is targeted to incentivise existing Oracle customers to make the leap into the Cloud, and should be considered against your ongoing annual support fees.
Since the start of the year, Analytics Cloud now comes in three different versions, with the Standard and Enterprise editions now being separated by the new Data Lake edition. The important things to note are that (possibly confusingly) Essbase is now incorporated into the Data Lake edition of the Autonomous Analytics Cloud and that for the full enterprise capability you have with OBIEE, you will need the Enterprise edition. Each version inherits the functionality of its preceding version: Enterprise edition gives you everything in the Data Lake edition; Data Lake edition incorporates everything in the Standard edition.
Finally, it’s worth noting that OAC aligns to the Universal Credit consumption model, whereby the cost is determined based on the size and shape of the cloud that you need. Services can be purchased as Pay as You Go or Monthly Flex options (with differential costing to match). The PAYG model is based on hourly consumption and is paid for in arrears, making it the obvious choice for short term prototyping or POC activities. Conversely, the Monthly Flex model is paid in advance and requires a minimum 12 month investment and therefore makes sense for full scale implementations. Then, the final piece of the jigsaw comes with the shape of the service you consume. This is measured in OCPU’s (Oracle Compute Units) and the larger your memory requirements, the more OCPU’s you consume.
Where You Put Your Data Will Always Matter
Moving your analytics platform into the cloud may make a lot of sense and could therefore be a relatively simple decision to make. However, the question of where your data resides is a more challenging subject, given the sensitivities and increasing legislative constraints that exist around where your data can or should be stored. The answer to that question will influence the performance and data latency you can expect from your analytics platform.
OAC is architected to be flexible when it comes to its data sources and consequently the options available for data access are pretty broad. At a high level, your choices are similar to those you would have when implementing on-premise, namely:
- perform ELT processing to transform and move the data (into the cloud);
- replicate data from source to target (in the cloud) or;
- query data sources via direct access.
These are supplemented by a fourth option to use the inbuilt Data Connectors available in OAC to connect to cloud or on-premise databases, other proprietary platforms or any other source accessible via JDBC. This is probably a decent path for exploratory data usage within DV, but I’m not sure it would always make the best long term option.
Unsurprisingly, with the breadth of options comes a spectrum of tooling that can be used for shifting your data around and it is important to note that depending on your approach, additional cloud services may or may not be required.
For accessing data directly at its source, the preferred route seems to be to use RDC (Remote Data Connector), although it is worth noting that support is limited to Oracle (including OLAP), SQL Server, Teredata or DB2 databases. Also, be aware that RDC operates within WebLogic Server and so this will be needed within the on-premise network.
Data replication is typically achieved using Data Sync (the reincarnation of the DAC, which OBIA implementers will already be familiar with), although it is worth mentioning that there are other routes that could be taken, such as APEX or SQL Developer, depending on the data volumes and latency you have to play with.
Classic ELT processing can be achieved via Oracle Data Integrator (either the Cloud Service, a traditional on-premise implementation or a hybrid-model).
Ultimately, due care and attention needs to be taken when deciding on your data architecture as this will have a fundamental effect on the simplicity with which data can be accessed and interpreted, the query performance achieved and the data latency built into your analytics.
Data Flows Make For Modern Analytics Simplification
A while back, I wrote a post titled Enabling a Modern Analytics Platform in which I attempted to describe ways that Mode 1 (departmental) and Mode 2 (enterprise) analytics could be built out to support each other, as opposed to undermining one another. One of the key messages I made was the importance of having an effective mechanism for transitioning your Mode 1 outputs back into Mode 2 as seamlessly as possible. (The same is true in reverse for making enterprise data available as an Mode 1 input.)
One of the great things about OAC is how it serves to simplify this transition. Users are able to create analytic content based on data sourced from a broad range of locations: at the simplest level, Data Sets can be built from flat files or via one of the available Data Connectors to relational, NoSQL, proprietary database or Essbase sources. Moreover, enterprise curated metadata (via RPD lift-and-shift from an on-premise implementation) or analyst developed Subject Areas can be exposed. These sources can be ‘mashed’ together directly in a DV project or, for more complex or repeatable actions, Data Flows can be created to build Data Sets. Data Flows are pretty powerful, not only allowing users to join disparate data but also perform some useful data preparation activities, ranging from basic filtering, aggregation and data manipulation actions to more complex sentiment analysis, forecasting and even some machine learning modelling features. Importantly, Data Flows can be set to output their results to disk, either written to a Data Set or even to a database table and they can be scheduled for repetitive refresh.
For me, one of the most important things about the Data Flows feature is that it provides a clear and understandable interface which shows the sequencing of each of the data preparation stages, providing valuable information for any subsequent reverse engineering of the processing back into the enterprise data architecture.
In summary, there are plenty of exciting and innovative things happening with Oracle Analytics in the cloud and as time marches on, the case for moving to the cloud in one shape or form will probably get more and more compelling. However, beyond a strategic decision to ‘Go Cloud’, there are many options and complexities that need to be addressed in order to make a successful start to your journey - some technical, some procedural and some organisational. Whilst a level of planning and research will undoubtedly smooth the path, the great thing about the cloud services is that they are comparatively cheap and easy to initiate, so getting on and building a prototype is always going to be a good, exploratory starting point.
Why DevOps Matters for Enterprise BI
Why are people frustrated with their existing enterprise BI tools such as OBIEE? My view is because it costs too much to produce relevant content. I think some of this is down to the tools themselves, and some of it is down to process.
Starting with the tools, they are not “bad” tools; the traditional licensing model can be expensive in today’s market, and traditional development methods are time-consuming and hence expensive. The vendor’s response is to move to the cloud and to highlight cost savings that can be made by having a managed platform. Oracle Analytics Cloud (OAC) is essentially OBIEE installed on Oracle’s servers in Oracle’s data centres with Oracle providing your system administration, coupled with the ability to flex your licensing on a monthly or annual basis.
Cloud does give organisations the potential for more agility. Provisioning servers can no longer hold up the start of a project, and if a system needs to increase capacity, then more CPUs or nodes can be added. This latter case is a bit murky due to the cost implications and the option to try and resolve performance issues through query efficiency on the database.
I don’t think this solves the problem. Tools that provide reports and dashboards are becoming more commoditised, up and coming vendors and platform providers are offering the service for a fraction of the cost of the traditional vendors. They may lack some of the enterprise features like open security models; however, these are an area that platform providers are continually improving. Over the last 10 years, Oracle's focus for OBIEE has been on more on integration than innovation. Oracle DV was a significant change; however, there is a danger that Oracle lost the first-mover advantage to tools such as Tableau and QlikView. Additionally, some critical features like lineage, software lifecycle development, versioning and process automation are not built in to OBIEE and worse still, the legacy design and architecture of the product often hinders these.
So this brings me back round to process. Defining “good” processes and having tools to support them is one of the best ways you can keep your BI tools relevant to the business by reducing the friction in generating content.
What is a “good” process? Put simply, a process that reduces the time between the identification of a business need and the realising it with zero impact on existing components of the system. Also, a “good” process should provide visibility of any design, development and testing, plus documentation of changes, typically including lineage in a modern BI system. Continuous integration is the Holy Grail.
This why DevOps matters. Using automated migration across environments, regression tests, automatically generated documentation in the form of lineage, native support for version control systems, supported merge processes and ideally a scripting interface or API to automate the generation of repetitive tasks such as changing the data type of a group of fields system-wide, can dramatically reduce the gap from idea to realisation.
So, I would recommend that when looking at your enterprise BI system, you not only consider the vendor, location and features but also focus on the potential for process optimisation and automation. Automation could be something that the vendor builds into the tool, or you may need to use accelerators or software provided by a third party. Over the next few weeks, we will be publishing some examples and case studies of how our BI and DI Developer Toolkits have helped clients and enabled them to automate some or all of the BI software development cycle, reducing the time to release new features and increasing the confidence and robustness of the system.