Liberate your data

Intelligence is all about knowledge. This website is dedicated sharing expertise on Oracle BI. More »

 

End-to-End ODI12c ETL on Oracle Big Data Appliance Pt.5 : Bulk Unload to Oracle

All week I’ve been looking at what’s involved in moving data around Hadoop on the Oracle Big Data Appliance, using ODI12c to orchestrate the end-to-end process. Using web log data from the Rittman Mead website, I first landed the log data on HDFS using Apache Flume, then transformed the data in several stages using Hive transformations built using ODI mappings and knowledge modules. The links below give the previous articles in this series, where you can read how I transformed data using Hive, Sqoop and Python streaming:

At the end of all these ODI12c transformations, I’m left with a Hive table containing summarised web page accesses, enhanced with reference data on the page title and author, and with the entries geocoded with the name of the country associated with the page access. In many cases, this is where I’d leave the data, but quite often it’d be useful to then export the data out of Hadoop and into an Oracle database, so that I could load it into a data warehouse, or analyse it using a richer SQL dialect than I can get with HiveQL. I could use Sqoop to get the data out of Hive and into Oracle, but for larger exports in particular I’d get better unload performance by using Oracle Loader for Hadoop, one of the Oracle Big Data Connectors that typically come with Oracle Big Data Appliance.

There’s actually two Oracle Big Data Connectors that deal with getting data out of Hadoop and into Oracle; Oracle Direct Connector for HDFS (ODCH) gives you the ability to define an Oracle Database External Table over a HDFS file or Hive table, and is useful if you just want to access the data in Hadoop without actually loading it into an Oracle Database. 

NewImage

The way ODCH works is that it adds a special type of adapter to the external table feature in Oracle Database, that gives the database the ability to stream HDFS files in a similar way to how external tables stream regular file data into the database. This makes ODCH very convenient if you just want access to HDFS or Hive data from an Oracle database, but it suffers from the same performance penalty as regular external flat file access, in that you can’t index the table and so forth; so if you want high-performance access to data held in a Hadoop system, and you need this access more than just occasionally, you’re better off copying the data right into the Oracle database, and this is where Oracle Loader for Hadoop comes in.

Oracle Loader for Hadoop (OLH) is a bulk-unloader for Hadoop that uses MapReduce technology to sort, partition and organise the data extraction at the Hadoop end, and then fast-unload the data into the Oracle Database typically much faster than a Sqoop data transfer.

NewImage

OLH is, however, a bit fiddly to setup, so what’s useful is that ODI has a knowledge module, IKM File/Hive to Oracle, that can set up both OLH and ODCH jobs. This knowledge module does have its own setup process that itself is a bit involved – mostly around environment variables on your Linux system – but once its done, each OLH data transfer is then very easy to setup, and as long as you don’t then go on to do any more transformation on the Oracle data post-unload, it’s covered by the ODI limited-use license you get when you buy it in conjunction with the BDA.

So the key things to get OLH working with ODI are firstly, to install OLH on the BDA node that’s running the ODI12c agent (or in my case, full ODI Studio), and then set up the Oracle user’s environment variables so OLH works, and ODI knows where to find any additional Hadoop / Hive libraries and configuration files to work with OLH. In my case, this meant adding the following entries to my .bashrc file:

export HIVE_HOME=/usr/lib/hive
export HADOOP_CLASSPATH=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/*:/etc/hive/conf:$HIVE_HOME/lib/hive-metastore-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/libthrift.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:$HIVE_HOME/lib/hive-common-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/hive-exec-0.12.0-cdh5.0.1.jar
export OLH_HOME=/home/oracle/oracle/product/oraloader-3.0.0-h2
export HADOOP_HOME=/usr/lib/hadoop
export JAVA_HOME=/usr/java/jdk1.7.0_60
export ODI_HIVE_SESSION_JARS=/usr/lib/hive/lib/hive-contrib.jar
export ODI_OLH_JARS=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/ojdbc6.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-utility.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-mapping.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-collation.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/oraclepki.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/osdt_cert.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/osdt_core.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/commons-math-2.2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/jackson-core-asl-1.8.8.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/jackson-mapper-asl-1.8.8.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/avro-1.7.3.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/avro-mapred-1.7.3-hadoop2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/oraloader.jar,/usr/lib/hive/lib/hive-metastore.jar,/usr/lib/hive/lib/libthrift-0.9.0.cloudera.2.jar,/usr/lib/hive/lib/libfb303-0.9.0.jar,/usr/lib/hive/lib/hive-common-0.12.0-cdh5.0.1.jar,/usr/lib/hive/lib/hive-exec.jar

where OLH was installed to /home/oracle/oracle/product/oraloader-3.0.0-h2, Hadoop was installed to /usr/lib/hadoop and Hive to /usr/lib/hive. ODI_HIVE_SESSION_JARS is optional, but if you do anything fancy with SerDes in your mappings you’re best adding this entry in, to the hive-contrib.jar file (or technically the symbolic link to it). Getting this right was the hardest part about setting this KM up, but once it’s done, it’s done.

Then, to set the ODI parts up, I first create a table in my Oracle 11.2.0.3 database with matching datatypes for the Hive table i’m going to unload; varchar2 for string, number for int, etc, and then create entries in the ODI topology and Designer navigator. Then, I create a new mapping in ODI12c, where I copy the columns from my Hive table into the Oracle table, like this (note that I could map from an HDFS file instead, if I wanted to):

NewImage

Now it’s a case of configuring the Physical part of the mapping. There’s two steps to this; first, you have to change the default LKM setting for the access table in the physical mapping, the ACCESS_F object in the TARGET_GROUP below:

NewImage

By default this sets itself to LKM SQL to Oracle (as you’ve got an Oracle target), but we need to change this to a special one, LKM SQL Multi-Connect (below), that delegates the multi-connect capabilities to the downstream node, so you can use a multi-connect IKM such as IKM File/Hive to Oracle. 

NewImage

 

Then, and only once you’ve selected this LKM, you can select IKM File/Hive to Oracle for the target table KM.

NewImage

There’s not much you need to enable with this KM to at least get started. Set OLH_OUTPUT_MODE to JDBC (you can enable OCI later on instead, for potentially faster unloading) and set MAPRED_OUTPUT_BASE_DIR to an HDFS directory that the agent’s OS user (typically “oracle”) can read and write to – in my case, I created one specifically for this KM, using Hue’s File System browser tool. Then, if all’s set up correctly you should be able to execute the mapping and view the results in the Operator navigator – note that in my case, it took several attempts before I finally got all the environment variable settings right. Once done, you should then be able to view the Hive table output in your Oracle table, using SQL Developer.

NewImage

So now, we’ve got each step for our end-to-end process. The only thing left to do is to pull all of these steps, including the ODI Procedure to run the sqoop transfer, into an ODI package or load plan, so that we can run the steps in sequence:

NewImage

Then, finally, I can run the whole process as a single execution, moving data from Flume capture through Hadoop ETL steps, to the unload into an Oracle table, all from within ODI12c.

NewImage

So there you have it – Hadoop-native ETL processing, with the metadata-driven design, error handling and heterogenous connectivity provided by ODI12c.

End-to-End ODI12c ETL on Oracle Big Data Appliance Pt.4 : Transforming Data using Python & Hive Streaming

This week I’m taking an in-depth look at ETL on the Oracle Big Data Appliance, using Oracle Data Integrator 12c to call the various bits of Hadoop functionality and orchestrate the whole process. So far, I’ve landed web log data into the Hadoop cluster using Flume, created a Hive table over the log data using Hive and the RegEx SerDe, then used further Hive transformations to join this data to other reference data, some of which came from an Oracle database via Sqoop. Here’s a complete listing of the posts so far, and the ones to come:

Our next transformation is a bit trickier though. The data we’re bringing in from our webserver has IP addresses recorded for each HTTP access, and what we want to do is use that IP address to identify the country that the website visitor was located in. Geocoding, as it’s called, is fairly easy to do if your data is stored in a database such as Oracle Database or you’re using a language like Python, with two steps needed to retrieve the IP address’s country:

1. First convert the IP address into a single integer, like this:

address = '174.36.207.186'
( o1, o2, o3, o4 ) = address.split('.')
integer_ip =   ( 16777216 * o1 )
             + (    65536 * o2 )
             + (      256 * o3 ) 
           +              o4

2. Then, using a freely-downloadable database from MaxMind, look to see which range of numbers your converted IP address is in:

SELECT ip_country
FROM geoip
WHERE 2921648058 BETWEEN begin_ip_num AND end_ip_num
LIMIT 1

The only problem is that Hive doesn’t support anything other than equi-joins. In a previous blog post I got around this issue by using Impala rather than Hive, and in another one I used Pig, and a custom Python UDF, to do the geocoding outside of the main Pig data flow. I could use Pig and a Python UDF in this situation, creating an ODI Procedure to call Pig from the command-line, but instead I’ll use another ODI KM, IKM Hive Transform, to call a Python script but entirely controlled from within ODI.

To set this geocoding process up, I first create a new ODI mapping, with my current Hive table as the source, and another Hive table this time with an additional column for the IP address country, as the target. I then map what columns I can into the target, leaving just the country column without a mapping. Notice I’ve put the source table within an ODI12c dataset – this isn’t mandatory, it’s just something I decided to do at the time – it doesn’t impact on the rest of the process.

NewImage

Now in a similar way to how I created the Python UDF to do the country lookup for Pig, I’ll create a variation of that script that expects columns of data as an input, outputs other columns of data, and does the geocoding within the script by calling the Python GeoIP API provided by MaxMind. Most importantly though, I’ll use it in conjunction with IKM Hive Transform’s ability to stream it’s Hive data through an arbitrary script, making use of the Hive TRANSFORM function and a feature called “hive streaming”. To keep things simple let’s first set it up outside of the KM, see how it works, and then configure the KM to use this python script as its transformation function.

Let’s start by looking at the source Hive table I’ll be starting with:

hive> describe access_per_post_categories;
OK
hostname           string             None                
request_date       string             None                
post_id            string             None                
title              string             None                
author             string             None                
category           string             None                
Time taken: 0.112 seconds, Fetched: 6 row(s)
hive> select * from access_per_post_categories limit 5;
OK
137.254.4.12[02/Jun/2014:21:20:41 +0000]2093Hyperion Planning : Installing the Sample Planning Application in EPM 11.1Mark RittmanHyperion
137.254.4.12[02/Jun/2014:21:20:42 +0000]2093Hyperion Planning : Installing the Sample Planning Application in EPM 11.1Mark RittmanHyperion
137.254.4.12[02/Jun/2014:21:20:43 +0000]2093Hyperion Planning : Installing the Sample Planning Application in EPM 11.1Mark RittmanHyperion
137.254.4.12[02/Jun/2014:21:20:45 +0000]2093Hyperion Planning : Installing the Sample Planning Application in EPM 11.1Mark RittmanHyperion
137.254.4.12[02/Jun/2014:21:21:23 +0000]2093Hyperion Planning : Installing the Sample Planning Application in EPM 11.1Mark RittmanHyperion
Time taken: 0.976 seconds, Fetched: 5 row(s)

The target table is exactly the same, except for the additional column for country:

hive> describe access_per_post_full;
OK
hostname           string             None                
request_date       string             None                
post_id            string             None                
title              string             None                
author             string             None                
country            string             None                
category           string             None                
Time taken: 0.098 seconds, Fetched: 7 row(s)

The way IKM Hive Transform works is by streaming each incoming row of Hive data to the script registered with it, with each row’s columns tab-separated. In-turn, it expects the script to output similar rows of tab-separated values, each line terminated with a newline. My script, therefore, needs to accept an arbitrary number of incoming rows, parse each row into its constituent columns, call the MaxMind API to do the geocoding, and then output a tab-separated set of columns for each row it processes (a bit like pipelined table functions in Oracle PL/SQL).

So here’s the Python script that does this:

[root@bdanode1 tmp]# cat add_countries.py
#!/usr/bin/python
import sys
sys.path.append('/usr/lib/python2.6/site-packages/')
import pygeoip
gi = pygeoip.GeoIP('/tmp/GeoIP.dat')
for line in sys.stdin:
  line = line.rstrip()
  hostname,request_date,post_id,title,author,category = line.split('\t')
  country = gi.country_name_by_addr(hostname)
  print hostname+'\t'+request_date+'\t'+post_id+'\t'+title+'\t'+author+'\t'+country+'\t'+category

A few things to note about the script:

1. “Import sys” brings in the Python libraries that then allows me to read the input data, via sys.stdin.
2. I’ve previously installed the MaxMind GeoIP API, database and libraries, with the GeoIP.dat database stored alongside the script in /tmp. Import pygeoip brings in the Python library to use the API.
3. The bit that does the geocoding is “gi.country_name_by_addr”, which calls the MaxMind API. It’s this bit that replaces the need for a join with a BETWEEN clause in HiveQL.
4. To output data back to the calling Hive transformation, I just “print” it, using “\t” for tab.

One thing we’ll need to do though, similar to how we had to copy the JAR file used by Hive for the RegEx SerDe around all the nodes in the cluster, Is copy the GeoIP.dat file used by the MaxMind geocoding API to the same place on all the cluster nodes. We’ll also have to set the permissions on this file so it’s readable and writeable by the Hive/YARN process on each Hadoop node:

officeimac:~ markrittman$ ssh root@bdanode2.rittmandev.com
The authenticity of host 'bdanode2.rittmandev.com (192.168.2.231)' can't be established.
RSA key fingerprint is 1b:e7:ec:01:57:0c:09:4e:6d:19:08:a4:0d:df:00:e0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'bdanode2.rittmandev.com' (RSA) to the list of known hosts.
root@bdanode2.rittmandev.com's password: 
Last login: Fri Jun  6 16:55:44 2014 from 192.168.2.201
[root@bdanode2 ~]# ls /tmp/Geo* -l
-rwxrwxrwx 1 oracle oracle 687502 Jun  7 12:31 /tmp/GeoIP.dat

I also need to make sure the MaxMind geocoding Python API is installed into each node too:

wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py pip 
install pygeoip

Now I can run a HiveQL command and test out the script. I’ll start up a Hive shell session, and the first thing I need to do is register the script with Hive, distributing across the cluster and putting it in cache. Then I can select against the source table, using the TRANSFORM USING clause:

[oracle@bdanode1 ~]$ hive
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/06/09 20:48:00 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/06/09 20:48:01 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
 
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.12.0-cdh5.0.1.jar!/hive-log4j.properties
hive> add file x-marsedit-filelocal:///tmp/add_countries.py;                                
Added resource: x-marsedit-filelocal:///tmp/add_countries.py
hive> select transform (hostname,request_date,post_id,title,author,category)
    > using 'add_countries.py'                                              
    > as (hostname,request_date,post_id,title,author,category,country)      
    > from access_per_post_categories; 
...
63.73.199.69[02/Jun/2014:21:19:28 +0000] 14726 SmartView as the Replacement for BI Office with OBIEE 11.1.1.7 Mark Rittman United States Oracle EPM
77.125.81.239[02/Jun/2014:21:18:58 +0000] 14476 Upgrading OBIEE to 11.1.1.7 Robin Moffatt Israel Oracle BI Suite EE
77.125.81.239[02/Jun/2014:21:18:59 +0000] 14476 Upgrading OBIEE to 11.1.1.7 Robin Moffatt Israel Oracle BI Suite EE
Time taken: 27.645 seconds, Fetched: 176 row(s)

So it looks like it’s working. Let’s move back to ODI now and reference the script within the IKM Hive Transform KM settings. Note that I enter the name and filesystem location of the script in the KM settings but I don’t use the TRANSFORM_SCRIPT option to directly key the script contents into the KM; this should work, and if you do so the KM will write the script contents out to the agent’s host filesystem at the start of the process, but the KM in this ODI12c release has an issue parsing these keyed-in scripts, so you’re best leaving this setting empty and the KM will just use the file already in the filesystem.

NewImage

After running the mapping, I then go over to the Operator navigator and see that it’s run successfully. Looking at the code the KM generates, I can see it referencing our script in the HiveQL statement, with the script outputting the extra country column that’s then loaded into the target table along with the other columns.

NewImage

Then finally, checking the target table, I can see that each row in my final Hive table has the country name alongside each log file entry, along with the IP address and other details.

So – now we’re at the stage where we’ve finished the processing with the Hadoop cluster, and I’d like to copy the final set of data into an Oracle database, so that I can analyse it using Oracle SQL and other tools. In the final installment in this series we’ll do just that, using IKM File/Hive to Oracle (OLH/ODCH) and Oracle Loader for Hadoop, one of the Oracle Big Data Connectors.

End-to-End ODI12c ETL on Oracle Big Data Appliance Pt.3 : Enhance with Oracle Reference Data via Sqoop, and CKMs

In the first two posts in this series, I used the software on the Oracle Big Data Appliance 3.0 to ingest web log data from the Rittman Mead blog server, parse and load that data into a Hive table, and then join that table to another to add details on post and author. Links to the post in this series are below, and I’ll complete them as the series is posted this week:

In this next step, I’ve got some data sitting in an Oracle database that I’d like to use to enhance the data in the Hadoop cluster; as a reminder, the diagram below shows how I’m looking to move data through the system, and we’re currently at step number three:

NewImage

The data in the Oracle tables contains categories that I can join to the page entries that have come through the ETL process, and the join is pretty simple: POST_ID = POST_ID; but how do I get access to the Oracle data – I need to register the Oracle source in the ODI topology and create a model to represent the table, but can I then just join that table to the Hive table and load the results into another Hive table? Let’s try.

The screenshot below shows the mapping that would use this approach, with two Oracle tables joining to the Hive table and another Hive table as the target; on the right-hand side of the screen is the join condition, across both sources:

NewImage

The Execution view shows how ODI intends to run the transformation, with LKM SQL to SQL KMs used to load the Oracle tables and stage their data in Hive, and then IKM Hive Control Append used to load the joined dataset into another Hive table.

NewImage

If you try and execute the mapping though, it fails – because the LKM SQL to SQL KM can’t work with Hive tables yet.

NewImage

In fact. the various IKMs that come with Oracle Data Integrator Application Adapter for Hadoop (IKM file to Hive, IKM Hive Control Append etc) are a bit of a mix of IKMs and LKMs in that they contain extraction code, and integration code in the same KM, and none of the regular LKMs and IKMs will otherwise work with Hadoop sources. In fact, what we need to do with ODI at this point is actually land the Oracle data in Hive first, then do the join, which then begs the second question – how do we do that?

Currently, the only way to get Oracle data in to Hadoop is via IKM File to Hive (LOAD DATA), which involves an unnecessary extra step of exporting the Oracle data to a file, and then loading that file into HDFS and Hive. What we can do though is use sqoop, a tool within Hadoop to extract from, and load into, relational databases, something I covered on the blog a few weeks ago. Sqoop creates data loading and unloading jobs that run in parallel on the Hadoop cluster, and can use native JDBC drivers or even additional plugins such as Oraoop to make the process run faster (though Oracle Loader for Hadoop is considered the fastest way to unload to Oracle, if you’ve licensed the Big Data Adapters).

The only problem is that there’s no official support for sqoop in ODI, so no KMs that make use of it. What you can do though is create a command-line script to run sqoop and include that in an ODI procedure, which is what I’ll now do to bring in my Oracle data into Hive. To do this, I create an ODI procedure and add a single task, using the Operating System (command-shell) technology type, and use it to tell Sqoop to create me a Hive table based on an SQL SELECT statement against my Oracle database:

NewImage

(note that I had to format the sqoop command, in practice, as one line, to get it to run – the above listing is so that you can see all of the code.)

Executing this procedure works OK, and thereafter I’ve got a single Hive table containing the joined dataset from Oracle.

NewImage

Joining this new Hive table to the previous one containing the distinct set of page views is then fairly straightforward, but something I’d also like to do is stop any entries going into the rest of the ETL process where the calling IP address is a test one we use, “63.73.199.69″. The way I can do this is to use the CKM Hive knowledge module and put a constraint on the hostname in the table I’ll be loading from the join, so that I can then use ODI’s flow control feature to divert those rows to an error table.

NewImage

I also need to define a primary key for this table, something that’s mandatory when flow control is used. So let’s put the mapping together, joining the table I just brought in from Sqoop with the latest version of the weblog entries Hive table, loading into the Hive table I’ve just enabled the constraint for:

NewImage

and then I enable flow control, and the CKM Hive check knowledge module, in the Physical mapping settings.

NewImage

This is of course one of the benefits of using ODI to do your Hadoop data loading – you’ve got access to the data quality and error handling features that come with the tool. Then, when I execute the mapping and check with the Operator navigator, I can see the error handling process running, and afterwards in Hue I can see the contents of the new error table, which now contains those log entries where my test IP address was used, removing them from the target Hive table where they’d ordinarily have gone.

NewImage

So that’s the third step in the ODI BDA ETL process complete. The next one’s a bit trickier though – I need to geocode the entries in the log table, assigning country names to each row based on where the IP address is located. More tomorrow.

The BI Survey 14 – Have Your Voice Heard!

Long-term readers of this blog will know that we’ve supported for many years the BI Survey, an independent survey of BI tools customers and implementors. Rittman Mead have no (financial or other) interest in the BI Survey or its organisers, but we like the way it gathers in detailed data on which tools work best and when, and it’s been a useful set of data for companies such as Oracle when they prioritise their investment in tools such as OBIEE, Essbase and the BI Applications.

Here’s the invite text and link to the survey:

“We would like to invite you to participate in The BI Survey 14, the world’s largest annual survey of business intelligence (BI) users.

To take part in this year’s survey, visit: https://digiumenterprise.com/answer/?link=1906-PHB5RT7V

As a participant, you will:

  • Receive a summary of the results from the full survey
  • Be entered into a draw to win one of ten $50 Amazon vouchers
  • Ensure that your experiences are included in the final analyses

BARC’s annual survey gathers input from thousands of organizations to analyze their buying decisions, implementation cycles and the benefits they achieve from using BI software.

The BI Survey 14 is strictly vendor-independent: It is not sponsored by any vendor and the results are analyzed and published independently.

You will be asked to answer questions on your usage of a BI product from any vendor. Your answers will be used anonymously and your personal details will not be passed on to software vendors or other third parties.

Business and technical users, as well as vendors and consultants, are all encouraged to participate.

The BI Survey 14 should take about 20 minutes to complete. For further information, please contact Jevgeni Vitsenko at BARC (jvitsenko@barc.de). 

Click below to take part in The BI Survey 14: https://digiumenterprise.com/answer/?link=1906-PHB5RT7V

Rittman Mead Featured in Oracle In-Memory Option Launch

Today saw the official launch of the Oracle Database In-Memory Option, with Larry Ellison going through the product features and then reading out some quotes and testimonials from beta testers. Rittman Mead were part of the beta testing program, with several of our team testing out various scenarios where we ETL’d into it, used it with OBIEE and worked out what would be involved in “in-memory-enabling” some of our customer’s BI systems.

In fact, as we said in our quote for the launch, enabling Oracle Database for in-memory analysis was almost “boringly simple” – just enable the option, choose your tables, drop any OLTP indexes and you’re ready to go.

NewImage

Of course, in practice you’ll need to think about which tables you’ll put into memory if RAM is limited, in some scenarios TimesTen might be a better option, and you’ll need to test your particular system and carefully consider whether you’ll keep particular indexes or materialised views, but we’re really excited about the In-Memory Option for Oracle Database as it’s got the potential to significantly improve query response times for users – and from what we’ve seen so far, it “just works”.

We’re still in the NDA period whilst beta testing goes on, but you can read more on the In-Memory Option on the Oracle website, and on the blog post I wrote when the feature was announced last Openworld. Once it goes GA look out for some in-depth articles on the blog around how it works, and details on how we’ll be able to help customers take advantage of this significant new Oracle Database feature.