Category Archives: Rittman Mead

Using the Go-URL in OAS

The OBIEE Go-URL functionality is widely used and it's deprecation in OAS is causing issues for companies wishing to upgrade to OAS.

This is a feature of OBIEE that allows a report to be accessed directly from a URL, enabling links from other applications - often with optional parameters passed through to filter the report - and usually bypassing the login screen by including user credentials within the URL parameters.  

Typically a single user-id is used for Go-URL access to OBIEE reports thus avoiding the need to either enter/detect user details or to provide login credentials for users who would not login to OBIEE directly.

With the first release of OAS, Oracle have announced that the Go URL functionality is being deprecated and may be removed in a future release. Also, since OBIEE 12.2.1.3, the advent of the  combined bi security login screen for both Analytics and DV has meant the NQUser and NQPassword parameters no longer work, so it is not possible to specify a user for the Go URL and hence it cannot login.

There are currently two scenarios under which the Go-URL does work in OAS:

  1. If full Single Sign-on to OAS has been implemented - via Oracle Access Manager or Apache, currently the only recognised methods for SSO in OAS - then the Go-URL will work under the user-id that SSO signs in under. In other words, the user credential parameters in the URL are ignored, the user logs in automatically under their credentials and the requested report displays as normal.

    However this is often not appropriate - if users are clicking on Go-URL links outside the scope of the SSO environment, for instance connecting via the internet or from other networks within the company LAN where SSO could not operate. Also if reports are designed to work under a specific user, perhaps with special security rules, a users own security levels may result in different data appearing in a report.

  1. Disable the lightweight SSO option in OAS. This separates the security for Analytics and DV. For Analytics this returns to the classic login screen and the NQUser and NQPassword parameters on the Go-URL work correctly, but this causes issues for DV making it virtually unusable. This is only really an option if you do not want to use DV in your implementation of OAS.

There is however a third option we have discovered which uses the Single Sign-On approach, but without the need to actually detect user credentials via an SSO service. Instead a specific user-id can be provided, just like the NQUser parameter in the Go URL.

Instead of the Go URL passing the NQUser and NQPassword parameters, Apache can be used to re-direct access to OAS, providing the user-id to login with during the re-direct.  The effect is that the Go URL will still display the same report output whilst logging in with the same user-id, albeit supplied by Apache rather than the NQUser parameter.

This works by using the Single-Sign-On configuration between Apache and OAS.

In a normal SSO configuration, Apache would authenticate users via a standard SSO service, for example, Kerberos, and once the user has passed authentication, redirect the URL to OAS, providing the user credentials as request-headers, so that OAS can automatically login the user in, bypassing the login screen.

The Oracle document ID 2707401.1 explains this process in detail.

We can however ignore the SSO authentication within Apache and instead hard-code the User-id we want to login to OAS with.  Apache will still send this user-id in the request-header to OAS, which in turn will recognise it as a pre-authenticated user and log the session in.  In the case of a Go URL, the rest of the URL which points to a report or dashboard will continue to process as normal, displaying it to the user.

The user is totally unaware of the difference - they click on a link and see a report in OAS just as they did in OBIEE.

Just as with the Go URL there are security implications of this. The user-id and password are no longer sent in plain text in the URL, which is an improvement, but calling the URL will automatically login you into OAS. if you just use the core URL of http://oas-server:port/analytics, you will be logged in to OAS and end up at the home screen, without any user/password challenge.

It is important therefore to restrict the permissions and privileges of this hard-coded user so it can only access the reports/data it needs to and cannot create new content, access other subject areas etc.  Using the Apache configuration, OAS can be tied down so that just the /analytics URL can be opened for this, so no access can be given to /dv, /xmlpserver or any other area of OAS unless explicitly required.

By using Apache to control access to OAS you create a separate port to call OAS on for the Go-URL. This means the port you installed OAS with (e.g. 9502/9503) will remain as the main port for logging into OAS via the login screen - users manually connecting to OAS for analytics, DV and all the other functionality, should continue to use this port.   These connections will not go via Apache.

The Go-URL will use the port provided by Apache - in the instructions below I’ve set this up as port 9603 - and Apache will re-direct the user seamlessly to OAS’s main port.   Apache can also be configured to use SSL certificates, and if installed on the same server as OAS, the same certificates as OAS.  In this example I’m not using SSL, but the instructions for doing so are included.

Assuming you have already OAS installed, below are the steps to install and configure Apache to act as the “SSO” front-end for the Go-URL.   The instructions below were carried out on an Oracle Enterprise Linux 7.8 server - other versions of Linux or using Windows will be slightly different.

Start by installing Apache HTTP Server

As the root user, use yum to download and install Apache and the required SSL and SSO plug-ins:

yum -y install httpd mod_ssl mod_auth_kerb mod_auth_token

Then enable and start Apache:

systemctl enable httpd.service systemctl start httpd.service

Next enable Apache to connect to the network and allow access on port 9603. In this case I’ve installed  policycoreutils to get the semanage command as I have SELinux enabled.

yum -y install policycoreutils-python 
/usr/sbin/setsebool -P httpd_can_network_connect 1 
semanage port -a -t http_port_t -p tcp 9603 
semanage port -l | grep http

The final command above confirms port 9603 is now available:

Next open the firewall port for 9603:

firewall-cmd --zone=public --add-port=9603/tcp --permanent service 
firewalld stop service 
firewalld start 
systemctl restart httpd.service

Now Apache is is installed, we can configure it. Edit the file /etc/httpd/conf/httpd.conf

Set the Listen port and, if SSL is required, add the SSL virtual host properties with the correct server, port and SSL certificate file values (I’ve included the SSL virtual host in httpd.conf, but it could reside in it’s usual place in ssl.conf instead)

Listen 9603 
<VirtualHost oasvm3.local.com:9603> 
ErrorLog logs/ssl_error_log 
TransferLog logs/ssl_access_log 
ServerName oasvm3.local.com 
SSLEngine on 
SSLProxyEngine on 
SSLCertificateFile /u01/oas55/ssl/certificate.crt 
SSLCertificateKeyFile /u01/oas55/ssl/certificate.key 
SSLCertificateChainFile /u01/oas55/ssl/certificate_chain.pem 
SSLCACertificateFile /u01/oas55/ssl/ca_certificate.crt
SSLProxyCACertificateFile /u01/oas55/ssl/ca_certificate.crt 
RequestHeader set WL-Proxy-SSL "true" 
RequestHeader set IS_SSL "ssl" 
RewriteEngine On 
RewriteOptions Inherit 
ProxyPreserveHost On 
</VirtualHost>

save and close, then restart apache:

systemctl restart httpd.service

Then try the Apache URL in a Browser: http://oasvm3.local.com:9603 (or https for ssl)

if using SSL check for the padlock icon in the address bar.

Configure Apache for OAS

Re-open the httpd.conf file and locate the line:

#ServerName www.example.com:80

below this add the following lines - these are typical recommended settings for OAS

###### Various default settings ###### 
# 
# Timeout: The number of seconds before receives and sends time out. 
# 
Timeout 6000 
# 
# KeepAlive: Whether or not to allow persistent connections (more than 
# one request per connection). Set to "Off" to deactivate. 
# 
KeepAlive On 
# 
# MaxKeepAliveRequests: The maximum number of requests to allow 
# during a persistent connection. Set to 0 to allow an unlimited amount. 
# We recommend you leave this number high, for maximum performance. 
# 
MaxKeepAliveRequests 0 
# 
# KeepAliveTimeout: Number of seconds to wait for the next request from
# the same client on the same connection. 
# 
KeepAliveTimeout 60 
#
#####################################

Now add the following lines to the end of the file, altering the OAS URLs appropriately. The ProxyPass URLs must be the original SSL or Non-SSL OAS URLs.

Define GoUser reportuser 

RewriteEngine On Proxy
PreserveHost On 

# Protected Resources 
<Location "/analytics"> 
ProxyPass "http://oasvm3.local.com:9502/analytics" 
ProxyPassReverse "/analytics" 
#SSLRequireSSL 
RequestHeader unset Authorization 
RequestHeader set OAM_REMOTE_USER ${GoUser} 
RequestHeader set iv-user ${GoUser} 
RequestHeader set SM_USER ${GoUser} 
</Location> 

ProxyErrorOverride Off 

# Unprotected Resources 
<Location "/bi-security-login"> 
ProxyPass "https//oasvm3.local.com:9502/bi-security-login"
ProxyPassReverse "/bi-security-login" 
</Location> 

The first line defines the user which will login to OAS, in this case I’ve created a user called ‘reportuser’. This is then substituted into the request headers below.

Ideally this user should exist in the provider used for authentication in Weblogic - so that any security groups can be picked up to control this user's access with OAS.

Note the SSLRequireSSL is hashed out, the hash must be removed if SSL is required.

save and close, then restart Apache:

systemctl restart httpd.service

Configure Weblogic to accept SSO calls from Apache

To enable Weblogic to recognise SSO tokens to Apache we use the OAM Identity Asserter.

Login to the Weblogic admin console, Navigate to Security Realms → myrealm → Providers, then Click Lock & Edit, then click New.

Click New.  Enter a name and choose OAMIdentityAsserter as the type.

Click OK to add the provider:

Click on the new OAM Asserter provider to edit it. Change the Control Flag to REQUIRED & click Save. The remaining default settings are fine.

Return to the providers screen, click Reorder and move the OAM Asserter so it is second in the list below the main user directory, in this case PaulsAD is the main active directory authenticator

This next step isn’t strictly necessary - the Weblogic Plug-in is required to allow access to Data Visualizer and if you have no intention of allowing access to DV via Apache, this step can be skipped.

The Weblogic Plug-in needs enabling in three locations.

From the Domain Structure window click on bi

Go to the Web Applications tab

Scroll down to the WebLogic Plugin Enabled option and check it. then click Save.

From the Domain Structure window expand Environment and click on Servers. Click on bi_server1 in the Configuration tab.

Scroll down the General, click on the Advanced link, then locate the WebLogic Plug-In Enabled option and set it to ‘yes’. Click save.

From the Domain Structure window click on Clusters. Click on bi_cluster

Again in the General tab click on Advanced then set the WebLogic Plug-In Enabled option to ‘yes’. Click save.

All changes are now complete. Click Activate Changes.

A full stop and restart of Weblogic & OAS is required under the oracle user:

/u01/oas55/config/domains/bi/bitools/bin/stop.sh    
/u01/oas55/config/domains/bi/bitools/bin/start.sh

Enable Lightweight SSO

For this method of accessing OAS to work, the internal lightweight SSO must be enabled. It will be enabled by default, but if it has been disabled, for example to make the Go-URL work, then re-enable it:

Stop OAS, open the WLST command line:

cd /u01/oas55/product/oracle_common/common/bin ./wlst.sh

Re-enable SSO (alter the domain path to suit your environment)

wls:/offline> enableBISingleSignOn('/u01/oas55/config/domains/bi','/bi-security-login/logout?redirect=/dv'); 
wls:/offline> exit();

Then restart OBIEE to reflect the changes.

The Apache configuration is now complete and you should be able to login to OAS on the Apache URL, e.g. http://oasvm3.local.com:9603/analytics

The SSO configuration should automatically log you in under the user defined in the apache configuration above:

The Original OAS URL remains on port 9503 and will take you to the normal login screen:

Note than when you login via the Apache you can’t sign-out!  You can click the sign-out link, but the SSO process will simply login you back in again rather than displaying the login screen. To login normally close the browser and reopen with the standard OAS URL.

Apache Log files

If you have any issues with the above, there are five log files for Apache you can use investigate whats going on.

/var/log/audit/audit.log Contains SELinux issues
/var/log/messages Contains general activity messages
/var/log/httpd/error_log Contains errors generated by Apache,including some SSL and Kerberos messages
/var/log/httpd/ssl_error_log Contains SSL errors generated by Apache
/var/log/httpd/ssl_access_log Contains messages related to users connecting over SSL

The logging level in the error_log file is governed by the LogLevel setting in /etc/httpd/conf/httpd.conf This is normally be set to ‘warn’, but can be set to debug to display detailed messages.

Testing the Go URL

With Apache/SSO now enabled, the Go URL can be used:

http://oasvm3.local.com:9603/analytics/saw.dll?Go&Path=%2Fshared%2FPipeline QuickAnalytics%2FPipeline%2FSales Pipeline-Charts&locale=en-en&lang=en

Note that if the Go URL includes the NQUser and NQPassword parameters, they will be ignored.

Alternatively the Dashboard parameter can be used to take a user fully into OAS to view a dashboard:

http://oasvm3.local.com:9603/analytics/saw.dll?dashboard&PortalPath=%2Fshared%2FPipeline QuickAnalytics%2F_portal%2FSales Effectiveness

An important point here is that the user is now logged into OAS and can access other content. As mentioned earlier, the user used for Go URL access should have their permissions and privileges tightly controlled to limit access to just what it required.

Accessing other areas of OAS

The instructions above only provide access to the core /analytics functionality via the Apache SSO URL. Other areas of OAS, such as /dv or /xmlpserver are not enabled:

If these are required the location entries must be added to the httpd.conf file and Apache restarted.

The full list locations can be found in the Oracle doc mentioned earlier, and includes protected and unprotected locations (as some must be visible prior to logging into OAS, e.g. the bi-security-login URL). As an example, here is the location required to enable DV:

<Location "/dv"> 
ProxyPass "http://oasvm3.local.com:9502/dv" 
ProxyPassReverse "/dv" 
#SSLRequireSSL 
RequestHeader unset Authorization 
RequestHeader set OAM_REMOTE_USER 
${GoUser} RequestHeader set iv-user 
${GoUser} RequestHeader set SM_USER 
${GoUser} 
</Location>

Thats it, your now ready to use the Go-URL with the Apache URL and not include the NQUser/NQPassword parameters.

This solution is not ideal and does have security implications you must take care of, although it does avoid the user/password appearing in the URL, which is a definite positive.

One downside of this is that you can’t use multiple user-ids - all the Go URL’s will connect with the same user, it is not possible (well we’ve not found a way yet) to allow different Go URLs to connect with different user-ids. Apache will only allow one redirect to /analytics on the main listen port.

Enjoy!

Analysing Social Media Activity with ADW and OAC

Analysing Social Media Activity with ADW and OAC

Yesterday I wrote a series of tweets talking about my Twitter activity analysis with Oracle's Autonomous Database and Oracle Analytics Cloud. Since the tweet became popular and I didn't share the steps, I thought a blog post should follow, so... here we are!

Getting the Data from Twitter

There are many ways that you can get data from Twitter. For the sake of my exercise I requested a dump from my data from the Twitter website. You can request the same following these instructions.

Once the data is ready, you'll receive a mail or a notification and you'll be able to download the related zip file named twitter-YYYY-MM-DD-hashed_string.zip.

Once unzipped you'll see two folders:

  • assets containing files you attached in your tweets (mostly images)
  • data where the actual interesting information is.
Analysing Social Media Activity with ADW and OAC

Within the data folder I concentrated on the tweet.js file which contains the tweet information. There are other interesting files such as followers.js or like.js but I didn't include those datasets in my analysis as of now.

Warning: the tweet.js dataset contains only the tweets written by you. It doesn't contain all people tagging you in a tweet or replies to your tweets.

The Your archive.html file allows you to browse the content on the folder from a web browser

Analysing Social Media Activity with ADW and OAC

Importing Data into ADW with SQL Developer Web

The next step in my analysis is to import the data into the Autonomous Data Warehouse. For this task, I used SQLDeveloper Web, available by default in ADW, which has a powerful utility to import JSON documents as rows in a relational table. Jeff Smith's post covers the process in detail.

Unfortunately when trying to import into ADW the file tweet.js I encountered an error due to the fact that the file itself is not a pure JSON file, pretty clear if you check the file itself

window.YTD.tweet.part0 = [ {
  "tweet" : {
    "retweeted" : false,
    "source" : "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    .....
    "lang" : "en"
  }
},
....
{
  "tweet" : {
    "retweeted" : false,
    "source" : "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    .....
    "lang" : "en"
  }
 }
 ]

The first item to remove from our file is the window.YTD.tweet.part0 = prefix. I believe this is due to the pagination of the results, but it clearly screws up the JSON formatting.

Once removed I could parse the tweet.js file with SQLDeveloper Web, but the table definition proposed had only one column with containing the whole TWEET  JSON document.

Analysing Social Media Activity with ADW and OAC

At this point, I could either accept this and do further processing using Oracle's SQL JSON parsing functionality or slightly change the shape of the file to be ingested correctly and I opted for the second.

When importing JSON documents into rows, SQLDeveloper Web analyses only the first level or attributes in the document itself. In the case of our tweet.js file was something like

{
  "tweet" : {
    ...
    }
},
{
  "tweet" : {
    ...
    }
},

The first level parsed by SQL Developer Web was correctly only extracting the tweet element and proposing a CLOB (JSON) column to store it. But I wanted the content of the tweet to be parsed. I ended up removing the first layer by substituting in the file any occurrence of },{ "tweet" : { with a simple comma and removing the initial and final parenthesis.

The file now looks like the following

[ {
    "id" : "279908827007180800",
    "created_at" : "Sat Dec 15 11:20:39 +0000 2012",
    "full_text" : "Finally at home after #christmasparty... Looooong travel!",
    "lang" : "en"
    ...
  }
,{
    "id" : "276794944394498048",
    "created_at" : "Thu Dec 06 21:07:12 +0000 2012",
    "full_text" : "@mRainey will you be there next week too? Enjoy uk and visit #italy if you can!",
    "lang" : "en",
    ...
  }
 ...
 }]

We can now parse the file with SQL Developer Web, and the output correctly identifies all the first level entities in the JSON document.

Analysing Social Media Activity with ADW and OAC

Few more clicks and we have our table FRANCESCO.TWEET_DATA_SQLDEV populated automagically!

Analysing Social Media Activity with ADW and OAC

Note: The whole process above could be implemented and automated in several different ways, the aim of the blog post is only to demonstrate the feasibility of the analysis.

Text Tokenization with Oracle Text

The Tweet's FULL_TEXT column tokenization is done in the Oracle Database, you can see the full process described by the following video produced by the Oracle Analytics PM team.

If you prefer a short version of it, here it is: I basically created the following index

CREATE INDEX "FRANCESCO"."TWITTER_TWEET_IDX" ON "FRANCESCO"."TWEET_DATA_SQLDEV" ("FULL_TEXT") INDEXTYPE IS "CTXSYS"."CONTEXT";

The INDEXTYPE IS "CTXSYS"."CONTEXT" creates an index using Oracle Text. A more accurate description of the procedure can be found in OracleBase's post.

Once the index is created, we can see some new tables appearing with the name DR$INDEX_NAME$LETTER with

  • $DR$ being a fixed prefix
  • INDEX_NAME the name of the index
  • LETTER a single letter between I, K, N, U, R which meaning can be found in this Document

For the purpose of our analysis, we'll focus on the DR$TWITTER_TWEET_IDX$I table which contains the tokens of our FULL_TEXT column.

But the token by itself is not very useful, we need to match the token with the Tweet's ID to be able to provide meaningful analysis. Again, this is covered nicely by another video created by the Oracle Analytics PM team.

In order to associate the Token with the original Tweet we can use again the power of Oracle Text and the Index created above with the following query

SELECT
    full_text,
    score(1) AS text_score,
    token_text,
    a.id
FROM
    tweet_data_sqldev a,
    (
        SELECT DISTINCT
            token_text
        FROM
            dr$twitter_tweet_idx$i
    )
WHERE
    contains(a.FULL_TEXT, '/' || token_text, 1) > 0

Again for more info about Oracle Text's CONTAINS function please refer to the relevant documentation.

I physicalized the output of the above query in a table ( TWEET_TOKENS), which contains the TOKEN_TEXT together with the Tweet's ID so we can now join this table with the original one containing the list of Tweets in Oracle Analytics Cloud.

Note: One of the next versions of Oracle Analytics Cloud will provide the Tokenization as step of a DataFlow within the Database Analytics options! You'll be able to tokenize your strings without leaving OAC.

Analysing Social Media Activity with ADW and OAC

Analysis of the Data in Oracle Analytics Cloud

If you're a frequent blog reader, this is probably the easiest part. I just had to:

  • Create a connection to my Autonomous DataWarehouse by using the wallet file.
  • Create the two datasources: one for TWEET_DATA_SQLDEV and another for TWEET_TOKENS
  • Create a project and include both Datasources

Once in the project, the first visualization is about the most "liked" tweet... no surprise is when I become Oracle Ace Director, back in 2019, during KScope Keynote!

Analysing Social Media Activity with ADW and OAC

This first visualization is ok, but not really using any of the Oracle Text capabilities exposed in the Tokens... so my next analysis was...

Which words do I use more often when tweeting?

Easy... with the two datasources created above!

Analysing Social Media Activity with ADW and OAC
How does it change when I reply to people?

Well, you can see that words about OUG sessions and Oracle are still there, but there is the emerging topic of Food!

Analysing Social Media Activity with ADW and OAC
Who do I interact most with?

I plotted the # of Replies to specific Users…
No wonder (at least to me) that I get the most interactions with people that tag me with Italian food abominations…

Analysing Social Media Activity with ADW and OAC
And... how do I reply to them?

Again, very easy with Oracle Analytics Cloud Trellis Options.
You can spot that Food is the major discussion topic with HeliFromFinland and dw_pete, while my chats with @connor_mc_d and @stewartbryson are covering more topics

Analysing Social Media Activity with ADW and OAC
What about Nephentur?

One Last mention to my interactions with Nephentur: It’s clear his love for OUGs, Airports, Flights, Hotels… all driven by  #TheBergEffect

Analysing Social Media Activity with ADW and OAC

Hope you liked the story, just an example of what you can do with a dataset you own and tools available in the Oracle Cloud!

OA Summit 2020: OA Roadmap Summary

OA Summit 2020: OA Roadmap Summary

If you are in the Oracle Analytics space, the OA Summit is a great source of content: from Keynotes, customer stories, Partners deep dives; the website is a collection of stories about Oracle Analytics. We've been part of the initial episode talking about how you can start your journey to Data Science with Oracle Analytics!

In Tuesday's session, Gabby Rubin, VP of Product Management, and Jacques Vigeant, Senior Director Product Strategy, shared a series of insights about the product roadmap that we'll cover in this blog.

Public Roadmap & IdeaLab

This is quite big news in the Oracle space, now there is a clear vision about what's coming in the product, accessible by everyone on a public website.

OA Summit 2020: OA Roadmap Summary
OA Roadmap and IdeaLab

The public roadmap works well also in conjunction with IdeaLab: a place where everyone in the OA community can suggest new product features and up/downvote or add comments on other people's ideas. These hints are reviewed by Product Managers and, if considered valid, included in future releases of the product!

Converged Analytics

The state of the art in the Analytics space offers a neat separation between self-service tools and centralized IT-governed ones. Even for Oracle Analytics, we have two separate approaches as of now: self-service data preparation via Data Visualization vs enterprise IT-driven via RPD.

OA Summit 2020: OA Roadmap Summary

What was announced at the OA Summit is that the two approaches will converge: there will be the option to self-create a multi-table and multi-source federated dataset which can be shared. This approach empowers the end-user and works on top of the usual self-service data-source definition workflow enabling for each identity in the diagram, ML-based transformations and enrichment recommendations as well as caching setting definitions.

Empowering the end-user also means enabling best software development practices like versioning, certification and promotion capabilities on top of the asset created. The multi-table federated dataset created by end-users will seamlessly transition into enterprise-level semantic models without IT intervention.

OA Summit 2020: OA Roadmap Summary

From the enterprise side, as announced during OOW19, we'll see a new web-based semantic modeler which will substitute good old Admin Tool. Unlike the current "Lite Data Modeler", the new product will enable the same level of definition complexity we can find in today's RPDs thus will be compatible with every OAC/OAS prebuilt repository. The new web-based semantic modeler is not only a pure replacement of the Windows-based admin tool, but it also offers a native source control integration with Git and in-built options for Data Lineage explorations.

OA Summit 2020: OA Roadmap Summary

As mentioned the new tool is web-based; if you're an expert RPD developer and worried about different development methodology slowing down the build process, well, there is big news for you! You will be able to edit Repositories via the new JSON based Semantic Modeling Markup Language! Yep, you'll be able to define all the layers of an RPD in JSON syntax, even outside the web-based semantic modeler tool itself. This opens a huge variety of opportunities for source-control, CI/CD pipelines, as well as automatic (coded) builds of RPD artifacts.

Oracle Analytics as Enterprise Keystone

As briefly described in my Data Virtualization blog post, Oracle Analytics can (and should) be considered the Analytics Keystone in the enterprise: the convergence of IT-driven and self-service metadata models can already be exposed via ODBC in OAS enabling the downstream application to access the data inheriting the security settings defined in the Repository. The OA team is working to offer the equivalent in JDBC format for all Oracle Analytics products!

OA Summit 2020: OA Roadmap Summary

Oracle Analytics is also enhancing the number of source-system connectors available: we'll soon have the option to connect directly to Oracle EPM Cloud from Data Visualization, and similar connectors are arriving for JDBC sources and Google's Big Query.

OA Summit 2020: OA Roadmap Summary

Machine Learning in OA

Augmented Analytics and Machine Learning capabilities have existed for a long time in Oracle Analytics. A new important improvement in this area will enable Data Flows to use Machine Learning models created outside OA via the usual GUI.

OA Summit 2020: OA Roadmap Summary

Data Scientists will be able to use their favourite environment and toolset in order to analyse the data and build models. The models built in external systems like Oracle Databases, Oracle Data Science Services, or 3rd-party services could then be registered within OA and used to score data in Data Flows making the Data Scientist and Data Analyst collaboration much easier increasing the ML ubiquity in enterprises.

Predictions explainability will also be possible directly in OA, with details of each model exposed appropriately depending on the model type.

OA Summit 2020: OA Roadmap Summary

In addition to the above, OA will also allow the usage of Advanced Database Analytics like Sampling, Un-pivoting, Clustering, Anomaly Detection or tokenization with more to come. This allows the usage of already existing functions (avoiding to reinvent the wheel) that can perform securely on massive amounts of data in the Oracle Database.

Data Visualization New Features

There is also some news which has already been shipped in the latest product or will be available soon. Some examples are:

  • Adaptive content: we can enable content scrolling in Data Visualization projects, allowing proper spacing of otherwise crowded visualizations
  • Canvas Filter control: a prompt that can filter only a subset of analysis registered to it (similar to the Master-details concept in Analysis)
  • OAC Embedding Framework: That allows to integrate OA capabilities into external web applications
  • Digitalized Map Layers: create an infographic on top of any image, all directly in OAC
OA Summit 2020: OA Roadmap Summary

OA on Mobile

What about consuming OA content in mobile devices? Data Visualization content is adaptive, the visualization size and appearance will change depending on screen type, so all created content could simply be accessed via mobile browser. Oracle is also investing in Day by Day, which acts as a personal proactive Data Assistant and now enables the threshold-based notifications with more visualization types coming later in the year.

OA Summit 2020: OA Roadmap Summary

The new announcement regarding mobile is the new Oracle Analytics Mobile App which substitutes the Oracle BI Mobile HD and will provide a way to use, search and collaborate on curated content exposed by OA with an experience in line with modern apps.

OA Summit 2020: OA Roadmap Summary

More on this, All the Oracle Analytics Apps will enable collaboration: several people will be able to access and comment on visualizations and data avoiding the need to switch to other tools like emails.

A whole new wave of capabilities is in the Oracle Analytics roadmap, for any doubt or questions feel free to reach us!

Data Virtualization: What is it About?

Data Virtualization: What is it About?

The fast growth of company's data, data-sources and data-formats is driving an increasing interest in the field of Data Virtualization; what is it about? And what tools can provide this functionality?

Data Virtualization defines an approach to expose data coming from disparate data sources via a common interface to downstream applications.

Data Virtualization: What is it About?

Do you have your sales data in an Oracle Database and client peculiar information in some Cloud App? Data Virtualization will show them as an unique data source while, in the backend, will fire the proper queries to source-systems, retrieve the data and apply the correct joining conditions. Data Virtualization abstracts the technical aspects of the datasources from the consumer. Unlike the ETL approach,  there is no data copy or movement, with source systems accessed in real-time when the query is executed (also called Query in Place (QIP)).

To be clear: Data Virtualization is NOT a replacement of ETL, both paradigms are valid in specific use-cases. Data Virtualization is a perfect scenario when reduced amounts of data coming from various systems need to be joined and exposed. On the other side, when massive amounts of data need to be parsed, transformed and joined and data-retrieval speed is the key, then an ETL (or ELT) approach is still the way to go.

Data Virtualization Components?

So far we had a look at the theory and goals of Data Virtualization, but what are the main components of such a tool? We can summarize them in four main points in our source-to-user path: Data Access Optimization, Business Logic, Data Exposure and Security.

Data Virtualization: What is it About?

What should each layer do? Let's see them in detail.

Data Access Optimization

Data Virtualization: What is it About?

In this layer, the connection to data-sources needs to be defined, tuning parameters set and data-points of interest determinate. This query should be responsible for optimizing the query pushdown to source systems in order to retrieve the minimal row-set that needs to be displayed or joined with other datasources.

The Data Access Optimization layer should also be able to handle different datasources with different capabilities (e.g. aggregation functions) and, in case pushdown is not possible, perform transformations after data load. Finally, to ease the stress on the source system, caching options should be available in this layer.

This layer should also dictate the methodology of datapoints access, defining (with the help of the Security layer) if a certain column can be read-only or also written.

Business Logic

Data Virtualization: What is it About?

The Business Logic Layer should be responsible for translating data-points to company-defined metrics or attributes. Datapoints can come from various data-sources so it should also contain the definitions of joining conditions and data federations.

Metrics and attributes need to have aggregations and hierarchies defined in order to be used by downstream applications. Hierarchies can also help with vertical data federation, defining exactly the granularity of each datapoint, thus enabling query optimization when more aggregated datasources are available.

The Business Logic Layer is also responsible for decoupling Business Logic from datasources: when a change in a datasource happens (e.g. the creation of a datamart or a database vendor change) the metrics and attributes layout will not be altered, all the changes will happen in the Data Access Optimization and hidden in the Business Logic Layer definitions.

Data Exposure

Data Virtualization: What is it About?

The Data Exposure Layer is the one facing downstream applications or users, in this layer metrics and attributes should be organized in business-driven structures and made accessible via various methods like web-interfaces, ODBC, SOAP or REST.

Datapoints defined at this layer should contain business descriptions and be researchable via a data catalog allowing the re-usage of pre-built content.

Security

Data Virtualization: What is it About?

Exposing data to downstream applications needs to happen securely, thus datapoint access rules need to be defined. Data Virtualization systems should integrate with the company's Identity management tools to identify and dictate who can access any particular datapoint.

Security should not only work as ON/OFF, but also allow the access to subsets of data based on privileges: e.g. a country manager should only see data from his/her own country.

Security is not only about defining boundaries but also about auditing the correct access to data (GDPR?). Data virtualization systems should allow security evaluations or provide pre-built security checks.

Data Virtualization in Oracle

As you might understand, Data Virtualization is a new, hot and growing topic; what has Oracle to offer since there isn't an official Oracle Data Virtualization tool? Well, check the picture below:

Data Virtualization: What is it About?

This is an Oracle Analytics repository (RPD)! A tool created in 1998, acquired by Oracle in 2007 and with a long history of successful implementations! The reality is that Data Virtualization is not a new topic (we talked about it in 2008), but simply an old capability, with a new name!

Oracle Analytics RPD's three-layer concept matches exactly the Data Access Optimization, Business Logic and Data Exposure layers mentioned above providing the same functionality. The Security is a key component across the whole Oracle Analytics Platform and successfully integrates with all major identity providers.

Oracle Analytics for Data Virtualization

Oracle Analytics is a tool born for data analytics but solves successfully the problem of Data Virtualization: data from different source systems can be queried, joined and exposed.

OA is capable of firing queries based on fields selected by the downstream application and optimized for the datasource selected. Query pushdown is enabled by default and specific datasource features can be turned ON/OFF via configuration files. Specific transformations can happen in OA's memory if the datasource doesn't allow the pushdown. Results of queries can be cached and served for faster response.

Data Virtualization: What is it About?

In scenarios where massive amounts of data need to be sourced, probably Oracle Big Data SQL or Cloud SQL could be used, pushing part of the virtualization at the database level.

Vertical and horizontal Data Federation can be defined, so data can span across various tables and aggregated datasources can be used for faster response. Metrics, attributes, hierarchies and joins are defined in the model thus related complexity is hidden from downstream applications.

Data exposed can be accessed via a web browser using the traditional Answers and Dashboard or the innovative Data Visualization. Data can also be extracted via SOAP APIs and via ODBC when the related component is exposed. Data Sources defined in Data Visualization and BI Publisher can also be extracted via REST APIs. There is a plan to extend JDBC access also to RPD defined Subject Areas.

Data Virtualization: What is it About?
REST APIs

A utility, called Metadata Dictionary, automatically generates a Data Catalog which can be used to expose and share what datapoints are available. In the future, Oracle Analytics Cloud datapoints will also be available via Oracle Data Catalog, a specific offering around this area.

The security included with the tool allows datapoint access definition to specific roles as well as limits on data exports size or access times. Security settings can be exported and audited via external tools. The platform usage can also easily be monitored for performance and security reasons.

Self-service Data Virtualization

Data Virtualization: What is it About?
Self-service Data Virtualization

All we discussed so far describes the IT-driven Data Virtualization where the access to data is only managed by a restricted group of people (usually the IT department). Oracle Analytics enables it alongside Self-Service Data Virtualization, where new datasources, joins and security layers can be defined directly by Business Users enabling a faster and secure data sharing process which can still be audited and controlled.

Do you want to expose the business unit's forecast data in your Excel file alongside the revenue coming from your datamart? With Oracle Analytics this is only few clicks away including the security options controlling the audience.

Oracle Analytics offers both top-down and bottom-up approaches to Data Virtualization at the same time! IT-Driven, highly secured and controlled data-sources can coexist with user-defined ones. Joins and metadata models can be built from both sides and promoted to be visible by wider audience, all within one environment that can be tightly secured, controlled and audited.

Conclusion

Data Virtualization is a hot topic indeed, with the ever-increasing number of different datasources in a company's portfolio we'll see the rise of data abstraction layers. But don't look at new shiny tools to solve an old and common problem!

As described above, Oracle Analytics (Cloud or Server) can successfully be used in Data Virtualization contexts. Deployed in conjunction with Oracle Big Data SQL or Cloud SQL and Oracle Data Catalog offers an extremely compelling solution with a set of tools and knowledge already available in the majority of companies.

Getting Smart View to work with OAC

I wanted to demonstrate what I thought would be a simple task to a client today, however it turned out to be a little more complex than I first anticipated, so I thought I would publish here. All I needed to do was get Smart View connecting to OAC.

Prerequisites

  1. Excel installed
  2. OAC up, running and available

Approach

  1. Download Smart View (see here) - for reference I downloaded version 11.1.2.5.910
  2. Double click to install and accept the defaults

Establishing a Connection

When I last did this (not sure how long ago that was), I think the OBIEE extension was installed by default, so I put in the OAC connection string and then couldn't work out why I kept getting an error. Fortunately a colleague of mine pointed out the OBIEE extension wasn't actually installed by default any more, so that's what I needed to do next.

Install the Extension

  • Open Excel
  • Create a Blank Workbook
  • Select the Smart View menu item
  • Select Options

Go to Extensions

The OBIEE extension is no longer installed out of the box

You need to check Check for updates when Microsoft Office starts, then you should get a link above saying Check for Updates, New Installs and Uninstalls

Select the OBIEE Extension to install, once successful you should see

Now you can create a connection to OAC

Create a Connection

  • Select the Panel

  • Choose Private Connection, then Create new connection from the bottom of the screen

Choose OBIEE for the connection type

Enter the URL in the following format:

https://<your-oac-host>/analytics/jbips

My example was

https://dv1dfree-xyzabc-ld.analytics.ocp.oraclecloud.com/analytics/jbips

You'll be asked to logon

You can then give the connection a name to save it as

Then you should see the Catalog displayed on the right

If you select a View you will get a number of options of how to display it, the simplest is insert.

Once selected, this will load the data into the sheet.