Liberate your data

Intelligence is all about knowledge. This website is dedicated sharing expertise on Oracle BI. More »

 

Machine Learning and Spatial for FREE in the Oracle Database

Machine Learning and Spatial for FREE in the Oracle Database

Last week at UKOUG Techfest19 I spoke a lot about Machine Learning both with Oracle Analytics Cloud and more in depth in the Database with Oracle Machine Learning together with Charlie Berger, Oracle Senior Director of Product Management.

Machine Learning and Spatial for FREE in the Oracle Database

As mentioned several times in my previous blog posts, Oracle Analytics Cloud provides a set of tools helping Data Analysts start their path to Data Science. If, on the other hand, we're dealing with experienced Data Scientists and huge datasets, Oracle's proposal is to move Machine Learning where the data resides with Oracle Machine Learning. OML is an ecosystem of various options to perform ML with dedicated integration with Oracle Databases or Big Data appliances.

Machine Learning and Spatial for FREE in the Oracle Database

One of the most known branches is OML4SQL which provides the ability of doing proper data science directly in the database with PL/SQL calls! During the UKOUG TechFest19 talk Charlie Berger demoed it using a collaborative Notebook on top of an Autonomous Data Warehouse Cloud.

Machine Learning and Spatial for FREE in the Oracle Database

Both Oracle ADW and ATP include OML by default at no extra cost. This wasn't true for all the other database offerings in cloud or on-premises which required an additional option to be purchased (the Advanced Analytics one for on-premises deals). The separate license requirement was obviously something that limited the spread of this functionality, but, I'm happy to say that it's going away!

Oracle's blog post yesterday announced that:

As of December 5, 2019, the Machine Learning (formerly known as Advanced Analytics), Spatial and Graph features of Oracle Database may be used for development and deployment purposes with all on-prem editions and Oracle Cloud Database Services. See the Oracle Database Licensing Information Manual (pdf) for more details.

What this means is that both features are included for FREE within the Oracle Database License! Great news for both Machine Learning as well as Graph Databases fans! The following tweet from Dominic Giles (Master Product Manager for the Oracle DB) provides a nice summary of the licenses including the two options for the Oracle DB 19c.

But hey, this license change effects also older versions starting from the 12.2, the older one still in general support! So, no more excuses, perform Machine Learning where your data is: in the database with Oracle Machine Learning!

Rittman Mead at UKOUG TechFest 19

Rittman Mead at UKOUG TechFest 19

Like every year, December for the Oracle community means UKOUG! This time is special since the event, named TechFest19, will be hosted at The Grand Brighton Hotel, near our base office!

Rittman Mead at UKOUG TechFest 19

Let's talk about the important stuff first: we are organising a Techfest party featuring "The Chaps" on Monday 2nd Dec between 7 and 10:30 in our office at Platf9rm, Hove Town Hall. We'll be having few drinks and while enjoying the live music, if you are interested register yourself here!

Rittman Mead at UKOUG TechFest 19

Now on the main topic, the Techfest! Rittman Mead will be well represented this time with four talks:

Data the Missing Ingredient

Monday 2nd at 14:15, Location: Stage 4

Jon Mead, our CEO, will introduce you to the concepts of Data Management, Engineering and Governance, how they should be addressed across a wide range of projects and which Oracle tools are there to help you.

Become an Equilibrista: Find the Right Balance in the Analytics Tech Ecosystem

Tuesday 3rd at 11:00, Location: Stage 2

Rittman Mead at UKOUG TechFest 19

Do you want to understand how to avoid the "Excel Chaos" in your organization? Let me show you how you can balance Centralized and Self-service analytics, taking the best of both worlds but still using a unique tool: Oracle Analytics Cloud!

Is it Corked? Wine Machine Learning Predictions with OAC

Tuesday 3rd at 15:45, Location: Stage 2

Do you love Wine? And maybe you don't dislike Analytics? Then join me in understanding how Oracle Analytics Cloud can be used for data-science! We'll be analysing a Wine dataset and using OAC to create a predictive model scoring wine quality! If you are a business analyst looking to start your path into Machine Learning, this session is a kickstarter!

Rittman Mead at UKOUG TechFest 19

Picking a Good Wine for <$20 Using Oracle Autonomous, Machine Learning and Analytics Cloud

Monday 2nd at 09:00, Location: Stage 2

Using the same dataset as the talk above, Charlie Berger, Sr. Director of Product Management, will show you how to build a predictive model by performing Machine Learning directly within the Oracle Autonomous Datawarehouse, all accessed by a notebook interface. I'll then show you how we can integrate such model within OAC and show the main drivers as well as the model outcomes!

During the event, few of the Rittman Mead folks will be around. If you see us in sessions, around the conference or during our talks, we'd be pleased to speak with you about your projects and answer any questions you might have.

Timestamp Functions and Presentation Variables in Oracle Cloud Analytics

One of the most popular Rittman Mead blog posts over the last 10 years is Timestamps and Presentation Variables. As we are seeing more and more migrations to OAC, we decided to review and revise this post for the latest version of Oracle Cloud Analytics (OAC), 105.4.0-140 as of October 2019. Read more about the latest updates here.

--

One could say that creating a chart is not the most complex task in the world of Business Intelligence but we would argue that creating a meaningful report that perfectly illustrates the message hidden in data and therefore adds value to the management is nowhere close to being easy!    A good way to make a report as informative as possible is to use trends and comparison. And to do so, a perfect tool would be the time analysis functions. For example comparing sales in a period of time this year to the same period of time the year before. Or measure the similarity or dissimilarity of sales in different months of the year.

Demo Platform

I have used a free trial instance of OAC for this demo. If you haven’t done yet, sign up for a free 30-day trial Oracle Cloud account (different to an Oracle account). Use the account to access the Oracle Cloud Infrastructure (OCI) console which is the latest Oracle movement towards having one integrated cloud platform to manage all your Oracle cloud applications, platforms, and infrastructure in one place.
From the OCI console it is 5 to 10 minutes before your free trial instance of OAC is up and running. For the detailed step by step of creating a new instance read here.

Demo Goals

In this blog post I intend to show you how to combine the power of timestamp functions and presentation variables to create robust, repeatable reports. We will create a report that displays a year over year analysis for any rolling number of periods, by week or month, from any date in time, all determined by the user. This entire demo will only use values from a date and a revenue field.


TIMESTAMP Functions

TIMESTAMPADD() manipulates data of the data types DATE and DATETIME based on a calendar year.

Syntax: TIMESTAMPADD(interval, expr, timestamp)
Example: TIMESTAMPADD(SQL_TSI_MONTH, 12,Time."Order Date")
Description: Adds a specified number of intervals to a timestamp, and returns a single timestamp.
Timestamp Interval (TSI) Options: SQL_TSI_SECOND, SQL_TSI_MINUTE, SQL_TSI_HOUR, SQL_TSI_DAY, SQL_TSI_WEEK, SQL_TSI_MONTH, SQL_TSI_QUARTER, SQL_TSI_YEAR

Read more about other calendar functions.

Building Filters

Starting to build our demo, the filter below returns all dates greater than or equal to 7 days ago including the current date.

In other words we have now a functional filter to select all the rows where Date >= a week ago.

As a good practice, always include a second filter giving an upper limit to the time filter. For example "Periods"."Day Date" < CURRENT_DATE would confirm that there won’t be any records that you don’t want in the mix and therefore no unnecessary strain on the system.

Let’s go one step further, instead of going 7 days back, we could try and include all the previous days in the current month or in other words dates >= the first day of the month. In this scenario, we can use the DAYOFMONTH() function to get the calendar day of any date. From here it will be easy to calculate the number of days in the month so far. Our new filter would look like this:

For example, if today is October 16th, DAYOFMONTH(CURRENT_DATE) would equal 16. Thus, we would subtract 16 days from CURRENT_DATE to go back to September 30th, and adding one will give us October 1st.

Presentation Variables

A presentation variable is a variable that can be created from the front end, the Analytics as part of one of the following types of dashboard prompts:

  • Column prompt, Associated with a column and the values that it can take come from the column values. For information on working with column prompts, see Creating a Column Prompt.
  • Variable prompt, Not associated with any column, and you define the values that it can take. For information on working with variable prompts, see Creating a Variable Prompt.

Each time a user selects a value in the column or variable prompt, the value of the presentation variable is set to the value that the user selects and will then be sent to any references of that filter throughout the dashboard page. This could be filters, formulas and even text boxes.

The first presentation variable we could introduce is to replace the CURRENT_DATE with a prompted value. Let’s call this presentation variable pv_Date,

  • Use the syntax @{pv_Date} to call this variable in the reports.
  • For variables of type string, surround the name in single quotes: ‘@{pv_String]’
  • It is good practice to assign a default value to the presentation variables so that you can work with your report before publishing it to a dashboard. For example the default value for the pv_Date is CURRENT_DATE so the new syntax would be @{pv_Date}{CURRENT_DATE}

Demo Time!

Our updated filter after replacing the CURRENT_DATE looks like below. Will will refer to this filter later as Filter 1 (F1).

The filter is starting to take shape. Now let's say we are going to always be looking at a date range of six months before the selected date. All we would need to do is create a nested TIMESTAMP function. To do this, we will “wrap” our current TIMESTAMP with another that will subtract six months:

Now we have a filter to select dates that are greater than or equal to the first day of the month of any given date and all the six months prior to that.


To take this one step further, we can create another presentation variable called  pv_n to allow the users to determine the amount of months to include in this analysis from a dashboard prompt.


Here is the updated version of our filter using the number of periods presentation variable and a default value of 6, @{pv_n}{6}. We will refer to the following filter as Filter 2 (F2).

Our TIMESTAMPADD function is now fairly robust and will give us any date greater than or equal to the first day of the month from n months ago from any given date. Now we will see what we just created in action by creating date ranges to allow for a Year over Year analysis for any number of months. Consider the following filter set:

This may appear to be pretty intimidating at first but if we break it into parts we can start to understand its purpose. Notice we are using the exact same filters from before; Filter 1 and Filter 2.   What we have done here is filtered on two time periods, separated by the OR statement.

  • The first date range defines the period as being the most recent completed n months from any given prompted date value, using a presentation variable with a default of today. Dates in the current month have been removed from the set by Filter 1.
  • The second time period, after the OR statement, is the exact same as the first only it has been wrapped in another TIMESTAMP function subtracting a year, giving you the exact same time frame for the year prior.

This allows us to create a report that can run a year over year analysis for a rolling n month time frame determined by the user.

A note on nested TIMESTAMPS: you will always want to create nested TIMESTAMPS with the smallest interval first. Then you will wrap intervals as necessary. In this case our smallest increment is day, wrapped by month, wrapped by year.


Let’s Go Crazy

A more advanced trick, If you use real time or near real time reporting: using CURRENT_DATE may be how you want to proceed. Otherwise, instead of using today as your default date value, use yesterday’s date since most data are only as current as yesterday.  Using yesterday will be valuable especially when pulling reports on the first day of the month or year - you generally want the entire previous time period rather than the empty beginning of a new one.  So, to implement, wherever you have @{pDate}{CURRENT_DATE} replace it with @{pDate}{TIMESTAMPADD(SQL_TSI_DAY,-1,CURRENT_DATE)}

One more change on our filter to make it extra-flexible here is to use a new presentation variable to determine if you want to display year over year values, or by month, or by week. This can be done by inserting a variable into your SQL_TSI_MONTH and DAYOFMONTH statements; changing MONTH to SQL_TSI_@{pv_INT}{MONTH} and DAYOF@{pv_INT}{MONTH}, where pv_INT is the name of our variable.

Start by creating a dummy variable in your prompt to allow users to select either MONTH or WEEK.  You can try something like this: CASE MOD(DAY("Time"."Date"),2) WHEN 0 'WEEK' WHEN 1 THEN 'MONTH' END

The updated filter now look like this:

In order for our interaction between Month and Week to run smoothly we have to factor in one last consideration: if we are to take the date December 1st, 2019 and subtract one year we get December 1st, 2018.  However, if we take the first day of this week, Sunday December 15, 2019 and subtract one year we get Saturday December 15, 2014.  In our analysis this will cause an extra partial week to show up for prior years.  To get around this we will add a case statement determining if '@{pv_INT}{MONTH}' = 'Week' THEN subtract 52 weeks from the first of the week ELSE subtract 1 year from the first of the month. With this, our final filter set will look like this:


With the use of these filters and some creative dashboarding, you can construct a report that easily allows you to view a year over year analysis from any date in time for any number of periods either by month or by week.

Filtered by Week intervals;

The formula below will give you the value of period rolling to use in the analysis;

In this post, we created a cloud version of the amazing demo previously described by Brian Hall.  As demonstrated, Timestamp functions and their power have been within the interesting topics of the visualisation and reporting for as long as we at Rittman Mead remember and can still be used in the realm of the Oracle Cloud Services in a very similar way as the past.

Feel free to get in touch, let us know your reviews and comments.

OAC v105.4: Understanding Map Data Quality

OAC v105.4: Understanding Map Data Quality

Last week Oracle Analytics Cloud v105.4 was announced. One of the features particularly interested me since it reminded the story of an Italian couple willing to spend their honeymoon in the Australian Sydney and ending up in the same Sydney city but in Nova Scotia for a travel agency error. For the funny people out there: don't worry, it wasn't me!

The feature is "Maps ambiguous location matches" and I wanted to write a bit about it.

Btw OAC 105.4  includes a good set of new features like a unified Home page, the possibility to customize any DV font and more options for security and on-premises connections amongst others. For a full list of new features check out the related Oracle blog or videos.

Maps: a bit of History

Let's start with a bit of history. Maps have been around in OBIEE first and OAC later since a long time, in the earlier stages of my career I spent quite a lot of time writing HTML and Javascript to include map visualizations within OBIEE 10g. The basic tool was called Mapviewer and the knowledge & time required to create a custom clickable or drillable map was....huge!

With the raise of OBIEE 11g and 12c the Mapping capability became easier, a new "Map" visualization type was included in the Answers and all we had to do was to match the geographical reference coming from one of our Subject Areas (e.g. Country Name) with the related column containing the shape information (e.g. the Country Shape).

OAC v105.4: Understanding Map Data Quality

After doing so, we were able to plot our geographical information properly: adding multiple layers, drilling capabilities and tooltips was just a matter of few clicks.

OAC v105.4: Understanding Map Data Quality

The Secret Source: Good Maps and Data Quality

Perfect, you might think, we can easily use maps everywhere as soon as we have any type of geo-location data available in our dataset! Well, the reality in the old days wasn't like that, Oracle at the time provided some sample maps with a certain level of granularity and covering only some countries in detail. What if we wanted to display all the suburbs of Verona? Sadly that wasn't included so we were forced to either find a free source online or to purchase it from a Vendor.

The source of map shapes was only half of the problem to solve: we always need to create a join with a column coming from our Subject Area! Should we use the Zip Code? What about the Address? Is City name enough? The deeper we were going into the mapping details the more problems were arising.

A common problem (as we saw before about Sydney) was using the City name. How many cities are called the same? How many regions? Is the street name correct? Data quality was and still is crucial to provide accurate data and not only a nice but useless map view.

OAC and the Automatic Mapping Capability

Within OAC, DV offers the Automatic Mapping Capability, we only need to include in a Project a column containing a geographical reference (lat/long, country name etc), select "Map" as visualization type and the tool will choose the most appropriate mapping granularity that matches our dataset.

OAC v105.4: Understanding Map Data Quality

Great! This solves all our issues! Well... not all of them! The Automatic Mapping capability doesn't have all the possible maps in it, but we can always include new custom maps using the OAC Console if we need them.

OAC v105.4: Understanding Map Data Quality

So What's New in 105.4?

All the above was available way before the latest OAC release. The 105.4 adds the "Maps ambiguous location matches" feature, which means that every time we create a Map View, OAC will provide us with a Location Matches option

OAC v105.4: Understanding Map Data Quality

If we click this option OAC will provide as a simple window where we can see:

  • How many locations matched
  • How many locations have issues
  • What's the type of Issue?
OAC v105.4: Understanding Map Data Quality

The type of issue can be one between:

  • No Match in case OAC doesn't find any comparable geographical value
  • Multiple Matches  when there are multiple possible associations
  • Partial Matches when there is a match only to part of the content
OAC v105.4: Understanding Map Data Quality

We can then take this useful information and start a process of data cleaning to raise the quality of our data visualization.

Conclusion

Maps were and are a really important visualization available in OAC. The Maps ambiguous location matches feature provides a way to understand if our visualization is representative of our dataset. So, if you want to avoid spending your honeymoon in the wrong Sydney or if you just want to provide accurate maps on top of your dataset, use this feature available in OAC!

OOW19 Review: Oracle Analytics Deep Dive

OOW19 Review: Oracle Analytics Deep Dive

In my previous blog I outlined the global news regarding Oracle like the Always Free Tier, the new datacenter plan and the set of new tools for Data Science. Today's blog is dedicated to all the news announced regarding Oracle Analytics in any of the versions: Cloud, Server or Applications.

OOW19 Review: Oracle Analytics Deep Dive

Oracle Analytics Server

OAS is the long awaited replacement of OBIEE 12c on-premises and promises  functional parity with OAC. Current official ETA is Fiscal Year 2020 and it will be available to customers as a free upgrade. With OAS all customers still on-premises will experience the following benefits:

  • Almost 1-1 feature with OAC
  • Complete compatibility with OAC
  • Simplified cloud migration and better support for hybrid deployments

A related announcement for on-premises customers regards licensing: there is only a single license to purchase OAS which includes all features within it, no separate option for Mobile or Self-Service Data Visualization needed!

Oracle Analytics for Application

This represents the new incarnation of BIApps, completely redesigned specifically for Fusion Apps. As his predecessor, OAX (this is the acronym) it's a packaged, ready-to-use solution with pre-built ETLs and Analytics content like RPD, dashboards, analysis, KPIs. Under the covers uses Oracle Autonomous Data Warehouse and Oracle Data Integrator Cloud. OAX is also extendible, by bringing additional datasets in ADW and extending the semantic model and catalog.

Oracle Analytics Cloud

Several enhancements were announced, especially during Gabby Rubin's (VP of Oracle Analytics Product Management) Strategy & Roadmap Session. New features will be available in most of the areas of the tool, including the core of the centralized reporting: the RPD.

OOW19 Review: Oracle Analytics Deep Dive

Data Preparation

New options will be available in the Data Preparation/Enrichment phase such as:

  • Custom Enrichments based on pre-existing set of values. E.g. enriching PRODUCT_ID with fields coming from a standard Product dimension. This is an interesting idea to enable standardization of dimensions across reporting without forcing people to write SQL or to know where the standard information is coming from.
  • Force Enrichments/Masking: as Administrators, we could enforce some transformations like the credit card obfuscation of fields that may contain sensitive data.

Natural Language Generation

The Natural Language view is already present in the current version of OAC, there is a plan to enhance this visualization by adding more options in the settings panel for grouping and trending analysis.

OOW19 Review: Oracle Analytics Deep Dive

Spatial Analytics in OAC

A few weeks ago I wrote about Oracle Spatial Studio, a tool designed to provide advanced visual Spatial Analytics. This tool will remain and progress over time, OAC will not cover all the specific use-cases of Spatial Studio. However OAC will enhance its spatial capabilities, like:

  • Providing accurate information about row geo-location: e.g. how many rows were correctly located, how may errors and menus to fix value to location association.
  • Provide spatial functions in the front-end: an end-user will be easily able to calculate the distance between points in a map by writing a simple Logical SQL statement. This option will probably appear on the RPD first (check the twitter thread below)

As you can see, calculating the distance will be just a matter of having the correct dataset and writing a GeometryDistance function.

Connectivity and Security

One of OAC's missions is to become the Analytics Platform on top of any type of datasource. The plan in the future is to expand the list of connectors and security/configuration options like SSL or Kerberos. There is also a roadmap to extend the Data Gateway capabilities to query non-oracle databases.

Modeling capabilities

In OAC we were used to either the classic RPD approach or the self-service Data-Sets. The future reserves some news in both approaches:

  • A new cloud web-based Modeler with the objective of functional parity with the Admintool, so capable of handling more complex designs that the current light data-modeler. I believe this will be also an effort to adapt the RPD development process to the current standards of concurrent development, versioning and storage format.
  • A new Self Service Data Model solution to build light self service models allowing end-users to evolve datasets into proper models sharable and optimized for reporting.

I like the idea of allowing both top-down (centralized) as well as bottom-up (self-service) approach to data modeling. This provides clients the flexibility on the analytical approach while still allowing to enforce centralized rules  (e.g. unique source of truth) when needed.

Unified User Experience and Layout Customizations

As of now the old "Answers and Dashboards" and the new "Data Visualization Projects" were almost completely separated products with each one having its own home page and layout. In the next releases we'll see that the two worlds will start combining, with a unique home and a similar look and feel.

In other news, highly requested by end-users is the possibility of customize almost any option of the layout: from font type and size to colors of any object visible in a project.

Machine Learning Integration

As discussed in the previous OOW review post in the future OAC will be able to use models built in other tools like Oracle Machine Learning in the Autonomous Data Warehouse or Oracle Data Science. This provides an end-to-end Data Science story from Data Analyst to Data Scientist all with a simple, secure, highly configurable and performant toolset.

OOW19 Review: Oracle Analytics Deep Dive

As you can see a lot of news coming in various aspects of the tool, from on-premise functional parity, a new packaged solution for Fusion Apps and a lot of features enhancing OAC functionality and customization options.

What do you think? Is this the right direction? Do you feel there is something missing?