Rittman Mead Consulting

Subscribe to Rittman Mead Consulting feed Rittman Mead Consulting
Rittman Mead consults, trains, and innovates within the world of Oracle Business Intelligence, data integration, and analytics.
Updated: 14 hours 30 min ago

Oracle Support Rewards

Thu, 2021-06-24 14:44

First of all, what is it?

Essentially, it’s a cashback scheme. For every Dollar, Pound, Yen, Euro you spend moving to OCI (Oracle Cloud Infrastructure), Oracle will give you 25% off your annual support bill.

It’s a great incentive for cloud adoption and has the potential to wipe out your support bill completely.

How does it work?

Here’s an example: you decide to move your on-premise data warehouse, ETL, and OBIEE systems to Oracle Cloud.

Your total annual support for all your remaining Oracle on-premise software is £100,000. Your annual OCI Universal Credits spend is £100,000. Oracle will give you 25% of as Support Rewards, equating to £25,000.

You’ve just taken a quarter off your annual support bill. You’re a hero.

But wait!

You’re not finished yet, your E-Business Suite system is looking a bit dusty, ERP Cloud is the way forward, but getting the budget has been harder than expected.

Why not try again? The more you move the higher the reward, right…

You could offset the some of the cost of the move to ERP Cloud with the savings on your on-premise support costs.

Now you’ve wiped out the annual support bill completely. Legend territory!

Total Savings

Having the possibility to reduce your annual support spend by 100% is stunning, but in the drive for cloud adoption Oracle have gone further. The Support Rewards run alongside the current Volume Discount offer. This gives the potential of having a 50% total saving.

Support Rewards with OCI Volume Discounts

Now, if like me this is causing you to get rather excited feel free to get in contact. We can share our excitement and if applicable discuss some of the ways Rittman Mead can help you build your business case to modernise your product stack.

Categories: BI & Warehousing

Sql2Odi, Part Two: Translate a complex SELECT statement into an ODI Mapping

Wed, 2021-04-07 03:58

What we call a complex SQL SELECT statement really depends on the context. When talking about translating SQL queries into ODI Mappings, pretty much anything that goes beyond a trivial SELECT * FROM <a_single_source_table> can be called complex.

SQL statements are meant for humans to be written and for RDBMS servers like Oracle Database to be understood and executed. RDBMS servers benefit from a wealth of knowledge about the database we are querying and are willing to give us a lot of leeway about how we write those queries, to make it as easy as possible for us. Let me show you what I mean:

SELECT
    FIRST_NAME,
    AGE - '5' LIE_ABOUT_YOUR_AGE,
    REGION.*
FROM
    CUSTOMER
    INNER JOIN REGION ON "CUSTOMER_REGION_ID" = REGION.REGION_ID

We are selecting from two source tables, yet we have not bothered about specifying source tables for columns (apart from one instance in the filter). That is fine - the RDBMS server can fill that detail in for us by looking through all source tables, whilst also checking for column name duplicates. We can use numeric strings like '567' instead of proper numbers in our expressions, relying on the server to perform implicit conversion. And the * will always be substituted with a full list of columns from the source table(s).

All that makes it really convenient for us to write queries. But when it comes to parsing them, the convenience becomes a burden. However, despite lacking the knowledge the the RDBMS server possesses, we can still successfully parse and then generate an ODI Mapping for quite complex SELECT statements. Let us have a look at our Sql2Odi translator handling various challenges.

Rittman Mead's Sql2Odi Translator in Action

Let us start with the simplest of queries:

SELECT
  ORDER_ID,
  STATUS,
  ORDER_DATE
FROM
  ODI_DEMO.SRC_ORDERS

The result in ODI looks like this:

Sql2Odi has created an Expression, in which we have the list of selected columns. The columns are mapped to the target table by name (alternatively, they could be mapped by position). The target table is provided in the Sql2Odi metadata table along with the SELECT statement and other Mapping generation related configuration.

Can we replace the list of columns in the SELECT list with a *?

SELECT * FROM ODI_DEMO.SRC_ORDERS

The only difference from the previously generated Mapping is that the Expression now has a full list of source table columns. We could not get the list of those columns while parsing the statement but we can look them up from the source ODI Datastore when generating the mapping. Groovy!

Let us increase the complexity by adding a JOIN, a WHERE filter and an ORDER BY clause to the mix:

SELECT
  SRC_ORDERS.*
FROM
  ODI_DEMO.SRC_ORDERS
  LEFT JOIN ODI_DEMO.SRC_CUSTOMER CUST ON 
    SRC_ORDERS.CUST_ID = CUST.CUSTID
WHERE 
  CUST.AGE BETWEEN 20 AND 50
ORDER BY CUST.AGE

The Mapping looks more crowded now. Notice that we are selecting * from one source table only - again, that is not something that the parser alone can handle.

We are not using ODI Mapping Datasets - a design decision was made not to use them because of the way Sql2Odi handles subqueries.

Speaking of subqueries, let us give them a try - in the FROM clause you can source your data not only from tables but also from sub-SELECT statements or subqueries.

SELECT 
  LAST_NAME,
  FIRST_NAME,
  LAST_NAME || ' ' || FIRST_NAME AS FULL_NAME,
  AGE,
  COALESCE(LARGE_CITY.CITY, ALL_CITY.CITY) CITY,
  LARGE_CITY.POPULATION
FROM 
  ODI_DEMO.SRC_CUSTOMER CST
  INNER JOIN ODI_DEMO.SRC_CITY ALL_CITY ON ALL_CITY.CITY_ID = CST.CITY_ID  
  LEFT JOIN (
    SELECT 
      CITY_ID,
      UPPER(CITY) CITY,
      POPULATION
    FROM ODI_DEMO.SRC_CITY
    WHERE POPULATION > 750000
  ) LARGE_CITY ON LARGE_CITY.CITY_ID = CST.CITY_ID  
WHERE AGE BETWEEN 25 AND 45

As we can see, a sub-SELECT statement is handled the same way as a source table, the only difference being that we also get a WHERE Filter and an Expression that together give us the data set of the subquery. All Components representing the subquery are suffixed with a 3 or _3_1 in the Mapping.

Now let us try Aggregates.

SELECT 
  REGION,
  SUM(POPULATION) TOTAL_POPULATION,
  ROUND(MAX(SRC_CITY.POPULATION) / 1000000) BIGGEST_CITY_POPULATION_K,
  ROUND(MIN(SRC_CITY.POPULATION) / 1000000) SMALLEST_CITY_POPULATION_K
FROM 
  ODI_DEMO.SRC_CITY
  INNER JOIN ODI_DEMO.SRC_REGION ON SRC_CITY.REGION_ID = SRC_REGION.REGION_ID
WHERE
  CITY_ID > 20 AND 
  "SRC_CITY"."CITY_ID" < 1000 AND 
  ODI_DEMO.SRC_CITY.CITY_ID != 999 AND
  COUNTRY IN ('USA', 'France', 'Germany', 'Great Britain', 'Japan')
GROUP BY
  REGION
HAVING 
  SUM(POPULATION) > 10000 AND
  MIN(SRC_CITY.POPULATION) > 100 AND
  MAX("POPULATION") > 1000 AND
  AVG(ODI_DEMO.SRC_CITY.POPULATION) >= 5

This time, instead of an Expression we have an Aggregate. The parser has no problem handling the many different "styles" of column references provided in the HAVING clause - all of them are rewritten to be understood by ODI.

Now let us throw different Expressions at it, to see how well they are handled.

SELECT
  REG_COUNTRY.COUNTRY,
  REG_COUNTRY.LOC,
  REG_COUNTRY.NR_OF_EURO_REG,
 
  LAST_NAME,
  LAST_NAME AS SURNAME,
  FIRST_NAME,
  FIRST_NAME || ' ' || LAST_NAME FULL_NAME,
  'String concatenation' || ' ' || FIRST_NAME || ' demo.' CONCAT_STRING,
  UPPER(LAST_NAME) || ' in UpperCase' AS LAST_NAME_UPPER,
  SUBSTR(TRIM(UPPER('     Name: ' || LAST_NAME || ' ' || FIRST_NAME || '    ')), 2, 10) TEXT_FUNC,
  TRANSLATE(UPPER(LAST_NAME), 'AEIOU', 'XXXXX') X_LAST_NAME,
  LENGTH(FIRST_NAME || ' ' || LAST_NAME) FULL_NAME_LEN,
  10 +  LENGTH(FIRST_NAME || ' ' || LAST_NAME) FULL_NAME_LEN_10P,
  10 *  LENGTH(FIRST_NAME || ' ' || LAST_NAME) FULL_NAME_LEN_10T,
  INSTR(UPPER(LAST_NAME), 'MC') MC_IN_LAST,
  
  1 + 2 + 3 + 4 +5+6+7 SIMP_SUM,
  1+2-3*4/5+(6*7+8-9)/(1+2+3) SUM2,
  ROUND(1+2-3*4/5+(6*7+8-9)/(1+2+3)) SUM2_ROUND1,
  ROUND(1+2-3*4/5+(6*7+8-9)/(1+2+3), 2) SUM2_ROUND2,
  FLOOR(ROUND(1+2-3*4/5+(6*7+8-9), 2) / ROUND((1+2+3), 2)) SUM2_ROUND3, 
  
  SYSDATE DATE_NOW,
  SYSDATE AS CUR_DAT,
  1 + SYSDATE AS CURD_1,
  SYSDATE + 4 AS CURD_4,
  CURRENT_DATE AS CUR_ALT,
  ADD_MONTHS(SYSDATE, 1) CURD_1M,
  CURRENT_TIMESTAMP STAMP_NOW,
  LAST_DAY(SYSDATE) LAST_CURD,
  NEXT_DAY(LAST_DAY(SYSDATE), 2) LAST_NEXT_CURD,  
  TO_CHAR(LAST_DAY(ADD_MONTHS(SYSDATE, 1)), 'DD/MM/YYYY') CHAR_CURT,
  
  CASE
    WHEN REG_COUNTRY.COUNTRY = 'USA' THEN NULL
    WHEN REG_COUNTRY.COUNTRY = 'France' THEN ROUND(1+2-3*4/5+(6*7+8-9)/(1+2+3), 2)
    WHEN REG_COUNTRY.COUNTRY = 'Great Britain' THEN FLOOR(345.56)
    WHEN REG_COUNTRY.COUNTRY = 'Germany' THEN MONTHS_BETWEEN(SYSDATE, SYSDATE+1000)
    ELSE NULL
  END SIM_CASE_NUM,
  
  CASE
    WHEN REG_COUNTRY.COUNTRY = 'USA' THEN NULL
    WHEN REG_COUNTRY.COUNTRY = 'France' THEN NEXT_DAY(LAST_DAY(SYSDATE+5), 2)
    WHEN REG_COUNTRY.COUNTRY = 'Great Britain' THEN NEXT_DAY(LAST_DAY(SYSDATE+40), 2)
    ELSE NULL
  END SIM_CASE_DATE,  
  
  CASE
    WHEN REG_COUNTRY.COUNTRY = 'USA' THEN NULL
    WHEN REG_COUNTRY.COUNTRY = 'France' THEN UPPER(FIRST_NAME || ' ' || LAST_NAME)
    WHEN REG_COUNTRY.COUNTRY = 'Great Britain' THEN AGE || ' years of pain'
    ELSE NULL
  END SIM_CASE_CHAR,  
  
  CASE
    WHEN REG_COUNTRY.COUNTRY = 'USA' THEN NULL
    WHEN REG_COUNTRY.COUNTRY = 'France' THEN FIRST_NAME
    WHEN REG_COUNTRY.COUNTRY = 'Great Britain' THEN LAST_NAME
    ELSE NULL
  END SIM_CASE_CHARCOL,
  
  CASE
    WHEN REG_COUNTRY.COUNTRY = 'USA' THEN NULL
    WHEN REG_COUNTRY.COUNTRY = 'France' THEN AGE
    WHEN REG_COUNTRY.COUNTRY = 'Great Britain' THEN AGE
    ELSE NULL
  END SIM_CASE_NUMCOL,
  
  '123' * 10 IMPI_NUM1,
  123 * '10' IMPI_NUM2
  
FROM 
  ODI_DEMO.SRC_CUSTOMER
  INNER JOIN ODI_DEMO.SRC_CITY ON SRC_CITY.CITY_ID = SRC_CUSTOMER.CITY_ID
  INNER JOIN ODI_DEMO.SRC_REGION ON SRC_CITY.REGION_ID = SRC_REGION.REGION_ID
  INNER JOIN (  
      SELECT COUNTRY_ID, COUNTRY, 'Europe' LOC, COUNT(DISTINCT REGION_ID) NR_OF_EURO_REG FROM ODI_DEMO.SRC_REGION WHERE COUNTRY IN ('France','Great Britain','Germany') GROUP BY COUNTRY_ID, COUNTRY
      UNION
      SELECT DISTINCT COUNTRY_ID, COUNTRY, 'Non-Europe' LOC, 0 NR_OF_EURO_REG FROM ODI_DEMO.SRC_REGION WHERE COUNTRY IN ('USA','Australia','Japan')    
      ORDER BY NR_OF_EURO_REG
    ) REG_COUNTRY ON SRC_REGION.COUNTRY_ID = REG_COUNTRY.COUNTRY_ID
WHERE
  REG_COUNTRY.COUNTRY IN ('USA', 'France', 'Great Britain', 'Germany', 'Australia')
ORDER BY
  LOC, COUNTRY

Notice that, apart from parsing the different Expressions, Sql2Odi also resolves data types:

  • 1 + SYSDATE is correctly resolved as a DATE value whereas TO_CHAR(LAST_DAY(ADD_MONTHS(SYSDATE, 1)), 'DD/MM/YYYY') CHAR_CURT is recognised as a VARCHAR value - because of the TO_CHAR function;
  • LAST_NAME and FIRST_NAME are resolved as VARCHAR values because that is their type in the source table;
  • AGE || ' years of pain' is resolved as a VARCHAR despite AGE being a numeric value - because of the concatenation operator;
  • More challenging is data type resolution for CASE statements, but those are handled based on the datatypes we encounter in the THEN and ELSE parts of the statement.

Also notice that we have a UNION joiner for the two subqueries - that is translated into an ODI Set Component.

As we can see, Sql2Odi is capable of handling quite complex SELECT statements. Alas, that does not mean it can handle 100% of them - Oracle hierarchical queries, anything involving PIVOTs, the old Oracle (+) notation, the WITH statement - those are a few examples of constructs Sql2Odi, as of this writing, cannot yet handle.

Sql2Odi - what is under the hood?

Scala's Combinator Parsing library was used for lexical and syntactic analysis. We went with a context-free grammar definition for the SELECT statement, because our goal was never to establish if a SELECT statement is 100% valid - only the RDBMS server can do that. Hence we start with the assumption that the SELECT statement is valid. An invalid SELECT statement, depending on the nature of the error, may or may not result in a parsing error.

For example, the Expression ADD_MONTHS(CUSTOMER.FIRST_NAME, 3) is obviously wrong but our parser assumes that the FIRST_NAME column is a DATE value.

Part of the parsing-translation process was also data type recognition. In the example above, the parser recognises that the function being used returns a datetime value. Therefore it concludes that the whole expression, regardless of what the input to that function is - a column, a constant or another complex Expression - will always be a DATE value.

The output of the Translator is a structured data value containing definitions for ODI Mapping Components and their joins. I chose JSON format but XML would have done the trick as well.

The ODI Mapping definitions are then read by a Groovy script from within ODI Studio and Mappings are generated one by one.

Mapping generation takes much longer than parsing. Parsing for a mapping is done in a split second whereas generating an ODI Mapping, depending on its size, can take a couple of seconds.

Conclusion

It is possible to convert SQL SELECT statements to ODI Mappings, even quite complex ones. This can make migrations from SQL-based legacy ETL tools to ODI much quicker, allows to refactor an SQL-based ETL prototype to ODI without having to implement the same data extraction and transformation logic twice.

Categories: BI & Warehousing

Sql2Odi, Part One: Translate a simple SELECT statement into an ODI Mapping

Thu, 2021-03-25 06:30

We have developed a tool that translates complex SQL SELECT statements into ODI Mappings.

It was implemented using the Scala Parser Combinators library. We started by combining its Parser Generators into a context-free grammar to describe the Oracle SQL SELECT statement and… hang on, this blog post is not about impressing anyone with terminology. Let us start from the beginning, with the simple.

SQL-to-ODI translation - why bother?

ETL is most often prototyped with SQL statements. But, once shown that the data transformations we are after, work in SQL, we have do discard those SELECT statements and re-implement the same logic in ODI. Would it not be nice to refactor our SQL-based prototype into a first draft of ODI Mappings?

When migrating ETL logic from a legacy ETL system to ODI, it is likely that much of the legacy ETL logic will be presented in the form of a SELECT statement. If the number of those old mappings is in the tens or even hundreds, would we not welcome any opportunity to accelerate the migration process?

Bigger ETL implementations will typically have large sets of simple mappings following the same design template. If we want to replicate all tables in an OLTP Source schema to our Staging database, we could probably generate SELECT * FROM <source_table> statements for 100 source tables in an Excel spreadsheet in a minute. Creating that many ODI Mappings, no matter how simple, will take much longer than that. Do we not want to spend precious developer time on something more creative and challenging?

However, we do not always need a full-blown SQL parser and ODI Content generator for that. The attention-seeking introduction of this blog post is actually an extreme, complex example of ODI development acceleration. Simple ODI content generation can be done by simple means. There is no need for any form of SQL parsing if your mapping is based on a simple SELECT * FROM <source_table> statement. Now, for a moment let us forget about parsing and take a quick detour into the world of Groovy scripting, which is the first step towards generating ODI content.

Need to accelerate ODI development? Write a Groovy script!

I have been asked a couple of times about my favourite ODI feature. Without fail, I have always given the same reply - Groovy scripting!

In ODI Studio, from the main menu you navigate to Tools → Groovy, then choose New Script and write as much Groovy script as you please. It is that simple… sort of.

When scripting in Groovy, essentially the whole ODI SDK that the ODI Studio itself is based on, is at your disposal. So, in theory, everything you can do in ODI Studio, you can also script. You can design a Mapping, set up its Physical architecture (including IKM and LKM configuration),  validate it and then create a scenario for it - all that with a Groovy script. It is a great tool to accelerate simple, repetitive build and migration tasks. On the flip side, the ODI SDK public API documentation is not very developer (especially beginner developer) friendly. Fortunately, Google is quite good at helping you with simpler questions you may have - Stackoverflow has quite a few ODI-related Groovy questions asked and answered. Also, if you like Rittman Mead as much as I do, ask them for the 2-day Advanced ODI 12c Bootcamp - you will not become an ODI scripting expert in two days but it will get you started.

It will never be quicker to script the creation of a mapping in Groovy than to create the same mapping in ODI Studio. However, if we are talking about many Mappings that are simple, hence easy to script, we can save a lot of time by involving Groovy.

We can create a script that generates mappings for all tables in a source schema to replicate them to our Staging schema. We can even add ETL Batch ID or Replication Date if required. But if we need more than that, we will either need to provide heaps of metadata to base the Mapping generation on, or need to better understand the SELECT statement. The SELECT statement will usually be the preferred option, because you can easily test it on the Source database.

Now, let us return to the original subject of SQL-to-ODI translation.

SQL-to-ODI translation - can it be simple?

Yes. If all your source extract SELECT statements come in the form of SELECT * FROM <source_table>, you could write a Regex expression to extract the table name, assuming that everything that comes after the FROM keyword is a single table name (i.e. no joins, no multiple tables). Then, if you can find a Data Store with the same name in an ODI Model, your script can add it into your Mapping and map the source columns to a Target table (mapping the columns by name or position). All that can be done with relative ease.

What if we add a filter to the SELECT statement? We can still parse it with Regex - everything between the FROM and WHERE keywords is a table name, everything after the WHERE keyword is a filter condition:

SELECT *
FROM <this is all table name>
WHERE <filter from here onwards>

That sounds simple enough and it is - sort of. You need to be careful when pinpointing the FROM and WHERE keywords - a simple text search will not do the trick. The two SELECT statements below are 100% valid, but:

  • in the first one the FROM is neither space- nor carriage return-separated from the SELECT expression (123),
  • the second one has two FROM in it, where the first one is part of the column alias WHERE_FROM and the second one initialises the FROM block.
SELECT (123)FROM DUAL;
SELECT 'Milton Keynes' WHERE_FROM FROM DUAL;

Still, pinpointing the FROM and WHERE blocks is quite doable with a well written Regex expression (assuming no unions or subqueries).

Hopefully we do not need to pick the filter statement apart, because the ODI Filter Component accepts a single condition string.

Let us look at an example:

SELECT * FROM CUSTOMER
WHERE NAME IN (‘John’,’Paul’,’Peter’) AND SURNAME LIKE 'S%' AND AGE BETWEEN 25 and 85

In the WHERE clause above, we have a compound filter condition, consisting of three basic conditions joined with the AND keyword. If we would use the above in ODI, it will not be recognised as a valid ODI filter, because ODI requires column references to be in the form of TABLE_NAME.COLUMN_NAME (we could have an Expression, an Aggregate or other Mapping Component instead of a table name as well). Whereas the SQL statement above is perfectly valid, the filter requires adjustment before we can use it in ODI. Perhaps a text search/replace could do the trick but how do we know what to search for? And if we search and replace NAME with CUSTOMER.NAME, we will corrupt SURNAME.

You could write a Regex parser for this very particular filter condition to extract the column names from it. But if the filter changes to … AND AGE >= 25 AND AGE <= 85, we will have to change or extend the Regex string accordingly. No matter how good with Regex you are, you may have to give up here.

We could ask for the SELECT statement to be rewritten in an ODI-friendly form instead of trying to transform the filter condition in our script.

In addition to the above, if we have expressions (column references, constants, functions, CASE statements, etc.) in the SELECT part of the query, can we extract those with a script so we can add them to an ODI Mapping's Expression Component?

SELECT
    123,
    "string, with, commas,,,",
    COALESCE(column1, ROUND(column2,2), column3, TRUNC(column4), column5),
    ...

When armed with just Regex, it will be very difficult. We cannot assume that any comma separates two expressions - many SQL functions have commas as part of their expression.

Transformation logic in the SELECT part of a query (i.e. anything beyond simple column references), joins, subqueries, unions - I would not try to attack that armed with Regex alone. In the context of SQL-to-ODI translation, I would deem such SELECT statements complex or non-trivial. That is where Combinator Parsing and more serious software engineering come into play. Let us discuss that in the Part Two of this blog post series.

Categories: BI & Warehousing

Extending OAC with Map Backgrounds

Wed, 2021-02-03 08:50

One of the really powerful things in OAC is the simplicity with which you can create geographical insights using the Map visualisation. I say this as someone who has implemented maps in OBIEE several times and so know the challenges of MapBuilder and MapViewer all too well. It really is a case of chalk and cheese when you compare this to how easy things are in OAC! Well, things just got even better in OAC 5.9. With the introduction of Web Map Service (WMS) and Tiled Web Maps, we can now integrate custom backgrounds to overlay our layers onto. In this post, I will walk through the process of configuring WMS backgrounds and show how this may prove useful to OAC users with a use case based on openly available data sources.

Maps Background

Before we look into these new Map Backgrounds...it's worth taking a moment to offer some background on Maps in OAC. For those of you know how things work, please skip ahead. For those who are new to mapping or who have used Map visualisations in OAC, but never stopped to consider how the data magically appears in your canvas, then here comes a brief guide:

For your data to be rendered on a map, you need three things:

  • A Map Layer which defines the geometry of your data. The map layer must be presented in geoJSON format, which defines a series of points or polygon boundaries. The map layer must also include a key, which needs to match the key range in your data set...this is how OAC is able to interpret your data as relating to a specific point or space on the map layer.
  • A Background Map, which is basically an image that sits behind the map layer and provides context to it. This would typically be an outline map of the world or it could have greater definition if that is helpful.
  • A Data Set that includes a key attribute which relates to its geography. This may be as simple as a country code (or name) or it could be some form of surrogate or business key. When you use this data set to create a Map visualisation, you can allow OAC to determine which map layer to use (based on a comparison of the key in the data set with the configured keys in the available map layers) or you can specify the default layer to be used by setting the Location Details when preparing the data set.

Both Background Maps and Map Layers are configured within the OAC Console. Out of the box, you get a set of useful backgrounds and layers provided, but you are also able to configure your own custom layers and backgrounds. Once set up, they become available for users to select when they create a Map in their DV Canvas.

There's quite a lot to unpack here, including how you can source your geoJSON map layers (either from open sources or via SQL if the geometry data resides in your database), how to add custom layers, how to assign default layers and how to manage the join quality to ensure your visualisation is presenting the right data. Too much to get into here, but I may return to the subject in future - let me know if this would be useful to any of you!

Back to the Map Backgrounds...

So, OAC provides you with three generic Background Maps out of the box:

  • Oracle BI - a clean, silhouetted but coarse grained background map which provides differentiation of country boundaries
  • Oracle Maps - a more detailed map which includes road networks, watercourses, national parks, place names etc.
  • OpenStreetMaps - an open source background with similar detail to the Oracle Maps background, but a slightly more neutral style

The key thing for me, here, is that the options range from coarse grained to extreme detail, with not much in between. Plus, plotting data against a map that only shows physical infrastructure elements may not be relevant for every use case as it can easily complicate the picture and detract from the user experience.

What if I have a data set as per the one shown above (which gives me population and population growth for major towns and cities in England and Wales) and I want to present it in on a map, but in a way that will help me understand more than simply the geographical location? Well, the good news is that there are a growing number of open data map service resources that offer us a rich source for background maps. The better news is that from 5.9 onwards, we are now able to integrate these into our OAC instance and make them available as background maps.  Specifically, we are able to integrate Web Map Services (WMS) and Tiled Web Maps.

WMS is a universal protocol introduced by the Open Geospatial Consortium and serves as a standard through which providers can make map details available and end user applications can request and consume the maps. The service consists of a number of different request types, among which the GetCapabilities request provides important details needed when integrating into OAC.  

Taking my scenario as an example, and using the openly available UK Air Information Resource, we will look at how we can use this new feature to help us better understand the effects of population growth on air quality.  

Understanding the GetCapabilities Request

As mentioned before, the Background Maps are configured in the OAC Console, under the Maps option. Navigate to the Backgrounds tab and select the Add Background option. As you can see, we have two new options available to us (Web Map Service and Tiled Web Map).

Selecting the Web Map Service option presents us with the Add Background dialog box, where we can configure the map integration. You can see below that we need to provide a name for the background (this is the name that end users will identify the background by in DV) and an optional description. We also have some options that we need to derive from the map service, namely:

  • URL - this is the URL for the server hosting the may service
  • Version - this specifies the WMS version in use
  • Coordinate Reference System - this is a standard value and cannot be modified
  • Layers - the WMS service may support multiple layers, and here we can specify the layer(s) that we want to use, based on the layer name.
  • Format - the type of image that will be rendered when using the background

All of this information can be derived using the map services GetCapabilities request, which is an XML document that describes the definition of the service. Hopefully, your WMS provider will make the GetCapabilities easily available, but if not (and you know the URL for the host server) then you can pass the request=GetCapabilities parameter for the same result.

Once you have provided the required values, you must finally accept the trust statement (which will automatically add the host site to the Safe Domains list). After saving and a quick refresh, you should be able to inspect the new background and get a Preview. I've found it useful to do this up front, just to validate that the integration is working correctly.

In my scenario, the WMS provides discrete layers for every year from 2001 to 2018. As my data set shows population growth between 2009 and 2019 and I want to see if there is any relationship between population growth and the changes in air quality over time, I have created three separate background maps showing snapshots at 2008, 2013 and 2018.

Now...once we've completed this set up (and assuming we have already created any custom Map Layers and have our Data Set to hand), we have everything we need to begin building some insights.

Using the Map Backgrounds in DV

My hypothesis is that population increase will result in greater use of the surrounding road networks and there should, therefore, be an increase in recorded roadside particulate levels. Obviously, there are other offsetting variables, such as the change in vehicles (cleaner technology, electric cars etc.) and societal changes towards a greener environment. I'm interested to learn something about the balance of these factors.

To start with, I will use the data set to identify the range of population growth of each town/city by way of a simple scatter graph. I have assigned colour based on a growth classification and, as well as plotting the growth %, I am using size to represent the base population. To save space, I am plotting data on the X Axis rather than the Y Axis. The end result looks like this:

Next, I create a map using the key element (Code), the Name of the town/city and the population growth measure. The appropriate map layer is selected (as I have already specified the default Location Details for this data set) and it is displayed on the Oracle BI background.

Now, instead of using the Oracle BI background, I am going to switch this to use the WMS based background I have just created...a simple case of navigating to the Map properties tab and selecting the appropriate background from the list.

This doesn't look great when viewing the entire data set, but I am more interested in looking at individual locations. I also want to see how the situation has changed over time. I therefore create two duplicates of the map and change the background to reference the 2013 and 2018 snapshot maps respectively (which, remember, simply pick up different layers within the hosted WMS). I increase the transparency of the Map Layer to 80% (so that I can better focus on the background map) and I switch the Use as Filter setting on my Scatter graph (so that I can use that to focus in on specific locations of interest). The resulting canvas looks something like this:

But...What Does It All Mean?

I can definitely see a change in the roadside emissions, most markedly between 2013 and 2018, but I am missing something important. I have no idea what this change means. Without a legend, I am unable to say whether the analysis supports or contradicts my hypothesis...pretty but also pretty useless!

Luckily, the solution, once again, lies in the WMS service. Returning to the GetCapabilities definition, I can see that each layer includes a <LegendURL> tag which gives me the destination for the legend to the data. A quick check to make sure all layers are using the same legend (thankfully, they are) and an extra visualisation later (this time, an image that references the URL specified in the <LegendURL> tag), I get closer to understanding my data:

Quite pleasingly, it appears that, regardless of the rate of population growth or the location, there is a consistent reduction in recorded roadside particulate levels.

In Summary

At this stage, it is probably worth saying that technically, it would have been possible to perform this same type of analysis in previous versions of OAC, as it has always been possible to create and overlay multiple layers within your Map visualisation. However, in practice, this particular use case would be impossible to achieve. It would involve carrying the data and the geometries for every single point of every single road appearing in the background map. That would be a lot of data, and even assuming we had it all available to us, it would be difficult to manage and definitely much, much slower to process than calling out to the WMS service and rendering the response images. Not to mention that the geoJSON file for the map layer would likely breach the size restrictions! Using the WMS integration saves us all this pain and opens up many more opportunities.

In fact, I can see some really powerful use cases for this new feature, as it takes the map background away from being a passive component of the Map visualisation and creates opportunities for it to become an active part of the insight process, enriching and adding relevant context to your existing data sets. Hopefully the use case I've presented here sparks your imagination and opens up some new ways of viewing data spatially within OAC.

Footnote: The examples shown above contain public sector information licensed under the Open Government Licence v2.0

Categories: BI & Warehousing

Rittman Mead sponsoring RMOUG

Mon, 2021-02-01 05:58
Rittman Mead sponsoring RMOUG

After enjoying a successful 2020 in the North American data and analytics market, we’re excited to announce that we’re sponsoring Rocky Mountain Oracle Users Group (RMOUG) this year!

Running from 8th - 11th February, the agenda is jam packed with content including interactive workshops and technical sessions all presented by local and international Oracle product managers, partners and end users. There’ll be plenty of content covering everything from the newest developments in Oracle Analytics to database migrations. You can catch our CEO, Jon Mead at 11.30am (MST) on the 11th where he’ll be exploring how metadata can help you to understand and enhance user engagement. Don’t miss it!

Rittman Mead sponsoring RMOUG

We’ll be hosting a couple of break-out sessions (bring your own coffee) so if you want to chat Oracle Analytics, AI, ML, databases or anything in between then please feel free to virtually drop by and have a chat with one of the team.

Register here: https://events.bizzabo.com/TD2021/page/1611456/registration

Categories: BI & Warehousing

OAC 5.9 - A Tour of the UX Enhancements

Tue, 2021-01-26 09:29

Now that we are nicely settled into 2021 (Happy New Year, by the way!), it's time to roll our sleeves up and take a look at the latest release from the Oracle Analytics team: Oracle Analytics Cloud 5.9. Whenever a new release arrives, there is often a clamour for the new functionality that it brings and whilst 5.9 introduces some interesting features that open up new opportunities for analytics users, this is not what caught my immediate attention.

I will take a look at the new features in due course, but in unpacking 5.9, I was first struck by the enhancements in user experience. UX - a tricky to pinpoint combination of efficiency, convenience, intuitiveness, comfort, productivity and emotion - is often the biggest driver of how well a system is adopted. So, it is essential that these things don't get overlooked and it's good to see that they have not been in 5.9. Even though they might seem trivial when considered in isolation, sometimes...

...it's the small things that make a big difference

If, like me, you've spent some time working with DV, you've probably spent a portion of this time watching the blue progress bar jump to 90% and then waiting for the processing to complete and the visualisation to render. You've also probably been caught out once or twice not noticing the blue bar and thinking that the visualisation in focus is displaying the refreshed results. This can be frustrating and so I was pleased to see that this has been updated in 5.9. The blue bar has been assigned to the dustbin and replaced by a spinning wheel which appears when you roll your mouse over a particular visualisation. More importantly, during refresh, the stale visualisation is no longer shown and the visualisation is only displayed when the refresh has completed. To me, this is a more elegant approach for a couple of reasons: firstly, it removes the trap described above - you can be confident that when you see a visualisation displayed, you are looking at the latest, freshest output. Secondly, if you have multiple visualisations on your canvas, your attention is actively drawn to visualisations on completion, allowing users to focus in on what is immediately available, rather than waiting until all visualisation are refreshed. Confusion reduced, efficiency increased.

It's always important that the interface is as clean and simple as possible and, when space is at a premium, maximising the area for building and working with your visualisations (aka the canvas) is essential. This has been addressed in 5.9 by a really simple change: in the left-hand panel, the Data, Visualizations & Analytics tabs have been switched from a vertical layout to a horizontal layout, now appearing at the top, above the search bar. Add this to the simplified header (without the Oracle Analytics header and the Project header minimised) and the result is quite noticeable. Again, nothing mind-blowing, but the impact is significant, creating a much cleaner interface with much more of the real estate dedicated to the important stuff.

Elsewhere, there are two interesting enhancements as to how we can interact with the Canvas. The first involves sorting data within a visualisation. There is a minor cosmetic change to the way the sort options are presented, with the measures and attributes now being displayed in a nested menu structure rather than a single list. We are still able to perform nested attribute sort ordering or sorting on measure values, but the new menu structure is cleaner and more intuitive, as it also depicts the grammar assignment (colour, size, shape etc.) for each data element. More importantly, as you can see in the screenshot below, we are also now able to explicitly remove sort ordering by selecting None. This removes any default sorting applied to the data when it is added to the canvas.

The second revolves around filtering. You will be familiar with the way that canvas level filters are assigned at the top of the Canvas (via the filter bar), whilst it is also possible to apply filters discretely within any given visualisation (within the Filters assignment). These visualisation level filters get applied in addition to any canvas level filters. However, up until now, the filters needed to be created independently of each other and if you wanted to switch from a canvas level filter to a visualisation level filter (or vice versa), frustratingly, you had to remove and recreate the filters. This has been addressed in 5.9, with filters now being interchangeable, simply be dragging and dropping a created filter from one area to the other. Another nice efficiency saving!

There's not too much to report in terms of new visualisations in 5.9, but we do see the addition of a new area chart option. Previously, we only had two options: Area (pictured below) and 100% Area. Now...the Area option was always confusing to me as it actually stacked measure values, making the Y axis a cumulative scale and not an absolute one. I always found this made it difficult to compare measure values and it was relatively easy to misinterpret the scale. This has been addressed through the addition of the new option (named Area). This allows us to plot our data on a single value scale (Y axis), showing the absolute values as opposed to the cumulative values stacked on top of each other. Conveniently, the areas are displayed with some transparency, as they are overlayed on top of each other and, as normal, these settings can be manipulated through the Properties menu. The old Area visualisation remains, but has been (appropriately) renamed as Stacked Area and the 100% Area option also remains unaffected.

One of the major strengths within DV is the ease with which data can be acquired and loaded, with preparation actions being defined and applied to the data on its journey into the system. The ability to clean source data has been made a little simpler in 5.9 with the addition of a TRIM function which allows you to eliminate any leading or trailing spaces from source data elements. Interestingly, I hit exactly this issue on my current project and, whilst it was simple enough for me to create a custom transformation, I can see how achieving the same result with a single mouse click will be very useful to many self-service users.

Now, as much as we rail against the thought of users moving data and insight outside of the platform, it is an inevitability that we've all come to accept. It's therefore interesting to see that the classic data download limits (i.e. Enterprise Analysis & Dashboards) has been increased (2m rows for 2-12 OCPU's or 4m rows for 16-52 OCPU's) for raw downloads and for formatted downloads (200k rows for 2-12 OCPU's or 4ook rows for 16-52 OCPU's). There are also some nice improvements to the share/export feature in 5.9, with the outputs to Powerpoint, PDF and Image now being much cleaner. Where previously, the formatting of the exported outputs was a little clunky and often required manual adjustment, in 5.9 the rendered output is much more closely aligned to what you see on your screen when working in DV. Aspect ratios are respected and the end result is all the better for it!  

Finally (for now, at least), 5.9 brings us greatly increased flexibility over the ways we can present geographical insights with the inclusion of Web Map Services and XYZ Tiled Web Maps as sources for your Background Maps. This allows you to configure map backgrounds based on integrations with openly available services. Configuration is a relatively simple process and once created the new backgrounds are immediately available for your self-service user community to grab and create map layers over as soon as configured.

In summary, I can definitely see how the enhancements described above will make OAC 5.9 a cleaner, more intuitive platform allowing users to be that little bit more efficient in their work. When aggregated, I think the cumulative effect becomes quite measurable.

Next time, I'll look into these new background map integrations in more detail and also explore the other new features that arrive in 5.9, including some new data preparation actions, assistance in understanding machine learning models and more.

Categories: BI & Warehousing

Using the Go-URL in OAS

Tue, 2020-11-10 05:18

The OBIEE Go-URL functionality is widely used and it's deprecation in OAS is causing issues for companies wishing to upgrade to OAS.

This is a feature of OBIEE that allows a report to be accessed directly from a URL, enabling links from other applications - often with optional parameters passed through to filter the report - and usually bypassing the login screen by including user credentials within the URL parameters.  

Typically a single user-id is used for Go-URL access to OBIEE reports thus avoiding the need to either enter/detect user details or to provide login credentials for users who would not login to OBIEE directly.

With the first release of OAS, Oracle have announced that the Go URL functionality is being deprecated and may be removed in a future release. Also, since OBIEE 12.2.1.3, the advent of the  combined bi security login screen for both Analytics and DV has meant the NQUser and NQPassword parameters no longer work, so it is not possible to specify a user for the Go URL and hence it cannot login.

There are currently two scenarios under which the Go-URL does work in OAS:

  1. If full Single Sign-on to OAS has been implemented - via Oracle Access Manager or Apache, currently the only recognised methods for SSO in OAS - then the Go-URL will work under the user-id that SSO signs in under. In other words, the user credential parameters in the URL are ignored, the user logs in automatically under their credentials and the requested report displays as normal.

    However this is often not appropriate - if users are clicking on Go-URL links outside the scope of the SSO environment, for instance connecting via the internet or from other networks within the company LAN where SSO could not operate. Also if reports are designed to work under a specific user, perhaps with special security rules, a users own security levels may result in different data appearing in a report.

  1. Disable the lightweight SSO option in OAS. This separates the security for Analytics and DV. For Analytics this returns to the classic login screen and the NQUser and NQPassword parameters on the Go-URL work correctly, but this causes issues for DV making it virtually unusable. This is only really an option if you do not want to use DV in your implementation of OAS.

There is however a third option we have discovered which uses the Single Sign-On approach, but without the need to actually detect user credentials via an SSO service. Instead a specific user-id can be provided, just like the NQUser parameter in the Go URL.

Instead of the Go URL passing the NQUser and NQPassword parameters, Apache can be used to re-direct access to OAS, providing the user-id to login with during the re-direct.  The effect is that the Go URL will still display the same report output whilst logging in with the same user-id, albeit supplied by Apache rather than the NQUser parameter.

This works by using the Single-Sign-On configuration between Apache and OAS.

In a normal SSO configuration, Apache would authenticate users via a standard SSO service, for example, Kerberos, and once the user has passed authentication, redirect the URL to OAS, providing the user credentials as request-headers, so that OAS can automatically login the user in, bypassing the login screen.

The Oracle document ID 2707401.1 explains this process in detail.

We can however ignore the SSO authentication within Apache and instead hard-code the User-id we want to login to OAS with.  Apache will still send this user-id in the request-header to OAS, which in turn will recognise it as a pre-authenticated user and log the session in.  In the case of a Go URL, the rest of the URL which points to a report or dashboard will continue to process as normal, displaying it to the user.

The user is totally unaware of the difference - they click on a link and see a report in OAS just as they did in OBIEE.

Just as with the Go URL there are security implications of this. The user-id and password are no longer sent in plain text in the URL, which is an improvement, but calling the URL will automatically login you into OAS. if you just use the core URL of http://oas-server:port/analytics, you will be logged in to OAS and end up at the home screen, without any user/password challenge.

It is important therefore to restrict the permissions and privileges of this hard-coded user so it can only access the reports/data it needs to and cannot create new content, access other subject areas etc.  Using the Apache configuration, OAS can be tied down so that just the /analytics URL can be opened for this, so no access can be given to /dv, /xmlpserver or any other area of OAS unless explicitly required.

By using Apache to control access to OAS you create a separate port to call OAS on for the Go-URL. This means the port you installed OAS with (e.g. 9502/9503) will remain as the main port for logging into OAS via the login screen - users manually connecting to OAS for analytics, DV and all the other functionality, should continue to use this port.   These connections will not go via Apache.

The Go-URL will use the port provided by Apache - in the instructions below I’ve set this up as port 9603 - and Apache will re-direct the user seamlessly to OAS’s main port.   Apache can also be configured to use SSL certificates, and if installed on the same server as OAS, the same certificates as OAS.  In this example I’m not using SSL, but the instructions for doing so are included.

Assuming you have already OAS installed, below are the steps to install and configure Apache to act as the “SSO” front-end for the Go-URL.   The instructions below were carried out on an Oracle Enterprise Linux 7.8 server - other versions of Linux or using Windows will be slightly different.

Start by installing Apache HTTP Server

As the root user, use yum to download and install Apache and the required SSL and SSO plug-ins:

yum -y install httpd mod_ssl mod_auth_kerb mod_auth_token

Then enable and start Apache:

systemctl enable httpd.service systemctl start httpd.service

Next enable Apache to connect to the network and allow access on port 9603. In this case I’ve installed  policycoreutils to get the semanage command as I have SELinux enabled.

yum -y install policycoreutils-python 
/usr/sbin/setsebool -P httpd_can_network_connect 1 
semanage port -a -t http_port_t -p tcp 9603 
semanage port -l | grep http

The final command above confirms port 9603 is now available:

Next open the firewall port for 9603:

firewall-cmd --zone=public --add-port=9603/tcp --permanent service 
firewalld stop service 
firewalld start 
systemctl restart httpd.service

Now Apache is is installed, we can configure it. Edit the file /etc/httpd/conf/httpd.conf

Set the Listen port and, if SSL is required, add the SSL virtual host properties with the correct server, port and SSL certificate file values (I’ve included the SSL virtual host in httpd.conf, but it could reside in it’s usual place in ssl.conf instead)

Listen 9603 
<VirtualHost oasvm3.local.com:9603> 
ErrorLog logs/ssl_error_log 
TransferLog logs/ssl_access_log 
ServerName oasvm3.local.com 
SSLEngine on 
SSLProxyEngine on 
SSLCertificateFile /u01/oas55/ssl/certificate.crt 
SSLCertificateKeyFile /u01/oas55/ssl/certificate.key 
SSLCertificateChainFile /u01/oas55/ssl/certificate_chain.pem 
SSLCACertificateFile /u01/oas55/ssl/ca_certificate.crt
SSLProxyCACertificateFile /u01/oas55/ssl/ca_certificate.crt 
RequestHeader set WL-Proxy-SSL "true" 
RequestHeader set IS_SSL "ssl" 
RewriteEngine On 
RewriteOptions Inherit 
ProxyPreserveHost On 
</VirtualHost>

save and close, then restart apache:

systemctl restart httpd.service

Then try the Apache URL in a Browser: http://oasvm3.local.com:9603 (or https for ssl)

if using SSL check for the padlock icon in the address bar.

Configure Apache for OAS

Re-open the httpd.conf file and locate the line:

#ServerName www.example.com:80

below this add the following lines - these are typical recommended settings for OAS

###### Various default settings ###### 
# 
# Timeout: The number of seconds before receives and sends time out. 
# 
Timeout 6000 
# 
# KeepAlive: Whether or not to allow persistent connections (more than 
# one request per connection). Set to "Off" to deactivate. 
# 
KeepAlive On 
# 
# MaxKeepAliveRequests: The maximum number of requests to allow 
# during a persistent connection. Set to 0 to allow an unlimited amount. 
# We recommend you leave this number high, for maximum performance. 
# 
MaxKeepAliveRequests 0 
# 
# KeepAliveTimeout: Number of seconds to wait for the next request from
# the same client on the same connection. 
# 
KeepAliveTimeout 60 
#
#####################################

Now add the following lines to the end of the file, altering the OAS URLs appropriately. The ProxyPass URLs must be the original SSL or Non-SSL OAS URLs.

Define GoUser reportuser 

RewriteEngine On Proxy
PreserveHost On 

# Protected Resources 
<Location "/analytics"> 
ProxyPass "http://oasvm3.local.com:9502/analytics" 
ProxyPassReverse "/analytics" 
#SSLRequireSSL 
RequestHeader unset Authorization 
RequestHeader set OAM_REMOTE_USER ${GoUser} 
RequestHeader set iv-user ${GoUser} 
RequestHeader set SM_USER ${GoUser} 
</Location> 

ProxyErrorOverride Off 

# Unprotected Resources 
<Location "/bi-security-login"> 
ProxyPass "http://oasvm3.local.com:9502/bi-security-login"
ProxyPassReverse "/bi-security-login" 
</Location> 

The first line defines the user which will login to OAS, in this case I’ve created a user called ‘reportuser’. This is then substituted into the request headers below.

Ideally this user should exist in the provider used for authentication in Weblogic - so that any security groups can be picked up to control this user's access with OAS.

Note the SSLRequireSSL is hashed out, the hash must be removed if SSL is required.

save and close, then restart Apache:

systemctl restart httpd.service

Configure Weblogic to accept SSO calls from Apache

To enable Weblogic to recognise SSO tokens to Apache we use the OAM Identity Asserter.

Login to the Weblogic admin console, Navigate to Security Realms → myrealm → Providers, then Click Lock & Edit, then click New.

Click New.  Enter a name and choose OAMIdentityAsserter as the type.

Click OK to add the provider:

Click on the new OAM Asserter provider to edit it. Change the Control Flag to REQUIRED & click Save. The remaining default settings are fine.

Return to the providers screen, click Reorder and move the OAM Asserter so it is second in the list below the main user directory, in this case PaulsAD is the main active directory authenticator

This next step isn’t strictly necessary - the Weblogic Plug-in is required to allow access to Data Visualizer and if you have no intention of allowing access to DV via Apache, this step can be skipped.

The Weblogic Plug-in needs enabling in three locations.

From the Domain Structure window click on bi

Go to the Web Applications tab

Scroll down to the WebLogic Plugin Enabled option and check it. then click Save.

From the Domain Structure window expand Environment and click on Servers. Click on bi_server1 in the Configuration tab.

Scroll down the General, click on the Advanced link, then locate the WebLogic Plug-In Enabled option and set it to ‘yes’. Click save.

From the Domain Structure window click on Clusters. Click on bi_cluster

Again in the General tab click on Advanced then set the WebLogic Plug-In Enabled option to ‘yes’. Click save.

All changes are now complete. Click Activate Changes.

A full stop and restart of Weblogic & OAS is required under the oracle user:

/u01/oas55/config/domains/bi/bitools/bin/stop.sh    
/u01/oas55/config/domains/bi/bitools/bin/start.sh
Enable Lightweight SSO

For this method of accessing OAS to work, the internal lightweight SSO must be enabled. It will be enabled by default, but if it has been disabled, for example to make the Go-URL work, then re-enable it:

Stop OAS, open the WLST command line:

cd /u01/oas55/product/oracle_common/common/bin ./wlst.sh

Re-enable SSO (alter the domain path to suit your environment)

wls:/offline> enableBISingleSignOn('/u01/oas55/config/domains/bi','/bi-security-login/logout?redirect=/dv'); 
wls:/offline> exit();

Then restart OBIEE to reflect the changes.

The Apache configuration is now complete and you should be able to login to OAS on the Apache URL, e.g. http://oasvm3.local.com:9603/analytics

The SSO configuration should automatically log you in under the user defined in the apache configuration above:

The Original OAS URL remains on port 9503 and will take you to the normal login screen:

Note than when you login via the Apache you can’t sign-out!  You can click the sign-out link, but the SSO process will simply login you back in again rather than displaying the login screen. To login normally close the browser and reopen with the standard OAS URL.

Apache Log files

If you have any issues with the above, there are five log files for Apache you can use investigate whats going on.

/var/log/audit/audit.log Contains SELinux issues
/var/log/messages Contains general activity messages
/var/log/httpd/error_log Contains errors generated by Apache,including some SSL and Kerberos messages
/var/log/httpd/ssl_error_log Contains SSL errors generated by Apache
/var/log/httpd/ssl_access_log Contains messages related to users connecting over SSL

The logging level in the error_log file is governed by the LogLevel setting in /etc/httpd/conf/httpd.conf This is normally be set to ‘warn’, but can be set to debug to display detailed messages.

Testing the Go URL

With Apache/SSO now enabled, the Go URL can be used:

http://oasvm3.local.com:9603/analytics/saw.dll?Go&Path=%2Fshared%2FPipeline QuickAnalytics%2FPipeline%2FSales Pipeline-Charts&locale=en-en&lang=en

Note that if the Go URL includes the NQUser and NQPassword parameters, they will be ignored.

Alternatively the Dashboard parameter can be used to take a user fully into OAS to view a dashboard:

http://oasvm3.local.com:9603/analytics/saw.dll?dashboard&PortalPath=%2Fshared%2FPipeline QuickAnalytics%2F_portal%2FSales Effectiveness

An important point here is that the user is now logged into OAS and can access other content. As mentioned earlier, the user used for Go URL access should have their permissions and privileges tightly controlled to limit access to just what it required.

Accessing other areas of OAS

The instructions above only provide access to the core /analytics functionality via the Apache SSO URL. Other areas of OAS, such as /dv or /xmlpserver are not enabled:

If these are required the location entries must be added to the httpd.conf file and Apache restarted.

The full list locations can be found in the Oracle doc mentioned earlier, and includes protected and unprotected locations (as some must be visible prior to logging into OAS, e.g. the bi-security-login URL). As an example, here is the location required to enable DV:

<Location "/dv"> 
ProxyPass "http://oasvm3.local.com:9502/dv" 
ProxyPassReverse "/dv" 
#SSLRequireSSL 
RequestHeader unset Authorization 
RequestHeader set OAM_REMOTE_USER 
${GoUser} RequestHeader set iv-user 
${GoUser} RequestHeader set SM_USER 
${GoUser} 
</Location>

Thats it, your now ready to use the Go-URL with the Apache URL and not include the NQUser/NQPassword parameters.

This solution is not ideal and does have security implications you must take care of, although it does avoid the user/password appearing in the URL, which is a definite positive.

One downside of this is that you can’t use multiple user-ids - all the Go URL’s will connect with the same user, it is not possible (well we’ve not found a way yet) to allow different Go URLs to connect with different user-ids. Apache will only allow one redirect to /analytics on the main listen port.

Enjoy!

Categories: BI & Warehousing

Analysing Social Media Activity with ADW and OAC

Wed, 2020-10-28 08:11
Analysing Social Media Activity with ADW and OAC

Yesterday I wrote a series of tweets talking about my Twitter activity analysis with Oracle's Autonomous Database and Oracle Analytics Cloud. Since the tweet became popular and I didn't share the steps, I thought a blog post should follow, so... here we are!

[1/N]
I wanted to analyse my Social Media data.
I exported my @Twitter data and analysed with the #OracleText in the #OracleAutonomousDatabase @OracleDatabase and @OracleAnalytics Cloud

Starting from the easy… which is the tweet with most likes?
hint: Becoming @oracleace #ACED! pic.twitter.com/gUQR0Ur7k7

— Francesco Tisiot (@FTisiot) October 27, 2020 Getting the Data from Twitter

There are many ways that you can get data from Twitter. For the sake of my exercise I requested a dump from my data from the Twitter website. You can request the same following these instructions.

Once the data is ready, you'll receive a mail or a notification and you'll be able to download the related zip file named twitter-YYYY-MM-DD-hashed_string.zip.

Once unzipped you'll see two folders:

  • assets containing files you attached in your tweets (mostly images)
  • data where the actual interesting information is.
Analysing Social Media Activity with ADW and OAC

Within the data folder I concentrated on the tweet.js file which contains the tweet information. There are other interesting files such as followers.js or like.js but I didn't include those datasets in my analysis as of now.

Warning: the tweet.js dataset contains only the tweets written by you. It doesn't contain all people tagging you in a tweet or replies to your tweets.

The Your archive.html file allows you to browse the content on the folder from a web browser

Analysing Social Media Activity with ADW and OACImporting Data into ADW with SQL Developer Web

The next step in my analysis is to import the data into the Autonomous Data Warehouse. For this task, I used SQLDeveloper Web, available by default in ADW, which has a powerful utility to import JSON documents as rows in a relational table. Jeff Smith's post covers the process in detail.

Unfortunately when trying to import into ADW the file tweet.js I encountered an error due to the fact that the file itself is not a pure JSON file, pretty clear if you check the file itself

window.YTD.tweet.part0 = [ {
  "tweet" : {
    "retweeted" : false,
    "source" : "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    .....
    "lang" : "en"
  }
},
....
{
  "tweet" : {
    "retweeted" : false,
    "source" : "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    .....
    "lang" : "en"
  }
 }
 ]

The first item to remove from our file is the window.YTD.tweet.part0 = prefix. I believe this is due to the pagination of the results, but it clearly screws up the JSON formatting.

Once removed I could parse the tweet.js file with SQLDeveloper Web, but the table definition proposed had only one column with containing the whole TWEET  JSON document.

Analysing Social Media Activity with ADW and OAC

At this point, I could either accept this and do further processing using Oracle's SQL JSON parsing functionality or slightly change the shape of the file to be ingested correctly and I opted for the second.

When importing JSON documents into rows, SQLDeveloper Web analyses only the first level or attributes in the document itself. In the case of our tweet.js file was something like

{
  "tweet" : {
    ...
    }
},
{
  "tweet" : {
    ...
    }
},

The first level parsed by SQL Developer Web was correctly only extracting the tweet element and proposing a CLOB (JSON) column to store it. But I wanted the content of the tweet to be parsed. I ended up removing the first layer by substituting in the file any occurrence of },{ "tweet" : { with a simple comma and removing the initial and final parenthesis.

The file now looks like the following

[ {
    "id" : "279908827007180800",
    "created_at" : "Sat Dec 15 11:20:39 +0000 2012",
    "full_text" : "Finally at home after #christmasparty... Looooong travel!",
    "lang" : "en"
    ...
  }
,{
    "id" : "276794944394498048",
    "created_at" : "Thu Dec 06 21:07:12 +0000 2012",
    "full_text" : "@mRainey will you be there next week too? Enjoy uk and visit #italy if you can!",
    "lang" : "en",
    ...
  }
 ...
 }]

We can now parse the file with SQL Developer Web, and the output correctly identifies all the first level entities in the JSON document.

Analysing Social Media Activity with ADW and OAC

Few more clicks and we have our table FRANCESCO.TWEET_DATA_SQLDEV populated automagically!

Analysing Social Media Activity with ADW and OAC

Note: The whole process above could be implemented and automated in several different ways, the aim of the blog post is only to demonstrate the feasibility of the analysis.

Text Tokenization with Oracle Text

The Tweet's FULL_TEXT column tokenization is done in the Oracle Database, you can see the full process described by the following video produced by the Oracle Analytics PM team.

If you prefer a short version of it, here it is: I basically created the following index

CREATE INDEX "FRANCESCO"."TWITTER_TWEET_IDX" ON "FRANCESCO"."TWEET_DATA_SQLDEV" ("FULL_TEXT") INDEXTYPE IS "CTXSYS"."CONTEXT";

The INDEXTYPE IS "CTXSYS"."CONTEXT" creates an index using Oracle Text. A more accurate description of the procedure can be found in OracleBase's post.

Once the index is created, we can see some new tables appearing with the name DR$INDEX_NAME$LETTER with

  • $DR$ being a fixed prefix
  • INDEX_NAME the name of the index
  • LETTER a single letter between I, K, N, U, R which meaning can be found in this Document

For the purpose of our analysis, we'll focus on the DR$TWITTER_TWEET_IDX$I table which contains the tokens of our FULL_TEXT column.

But the token by itself is not very useful, we need to match the token with the Tweet's ID to be able to provide meaningful analysis. Again, this is covered nicely by another video created by the Oracle Analytics PM team.

In order to associate the Token with the original Tweet we can use again the power of Oracle Text and the Index created above with the following query

SELECT
    full_text,
    score(1) AS text_score,
    token_text,
    a.id
FROM
    tweet_data_sqldev a,
    (
        SELECT DISTINCT
            token_text
        FROM
            dr$twitter_tweet_idx$i
    )
WHERE
    contains(a.FULL_TEXT, '/' || token_text, 1) > 0

Again for more info about Oracle Text's CONTAINS function please refer to the relevant documentation.

I physicalized the output of the above query in a table ( TWEET_TOKENS), which contains the TOKEN_TEXT together with the Tweet's ID so we can now join this table with the original one containing the list of Tweets in Oracle Analytics Cloud.

Note: One of the next versions of Oracle Analytics Cloud will provide the Tokenization as step of a DataFlow within the Database Analytics options! You'll be able to tokenize your strings without leaving OAC.

Analysing Social Media Activity with ADW and OACAnalysis of the Data in Oracle Analytics Cloud

If you're a frequent blog reader, this is probably the easiest part. I just had to:

  • Create a connection to my Autonomous DataWarehouse by using the wallet file.
  • Create the two datasources: one for TWEET_DATA_SQLDEV and another for TWEET_TOKENS
  • Create a project and include both Datasources

Once in the project, the first visualization is about the most "liked" tweet... no surprise is when I become Oracle Ace Director, back in 2019, during KScope Keynote!

Analysing Social Media Activity with ADW and OAC

This first visualization is ok, but not really using any of the Oracle Text capabilities exposed in the Tokens... so my next analysis was...

Which words do I use more often when tweeting?

Easy... with the two datasources created above!

Analysing Social Media Activity with ADW and OACHow does it change when I reply to people?

Well, you can see that words about OUG sessions and Oracle are still there, but there is the emerging topic of Food!

Analysing Social Media Activity with ADW and OACWho do I interact most with?

I plotted the # of Replies to specific Users…
No wonder (at least to me) that I get the most interactions with people that tag me with Italian food abominations…

Analysing Social Media Activity with ADW and OACAnd... how do I reply to them?

Again, very easy with Oracle Analytics Cloud Trellis Options.
You can spot that Food is the major discussion topic with HeliFromFinland and dw_pete, while my chats with @connor_mc_d and @stewartbryson are covering more topics

Analysing Social Media Activity with ADW and OACWhat about Nephentur?

One Last mention to my interactions with Nephentur: It’s clear his love for OUGs, Airports, Flights, Hotels… all driven by  #TheBergEffect

Analysing Social Media Activity with ADW and OAC

Hope you liked the story, just an example of what you can do with a dataset you own and tools available in the Oracle Cloud!

Categories: BI & Warehousing

OA Summit 2020: OA Roadmap Summary

Thu, 2020-06-11 05:30
 OA Roadmap Summary

If you are in the Oracle Analytics space, the OA Summit is a great source of content: from Keynotes, customer stories, Partners deep dives; the website is a collection of stories about Oracle Analytics. We've been part of the initial episode talking about how you can start your journey to Data Science with Oracle Analytics!

In Tuesday's session, Gabby Rubin, VP of Product Management, and Jacques Vigeant, Senior Director Product Strategy, shared a series of insights about the product roadmap that we'll cover in this blog.

Public Roadmap & IdeaLab

This is quite big news in the Oracle space, now there is a clear vision about what's coming in the product, accessible by everyone on a public website.

 OA Roadmap SummaryOA Roadmap and IdeaLab

The public roadmap works well also in conjunction with IdeaLab: a place where everyone in the OA community can suggest new product features and up/downvote or add comments on other people's ideas. These hints are reviewed by Product Managers and, if considered valid, included in future releases of the product!

Converged Analytics

The state of the art in the Analytics space offers a neat separation between self-service tools and centralized IT-governed ones. Even for Oracle Analytics, we have two separate approaches as of now: self-service data preparation via Data Visualization vs enterprise IT-driven via RPD.

 OA Roadmap Summary

What was announced at the OA Summit is that the two approaches will converge: there will be the option to self-create a multi-table and multi-source federated dataset which can be shared. This approach empowers the end-user and works on top of the usual self-service data-source definition workflow enabling for each identity in the diagram, ML-based transformations and enrichment recommendations as well as caching setting definitions.

Empowering the end-user also means enabling best software development practices like versioning, certification and promotion capabilities on top of the asset created. The multi-table federated dataset created by end-users will seamlessly transition into enterprise-level semantic models without IT intervention.

 OA Roadmap Summary

From the enterprise side, as announced during OOW19, we'll see a new web-based semantic modeler which will substitute good old Admin Tool. Unlike the current "Lite Data Modeler", the new product will enable the same level of definition complexity we can find in today's RPDs thus will be compatible with every OAC/OAS prebuilt repository. The new web-based semantic modeler is not only a pure replacement of the Windows-based admin tool, but it also offers a native source control integration with Git and in-built options for Data Lineage explorations.

 OA Roadmap Summary

As mentioned the new tool is web-based; if you're an expert RPD developer and worried about different development methodology slowing down the build process, well, there is big news for you! You will be able to edit Repositories via the new JSON based Semantic Modeling Markup Language! Yep, you'll be able to define all the layers of an RPD in JSON syntax, even outside the web-based semantic modeler tool itself. This opens a huge variety of opportunities for source-control, CI/CD pipelines, as well as automatic (coded) builds of RPD artifacts.

Oracle Analytics as Enterprise Keystone

As briefly described in my Data Virtualization blog post, Oracle Analytics can (and should) be considered the Analytics Keystone in the enterprise: the convergence of IT-driven and self-service metadata models can already be exposed via ODBC in OAS enabling the downstream application to access the data inheriting the security settings defined in the Repository. The OA team is working to offer the equivalent in JDBC format for all Oracle Analytics products!

 OA Roadmap Summary

Oracle Analytics is also enhancing the number of source-system connectors available: we'll soon have the option to connect directly to Oracle EPM Cloud from Data Visualization, and similar connectors are arriving for JDBC sources and Google's Big Query.

 OA Roadmap SummaryMachine Learning in OA

Augmented Analytics and Machine Learning capabilities have existed for a long time in Oracle Analytics. A new important improvement in this area will enable Data Flows to use Machine Learning models created outside OA via the usual GUI.

 OA Roadmap Summary

Data Scientists will be able to use their favourite environment and toolset in order to analyse the data and build models. The models built in external systems like Oracle Databases, Oracle Data Science Services, or 3rd-party services could then be registered within OA and used to score data in Data Flows making the Data Scientist and Data Analyst collaboration much easier increasing the ML ubiquity in enterprises.

Predictions explainability will also be possible directly in OA, with details of each model exposed appropriately depending on the model type.

 OA Roadmap Summary

In addition to the above, OA will also allow the usage of Advanced Database Analytics like Sampling, Un-pivoting, Clustering, Anomaly Detection or tokenization with more to come. This allows the usage of already existing functions (avoiding to reinvent the wheel) that can perform securely on massive amounts of data in the Oracle Database.

Data Visualization New Features

There is also some news which has already been shipped in the latest product or will be available soon. Some examples are:

  • Adaptive content: we can enable content scrolling in Data Visualization projects, allowing proper spacing of otherwise crowded visualizations
  • Canvas Filter control: a prompt that can filter only a subset of analysis registered to it (similar to the Master-details concept in Analysis)
  • OAC Embedding Framework: That allows to integrate OA capabilities into external web applications
  • Digitalized Map Layers: create an infographic on top of any image, all directly in OAC
 OA Roadmap SummaryOA on Mobile

What about consuming OA content in mobile devices? Data Visualization content is adaptive, the visualization size and appearance will change depending on screen type, so all created content could simply be accessed via mobile browser. Oracle is also investing in Day by Day, which acts as a personal proactive Data Assistant and now enables the threshold-based notifications with more visualization types coming later in the year.

 OA Roadmap Summary

The new announcement regarding mobile is the new Oracle Analytics Mobile App which substitutes the Oracle BI Mobile HD and will provide a way to use, search and collaborate on curated content exposed by OA with an experience in line with modern apps.

 OA Roadmap Summary

More on this, All the Oracle Analytics Apps will enable collaboration: several people will be able to access and comment on visualizations and data avoiding the need to switch to other tools like emails.

A whole new wave of capabilities is in the Oracle Analytics roadmap, for any doubt or questions feel free to reach us!

Categories: BI & Warehousing

Data Virtualization: What is it About?

Fri, 2020-06-05 07:54
 What is it About?

The fast growth of company's data, data-sources and data-formats is driving an increasing interest in the field of Data Virtualization; what is it about? And what tools can provide this functionality?

Data Virtualization defines an approach to expose data coming from disparate data sources via a common interface to downstream applications.

 What is it About?

Do you have your sales data in an Oracle Database and client peculiar information in some Cloud App? Data Virtualization will show them as an unique data source while, in the backend, will fire the proper queries to source-systems, retrieve the data and apply the correct joining conditions. Data Virtualization abstracts the technical aspects of the datasources from the consumer. Unlike the ETL approach,  there is no data copy or movement, with source systems accessed in real-time when the query is executed (also called Query in Place (QIP)).

To be clear: Data Virtualization is NOT a replacement of ETL, both paradigms are valid in specific use-cases. Data Virtualization is a perfect scenario when reduced amounts of data coming from various systems need to be joined and exposed. On the other side, when massive amounts of data need to be parsed, transformed and joined and data-retrieval speed is the key, then an ETL (or ELT) approach is still the way to go.

Data Virtualization Components?

So far we had a look at the theory and goals of Data Virtualization, but what are the main components of such a tool? We can summarize them in four main points in our source-to-user path: Data Access Optimization, Business Logic, Data Exposure and Security.

 What is it About?

What should each layer do? Let's see them in detail.

Data Access Optimization What is it About?

In this layer, the connection to data-sources needs to be defined, tuning parameters set and data-points of interest determinate. This query should be responsible for optimizing the query pushdown to source systems in order to retrieve the minimal row-set that needs to be displayed or joined with other datasources.

The Data Access Optimization layer should also be able to handle different datasources with different capabilities (e.g. aggregation functions) and, in case pushdown is not possible, perform transformations after data load. Finally, to ease the stress on the source system, caching options should be available in this layer.

This layer should also dictate the methodology of datapoints access, defining (with the help of the Security layer) if a certain column can be read-only or also written.

Business Logic What is it About?

The Business Logic Layer should be responsible for translating data-points to company-defined metrics or attributes. Datapoints can come from various data-sources so it should also contain the definitions of joining conditions and data federations.

Metrics and attributes need to have aggregations and hierarchies defined in order to be used by downstream applications. Hierarchies can also help with vertical data federation, defining exactly the granularity of each datapoint, thus enabling query optimization when more aggregated datasources are available.

The Business Logic Layer is also responsible for decoupling Business Logic from datasources: when a change in a datasource happens (e.g. the creation of a datamart or a database vendor change) the metrics and attributes layout will not be altered, all the changes will happen in the Data Access Optimization and hidden in the Business Logic Layer definitions.

Data Exposure What is it About?

The Data Exposure Layer is the one facing downstream applications or users, in this layer metrics and attributes should be organized in business-driven structures and made accessible via various methods like web-interfaces, ODBC, SOAP or REST.

Datapoints defined at this layer should contain business descriptions and be researchable via a data catalog allowing the re-usage of pre-built content.

Security What is it About?

Exposing data to downstream applications needs to happen securely, thus datapoint access rules need to be defined. Data Virtualization systems should integrate with the company's Identity management tools to identify and dictate who can access any particular datapoint.

Security should not only work as ON/OFF, but also allow the access to subsets of data based on privileges: e.g. a country manager should only see data from his/her own country.

Security is not only about defining boundaries but also about auditing the correct access to data (GDPR?). Data virtualization systems should allow security evaluations or provide pre-built security checks.

Data Virtualization in Oracle

As you might understand, Data Virtualization is a new, hot and growing topic; what has Oracle to offer since there isn't an official Oracle Data Virtualization tool? Well, check the picture below:

 What is it About?

This is an Oracle Analytics repository (RPD)! A tool created in 1998, acquired by Oracle in 2007 and with a long history of successful implementations! The reality is that Data Virtualization is not a new topic (we talked about it in 2008), but simply an old capability, with a new name!

Oracle Analytics RPD's three-layer concept matches exactly the Data Access Optimization, Business Logic and Data Exposure layers mentioned above providing the same functionality. The Security is a key component across the whole Oracle Analytics Platform and successfully integrates with all major identity providers.

Oracle Analytics for Data Virtualization

Oracle Analytics is a tool born for data analytics but solves successfully the problem of Data Virtualization: data from different source systems can be queried, joined and exposed.

OA is capable of firing queries based on fields selected by the downstream application and optimized for the datasource selected. Query pushdown is enabled by default and specific datasource features can be turned ON/OFF via configuration files. Specific transformations can happen in OA's memory if the datasource doesn't allow the pushdown. Results of queries can be cached and served for faster response.

 What is it About?

In scenarios where massive amounts of data need to be sourced, probably Oracle Big Data SQL or Cloud SQL could be used, pushing part of the virtualization at the database level.

Vertical and horizontal Data Federation can be defined, so data can span across various tables and aggregated datasources can be used for faster response. Metrics, attributes, hierarchies and joins are defined in the model thus related complexity is hidden from downstream applications.

Data exposed can be accessed via a web browser using the traditional Answers and Dashboard or the innovative Data Visualization. Data can also be extracted via SOAP APIs and via ODBC when the related component is exposed. Data Sources defined in Data Visualization and BI Publisher can also be extracted via REST APIs. There is a plan to extend JDBC access also to RPD defined Subject Areas.

 What is it About?REST APIs

A utility, called Metadata Dictionary, automatically generates a Data Catalog which can be used to expose and share what datapoints are available. In the future, Oracle Analytics Cloud datapoints will also be available via Oracle Data Catalog, a specific offering around this area.

The security included with the tool allows datapoint access definition to specific roles as well as limits on data exports size or access times. Security settings can be exported and audited via external tools. The platform usage can also easily be monitored for performance and security reasons.

Self-service Data Virtualization What is it About?Self-service Data Virtualization

All we discussed so far describes the IT-driven Data Virtualization where the access to data is only managed by a restricted group of people (usually the IT department). Oracle Analytics enables it alongside Self-Service Data Virtualization, where new datasources, joins and security layers can be defined directly by Business Users enabling a faster and secure data sharing process which can still be audited and controlled.

Do you want to expose the business unit's forecast data in your Excel file alongside the revenue coming from your datamart? With Oracle Analytics this is only few clicks away including the security options controlling the audience.

Oracle Analytics offers both top-down and bottom-up approaches to Data Virtualization at the same time! IT-Driven, highly secured and controlled data-sources can coexist with user-defined ones. Joins and metadata models can be built from both sides and promoted to be visible by wider audience, all within one environment that can be tightly secured, controlled and audited.

Conclusion

Data Virtualization is a hot topic indeed, with the ever-increasing number of different datasources in a company's portfolio we'll see the rise of data abstraction layers. But don't look at new shiny tools to solve an old and common problem!

As described above, Oracle Analytics (Cloud or Server) can successfully be used in Data Virtualization contexts. Deployed in conjunction with Oracle Big Data SQL or Cloud SQL and Oracle Data Catalog offers an extremely compelling solution with a set of tools and knowledge already available in the majority of companies.

Categories: BI & Warehousing

Getting Smart View to work with OAC

Tue, 2020-05-05 11:29

I wanted to demonstrate what I thought would be a simple task to a client today, however it turned out to be a little more complex than I first anticipated, so I thought I would publish here. All I needed to do was get Smart View connecting to OAC.

Prerequisites
  1. Excel installed
  2. OAC up, running and available
Approach
  1. Download Smart View (see here) - for reference I downloaded version 11.1.2.5.910
  2. Double click to install and accept the defaults
Establishing a Connection

When I last did this (not sure how long ago that was), I think the OBIEE extension was installed by default, so I put in the OAC connection string and then couldn't work out why I kept getting an error. Fortunately a colleague of mine pointed out the OBIEE extension wasn't actually installed by default any more, so that's what I needed to do next.

Install the Extension

  • Open Excel
  • Create a Blank Workbook
  • Select the Smart View menu item
  • Select Options

Go to Extensions

The OBIEE extension is no longer installed out of the box

You need to check Check for updates when Microsoft Office starts, then you should get a link above saying Check for Updates, New Installs and Uninstalls

Select the OBIEE Extension to install, once successful you should see

Now you can create a connection to OAC

Create a Connection

  • Select the Panel

  • Choose Private Connection, then Create new connection from the bottom of the screen

Choose OBIEE for the connection type

Enter the URL in the following format:

https://<your-oac-host>/analytics/jbips

My example was

https://dv1dfree-xyzabc-ld.analytics.ocp.oraclecloud.com/analytics/jbips

You'll be asked to logon

You can then give the connection a name to save it as

Then you should see the Catalog displayed on the right

If you select a View you will get a number of options of how to display it, the simplest is insert.

Once selected, this will load the data into the sheet.

Categories: BI & Warehousing

Oracle Analytics: Everything you always wanted to know (But were afraid to ask)

Thu, 2020-02-20 03:52
 Everything you always wanted to know (But were afraid to ask)

The release of Oracle Analytics Server (OAS) on 31st January 2020 has left many OBIEE users wondering what it means for them. Common questions we were asked at Oracle OpenWorld Europe last week included:

  • what’s the difference between OAS, OAC and OBIEE?
  • where does DV fit into this?
  • should I be migrating and if so, when; what are the benefits?

This blog post aims to answer all these questions!
First of all, let’s define each product in order to compare them.

 Everything you always wanted to know (But were afraid to ask)Oracle Analytics Cloud (OAC)

Oracle’s strategic analytics platform, hosted in the Oracle Cloud.

Oracle Analytics Server (OAS)

OAS is the new on-prem version of OAC, and is intended as a bridge between legacy OBIEE platforms and Cloud. It has almost complete feature parity with OAC, including AI-powered, modern and self-service analytics capabilities for data preparation, visualisation, enterprise reporting, augmented analysis and natural language processing and search. If you are an OBIEE customer, you are automatically licensed for OAS, and Oracle Data Visualisation (DV) is included at no extra cost.

OAS vs OAC

The main difference between OAS and OAC is related to hosting and administration. OAC is hosted and managed by Oracle, while OAS needs to be installed, configured and patched by you in your datacenter. This also defines the level of control and customisation: with OAS you have the full control over config files, styles, custom function etc, while in OAC you’ll be able to change only what’s exposed in the cloud console by Oracle.
OAC will receive more frequent updates and new features, with OAS scheduled to have an annual release bringing the cloud features to on-premise customers.
So the choice between the two depends on the amount of customisations needed vs the time spent on supporting the platform.

OBIEE vs OAS

OAS was developed to replace OBIEE, however the two products are not exactly the same. There are one or two OBIEE features that are deprecated in OAS, such as BISQLProvider or Act As, but they are still present in the tool, and they’ll not go away until a proper replacement is in place. On the other side, If you were using Scorecards, this tool is no longer shipped with OAS.
OAS on the other hand, brings almost functional parity with OAC, providing a huge amount of new features especially in the self-service area, more info in the dedicated post.

Can I connect to a Database in my Datacenter with OAC? Everything you always wanted to know (But were afraid to ask)


Yes you can, Data Visualization offers the option to connect to any datasource which is reachable from the Cloud. If you don’t want to expose your database directly, Oracle Data Gateway enables the connection from OAC (including RPD based connections) to on-prem data-sources without the need to open any firewall port.

Where does DV come into this? Everything you always wanted to know (But were afraid to ask)

Data Visualization (formerly known as Visual Analyzer) is Oracle’s self-service tool. If you’re an OBIEE 12c user, you may be paying extra license fees to use DV. If you’re already using OAC, you may have noticed DV is included with your license, and this will also be the case for OAS.  

Ultimately, Oracle Analytics’ aim is to provide a mix and match offering, where you can choose which components are Cloud and on-prem. For example, you can upgrade to OAC and point it to your on-prem database. Or if you’re Cloud averse for whatever reason, you can migrate to OAS and utilise many of OAC’s features.

What does Rittman Mead recommend you do next?

There is probably a different answer for everyone, depending on where you are in your Oracle Analytics journey, so we’d recommend contacting us for an initial chat to go through your options. Broadly speaking, if you’re using OBIEE 11.1.1.7 onwards and you’re considering upgrading to 12c, you should factor OAS or OAC into your decision making process.

To help you decide which product is best for you, we are offering a FREE two-day Oracle Analytics assessment which aims to help you create a business case for migrating to OAC or OAS based on your pain points and anticipated usage. Contact us for more information.

Rittman Mead also provides OAC, OAS and OBIEE training. Our next OAC bootcamp is taking place in London, March 23rd - 26th 2020. For more information go to our training page or contact Dan Garrod.

Categories: BI & Warehousing

Oracle Data Science - Accelerated Data Science SDK Configuration

Mon, 2020-02-17 02:31
Oracle Data Science - Accelerated Data Science SDK Configuration

In my last post, I introduced Oracle Data Science, the new tool from Oracle aimed at Data Science collaboration including an Auto-ML Python SDK named Accelerated Data Science. The SDK allows the speed-up and automation of various tasks in the ML pipeline: from feature engineering, model and features selection and model explainability. A very handy tool for newbies and experienced people facing data science problems.

First version, First Hurdle

In the first version of Oracle Data Science, there is a preliminary step to follow before using the Accelerated Data Science SDK. This step is needed to be able to connect Oracle Cloud Infrastructure Object Storage, to save our models in the model catalog.

Oracle Data Science - Accelerated Data Science SDK Configuration

Please note that all the other steps within the SDK will still be available even without this setting. You will still be able to execute the calls to feature engineering,  model and feature selection, model explainability functions but you'll not be able to save the model in the catalog.

If we want to accomplish this last step, we need to create a private/public key and setup a configuration file. Lucky enough we can run the whole process within the Oracle Data Science Notebook! Let's see all the steps

ADF SDK Configuration

First of all, let's login in the notebook and open a terminal session

Oracle Data Science - Accelerated Data Science SDK Configuration

We can then create a folder named .oci under the /home/datascience

mkdir ~/.oci

In the next step, we need to generate an API signing Key

openssl genrsa -out ~/.oci/oci_api_key.pem -aes128 2048 

the command will ask for a password which will secure the key

Oracle Data Science - Accelerated Data Science SDK Configuration

now it's time to generate the public key with

openssl rsa -pubout -in ~/.oci/oci_api_key.pem -out ~/.oci/oci_api_key_public.pem

again the command will ask for a password and then generate the oci_api_key_public.pem file

Another item we need to generate is the key's fingerprint, which can be done with

openssl rsa -pubout -outform DER -in ~/.oci/oci_api_key.pem | openssl md5 -c

the command will ask for the key's password and then output the fingerprint

Oracle Data Science - Accelerated Data Science SDK Configuration

The next piece of information needed is the Tenancy OCID and the User OCID. The first one can be obtained by navigating to the Governance and Administration section and then selecting Administration and Tenancy Details

Oracle Data Science - Accelerated Data Science SDK Configuration

The OCID is shown in the main Tenancy Information section

Oracle Data Science - Accelerated Data Science SDK Configuration

The User OCID can be found by selecting the Identity -> Users

Oracle Data Science - Accelerated Data Science SDK Configuration

After selecting the User we want to connect with, the OCID is visible

Oracle Data Science - Accelerated Data Science SDK Configuration

The next step is to upload the public key generated few steps before. Navigate again to the Identity -> Users -> Username screen in the console. Under the global User info, there is an API Keys section

Oracle Data Science - Accelerated Data Science SDK Configuration

we can click on Add Public Key and paste the content of the oci_api_key_public.pem file generated before

Oracle Data Science - Accelerated Data Science SDK Configuration

Now it's time to use all the information collected so far to create a config file. The file needs to reside under ~/.oci/ folder and must be named config with the following entries

[DEFAULT]
user=<OCID of the user>
fingerprint=<Fingerprint of the Key>
key_file=<Path to the private Key>
pass_phrase=<Passphrase used to Encrypt>
tenancy=<OCID of the Tenancy>
region=<Region where the Tenancy is hosted>

an example is

[DEFAULT]
user=ocid1.user.oc1........
fingerprint=c8:24:75:00:00....
key_file=~/.oci/oci_api_key.pem
pass_phrase=oracle123
tenancy=ocid1.tenancy.oc1.......
region=eu-frankfurt-1

If the settings are not correct, when trying to save the model you'll get an error like the following

Oracle Data Science - Accelerated Data Science SDK Configuration

If, settings are correct, the save model step (defined in my previous blog post) will execute correctly. Enjoy your trials with Oracle Data Science!

Categories: BI & Warehousing

Machine Learning Collaboration with Oracle Data Science

Fri, 2020-02-14 01:48
Machine Learning Collaboration with Oracle Data Science

This week I was sadly forced to skip my planned trip to OOW London in which Rittman Mead had a fantastic booth and Jon Mead presented the "How to Become a Data Scientist" session with Oracle Analytics Cloud and Oracle Machine Learning.

@jonmead prepped and primed to deliver @FTisiot’s How To Become A Data Scientist presentation. Get down to Zone 4! #OOWLON #DataScience pic.twitter.com/BRFobFuTX0

— RittmanMead (@rittmanmead) February 12, 2020

The last-minute plan change, however, gave me time to test a new product just being available in the Oracle Cloud: Oracle Data Science! I've been talking about the tool been announced in my OOW19 Review post describing it as a Data Science collaboration tool coming from the acquisition of DataScience.com, let's have a look in detail at the first release.

Instance Creation

Oracle Data Science can be found in the OCI console, under the Data and AI section with other products including the newly released Data Catalog and Data Flow

Machine Learning Collaboration with Oracle Data Science

Before starting creating a Data Science Project, you'll need to review some security and policy settings which are well described in this blog post. Those settings are not straightforward, but once in place, we can then create a Project, where a team can collaborate. All we need to define is the compartment name, the project name and description.

Machine Learning Collaboration with Oracle Data Science

After creating the project, is time to create one (or more) Notebook Sessions within the project. In here we need to specify the compartment (the same as the project), the name of the notebook session, the shape (choosing between a pre-defined list of shapes), the size of the block storage and the networking details.

Machine Learning Collaboration with Oracle Data Science

If all the security and policy settings are working we'll have the notebook instance created in minutes

Machine Learning Collaboration with Oracle Data Science

We can then click on Open to enter the notebook itself.

What's in the Notebook

What we see immediately is that Data Science is based on instantiated images of jupyter notebooks including Python 3. Data Science Notebooks can also be version-controlled with a pre-built git integration and contain all the best python open source ML libraries like TensorFlow, Keras, SciKit-Learn, and XGBoost as well as the common visualization ones like Plotly and Matplotlib. The terminal access also means that we have the freedom to install any other custom library we might be interested in.

Machine Learning Collaboration with Oracle Data Science

We can now create a Python notebook and start solving out ML problems.

Pro-Tip: move all notebooks and data under the folder /block_storage  otherwise you'll lose the content when stopping and starting the notebook.

Oracle Accelerated Data Science (ADS) SDK

A new feature coming with Oracle Data Science is also the Accelerated Data Science (ADS) SDK: a python library simplifying and accelerating the data science process by offering a set of methods that covers all the phases of the process, from data acquisition to model creation, evaluation and interpretation.

The first step to use the SDK is to import the related libraries

from ads.dataset.factory import DatasetFactory
from ads.dataset.dataset_browser import DatasetBrowser

Then we can import the data and define pointsCat as our target column

df=DatasetFactory.open("Data/train_winedata.csv", target="pointsCat")

Immediately after importing, the library will suggest to use two methods: show_in_notebook()  and get_recommendations(). Let's execute them!

df.show_in_notebook()

This function creates a series of sections enabling us to understand better the dataset. The sections include a Summary expressing the overall features of the dataset, then for each Feature, a dedicated chart will represent the distribution. The Correlation tab shows the similarity between features and finally, the Data tab shows examples of the dataset.

Machine Learning Collaboration with Oracle Data Science

The next step to try is the get_recommendations() which generates few interesting suggestions to transform our dataset. The function shows which are the primary columns that should be deleted, suggestions on how to handle missing values and features with strong correlation, lastly allows the identification of the positive label for the target.

Machine Learning Collaboration with Oracle Data Science

The end-user can keep or change any of the values proposed and apply the transformations. Please note that the original object is not modified, to get the transformed object we need to call

wine_transformed=df.get_transformed_dataset()

If we don't want to go through all the transformations we can rely on ADF to chose the best for us by calling the auto_transform() function which will implement all the transformations suggested before with the default parameters. If we want to have a look at what transformations have been processed by the auto_transform(), we can call the visualize_transforms() function which will show the pipeline in an image similar to the below

Machine Learning Collaboration with Oracle Data Science

Next step is to create a predictive model, again with the SDK is just a few lines of code away

from ads.dataset.factory import DatasetFactory
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
from ads.evaluations.evaluator import ADSEvaluator
ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR)
oracle_automl = AutoML(wine_transformed, provider=ml_engine)
automl_model1, baseline = oracle_automl.train()

The SDK will now start testing a few ML algorithms to find the one providing the best results. At the end of the execution, it'll show a summary output like the following which includes also the Selected Algorithm which is the ML model having the best score.

Machine Learning Collaboration with Oracle Data Science

For each of the ML algorithms tried, the SDK will also show some summary stats that can be used to evaluate all the models.

Machine Learning Collaboration with Oracle Data Science

We can reprint the models anytime passing the number of rows to visualize and the sort order as parameters with the following call

oracle_automl.print_trials(max_rows=20, sort_column='Mean Validation Score')

And have a graphical representation of why a certain algorithm was selected by calling the visualize_algorithm_selection_trials() function.

Machine Learning Collaboration with Oracle Data Science

ADS has similar calls to check the sampling size (visualize_adaptive_sampling_trials()) and the features selected (visualize_feature_selection_trials()). These functions can take a while to execute, probably due to the first version of the SDK.

The model train function accepts optional parameters such as:

  • if we are interested in just trying out a specific model (or models) we can pass it via model_list=['LogisticRegression']
  • to change the scoring function we use score_metric='f1_macro'
  • to give a hint about the amount of time to spend on training we pass time_budget=10
  • we can specify which features we want always to be included  with min_features=['price', 'country'])

Once created the models, we can then evaluate them with  

evaluator = ADSEvaluator(test_wine, models=[automl_model1, baseline],
                         training_data=wine_transformed, positive_class='Good')
evaluator.show_in_notebook(plots=['normalized_confusion_matrix'])
evaluator.metrics

which results being

Machine Learning Collaboration with Oracle Data Science

The ADF also contains functions for model explainability, but the ones available in v1 don't seem to work well, all the trials done as of now are stopping against errors in the code.

Once we identified the model, it's time to save it, we can do it via

from ads.catalog.model import ModelSummaryList, ModelCatalog
from ads.catalog.project import ProjectSummaryList, ProjectCatalog
from ads.catalog.summary import SummaryList
from ads.common.model_artifact import ModelArtifact
path_to_model_artifact = "/home/datascience/block_storage/my_model"
model_artifact = automl_model1.prepare(path_to_model_artifact, force_overwrite=True)

The prepare function will create several files within the chosen directory that describes the model, once done, we can store the model in the catalog with the following

import os
compartment_id = os.environ['NB_SESSION_COMPARTMENT_OCID']
project_id = os.environ["PROJECT_OCID"]

# Saving the model artifact to the model catalog:
mc_model = model_artifact.save(project_id=project_id, compartment_id=compartment_id, display_name="Wine LGBMClassifier",
description="Wine LGBMClassifier predictor", training_script_path="Training.ipynb", ignore_pending_changes=True)

Please note that this step will fail, if you didn't configure the ADF SDK to access the OCI APIs, a separate post is coming covering this step. If on the other side, all the setup is correctly done, the Model is then visible within the Data Science Project Page

Machine Learning Collaboration with Oracle Data Science

The model can now be downloaded and used by others, we can also think about  exposing it as function and call it via REST APIs, all possible with the help of the ADS SDK as mentioned in the documentation.

A New Ecosystem for Data Science Collaboration

Oracle Data Science is an interesting product and covers a missing piece in Oracle's AI strategy. As of now the user experience is a bit rough on the edges both during provisioning, with the policy configuration as a required pre-step, and during utilization with some of the steps not working 100% of the times. This, however, is just the first release and we hope to have a speedy following set of new versions as already happening for all the other products in the Oracle cloud.

Categories: BI & Warehousing

Oracle Analytics Server Step by Step Installation

Thu, 2020-02-06 03:47
Oracle Analytics Server Step by Step Installation

Oracle Analytics Server is available! There is already a blog post talking about how to have a docker version up and running in minutes. My one explains the GUI process, providing hints on how to automate the whole installation using response files. The steps are very similar to an OBIEE12c installation, let's check the details.

Supported OS and Database

Before installing is always a good practice to check which are the supported OS and Databases (for the RCU schemas). A news for Oracle Analytics Server is that the certification matrix migrated from an Excel sheet to a website making it easier to find informations.

For this initial OAS release, the supported os is Linux, with Oracle Linux 6 being the minimal release. Windows support will come later in the year

Oracle Analytics Server Step by Step Installation

Also worth mentioning that as per documentation, the OAS RCU is supported only in Oracle Databases, starting from 11.2, support for other databases will come in the future.

Oracle Analytics Server Step by Step Installation

Once cleared the pre-reqs, let's see now how we can install OAS!

Files Download

The first step is to download files from e-delivery

Oracle Analytics Server Step by Step Installation

You can avoid downloading the Oracle VM Virtual Appliance since it's not needed and will save you from downloading an extra 2.8GB

Oracle Analytics Server Step by Step Installation

One of the prerequisites is the java JDK 1.8 8u211 or newer, which can be downloaded from the oracle website.

Edit: You can find all the files now in a unique place.

Before starting the installation, let's have a look at the recommended directory structure: as for OBIEE12c Oracle documentation suggests to keep the ORACLE_HOME and DOMAIN_HOME separate. This is not just a good practice, by keeping binaries and configurations separate we avoid having problems when new OAS versions will be released since they'll be installed in new ORACLE_HOMEs while the configurations and applications data will always reside in the DOMAIN_HOME.

Oracle Analytics Server Step by Step Installation

Be aware that the OAS default configuration step will suggest to create the DOMAIN_HOME under the ORACLE_HOME, you'll need to set the DOMAIN_HOME externally to be compliant to what the documentation suggests!

Installation

The first step is to unzip the files V988574-01.zip and V983368-01.zip then we can start the Fusion Middleware installation by

java -jar fmw_12.2.1.4.0_infrastructure.jar

This will open up the installation GUI asking to setup the Oracle Inventory Directory and the OS Group, if not already configured in the same server

Oracle Analytics Server Step by Step Installation

The next steps require to choose the ORACLE_HOME as per diagram before we selected the /opt/oracle/product/oas55 which gives also a hint on the OAS version we're installing

Oracle Analytics Server Step by Step Installation

The rest of the FMW installation is, selecting the installation type, verifying the prerequisites and checking the installation summary. An important step, if you want to reproduce the same installation via script in other environments is to save the response file. The file (in the following command named fmw.rsp) contains all the details of the setup, and, by changing some parameters, it can be reused for other installations.

java -jar name fmw_12.2.1.4.0_infrastructure.jar -silent -responseFile fmw.rsp -invPtrLoc <INVENTORY_FILE_LOCATION>  

After installing, we need to apply the 30657796 patch which was downloaded from e-delivery. To do that, we simply need to unzip the V988922-01.zip file which will create the 30657796 folder and then

cd 30657796
export PATH=$ORACLE_HOME/Opatch:$PATH
opatch apply

If the patching is successful we'll get a message like

Patching component oracle.fmwconfig.common.wls.shared.internal, 12.2.1.4.0...
Patch 30657796 successfully applied.
Log file location: /....log

OPatch succeeded.

Next step is OAS installation, we can do it by running

java -jar Oracle_Analytics_Server_5.5.0.jar

After skipping the auto-upgrades we just need to select the ORACLE_HOME (same one defined on top) and verify that all the prerequisites are met.

Oracle Analytics Server Step by Step Installation

Again, you may want to save the response file for later in case you want to reproduce the silent installation in another server.

Repository

Next step is to create the database schemas required by OAS, by executing the rcu command.

cd $ORACLE_HOME/oracle_common/bin
./rcu

The GUI will ask for Database location and credentials used to create the schemas

Oracle Analytics Server Step by Step Installation

Please note that you need to select Oracle Business Intelligence checkbox in the Select Components tab. You can also specify tablespaces and password for each of the schemas created. Also in this case an option to save the response file in order to do a silent installation is available.

The RCU schemas can also be created as part of the configuration process, I detailed it as separate topic since the standalone rcu is more reliable and provides more options for tablespace and schema management.

Configuration

Next step is to configure OAS, for this we need to run

cd $ORACLE_HOME/bi/bin
./config.sh

The set of screens will ask for the DOMAIN_HOME mentioned above, the GUI will suggest a DOMAIN_HOME located under the ORACLE_HOME. If you want to be compliant to the Oracle documentation and avoid problems in case of future in place upgrades, place the DOMAIN_HOME in a separate folder!

Oracle Analytics Server Step by Step Installation

Other details required are: the Admin username and password, the products to include (with BI Publisher and Oracle Analytics Enterprise Edition being the two options available). Then we'll have to point to the RCU schema created before and decide the port range.

Oracle Analytics Server Step by Step Installation

If all everything is set correctly you should get a successful configuration with all the services starting properly.

Oracle Analytics Server Step by Step Installation

And OAS should be available (by default at http://<SERVER_NAME>:9502/analytics)!

Oracle Analytics Server Step by Step Installation

The whole example above was performed on a OCI compute instance installing schemas on a DBaaS. This could be a solution for customers willing to use the Oracle cloud but still needing a level of customization and control higher than what's achievable in OAC.

For more info, don't hesitate to contact us!

Categories: BI & Warehousing

Oracle Analytics Server is here

Mon, 2020-02-03 09:47
Oracle Analytics Server is here

Oracle has been talking about this product for months and we’re pleased to tell you the wait is over (for Linux users anyway), and Oracle Analytics Server (OAS) 5.5.0 is now available for download on edelivery(search Oracle Analytics Server).

Oracle Analytics Server is hereOracle Fusion Middleware needs to be downloaded separately

If your organisation uses Oracle technology for data analytics, then the likelihood is you’ve heard of Oracle Analytics Cloud (OAC). You may have also come across the latest addition to the family, OAS, the new on-prem version of OAC, set to eventually replace OBIEE.


The umbrella term, Oracle Analytics, now includes:

Oracle Analytics Server is here
  • Oracle Analytics Server (OAS)
  • Oracle Analytics Cloud (OAC)
  • Oracle Analytics for Applications (OAX)

Whilst OAC is the jewel in the crown, and will receive regular quarterly updates, these updates will be reflected in OAS. You’ll be pleased to hear current OBIEE users will be automatically licensed for OAS - the logic behind this is that OAS becomes a stepping stone in your journey to using Cloud. If you’re buying OAS new, the licensing model is the same as the current OBIEE model.

OAS looks almost the same as OAC, minus some features, like the Natural Language Generator. This feature generates explanations of your visualisations in 28 different languages and will probably be included in a later version of the tool.  

How does OAS compare to OBIEE and OAC?

  • Licensing: OAS now includes options like Data Visualization (DV) and Mobile which were previously considered extra.
  • Data Visualization: Oracle’s self-service visualisation tool does what it says on the tin. Allows you to decipher your enterprise data with intelligent visuals. Now it includes almost all the new features available in OAC. A big step forward compared to the DV version available in the latest OBIEE
  • Data Flows: Clean and Transform your data via a GUI based tool without leaving your analytical platform.
  • Machine Learning: All the goodies related to “one-click forecast” or  “Explain” and the full ML capabilities are now included in the on-prem Oracle Analytics Edition!
  • Configuration Options: OAS provides the "OBIEE"-type configuration options, where you can tweak each componend individually

Oracle’s aim is for users to achieve “100% data literacy” and plan to do this via their vision for analytics: augmented, collaborative and integrated. OAS really plays into this strategy, allowing users to employ data science and machine learning techniques to both analyse current trends and predict future ones (find out more in this blog post)

Oracle Analytics Server is here


Talk to us about how to migrate from OBIEE to OAS or OAC. We can help you with every deployment scenario including on-prem, hybrid, full public cloud, or a mix and match of these suited to your needs. Email us: info@rittmanmead.com to arrange a chat with one of our team.

Categories: BI & Warehousing

What's new in OAC5.5?

Mon, 2020-02-03 03:01
What's new in OAC5.5?

Last Friday, alongside Oracle Analytics Server (for which a blog post is coming), new OAC version came out, let's have a quick look at all the new features it includes!

Maps

If you use Maps often, then there is a good list of options available to you! The very first is the possibility to associate a Map Layer to a Data Column directly in the data source definition.

Let's say you have a column City Zones in your dataset, which divides a city in customized areas based on your company needs and you have a map layer defining geographically those areas (e.g. with a GeoJSON file). I created an example with Verona, the city where I live. Custom GeoJSON file was created using geojson.io.

What's new in OAC5.5?

The upload of custom shapes and their usage in OAC was already available since some time, however you as project creator had to associate the City Zone column to the correct Map Layer for each map visualization included in your project. Now you can define the Map Layer to Data column association once for all at datasource level, so every Map using the City Zone column will automatically use the correct Layer.

What's new in OAC5.5?

Another cool new feature in Maps is the AutoFocus on Data, meaning that the  visualization will automatically zoom and center the map appropriately based on the dataset presented and rearrange in case of changes in the filtering.

Pivot Tables

Another new option is available in pivot tables where now you can set Totals and Subtotals Above and Below like you were used to do in Answers. Like the "old" tool you can now set a different format for the Totals and Subtotals with coloring, background and font formatting options available. You have now the full control of the layout and can make beautiful or horrible (like the below) color choices.

What's new in OAC5.5?Visualizations

The perfect visualization is now available: the Spacer Viz! This is an empty visualization that you can add to your canvas allowing you to optimize the layout in cases where you need an extra white space.

What's new in OAC5.5?

Another news in this release is related to the Custom Background: now it's possible to define a color or an image as background for the whole Project or for a single canvas. The image can be a URL reference or uploaded from the desktop. There are also options to position the image in the screen and to auto-fit the image in the window size. Adding a custom background to the whole project means that every time a new canvas is added, it will already have the selected image/color by default.

What's new in OAC5.5?

Another news is represented by the Butterfly Viz, this view was already available  as plugin from the Oracle Analytics Library, now becomes native in OAC5.5. The butterfly viz is useful when comparing two metrics across the same dimension.

What's new in OAC5.5?

By Default the two metrics are on the same scale, but there is also an option "Synchronized Scales" that, when set to OFF will show the metric on different scales.

Datasources and Data Gateway

A new datasource definition to Oracle NetSuite is now available, allowing the connection by passing the parameters Datasource, Account ID and Role ID on top of the usual Host, Username and Password.

What's new in OAC5.5?

An enhancement has been published also for the Oracle Database connection: now you can select between a Basic connection and Advanced. The Basic option should be used when connecting to single node databases. The Advanced, on the other side, is useful when connecting to Cluster RAC DBs where multiple hostnames and ports need to be listed. When selecting the Advanced option we can simply add a custom connection string like the below

(DESCRIPTION 
      (ADDRESS_LIST= (LOAD_BALANCE=on)(FAILOVER=on)
      (ADDRESS=(PROTOCOL=tcp)(HOST=hostname1.subnet.com)(PORT=1529))
      (ADDRESS=(PROTOCOL=tcp)(HOST=hostname2.subnet.com)(PORT=1529))
      (ADDRESS=(PROTOCOL=tcp)(HOST=hostname3.subnet.com)(PORT=1529))
      ...
      (ADDRESS=(PROTOCOL=tcp)(HOST=hostnamen.subnet.com)(PORT=1529))
      )
)

A option is also available to use Data Gateway with Essbase sources, making the OAC  on-premises Essbase connection only one click away by just enabling Use Data Gateway option in the screen. The usage of Data Gateway is also available now on BI Publisher allowing the pixel perfect reporting from on-premises datasources.

What's new in OAC5.5?

Another option in BI Publisher is Data Chunking, extremely useful for big reports since it allows the report execution in multiple sub-jobs in parallel with a final job to consolidate the results in a unique output.

The above are the news for this release, do you want more detailed examples on a particular features? Let me know in the comments and I'll write about it!

Categories: BI & Warehousing

A decade in data, a decade to come

Fri, 2020-01-10 08:52

First of all I’d like to take the chance to wish everyone in our network, colleagues, customers and friends a very happy new year.

The last decade was an incredible one for the data industry.  I realised before writing this that 2010 was actually when I started at my first data management consultancy, so it’s given me the perfect chance to reflect on what I’ve witnessed in the last 10 years.  

Data Insights out of IT and into the business

One thing that stands out is the relevance of insights through data within business functions.  The reason why we do what we do at Rittman Mead is because we believe that data solutions will drive positive change for us all in the future.  Making it accessible, insightful and interesting to people who are doing great things for great companies is the goal for all of us.

During the last decade we saw the mass market adoption of line of business data discovery tools such as Tableau, Oracle Data Visualization, QlikSense & Microsoft PowerBI.  All wonderfully innovative products that have flown the flag for bringing insights through data into the business.  

Oracle DV Geo LayersWhat is Big Data?

Also, what has happened to the Big Data hype which was a huge back in 2010?

Ten years on do we think it lived up to its hype?  Do we even know what it means? On reflection was it as groundbreaking as we might have thought at the time?  This is an argument that could be had but might never end.

Public Cloud Cover

One thing is for sure - the adoption at scale of public Cloud, SaaS, Storage and what it seems is infinite amounts of processing power has been huge since 2010.  We've seen AWS and Microsoft Azure leading the way, we see Google and Oracle Cloud Infrastructure catching up.  We know there are pros, cons, knowns and unknowns when moving to Cloud hosted applications and platforms.  We know that one thing is for sure - come 2030 the Cloud will be bigger whether we like it or not.

Streams of Events

Finally the emergence of event driven architectures is as innovative as it is complex. Large growth companies like Uber have built their business model on the ability to process huge amounts of events at low latency in order to operate.  Think demand + supply and Uber’s dynamic pricing model.  Fortunately open source projects like Kafka and particularly organisations like Confluent, the company behind Kafka, have made it easier to integrate this functionality to businesses throughout the world.

Analysing stream events in realtime using Kafka KSQL

It wouldn’t be fair to say that the innovation in the data industry has always been used ethically in the last 10 years and this is something that we all have a part to play in going forwards.  The Facebook / Cambridge Analytica scandal certainly made everyone reflect on the importance of using data for positive and ethical outcomes rather than underhand and in-transparent tactics.

And whats next?

As we embark on a new decade, we at Rittman Mead are excited about what is to come.  Artificial Intelligence, Machine Learning and Augmented Analytics is going to hit the mainstream and will play a part in all of our lives.  But the good news is - there will always be a place in this market for people.  Data is necessary, sophisticated machines are necessary but it relies on the people for it to work whether that be building an algorithm or making a decision based on the outcome of the information that a particular algorithm has generated.  We will see more automation and this is a brilliant thing.  Robotic Process Automation (RPA) and Autonomous databases such as Oracle’s offering are designed to breed innovation, not replacement.

Come and meet us in February

For us it all starts in February at Oracle OpenWorld Europe.  We will be exhibiting and presenting here and there will be insightful, thought provoking and relatable content throughout the two days.  Augmented analytics & autonomous will be at the top of the agenda and rightly so.  What’s more, it’s completely free to attend. Registration details can be found here:

We’d love to meet with you if you do decide to come.  To make this easier (as we know there is a lot to cover on days like these) we have set up a calendar where you can book a meeting slot with one of our sales team or technical consultants.  We can talk about:

  • Oracle
  • AI/ML
  • What are we doing?
  • What are you doing?
  • Anything you like in an informal setting

Please feel free to book a time slot using the link below:

https://calendly.com/rittman-mead/oracle-openworld

Finally we wish everyone a very prosperous new decade - who knows where we will be by 2030!

Categories: BI & Warehousing

Machine Learning and Spatial for FREE in the Oracle Database

Fri, 2019-12-06 04:34
Machine Learning and Spatial for FREE in the Oracle Database

Last week at UKOUG Techfest19 I spoke a lot about Machine Learning both with Oracle Analytics Cloud and more in depth in the Database with Oracle Machine Learning together with Charlie Berger, Oracle Senior Director of Product Management.

Machine Learning and Spatial for FREE in the Oracle Database

As mentioned several times in my previous blog posts, Oracle Analytics Cloud provides a set of tools helping Data Analysts start their path to Data Science. If, on the other hand, we're dealing with experienced Data Scientists and huge datasets, Oracle's proposal is to move Machine Learning where the data resides with Oracle Machine Learning. OML is an ecosystem of various options to perform ML with dedicated integration with Oracle Databases or Big Data appliances.

Machine Learning and Spatial for FREE in the Oracle Database

One of the most known branches is OML4SQL which provides the ability of doing proper data science directly in the database with PL/SQL calls! During the UKOUG TechFest19 talk Charlie Berger demoed it using a collaborative Notebook on top of an Autonomous Data Warehouse Cloud.

Machine Learning and Spatial for FREE in the Oracle Database

Both Oracle ADW and ATP include OML by default at no extra cost. This wasn't true for all the other database offerings in cloud or on-premises which required an additional option to be purchased (the Advanced Analytics one for on-premises deals). The separate license requirement was obviously something that limited the spread of this functionality, but, I'm happy to say that it's going away!

Oracle's blog post yesterday announced that:

As of December 5, 2019, the Machine Learning (formerly known as Advanced Analytics), Spatial and Graph features of Oracle Database may be used for development and deployment purposes with all on-prem editions and Oracle Cloud Database Services. See the Oracle Database Licensing Information Manual (pdf) for more details.

What this means is that both features are included for FREE within the Oracle Database License! Great news for both Machine Learning as well as Graph Databases fans! The following tweet from Dominic Giles (Master Product Manager for the Oracle DB) provides a nice summary of the licenses including the two options for the Oracle DB 19c.

The #Oracle Database now has some previously charged options added to the core functionality of both Enterprise Edition and Standard Edition 2. Details in the 19c licensing guide with more information to follow. pic.twitter.com/dqkRRQvWq2

— dominic_giles (@dominic_giles) December 5, 2019

But hey, this license change effects also older versions starting from the 12.2, the older one still in general support! So, no more excuses, perform Machine Learning where your data is: in the database with Oracle Machine Learning!

Categories: BI & Warehousing

Rittman Mead at UKOUG TechFest 19

Tue, 2019-11-19 10:25
Rittman Mead at UKOUG TechFest 19

Like every year, December for the Oracle community means UKOUG! This time is special since the event, named TechFest19, will be hosted at The Grand Brighton Hotel, near our base office!

Rittman Mead at UKOUG TechFest 19

Let's talk about the important stuff first: we are organising a Techfest party featuring "The Chaps" on Monday 2nd Dec between 7PM and 10:30PM in our office at Platf9rm, Hove Town Hall. We'll be having few drinks and while enjoying the live music, if you are interested register yourself here!

Rittman Mead at UKOUG TechFest 19

Now on the main topic, the Techfest! Rittman Mead will be well represented this time with four talks:

Data the Missing Ingredient

Monday 2nd at 14:15, Location: Stage 4

Jon Mead, our CEO, will introduce you to the concepts of Data Management, Engineering and Governance, how they should be addressed across a wide range of projects and which Oracle tools are there to help you.

Become an Equilibrista: Find the Right Balance in the Analytics Tech Ecosystem

Tuesday 3rd at 11:00, Location: Stage 2

Rittman Mead at UKOUG TechFest 19

Do you want to understand how to avoid the "Excel Chaos" in your organization? Let me show you how you can balance Centralized and Self-service analytics, taking the best of both worlds but still using a unique tool: Oracle Analytics Cloud!

Is it Corked? Wine Machine Learning Predictions with OAC

Tuesday 3rd at 15:45, Location: Stage 2

Do you love Wine? And maybe you don't dislike Analytics? Then join me in understanding how Oracle Analytics Cloud can be used for data-science! We'll be analysing a Wine dataset and using OAC to create a predictive model scoring wine quality! If you are a business analyst looking to start your path into Machine Learning, this session is a kickstarter!

Rittman Mead at UKOUG TechFest 19

Picking a Good Wine for <$20 Using Oracle Autonomous, Machine Learning and Analytics Cloud

Monday 2nd at 09:00, Location: Stage 2

Using the same dataset as the talk above, Charlie Berger, Sr. Director of Product Management, will show you how to build a predictive model by performing Machine Learning directly within the Oracle Autonomous Datawarehouse, all accessed by a notebook interface. I'll then show you how we can integrate such model within OAC and show the main drivers as well as the model outcomes!

During the event, few of the Rittman Mead folks will be around. If you see us in sessions, around the conference or during our talks, we'd be pleased to speak with you about your projects and answer any questions you might have.

Categories: BI & Warehousing

Pages