February 28, 2013

My Perspective on OEID, Part 1

Time to eat a little crow.  OK, OK, I started this big "real world OEID" effort and I never got any further than here.  That's pretty lame on my part.  Sadly, 24 hours is not enough time in the day unless someone invents a pill that replaces sleep (and do I really want that?).  However, a few people have inquired about whether I'm going to continue this effort and my answer is yes...but not yet and I'm going to tie the why of that response into a series of posts around how I have come to view this product.

Believe it or not, a year has passed since Oracle acquired Endeca.  In that time, Oracle's messaging on what OEID is and how it should be used has meandered a bit.  I won't go as far to say it has wildly fluctuated like some might, but I do think of it like a lazy river curving back and forth across the landscape, winding itself around various barriers and pushing through others.  And while it's unfair to say that Oracle has got it all wrong, I will say Oracle is missing an opportunity to frame the product in a very real way that I think many of us BI professionals could wrap our heads around.  This same framing could also be used as one leg of the development roadmap for OEID, and I'll speak to that in another post.

After much pontificating and some experience working with the tool, I think a really good way to position the product is a BI ad hoc reporting capability.  I'll define more what I mean by that in a moment, but in case you're going to stop reading this post in a minute or two, I think OEID is the ad hoc reporting tool that users want "Answers" to be, but isn't.  If you get that, feel free to stop reading now and save yourself some of the precious time I lamented about above.

Kevin's Cliff's Note History of Ad Hoc BI

Now, I'm not a reporter and this was not researched one bit, but in my somewhat long professional career in BI, ad hoc reporting has always been a desired component of what I delivered to companies, but was always tricky for a number of reasons.  In my world, the tools "began" as direct database query tools where users would have to know table names, column names, joins, etc.  They could construct horribly performing queries and bring databases to their knees, but if the users were sophisticated enough in SQL, they had a lot of flexibility.  But there was still a gap: if the data they desired to analyze didn't exist in their prescribed databases, they couldn't easily analyze that data, so it invariably meant dumping the data to Excel and Access and performing more analysis there.

Over time, some of these tools adopted metadata layers to obscure the complexities of the database from the user, supposedly making it easier to use and requiring less SQL skills.  While there was some truth to this claim, an IT effort was introduced because very rarely would users be trusted with this metadata layer.  When a user wanted more data, IT had to intervene.  And despite more control of what users can query, IT still couldn't prevent ad hoc users from creating horrible queries and bringing the database down, so a requirement for ad hoc access was often met with resistance.  So, users still download data into Excel or Access and take it from there.

There have probably been additional layers of change/improvement over time, but the popular trend today seems to be leveraging smaller scale tools that are somewhat friendly to the business from a development/maintenance/use perspective.  These tools, like Tableau, Qlik, and Spotfire, have sexy interfaces and offer a lot of value, but they tend to be rogue efforts and a tug of war erupts over the use and control of them.  One could argue that IT should just get out of the way, but that's not always in the best interest of the company, despite the perceived roadblocks that IT creates.

Ad Hoc Reporting and Why Answers (OBIEE) Falls Short

From what I've witnessed over the years, business users would look for the following characteristics in "ad hoc reporting":

  1. Flexibility - as few constraints as possible with regards to where the data comes from, what it looks like, and how it can be queried.
  2. Quick to Market - the ad hoc need is often identified just-in-time and it can't take weeks (or even sometimes days) to deliver the capability; ad hoc needs often vary greatly over time instead of being predictable and repeatable.
  3. Fast - the days of launching an ad hoc query and waiting an hour for it to return the results are long gone; it has to be fast enough to adapt and change the output quickly and frequently over the lifespan of the analysis
  4. Believable - the data accuracy doesn't have to be perfect, but it has to be somewhat trustworthy.
  5. Accessible - power users tend to know a lot about data and data sources, but don't always know all the details of what and where.  If the tool can help without getting in the way, it's a bonus.
I'm sure there are other factors involved that vary from company to company, user to user, but those are the requirements I hear expressed consistently.  So let's look at why OBIEE "Answers" falls short in meeting these requirements.
  1. Flexibility - this gets back to my earlier point about IT having to intervene to define a data source in the metadata.  Very few companies allow power-users into the RPD and though BI Publisher has an interface to allow the upload of local files as data sources, the tool itself is a little too complex to use...for now.  The additional knock on OBIEE is that it can't handle *all* types of data well and those fringe types are becoming more and more mainstream to analyze because they deliver real business value.  In fact, we often constrain users to a subset of the available data in an ad hoc subject area to help control the experience, limiting users on the data OBIEE is actually good with.  The last point I would make here is that the paradigm for constructing your ad hoc analysis is very SQL-like: "select columns from subject area where filter definition."  While SQL is tried, true, and dependable, it's structured query language.  Ad hoc isn't always so structured, or at least shouldn't be constrained by a structure.
  2. Quick to Market - I'll get some pushback on this one from my professional peers, but OBIEE is not quite fast enough to market for ad hoc environments.  I've been promoting agile/rapid/iterative development in OBIEE for a long time and I think today's BI tools are faster to develop in than yesterday's, but I think it still misses the mark that ad hoc users are looking for, partly because of #1 above and partly because it's hard to deliver a highly tuned, high performing ad hoc environment, as I'm about to describe below.
  3. Fast - this is not really OBIEE's fault, per se, but the reality is that it's hard to tune databases for infinite query possibilities.  To meet the performance needs of a truly ad hoc BI tool, I believe you have to think different with regards to your data store.
  4. Believable - this one is not really a product of the BI tool, per se, but having your ad hoc environment be able to feed from your "single version of the truth" is valuable.  In that sense, OBIEE is solid here.
  5. Accessible - the metadata layer helps to define standard data sets and where to pull them from, but again, *all* data has to go through the RPD for OBIEE, and that's a big problem.
Why OEID is a Pretty Good Ad Hoc BI Tool

So why do I believe that OEID is better than Answers?  Let's start by looking at our five characteristics again:
  1. Flexibility - while one has to be careful not to drink too much marketing Kool-Aid here, I feel pretty confident in saying that OEID can handle more data types and sources than OBIEE can.  You also don't have to spend as much time "modeling" that data into a star schema model or any other schema model, for that matter (I will address the "ETL" in Part 2).  Lastly, the combination of search across all attributes in the data store and the way the guided navigation allows you to rapidly apply/re-apply filtering lends itself to the ad-hoc experience.  In Answers, I must view my results in the Results tab and go back to the Criteria tab if I want to change my filter, then go back to my Results tab again.  In OEID, it's all in one place and the filtering is very fluid, catering to a rapid ad hoc experience (more on that in #3).  While I will agree that much of the filtering in OEID is similar to the nested boolean filtering we're used to working with in SQL, the way you apply it is much more flexible.
  2. Quick to Market - again, this will generate some controversy, but OEID should be faster to deliver for an ad hoc environment.  The amount of time you have to spend planning and tuning broad Answers access by power users can be significant, and this should, on average, be easier and faster to do in OEID.
  3. Fast - this may be the most compelling argument I have.  Ad hoc access has never been synonymous with blazing speed, but it could be with OEID.  When ad hoc users want outer joins, NOT INs, and vaguely filtered data across hundreds of attributes, we OBIEE developers cringe.  Now, I won't sit here and write that it's impossible to develop a slow OEID application, but the Endeca Server is much better suited to handle the needs of unpredictable ad hoc users with extremely fast performance across many attributes than the combination of the BI Server and a relational database.  It's not that a relational database can't perform a given query fast, but it's much harder to prepare a relational database for infinite query possibilities when compared to the Endeca Server (excuse me while I go polish my shield to defend myself from my database peers).
  4. Believable - not much to say here, other than the newest release of OEID has a wizard to ingest data from the BI Server, which goes a long way to establishing trust in the ad hoc analysis, assuming the data presented in OBIEE dashboards is trustworthy.
  5. Accessible - I'm not sure that OEID is that much more powerful here than OBIEE, at least in the current versions of OEID.  While people won't generally equate the OBIEE RPD with the graphs built in Endeca Integrator, the reality is that all the data has to flow through both to get to the end target.  In other words, data won't get into the Endeca Server until it gets mapped into Clover ETL first...at least right now.
To further expound upon #1 and #3, there is something to be said for unloading ad hoc processing from the data store that services the standard reporting and dashboards.  The interactivity on OBIEE dashboards has increased pretty dramatically with 11g and an argument can be made that the existing interactivity (with more to come) is "ad hoc lite."  There are still constraints, but there is also flexibility. But this flexibility comes at a cost and that is increased processing on your data store (an argument for buying Exalytics).  If OEID can be positioned to assume the pure ad hoc BI capability, it can be a win-win for all users: ad hoc users can gain more speed while standard users get all the attention of the OBIEE/database resources (I'll address the licensing problem with that approach in Part 2).

It's also worth re-stating that having all data defined in the RPD is not a realistic endeavor.  While Oracle positions the RPD as the "Common Enterprise Model" (or something like that), not all data is enterprise data.  Removing the myriad of Excel spreadsheets and Access databases defined in the RPD is probably a good thing when it comes to maintaining your OBIEE environment over time.  And while BI Publisher made a few strides in 11g with local data sources and the web-based modeling tool, the nature of interacting with BIP report building is still very SQL-like and the interface, for ad hoc reporting, is probably worse than Answers.  OEID flat-out nails the user experience piece of what we all imagine "ad hoc analysis" to be like.

To Be Continued...

All of that said, my argument is that OEID is a better fit than OBIEE Answers, not a perfect fit...yet.  In Part 2 I'll take a look at how I think OEID could play the ad hoc role today and some enhancements I believe it needs to make to better play it tomorrow.

3 comments:

  1. excellent piece of information, I had come to know about your website from my friend kishore, pune,i have read atleast 8 posts of yours by now, and let me tell you, your site gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanx a lot once again, Regards,obiee online training

    ReplyDelete
  2. Kevin - can you expound a little on the "fast" aspect of OEID? do you have anecdotal examples of what kind of query performance gains can be achieved?

    ReplyDelete
  3. Appreciation for nice Updates, I found something new and folks can get useful info about BEST obiee ONLINE TRAINING

    ReplyDelete