It's Time to Unleash the Semantic Layer

What is the semantic layer you ask?

The semantic layer is the underpinning of modern business intelligence platforms. Vendors with robust semantic capabilities command more than 50% of a $14 billion plus market:

vero semantic layer market share

Source: Gartner Worldwide BI Market Share *(SAS — not sure if they offer a semantic layer in the classical sense — please let me know in the comments)

This data actually understates the actual market share and wide spread usage of semantic layers within enterprise BI shops. Several BI tools in the “Others” category, like Birst also offer solid semantic capabilities. The real number might be closer to 60% or even higher.

So yeah, it’s a big deal, worth at least $7 billion. It was pioneered by Business Objects in the early 90’s well before the acquisition by SAP and they still remain the market leader in BI after all these years.

Back to the original question — what is the semantic layer exactly? From wikipedia:

A semantic layer is a business representation of corporate data that helps end users access data autonomously using common business terms Close, but I think thats too high level and simplistic.

Here is my definition:

Semantic layers map concepts like Customer, Product, Clicks, and Revenue to a set of data transformation rules, aggregation rules, datasources, tables, and columns, allowing users to ask novel questions without knowing anything about the underlying structure of data. There are three big reasons why enterprise IT loves the semantic layer:

  1. Knowledge Encapsulation — A bite sized concept like Revenue is packed with details about where the data lives and how to query it correctly. Even if the original author of Revenue leaves the company, the knowledge lives on.
  2. Governance — Its repeatable and reliable. Every time you use a semantic concept, even in new contexts, you can rest assured you’re getting the correct answer. A single source of truth where all your numbers tie out neatly, in theory.
  3. Self Service — This is probably the most important reason. Semantic concepts appear as words with an icon attached to them in user interfaces. Through some UI interaction, end users simply group words together, apply some constraints, and arrive at a desired dataset. All of that without knowing anything about the underlying data structures, zero programming, and little understanding of the business rules used to transform them. That’s kind of magical.

Okay, if you are still reading this, you are either a really good friend, a data nerd, or a very curious individual. In any case, thanks and congratulations you’re about to peek into the future.

Note: the BI semantic layer is different from the popular notion of the semantic web (RDF’s) in that the links between semantics in the BI tool is business rules and datasources while in the semantic web the links are other associated relevant data (ie. country -> country_table -> city_table -> city vs. country -> city in the semantic web ).

Now for some bad news for the semantic layer fans (including myself in this group). After a decade and a half, we are seeing major cracks in this strategy and sometimes downright collapse. End users are running away en masse from centrally managed BI tools and the semantic layer is partly to blame. The proof is simply in the meteoric rise of Tableau who once called enterprise BI a pencil attached to a brick. That brick for better or worse is the semantic layer. Here are three reasons end users are fleeing from enterprise BI and semantic layers:

  1. It’s static and slow to change- Nobody likes stale bread. Business changes fast and while having a true north for Revenue makes sense, it doesn’t make sense to make it impossible to try new things. If you wanted to introduce a new way of calculating or sourcing Revenue, just to merely experiment, you would have to go through lengthy change requests. Either that or you will have to dump the data out somewhere else where you can wrangle it. Many users are opting for the latter option which then begs the question: why bother using enterprise BI and semantically enabled tools at all?
  2. Brittle or non-existent collaboration models — The bain of my BI career. Everything is centrally managed with little automated processes to collaborate and integrate new semantic concepts. It’s often manual and time consuming. We find more issues in production then during the tedious migration processes. Here you will also find the worst kind of version control, the kind that doesn’t really tell you what actually is the difference between two versions of a semantic concept.
  3. Extremely long learning curves — Sometimes it feels like that was done on purpose to help the service side of the business. If you are an end user/analyst and you undertake a three month training course on one of these platforms, you’re probably going to go back to doing things the manual way.
  4. You need expert operators to setup and maintain it — Because of (#3) you probably gave up and realized you need to hire expensive experts. Which means everytime you want to iterate you have to go through another person, wasting precious time.

It’s time to unleash the semantic layer. The great unbundling of the BI stack is already underway with Visualizations out of the gate first. The next to go will be the Semantic Layer.

Startups like mine, Vero Analytics, are already working hard to solve the many problems I described above while preserving the benefits of a semantic layer. Over the next months and years you will see the emergence of a new category of tools under the banner of “Semantic Data Preparation.” Like Visualization Tools, this new category will target business units and end users with lower learning curves, intuitive user interfaces, and a collaboration model that makes sense for the 21st century. It will at once enable transparency of what and how Revenue should be calculated, while allowing those rules to seamlessly evolve over a collaborative framework.

These new tools will marry reporting, analytics, and data preparation into one workflow. This means users will be able to self service questions like “Who are my Top 10 Customers”, then follow it up by enriching, correcting, and transforming that dataset. All the while adding and discovering new semantic concepts that becomes shareable and reusable across the whole organization.

The enterprise BI stack will start to change with a bring your own visualization tool strategy at the top level. Each business unit will and should be free to choose their preferred visualizations toolkit even if that means sticking with excel. This tier of visualizations will sit on top of the Semantic Data Prep tier with direct integration from visualization tools. Here users will define data sets, build pipelines, collaborate on semantic definitions, and possibly even cache datasets.

I am looking forward to this future where truth and creativity lives together.

As BI professionals for the last 10 years and the founder and CEO of Vero Analytics my team and I are thrilled to be part of this revolution. We recently announced the wide availability of our first release of the Vero Designer in open beta (free extended trials). Our work is not complete but the Vero Designer will give you a taste of how the Semantic Engine will work in the future. You can download it here: