In my last post, Want to build a BI platform?, I discussed the three forks in the road that we are facing: Visualizations, Business ETL (Data Prep), and Cloud Semantic Layer. In this post I will focus on how Vero may do Visualizations.
While we have 3 major directions we can push deeper into, we feel that without some basic visualizations in our tool it will be harder for users to complete their data prep activities in Vero. Our current strategy is to implement basic Visualizations while probing customers for their Business ETL needs. The following discussion lays out an idea we would pursue if we go all in on Visualizations and Dashboards.
Step 1. Acquire data from a variety of data sources
For any Visualization tool the first problem is to acquire data, build a local or remote cache, and present an OLAP (online analytical processing) interface on that data. An OLAP interface allows you to pivot, sort, and filter data without reaching back into slower underlying datasources. The great news for us here is that we have an amazing query engine already. It's far more capable than what Tableau and other leading visualization tools provide. We spent the majority of our time the last two years researching and building this engine. In fact, we even have patents on our ability to preemptively eliminate data making complex queries run much faster.
Below is our intuitive query builder. It supports many popular datasources including PostgreSQL, SQL Server, MySQL, Oracle, Teradata, etc.
Step 2. Examine all of the data at once
Most visualization tools dumps you into a screen devoid of data and full of controls and buttons right after the data acquisition step. Some tools may present you with a preview of the data in a static way with limited or no filtering and sorting. We believe this is a jarring experience and for most analysts this is not an effective workflow. Its an awful experience to have to build a visualization to determine whether you even got the right data.
A better experience would be to enter a tabular view where you can see all of the data, sort and filter the data, and reposition columns if needed. This is closer to what an analyst would do in excel before building a pivot table or chart. Here is a mockup of how we would implement this view:
In this view you can see all of the data returned from your data sources. Under each column header you have immediate access to sort and filter controls that are sensitive to the type of data. Then, when you are ready to visualize data you can hit the red chart icon on the toolbar.
Step 3. Visualizing Data
Obviously the big players in this space have hundreds of bells and whistles covering every possible visualization scenario. Unfortunately, the more features you have the more complexity your application will inherently need. We propose an easier approach to visualizations while borrowing some of the cool ideas others have pioneered. Here is a mockup of our visualization tier:
The main differentiators we bring here, besides simplicity and the clutter free design, is the always on access to filtering and sorting by any column in the original dataset. Most visualization tools have simply copied the Tableau model and decided you need a drop zone to define everything including adding filters and sorting rules to your viz. I personally think that is not the most productive user experience.
Notice that the column header controls for all of the columns in the original dataset is still accessible at the top in the same way it was in the preview area. This means you can filter and sort by any column without requiring the additional step of dragging and dropping an item into a filter or paging drop zone. You will see that this makes your data discovery activities much faster.
Finally, in our collapsible chart controls tray, we have drop zones that allow you to specifically define your visualization. Define the X-Axis, Series, and Series Segment precisely.
One of the things I find annoying in Tableau is the amount of effort it takes to get a specific visualization. They do an amazing job of guessing visualizations and they have very innovative chart layouts. However, if you know exactly what you want, you will find yourself wrestling with “rows” and “columns”. As an alternative to this design, we present controls that allow you to be more intentional about your visualization.
Step 4. Sharing Visualizations and Dashboards
This is another area where we can shake things up in a big way. All major providers of data discovery products (BI, visualizations etc) have capitulated and followed the Tableau and QlikView model. That is, they have all created desktop data discovery products. In some cases, like MicroStrategy, this was after attempting a pure web based data discovery product called Visual Insight.
Why did this happen? Here is my analysis of the situation:
- A desktop app provides a more personal experience with your own data files that isn’t by default accessible to admins or anyone else
- Cube performance for moderate sized data ( less than 10GB ) is much better when it is device local. A big problem with in memory cubes and other central server hosted shared caching strategies, is that the performance degrades with concurrency. I have yet to experience sub-second performance in a web based viz/dashboard backed by a moderately sized cube
Given all of this, when it comes to sharing dashboards and visualizations everyone and their mamma has opted for a web only strategy. The Tableau Reader offers a native experience but it can’t auto update or connect to remote data sources. This sucks! End users get stuck with a worse experience than the creators.
We would pursue an all native visualization experience across all devices. You would create a viz/dashboard in Vero publish it to the Vero Server for scheduling and administration. All of your end users will have a very thin viewer client that will do a dropbox like sync to keep their Visualizations upto date. Now you the Analytics Pro, and your stakeholders will have the same visual experience. This has the added benefit of solving the email distribution problem where static data is sent as excel and csv files. In this model, users will get a notification on their device when dashboards and visualizations they follow are refreshed.
Imagine that, everyone has the same visualization experience!
I truly believe we can make a major dent here by making data acquisition easier, while providing a better visualization experience for all users. Let me know what you think about our strategy in the comments below.