1. Most BI companies rely on internal services teams or external partners to build data transformation layers to feed their applications. This explains why there is always a healthy level of “professional services” on their quotes. Not only is this misleading from a TCO perspective, those resources use traditional ETL processes “behind the scenes” to deliver data integration tasks. This means you can expect the same order of delay for data requests that you may be used to with your internal IT team. This is only going to get worse as more of your colleagues request these services and more data sets come on line for analysis.

2. Given the nature of silo-based business operations and the data silos they employ, an enterprise will never standardize on one BI/analytical solution. There are many competing solutions on the market today and most are purpose-built for different tasks i.e. Visualization, Visual Discovery, Advanced Analytics/Machine Learning, Real Time Data Reporting, etc. Installing a semantic layer for data search, discovery and transformation will future proof your BI/Analytic roadmap. This architecture has the added benefit that you will be able to constantly utilize best-in-class tools as they come to market.

3. Once you build any ETL/data transformation for a specific BI tool, you will only be able to use that newly created data set in that tool for which it was designed, since it will reside in its proprietary data format. What happens when someone in an Advanced Analytics tool wants to use the data set that was created in your Visual Discovery tool? Most BI tools are designed for individual users therefore there is no “community learning” element to the tool. Each user has to figure out his or her own data selections and subsequent analysis—the aggregate time wasted through insights and institutional knowledge previously learned costs enterprises $millions in lost productivity. This extends into the ETL support for the enterprise. Generally, an individual supports a silo group within the organization and is familiar with the data sets that group uses. This is often a real issue if that individual leaves the organization as hard-learned institutional knowledge walks out the door and often takes weeks to relearn.

4. So-called Self Service BI tools lack even basic data discovery tools to allow the user to search data assets that live within an organization. As a result the BI user has to know what data sets are generated by the business. This may not be so much of an issue within the individual department that the analyst is supporting but how can that knowledge worker add value across the organization without insight or access to other department data sets? This perpetuates data silos within organizations. BI tools do not break down data silos and in most cases encourage LOB to build their own data layer utilizing tools that live on their desktop such as Access and Excel. This does not scale to the rising volumes of data that the LOB has to deal with. At best, the analyst must call someone in IT to discover what data sets are available—a process that, by its very nature, negates the whole principal of “self-service”.

5. Heavy data processing and transformation on your BI/application servers will negatively affect the in-app analytic performance for your end user. As data sets become larger and are often unstructured such as social media or IoT data sources, this only compounds the challenges of analysis and the responsiveness of the tools. It’s not uncommon on complex analysis for the “coffee break render” to be employed. As more analysts rely on the same BI servers the responsiveness continues to get worse. These tools were not architected for the growth in users or data set sizes. Moving the connection, discovery, cleansing, transformation and formatting of data sets for analysis to a highly scalable and responsive environment such as Hadoop and Spark enables the enterprise to scale seamlessly and cost effectively and meet the demands of the enterprise to deliver insights at the speed of business.