On April 4, Andy Sheldon from Unifi, Eric Kavanagh from The Bloor Group, and David Wells from Infocentric spoke together on a webinar discussing the critical decision of choosing a Data Catalog for your organization. Attendees of this webinar learned twenty criteria to help choose the right data catalog for their data management future.

You can register to view the recording here. Continue reading for answers to questions asked during this live event.

What’s the difference between a Data Catalog and a Business Glossary and/or Data Dictionary?

A data catalog contains a business glossary as part of the information that can be searched and returned – a catalog can give deeper information about your data. The glossary adds business terms to metadata that might be a bit obscure. For example; in JD Edwards ERP data F0101 is the table name for the Address Book and MCMCU is the field name for Business Unit.  – Andy Sheldon, Unifi Software

A data catalog is actively integrated into the analytics life cycle. A data dictionary is typically passive and non-integrated. – Dave Wells, Eckerson Group

How much of the intelligence in the lineage is autogenerated by metadata discovery or do you need to add knowledge manually?

Within Unifi, our lineage is auto-generated from the time of connection forward – so anything you do with a data set in terms of transformation is captured and displayed in our lineage UI. We are actively working on integration with other lineage systems, specifically, ASG Rochade so you will be able to import that lineage information and combine that with what we capture and display as a single lineage view. In addition to lineage, which captures the provenance of a dataset, our audit features in the admin side of the platform will show exactly who has accessed the data and what functions, if any, have been performed on each dataset.  – Andy Sheldon, Unifi Software

Will the Data Catalog consider the Reporting/Dashboarding assets too in the lineage?

Yes, we can read Tableau Server data which will provide lineage information. You can also use this feature to see other datasets within a visualization – and use KnowledgeGraph to see attribute correlation. – Andy Sheldon, Unifi Software

How do you compare and contrast data cataloging with Enterprise Architecture tools – which also has concepts such as glossary, data catalog, data flows, etc.?

This is more a question of who is using the tool. Traditional EIM systems have been designed for IT users to support the business. Just as traditional BI tools such as a Cognos, Informatica and BusinessObjects were designed for IT to produce reports and visualizations on behalf of their business “clients”. Just as modern BI tools such as Tableau and Power are helping the business be more self-sufficient in reporting and analytics, and even more recently data science, so modern EIM tools such as Unifi will help the business be more self-sufficient around the discovery and use of the enterprise information. Our platform is designed to be intuitive for a business analyst and with support for Natural Language Queries really for any business user who wishes to find answers from the corporate data connected to the platform.  – Andy Sheldon, Unifi Software

Enterprise Architecture tools are about the enterprise. Data catalogs are for data consumers so features, functions and UI are geared toward data analysts and curators. – Dave Wells, Eckerson Group

Do you deal with geologic well data?

We don’t have anyone with that type of data on the platform today but we’d like to understand what format that is stored in. Also, is that data coming in real-time from drill sensor or mud pump data? We support streaming data services such as Kafka and Flume. One use case we have considered is prescriptive analytics by capturing historic drill data and identifying anomalies, then in real-time transform new sensor data against anomaly profiles to detect and prescribe alert outcomes. – Andy Sheldon, Unifi Software

Is it appropriate to use a metadata catalog to manage how changes to upstream applications affect downstream applications or analytic environments (similar to a CMDB)?

It is a good idea to capture any and all configuration management data as part of a dataset’s trustworthiness insights. The more information that can be presented to a user in the catalog, the more credibility to the validity of that dataset. This is especially true if those upstream settings may have a bearing on the values of the data being stored. – Andy Sheldon, Unifi Software

Does the tool show data models?

Yes, using KnowledgeGraph interfaces we can display the relationship between datasets, attributes, transformations, TWBX/TDE etc. – Andy Sheldon, Unifi Software

Is Unifi a data governance tool or only data catalog?

Unifi has extensive data governance and security capabilities that are core to our integrated data catalog and data preparation platform. You can learn more about those here https://unifisoftware.com/product/governance-security/. – Andy Sheldon, Unifi Software