This is not a blog about visual data discovery, although (roughly) its first half is consumed with that topic. It’s actually about natural language processing, or NLP and its close cousin, Natural Language Queries – which is what we do to use the underlying NLP. I want to make the case that NLP is a technology with near-universal applicability, especially for people who consume analytics. To this end, I invoke the example of visual data discovery, distinguishing, in turn, between the discovery paradigm—in which data visualization fulfills a critical function—and data visualization technology itself. The idea is that data viz technology is easily generalized and applied to uses outside of visual data discovery; the discovery paradigm is less generalizable, however.

The larger claim isn’t just that NLP is even more generalizable than data viz technology—although it is—it’s that there’s something truly extraordinary about NLP.

A casual consumer of the business or technology press might assume that everyone who works with BI or analytics wants to do so via a visual data discovery tool of some kind such as Power Bi, Tableau, Looker and the like. There are several reasons for this, not least of which has to do with cache, visual discovery tools are sexy in precisely the way that BI reporting, ad hoc query/analysis, dashboard tools are not[1].And legitimized by reports such as the Gartner Magic Quadrant for Analytics and BI,

It’s a small step from this to the idea that the visual data discovery paradigm should be generalized and applied to almost everyone, almost everywhere, in almost every role. In other words, to assume that, given their druthers, executives, managers, and employees would all choose to become data discoverers—i.e., consumers and creators of insights.

Visual data discovery is a critical component of analytical development, both as a standalone practice (e.g., the business analyst-cum-insight-discoverer) and in the context of a larger, multi-disciplinary practice, such as data science. Discovery itself is an incredibly powerful metaphor that has transformed the work (and lives) of hundreds of thousands, perhaps millions, of people. But visual data discovery is just one paradigm among many. One metaphor among many. It isn’t applicable to, or suitable for, every potential information consumer for the simple reason that no paradigm or metaphor is—or could be. Discovery is a critical enabling technology for the production of analytics; it is not, by itself, identical with analytics.

Data visualization technology is different. It is highly generalizable: today, it’s used in all kinds of applications and services. It’s less a BI- or analytic-specific technology than a general-purpose technology that is optimized for displaying and conveying information or insights.




Having said this, I’d like to discuss another technology that, I think, is also highly generalizable—suitable for, applicable to—almost any information consumer in almost any role.

This would be NLP, or Natural Language Processing. Google Search is a good example of NLP in action—although this wasn’t always the case. Fifteen years ago, a user had to take care when she constructed her queries, e.g., employing different combinations of engine-specific operators to refine the scope of her search. Today, I take it for granted that when I type (or voice) a natural-language query into Google Search, I’ll receive a list of topically relevant results. In addition, about one out of every eight times I search, I’ll receive a top-line result that, for all intents and purposes, “answers” the question I’ve just asked. If, for example, I ask “What is the best live album of all time?” Google will tell me—compellingly, if not irrefragably—that the Who’s Live at Leeds tops a tally of ten other titles. There’s no magic here, however. My top-line “answer”—which Google calls a “featured snippet block”—is actually an excerpt from the results of a Rolling Stone magazine readers’ poll[2].

NLP has the potential to “lead” a person into a problem, as, for example, when an analyst asks a question and receives a ranked summary of results—or “answers.”

You can get a feel for this in Google Search—but just a feel. It’s a latent capacity, as distinct to something that (as with NLP query) is a salient feature. To the degree that an NLP implementation preserves context—i.e., a history of queries and results—the questions the analyst asks could and should lead to new (more concise, probing, inceptive, etc.) questions.

In this way, NLP permits an interactive question-driven experience that mimics in a sense, the human experience of discovery. This isn’t a literal claim that NLP in some way approximates the operation of the human mind; rather, it’s the idea that the experience of interacting with NLP mimics, mutatis mutandis, the process whereby one interrogates oneself, one’s peers, and the surrounding world. This is what’s neatest, coolest, and most superlative about NLP.

Bridging the viz vs NLP for the business user is simple. Not all questions require a visual result. If I want to know the value of the largest deal in my quarterly sales pipeline the result would be a single value. If I wanted to know how many prospects who received an email from me this week have also visited my website in the past month, again, the result would be a single answer. So NLP lends itself to near instant answers to specific value questions; whereas, visualizations lend themselves to trends and predictions. Both provide incredible value to the organization.

But this is just the beginning. I’ll have (much) more to say about NLP queries and how they can help support, drive, or, even inspire analysis and, yes, discovery in a future blog posting.


[1] Journalists and op-ed writers like sexy things, too. Operational reporting is unsexy. Spreadsheets and OLAP cubes, too, are unsexy. And SQL queries? Muy unsexy! Even dashboards are—40 years onso last millennium. What’s more, all of these things are yoked to (and seen as consubstantial with) the same centralized BI model that has been bane and bête noire to business people for decades. By way of contrast, stories about discovery seem to take it for granted that the tools are uniformly easy to use, fun to work with, intuitively intelligible, etc.

[2] The technology isn’t completely consistent, alas. If I tweak my query—e.g., substituting “Which” for “What”— Google generates a completely different featured snippet block result.