If you’re like us and love the Discovery Channel hit series, Gold Rush, you can’t help but see the similarities between gold mining and data mining. Here’s the gold miner’s challenge: Dig through thousands of yards of overburden to find the pay dirt. Then transport 1000’s of yards of pay dirt to the wash plant and run it through the plant to find flecks of gold in the sluice box.

An analyst, or Business Insight Guru (BIG) as we call them, are faced with the same challenge daily. Dig through Terabytes of raw data (overburden) and feed that to the data wash plant (ETL processes) so it can be normalized (pay dirt) and viewed by the BI visualization tool in the data sluice box to find the critical business insight (gold). In the old days this would take weeks and would involve ongoing interaction between IT and the BIG.

Now with UNIFi Software the integration is automated and the BIG is able to integrate and normalize their own data. This frees up IT resources and dramatically speeds up the process of getting gold in the data sluice box

We thought it would be fun to do a Wash Plant comparison.

Gold wash

Section

Gold Wash Plant UNIFi Software Data Wash Plant

A

Grizzly Feeder Structured, semi-structured, multi-structured and unstructured data pay dirt

B

Grizzly Wash System The UNIFi Software Auto Data Integration parser replaces technology-oriented tools native to the Hadoop eco-system such as Pig, Hive or Java map-reduce programs

C

Rock Discharge Chute The big bits of data youre not interested in

D

Heavy-Duty Gear Reducer Hadoop processing node

E

Trommel The UNIFi Transform In Place technology processes the data natively on Hadoop and prepares the data for discovery and visualization

F

Replaceable Screen UNIFi presentation layer allows the Business Insight Guru to pursue ‘what if’ scenarios easily and easily drill down on specific data sets

G

Trommel Spray Bar The BI visualization API

H

Nugget Trap Structured data does not need further processing

I

Electrical Panel Hadoop Processing Performance

J

Structural Steel Skid Hadoop cluster hardware

K

Sluice Recovery System BI visualization layer where you find the insight

L

Rubber Trunnion Supports UNIFi API integration with BI layer

M

Tailings Conveyer The data you dont need once processed