Tech Tuesday Intro Presentation

Helping everyone make better
data--driven decisions
Bridging Structured and Unstructured
Data Using Context Intelligence
Razvan Nistor - BIPB
Vince Barrett - Squirro
Tech Tuesday Meetup
April 14, 2015
What is unstructured data?
Structured vs. Unstructured data
The unprecedented growth of unstructured data
How are companies currently using unstructured data and
the untapped growth potential therein
What is the value in unstructured data?
Knowledge Prediction
Client examples
How? The bridge: Insight, Value, and Knowledge gained
by bridging structured and unstructured data
How can we make use of unstructured data?
How do we bridge data? Keyword Search?
Tools for text analysis: Process, index, and contextualize
text-based unstructured data.
Points to consider
What is unstructured data?
Context: What is unstructured data?
Unstructured data refers to information that either does not have a pre-defined
data model and/or is not organized in a predefined manner
Context: What is unstructured data?
Unstructured data refers to information that either does not have a pre-defined
data model and/or is not organized in a predefined manner
Context: Unstructured Structured
A machine
Model method
Context: Unstructured Structured
A machine
Types of unstructured data
We’re going to focus on text ‘only’…
The growth of unstructured data
What are companies currently doing with unstructured data?
Developing competitive pricing models
by scrubbing competitor’s websites
Indexing massive amounts of research
documents for search / retrieval
Maintaining company and employee
adherence to regulatory and
compliance requirements
Brand sentiment analysis
from Social Media
What is the value in unstructured data?
What is the value in unstructured data?
Access to information
What is the value in unstructured data?
BNY Mellon recently categorized
unstructured information on customer
interactions to build a more comprehensive
view of evolving banking needs.
What is the value in unstructured data?
Clothing retailer Chico’s using chatter on
social media streams to enrich their
customer information systems for more
personalized advertising. Yikes!
What is the value in unstructured data?
Government starting to categorize and organize
Petabytes of unstructured information at the
National Archives and Records Administration.
Movement towards organizing medical records
for Medicare & Medicaid.
The Bridge: Connecting Unstructured and Structured
The true value is in capturing unstructured data and combining it with other
data to gain new insights to improve business performance
How can we make use of unstructured data?
So, what is the ‘bridge’? Why not just Google everything?
Seth Grimes
‘give me what I want’ NOT ‘ give me what I said’
You can’t Google everything. You can’t Google Chuck Norris
Actually, Google is quite advanced these days…
‘How does google search work’
Safe search
Site & Page Quality
User Context
The tools: Categorizing and contextualizing unstructured data
Text analytics
The tools: Categorizing and contextualizing unstructured data
Hadoop and MapReduce
The tools: Categorizing and contextualizing unstructured data
Natural Language Processing:
IBM Watson and Jeopardy
The tools: Categorizing and contextualizing unstructured data
Watson Applications:
Points to consider: Deploying unstructured data analytics
Key criteria when assessing these tools:
• NLP vs Statistical based
• How easy is it to process different languages?
• What if your language has a specific taxonomy?
• How easy is it to pull information from multiple data sources?
• How easy is it to integrate the tool into existing enterprise framework?
• What are the costs of implementation, maintenance, personnel?
Most businesses need an agile, affordable, easy-to-integrate tool that
can turn unstructured data into valuable information
Need to contextualize a sea of information with agile, everadapting search capabilities to give you the information you want,
not only the information you asked for