Rosette® Name Indexer

RES
RNI
Entity Resolver
Real Identities
ROSETTE
Name Indexer
Matched
Names
www.basistech.com
[email protected]
+1 617-386-2090
RNT ROSETTE
Name Roosevelt
Translator 97%
Frank Delano
85%
Рузвельт, Франклин
RCA ROSETTE
84%
President
Roosevelt
Categorizer
82%
Gov. Franklin Roosevelt
82%
RSA ROSETTE
Sentiment Analyzer
79%
富兰克林·罗塞费尔特
77%
Franklin Rosenvelt
F. D. R.
F. D. Roosev
74%
Translated Name
Sorted Content
Franklin D. Roosevelt
32nd U.S. President
ID: USPRES32
DOB: Jan. 30, 1882
Actionable Insigh
Franklin Delano Roosevelt, also known by his initials,
FDR, was the 32nd President of the United States
and a central figure in world events during the
mid-20th century, leading the United States during....
73%
Accurate fuzzy name
matching in many languages
Names are the linchpin to connecting data points in financial compliance, antifraud, government intelligence, law enforcement, and identity verification.
Yet, names are challenging to connect because of their incredible variation in
misspellings, nicknames, initials, and titles. In international databases, a single
name may appear in many languages!
10
Supported
Languages
K EY F EATUR ES
- Component of the Rosette SDK
- Simple API
- Fast and scalable
- Industrial-strength support
- Easy installation
Rosette® Name Indexer (RNI) solves these challenges with a linguistic,
knowledge-based system that compares and matches names of people, places,
and organizations despite their many variations. RNI is unrivalled in its ability
to match names because of its intelligent approach.
- Flexible and customizable
As linguistics experts with deep understanding at the intersection of language
and technology, Basis Technology continually improves the Rosette product
family with language additions, feature updates, and the latest innovations
from the academic world. RNI is unrivalled in its ability to match the names of
entities—find out how your organization can utilize this pioneering technology
for extraordinary results.
- Increases name search accuracy and finds
- Java or web services
- Unix, Linux, Mac, or Windows
- Matches names of people, places, and
organizations
hits other systems miss
- Ranks results by relevancy with a similarity
score
- Built to work with Apache™ Solr and
Elasticsearch
Select Customers
Start using RNI today
Try our free product evaluation
www.basistech.com
RES
RNI
RNT
ROSETTE
Entity Resolver
Real Identities
ROSETTE
Name Indexer
Matched Names
ROSETTE
Name
Translator
The Rosette
Advantage
Integration
Options
Language
Our knowledge-based system combines the
latest in Natural Language Processing (NLP)
ROSETTE
to intelligently
match names based on their
linguistic and cultural structures and norms.
RCA
RSA
Java library to support its main use cases. RNI can also be adapted to match the needs of each
application.
Categorizer
Unlike expensive and less accurate legacy
solutions driven
by thousands of spelling
ROSETTE
variants from known names, RNI analyzes the
intrinsic structure of each name component
and performs an intelligent comparison using
advanced linguistic algorithms.
Translated Names
Identifier
RLI
RLI
Rosette® Name
Indexer
integrates easily
Apache Solr™ as a plug-in or into applications
as a
Identify
languages
and into
encodings
Base Linguistics
Apache Solr™-based search systems can easily add high-quality fuzzy name matching to every
RBL
RBL
search by simply
adding
name
fields. RNI provides
a special
Solr field type for names. This
Search
many
languages
with high
accuracy
mechanism means Solr can index documents with multiple name fields, each with multiple values
Actionable Insights
(e.g., an “alias” field may contain more than one name). Each document could also contain nonname fields like dates or plain text.
REX
Our approach is not limited to a particular list
of variants and reduces the likelihood of both
“false positives” (wrong matches) and “false
negatives” (missed matches).
Entity Extractor
REX
<field name=”primary”>Muhammad Ali</field>
Tag names
of people, places,
andJr</field>
organizations
<field
name=”alias”>Cassius
Clay
<field name=”alias”>The Greatest</field>
<field name=”dob”>1/7/1942</field>
Entity Resolver
You can then construct a single query that gives different weight to the various fields. For example,
List driven systems cannot equal RNI for
matching never-seen-before names or missegmented names (Mary Ellen vs. MaryEllen).
a single query can find movies starring “Binedict Cumberbund” with screenplays by “Giyermo
RES
RES
Make
real-world
connections
in your data
Diltoro” that
were released
around
2014.
RNI
Financial institutions use RNI to manage and
update watch lists to block terrorist access
to funds, simultaneously avoiding compliance
violations and protecting their reputation.
Applications also include fraud detection,
money laundering, and document triage.
Same name in multiple languages
Мао Цзэдун 1 Mao Zedong 1 毛泽东
Name Translator
Names are often the most critical data point
in intelligence, law enforcement, and border
control. RNI is being adopted throughout the
U.S. government to address the challenge
of matching names in all their variations—
Same namenames
in multiple
particularly
fromlanguages
non-Latin languages
Мао Цзэдун
1 Mao
Zedong
1 毛泽东
such
as Arabic,
Russian,
Chinese,
Korean, or
Persian.
Phonetic spelling differences
RNI
name components
-Out-of-order
Set the minimum
threshold of the similarity
Diaz,
Carlos
Alfonzo 1the
Carlos
Alfonzo
Diaz
score
to manage
precision
and
recall of
the returned
search results.
Missing
name components
Phonetic spelling differences
RNT
RNT
Charles
Carrlist
1 Phillip
Carr(“stopwords”)
Cairns 1 Kearns
1 Kerns foreign names into English
-Phillip
Ignore
a given
of words
Translate
Transliteration spelling differences
Abdul Rasheed 1 Abd-al-Rasheed 1 Abdulrashid
RCA
Initials
with respect
Missing
spaces to
or matching
hyphens (e.g., titles,
honorifics).
MaryEllen
1 Mary Ellen 1 Mary-Ellen
-Truncated
Force two
name
words to always match with
name
components
a given score
(e.g.,
“Elizabeth” and “Lisbeth”
McDonalds
1 McD
1 McDonald
always match at 90%).
Nicknames
William 1 Will 1 Bill 1 Billy
Government Intelligence
Categorizer
RCA
Categorize Everything In Sight Name split inconsistently across database fields
J. E. Smith 1 James Earl Smith
Titles and honorifics
Dr. 1 Mr. 1 Ph.D.
Sentiment Analyzer
• Van
Dyke
1 Dick
• Dyke
-Dick
Force
two
names
to Van
always
match with a
given score (e.g., “John Doe” and “Joe Bloggs”
always match at 95%).
RSA
Diaz, Carlos Alfonzo 1 Carlos Alfonzo Diaz
(e.g., queries for "Marilyn Monroe"RSA
and
Detect The Sentiments Of Your Text
Cairns 1 Kearns 1 Kerns
Identity
Verification
in the
Transliteration
spelling differences
Abdul Rasheed
1 Abd-al-Rasheed 1
Sharing
Economy
Name Indexer
Match names between many variations
Name Matching Capabilities
Customize To Your Needs
Financial Compliance
ROSETT
Base
ROSETT
Entit
ROSETT
Entit
Java Library
Any application that needs name matching can directly integrate a Java library which takes care of
storing watch lists for you without incurring the overhead of a web-service call.
Use Cases
Lang
Sorted Content
Apache Solr
Sentiment Analyzer
ROSETT
Abdulrashid
Nicknames
Trust is foundational to the sharing economy.
William 1 Will 1 Bill 1 Billy
Whether booking room rentals, rides, or
Initials
odd
jobs, it is important to establish ways
J. connect
E. Smith 1the
James
Earl
Smith
to
online
and
offline worlds to
reinforce
trust and confidence. Titles andthat
honorifics
Dr. 1 Mr. 1 Ph.D.
Name matching is a key component of verifying
online identities with real-world documentation
(passports, driver’s licenses). Members of the
sharing economy such as Airbnb rely on RNI
to match names originating from all over the
world, and internationally between names
written in alphabets besides the Roman A-to-Z.
© 2015 Basis Technology Corporation. “Basis Technology
Corporation” , “Rosette” and “Highlight” are registered trademarks of
Basis Technology Corporation. “Big Text Analytics” is a trademark of
Basis Technology Corporation. All other trademarks, service marks,
and logos used in this document are the property of their respective
owners. (2015-01-14-RNI)
Out-of-order name components
- Link multiple names to a single individual
"Norma Jeane Mortensen" include the same
person).
Missing name components
Phillip Charles Carr 1 Phillip Carr
Missing spaces or hyphens
MaryEllen 1 Mary Ellen 1 Mary-Ellen
Truncated name components
McDonalds 1 McD 1 McDonald
Available Languages and Scripts
Name split inconsistently across database fields
Dick • Van Dyke 1 Dick Van • Dyke
Compatibility
Code Base
Platform Support
RNI matches names from these languages either in
transliteration to English or written in their native
scripts.
- Arabic scripts: Arabic, Persian, Pashto, Urdu
- Kanjii, katakana, hirigana: Japanese
- Hangul: Korean
- Roman scripts: English, Spanish
- Cyrillic: Russian
HEADQUARTERS
FEDERAL
WEST COAST
One Alewife Center
Cambridge, MA
02140
2553 Dulles View Dr.
Suite 450
Herndon, VA
20171
1700 Montgomery St. Furzeground Way
San Francisco, CA
Middlesex UB11 1BD,
94111
UK
EUROPE
ASIA
9-6 Nibancho,
Chiyoda-ku
Tokyo 102-0084,
Japan
ROSETT
Nam
ROSETT
Nam
ROSETT
Cate
ROSETT
Sent