INSIDER Data Quality

Data Quality
INSIDER
S t r a t e g ic Dat a Q u a l i t y S o lu t i o n s f o r D e v e l o p e r s
02/13
Volume 8
A Melis s a Dat a Publication
In This Issue:
> How to Create a Golden Record
from Duplicate Records
> Calling Melissa Data Web
Services: A Best Practice
> Experience Data Quality “Uplift”
> Personator’s Full Contact Data
Quality Comes to Contact Zone
Issue 1
>> In The Spotlight
The INSIDER Pro
How to Create a Golden Record from
Duplicate Records
It’s hard to believe it’s a brand new year.
We’re proud to announce that the past
year has proven to be a banner year for us.
By Joseph Vertido, MVP Channel Manager/ Data Quality Analyst
So you’ve invested in a state-of-the-art matching tool, and through careful
analysis have constructed a matching policy that will catch all the duplicates
in your database. After running the matching process, you are presented with
the duplicated records bundled nicely into duplicate groups and ready for
consolidation. Obvious matches such John Smith at 123 Main St and John Smythe
at 123 Mein Street, are both identified as having the same information. Now what
comes next? What do you do with the duplicates once they are detected?
In the matching methodology, choosing the unique or winning record – known
in industry terms as building the Golden Record – is the next logical step in this
process. The concepts and the game-changing Melissa Data approach that goes
into this effort will be explored in this article.
Here are some of our biggest product
launches in 2012:
• Introduced a next-generation
enterprise data quality solution
called PersonatorTM
• Released Contact Zone®, powerful
data quality software with advanced
ETL capabilities
• Went global as we expanded our
data verification solutions to include
validating and correcting address
data for more than 240 countries
Introducing the Golden Record Selection Tool
In order to intelligently automate the Golden Record selection process, Melissa
Data introduces new functionality in the MatchUp® SSIS Component called the
Golden Record Selection. MatchUp is a powerful tool for advanced matching and
duplicate management.
The Golden Record Selection allows for defining survivorship rules using
expressions. But the formulation of these rules become subject to the nature of
the data itself. Therefore, the first step in the automation process of selecting the
Surviving Record is to know one’s business requirements. For example:
We’re excited about the opportunities that lie ahead in 2013. We’ve got
several new product developments and
upgrades in the pipeline that will make
your data quality and data integration
initiatives even easier to perform.
As your partner in data quality, you can
count on us to provide you with all the
resources you need to help you achieve
your business goals!
Sincerely,
Continued on page 3
Bud Walker
Yo u r Pa r t n e r i n D a t a Q u a l i t y • M E L I S S A D ATA
1
02/13
Calling Melissa Data Web Services:
A Best Practice
Data Quality
INSIDER
Experience Data Quality “Uplift”
By Bud Walker, Director, Data Quality Solutions
By Admound Chou, Product Development Manager
Melissa Data uses multiple clustered servers across the world. They can be
physical servers or cloud based with Amazon EC2. This allows us to provide
fast service, minimize physical distances, and provide failover. In fact, in
2011, Melissa Data was recognized by website monitoring firm, Alertra, for
achieving 100% uptime.
As a Best Practice for consuming a commercial Web Service, if there should
be any latency detected by the caller, it is best to retry the service multiple
times. This way there are no single points of failure, and the client code is
made much more robust. We recommend implementing a catch/try block
or equivalent code to catch any errors returned by the service, and then
retry up to five times.
For Example:
int Retry = 0;
Boolean ReqRet = false;
do
{
try
{
// Perform Phone Lookup and store results to the Response
ResPhone = PhoneClient.doPhoneCheck(ReqPhone);
ReqRet = true;
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
Retry++;
}
} while ((ReqRet == false) && (Retry < 5));
It is important that you use the official URL to our Web services and not
an IP address. Using an IP address will throw away our redundancy system
and is not supported by Melissa Data.
Melissa Data’s recently released Personator Web
Service now includes brand new functionality to
append missing information to complete a contact
record.
By leveraging Melissa Data’s advanced matching
and retrieval technology, any combination of
contact input such as a customer’s address, can
return any missing components like a name and a
phone number for powerful data quality uplift.
For example, if a phone number is submitted, a
matching name and an address can be returned.
Erroneous information can also be corrected with
up-to-date information available from Melissa Data.
Like the Verify function – which gives correlation
validity between the components of a contact
record – the Append function also leverages the
concept of centricity.
Centricity means the most important piece of
information is provided on which everything else
is based. So if the centricity is set by the user to
an address and an incorrect phone is provided,
the phone number can be corrected based on the
name and address.
This advanced Append feature is available to
subscribers of Personator, in addition to the Check
and Verify options that have already been released
to loud acclaim. If you want to learn more or take
the Personator Web Service for a test drive, please
call us today at 1-800-MELISSA (635-4772).
For more information on Persanotor, please go to:
www.MelissaData.com/dqi-personator
2
M E L I S S A D ATA • Yo u r Pa r t n e r i n D a t a Q u a l i t y
Data Quality
INSIDER
02/13
Continued from page 1
How to Create a Golden Record from Duplicate Records
From the chart on page one, we can see that the three records
are duplicates of each other with the exception of the first
record which contains a phone number. But which record do
we keep as our survivor and which ones do we discard?
Having a pre-defined business rule that states that the existence
of a phone number weigh-in to the validity of a record, then
becomes the determining factor for which record to keep. Of
course, the logic for survivorship can be simple, or they can be
extensive, depending on the complexity of the business rules.
These rules can then be defined from within the expression
builder of the Golden Record Selection tool.
The Golden Record Expression Builder offers several types
of operators for defining how the Golden Record should be
selected, from String functions, Date Functions, to Numeric.
Making Smarter, Better Business Decisions
Choosing the Golden Record Based on Data Quality
The new Melissa Data MatchUp and Golden Record Selection
– A New Game-Changing Approach
Many of the common rules for Golden Record selection can be
applied with the tool, such as Last Updated or Most Complete.
But what makes the Melissa Data Golden Record Selection tool
such a powerful, game-changing approach?
The answer lies in its unique ability to discern Contact Data
Quality Information and select the surviving record based on
level of quality of the information provided.
The basis by which we select our golden record should in fact
be primarily dependent on the actual quality of the data itself,
as it transcends and even overrules other determining factors,
such as which record was most complete or latest. For example:
It can be argued that the second record is the most recent
one, and should therefore be the survivor. But upon careful
consideration of the quality of the data, we can see that the
second record contains an invalid phone number. And from
a more human perspective, we come to a conclusion that the
first record has the better data and should therefore be our
golden record.
tool allows us
based on the
technique for
effective and
survivorship.
to easily rank and select the surviving record
actual quality of our contact data. This new
golden record selection offers a much more
logical approach when it comes to record
Through the flexibility of the Golden Record Selection tool, we
can easily define our business rules for survivorship – a task
which may prove to be daunting especially to an inexperienced
user.
But the real defining moment for Melissa Data’s Golden Record
tool is its inherent ability to understand the quality of our data
and use that as the primary basis for survivorship. Ultimately,
this creates an automatable system that can make smarter and
better decisions for data cleansing and creating that single,
accurate version of the truth that actually makes sense.
The Golden Record Selection, currently available within the
MatchUp SSIS Component, is coming soon to the Contact Zone
data quality platform.
For more information on Matchup, please go to:
www.MelissaData.com/dqi-matchup
Yo u r Pa r t n e r i n D a t a Q u a l i t y • M E L I S S A D ATA
3
Data Quality
INSIDER
02/13
Newsbytes continued from page 2
Personator’s Full Contact Data
Quality Comes to Contact Zone ®
By Peter Brown, DQT Software Engineer
Melissa Data recently released the next generation enterprise
data quality solution, Personator – an integrated Web service
offering powerful matching and retrieval technologies to provide
a whole new level of accuracy and completeness.
Melissa Data will offer PersonatorTM as a new component in
Contact Zone. So now, Contact Zone users will now be able to
associate addresses, phone numbers, and email addresses with
an individual – giving an organization confidence that their
information is relevant, the most current, and up-to-date. And
most importantly – that the data is related to the customer across
all applications.
How Personator Works
Since Personator is a Web service, you would normally have to
write a program that could access the service. While that does
give you a great deal of flexibility in terms of how you can use the
service, it can be a little inconvenient.
Director, Data Quality Solutions
Bud Walker
[email protected]
1-800-635-4772 x159
Melissa Data Technical Support
[email protected]
1-800-635-4772 x4 (6 am to 5 pm PST)
With the upcoming Personator component for Contact Zone, you
get the full functionality of Personator with an easy-to-use drag
and drop GUI. By using Personator in Contact Zone, you can set
up a program that can process large numbers of records through
the service in just a few minutes, and without having to write a
single line of code.
Editor Abby Telleria
Writers Admound Chou, Allison Moon,
Archana Chippada, Bud Walker, Jatinder
Kumar, Joseph Vertido, Patrick Bayne,
Peter Brown,Tim Sidor
Art Director Melody Yen
Graphic Designers Levi Irwin,
Timothy Magoun
Contact the editor at:
[email protected]
4
M E L I S S A D ATA • Yo u r Pa r t n e r i n D a t a Q u a l i t y
Melissa Data Corp.
22382 Avenida Empresa
RSM California, 92688-2112
Ph 1-800-MELISSA (635-4772)
Fax 949-589-5211
www.MelissaData.com
© 2013 Melissa Data Corp. All rights reserved.
The Personator Web service processes records containing some
combination of address, name, phone, and/or email data. Once
it has the data, Personator performs three different functions.
It can perform a check functionality, which allows it to parse,
standardize, and validate the contact data. It can also verify the
data, allowing it to confirm whether or not the different data
points of a record are associated with one another. It can also
append, adding data to partial or incomplete contact records.