Project Gado: Building an Open Archival Scanning Robot Using

Project Gado
A powerful, durable Open Source robotic scanner for sensitive archival materials
BPEX, December 5 2012
Tom Smith
[email protected]
Afro American Newspaper
family operated for
120 years
Afro American Newspaper
1.5 million
photographs
Afro American Newspaper
One
Archivist
The Vision for Gado
An affordable option for digitization at small archives
The Vision for Gado
an
autonomous and open source
robot for digitizing photographic collections costing
< $500
Gado 1
Gado 1
good proof of concept
much too complicated
Gado 2
Gado 2
40K images
1
2
of the cost
demo
http://youtu.be/OF8SAyHAi64?t=4m42s
Holistic Process
pre‐processing
digitization
post‐processing
publishing and monetization
Pre Processing
challenges
Digitization
42s per photo
Live Analytics
Post Processing
Crop and deskew
Post Processing
Crop and deskew
Post Processing
Page analysis
Text segmentation
Post Processing
OCR for back of images
Post Processing
`The former Miss Violet W. Walston
cuts her cake following her wedding tohers of the family looking on include, left to right, Mrs. Thurman TillettBlonnie Walston, Mr. and Mrs. Thomas R. Walston, Mrs. Helena J. Bar;nings, Mrs. Daniel Trotman and Mrs. Marie Johnson.
Tagging
Publishing/Monetization
GadoImages.com
image
licensing
Get Involved
GadoImages.com
ProjectGado.org
@projectgado
[email protected]
References
http://www.blackpast.org/files/blackpast_images/Murphy__John_Henry__Sr.jpg
http://brandimpact.files.wordpress.com/2008/11/legos.jpg
http://www.softicons.com/free‐icons/system‐icons/colobrush‐icons‐by‐eponas‐
deeway/database‐icon
Extras
overview
hardware
codebase
challenges
next steps
Design Philosophies
low cost
simple
modular
durable
Low Cost
Small archives don’t have a lot of cash.
The Afro can only afford one archivist to its 1.5 million images.
hardware
design philosophies
Low Cost
Small archives don’t have a lot of cash.
For example, the Afro has a single archivist to its 1.5 million images.
So we set a ceiling of hardware
$500
design philosophies
Simple
This robot is a kit. (keeping costs down)
We want John Doe to assemble it without a PhD in robotics.
hardware
design philosophies
Modular
Open source components and design.
No reinventing the wheel here.
hardware
design philosophies
Modular
If something breaks, it can be replaced.
hardware
design philosophies
Durable
1.5 million images
hardware
design philosophies
overview
hardware
codebase
challenges
next steps
Mechanical
1. Lasercut MDF
2. Screw‐together, tab and slot
3. Strong servo base
4. Mini actuator
5. Vacuum pump
hardware
“The Brain”
Custom shield + Arduino Uno
hardware
Sensors
hardware
overview
demo
codebase
challenges
next steps
overview
hardware
codebase
challenges
next steps
Basics
Arduino firmware
Python software (cross platform)
MySQL database
codebase
Arduino Communication
Pyserial
Custom Protocol
‐ int + char
‐ char → firmware function
‐ JSON response
Functions mirrored in firmware & software
codebase
Scanning
Python TWAIN
Heavily modified; steal our copy!
Direct system call to SANE in Linux
Needs: Store scanner in memory
codebase
Collection Management
MySQL Database
codebase
Collection Management
Needs:
Adapting to hierarchical collections
codebase
overview
hardware
codebase
challenges
next steps
Camera Scans
DSLR
expensive, no Windows
Webcam
cheap, low quality
Consumer camera
no control code
challenges
Community Involvement
Coding
Building
Tagging
challenges
overview
hardware
codebase
challenges
next steps
Code
http://code.google.com/p/project‐gado/
Need sample images?
Code questions?
[email protected]
Build
At ProjectGado.org:
• Lasercut files
• PCB gerbers
• Eagle files
• BOM
• and more
Partner
• ESDA LLC offers turn‐key setup for institutions, multiple scanning options
• Future partners
– Johns Hopkins
– University of Maryland
– MD State Archives
– You?
Contributors
Get Involved
GadoImages.com
@projectgado
[email protected]