Applied Imagery Pattern Recognition Workshop

1
Applied Imagery Pattern Recognition Workshop
Dedicated to facilitating the interchange of ideas between government, industry and
academia in an elegant setting conducive to technical interchange across a broad range
of disciplines
2014 IEEE Applied Imagery Pattern Recognition Workshop
AIPR 2014: Democratization of Imagery
Cosmos Club, Washington DC
October 14-16, 2014
AIPR 2014 is financially sponsored by the IEEE Computer Society
Technical sponsorship and support is provided by IEEE and the IEEE
Computer Society Technical Committee on Pattern Analysis and Machine
Intelligence
The AIPR Committee thanks Cybernet Systems Corporation, Integrity Applications
Incorporated and ObjectVideo Incorporated for their generous support of this year's
workshop.
1
2
The AIPR Officers:
Chairman:
Deputy Chair:
Program Chairs:
Secretary:
Treasurer:
Associate Treasurer:
Local Arrangements:
Publicity:
Web Master:
External Support:
Registration Chair:
Proceedings Chair:
Student Paper Award
Chair:
Neelam Gupta, ARL
John Irvine, Draper Laboratory
Guna Seetharaman, AFRL/RI, Rome, NY
Karl Walli, Col USAF
James Aanstoos, Mississippi State University
Al Williams, Self Employed Engineering Consultant
Carlos Maraviglia, NRL
Donald J. Gerson, Gerson Photography
Peter Costianes, AFRL, Emeritus
Charles J. Cohen, Cybernet
Pete Doucette, Integrity Applications Incorporated
Jeff Kretsch, Raytheon BBN Technologies
Franklin Tanner, Hadron Industries
Murray Loew, GWU
Paul Cerkez, DCS Corp.
AIPR Committee Members:
Jim Aanstoos, Mississippi State University
Eamon Barrett, Retired
Bob Bonneau, AFOSR
Filiz Bunyak, University of Missouri-Columbia
John Caulfield, Cyan Systems
Charles Cohen, Cybernet
Peter Costiannes, AFRL, Emeritus
Paul Cerkez, DCS Corp.
Peter Doucette, Integrity Applications Incorporated
Roger Gaborski, RIT
Donald J Gerson, Gerson Photography
Neelam Gupta, ARL
Mark Happel, Johns Hopkins University
John Irvine, Draper Laboratory
Steve Israel, Integrity Applications/ONR
Michael D Kelly, IKCS
Jeff Kretsch, Raytheon BBN Technologies
Murray H Loew, GWU,
Carlos Maraviglia, NRL
Paul McCarley, AFRL Eglin AFB,
Robert Mericsko, Booz Allen Hamilton,
Keith Monson, FBI,
Carsten Oertel, MITRE,
William Oliver, Brody School of Medicine,
K. Palaniappan, University of Missouri-Columbia
James Pearson, Remote Sensing Consultant
Surya Prasath, University of Missouri-Columbia
Amit Puntambekar, Intel Corporation
Mike Pusateri, LSI Corporation,
Harvey Rhody, RIT
David Schaefer, GMU
Guna Seetharaman, AFRL/RIEA,
Karl Walli, AFIT
Elmer,”Al” Williams, Self Employed Engineer
Alan Schaum, NRL
Franklin Tanner, Hadron Industries
Emeritus:
Larry Clarke, NCI
Robert Haralick, City University of New York
Heidi Jacobus, Cybernet
Joan Lurie, GCC Inc.
J. Michael Selander, Mitre
2
3
2014 AIPR Recognizer Sponsors
11600 Sunrise Valley Drive, Suite 210, Reston, VA 20191 USA 1(703) 654 9300
www.objectvideo.com
Twitter @OVLabs
Proud Sponsor of the Applied Imagery Pattern Recognition 3
4
IEEE AIPR-2014 - Democratization of Imagery
Cosmos Club, Washington DC – Oct 14-16, 2014
Day 1: October 14 – Tuesday
8 AM Check in
8:30
Opening remarks and welcome notes.
Neelam Gupta, Don Gerson & Karl Walli
8:40
Keynote: Dr Peter Highnam, Intelligence ARPA: An Overview
9:10
Q/A. 5 mins.
10 min
30min
9:15AM-10:15AM. Session 1: Image and Video Analysis. Session Chair: Col Karl Walli, USAF.
4 Talks. 15 min each including Q+A
P1
P2
9:15
9:30
P3
9:45
P4
10:00
Lin, Chung-Ching, Pankanti Shartath. Accurate Coverage Summarization of UAVS.
Madden, Don. Mobile ISR: Intelligent ISR Management and Exploitation for the
Expeditionary Warfighter
Pritt, Mark. Fast Ortho-rectified Mosaicking of Thousands of Aerial Photographs from
Small UAVs.
John Irvine. Imagery-based Modeling of Social, Economic, and Governance Indicators in
Sub-Saharan Africa
Coffee Break: 10:15 AM – 10:30 AM.
10:30AM-11:15AM Session 2: Analytics of Democratized Imagery. Session Chair: K. Palaniappan
10:30-10:50 Invited Talk. Dr. Lee Schwartz, the Department of State
10:50-11:10 Invited Talk. Robbie Schingler, COO, Planet Labs Inc.
11:10-11:15 Q&A 5 mins
11:15AM – 12:00N. Session 3: Contextual Models for Persistent ISR. Session Chair: Dr.
Raghuveer Rao
3-Talks. 15 min each including Q+A.
Rosario, Dalton. Against Conventional Wisdom: Longitudinal Inference for Pattern
P5 11:15
Recognition in Remote Sensing
Jonathan Tucker, Robert Stanfill, Suresh Subramanian. Role of Context in ISR Tracking,
P6 11:30
Stylianou, Abby. Images Don’t Forget: Online Photogrammetry to Find Lost Graves.
P7 11:45
12:00 Noon–1:30PM. Lunch Break
13:30 – 15:00 Session 4: Persistent Surveillance Analytics. Chairman: Dr. Suresh Subramanian
1:30
2:00
2:05
2:25
Keynote: Dr Philip Perconti. Vision to Perception - Imagery Analytics for Land
Forces
Q &A 5mins.
Invited Talk: Dr. Steve Suddarth. WAMI and grand scale ISR systems in the era of
ISIS: It takes a hobbyist to take a village
Poster Preview – 12 x 2 min.
2:50PM-3:10PM Coffee Break
4
5
3:10PM-4:40PM. Session 5: Hyperspectral / Polarimetric Imagery and Exploitation.
Session Chair: Dalton Rosario
P8
3:10
P9
3:25
Romano, Joao. A day/night range invariant anomaly detection for LWIR
polarimetric imagery
Holder, Joel. Calibration Methodology and Performance Characterization of a
Polarimetric Hyperspectral Imager
P10 3:40
Furey, John. Visible and SWIR Hyperspectral Polarimetric Imaging of Desert Soils
P11 3:55
Gross, Kevin (Martin, Jake). Exploring polarimetric hyperspectral imaging as a tool for
improved material identification
P12 4:10
Avishai Ben-David. Geodesic Paths for Time-Dependent Covariance Matrices in a
Riemannian Manifold
David Chenault Polarized microbolometer for target detection in video
P13 4:25
4:40 PM -4:45 Day 1 Closing remarks: Karl Walli, Don Gerson and Neelam Gupta.
Poster session and reception.
Poster Session PS-4:45PM. Session Chairman: Carlos Maraviglia. Venue: Poster Hall.
1.
2.
Lee, Lily. Automated Geo-referenced 3D Reconstruction from Multiview Satellite Images
Kumar, Brajesh. Integrating Spectral and Textural Features Using Wavelet Analysis for
Hyperspectral Image Classification.
3. Ferris Michael. Image Structural Similarity Based Metrics: A Performance Study for No-reference
Blurred Image Fusion
4. Skurikhin, Alexei. Learning Tree-structured Approximations for Conditional Random Fields.
5. Manish, Mahajan. Secret Communication in Colored Images Using Saliency Map as Model.
6. Rahnemoonfar, Maryam. Automatic detection of ice layers in synthetic aperture radar images.
7. Ngan, Henry. Robust Vehicle Edge Detection by Cross Filter Method.
8. Baran, Matthew. Time-adapted CLAHE for HDR Tonemapping of Full Motion Video.
9. Daniel Schmitt. Timing Mark Detection on Nuclear Detonation Video.
10. Himanshu Chaubey. Enhanced View Invariant Gait Recognition Using Feature Level Fusion
11. Yu Chen, Ryan Wu and Erik Blasch. A Container-based Elastic Cloud Architecture for Real-Time
Full-Motion Video Target Tracking.
The Cosmos Club has a unique history. Mr. Donald Gerson will give a tour of the Club, on
Oct 14 and Oct 15, to those who interested in three small groups each day. We encourage
you to check their webpage for the dress code as well.
5
6
Day 2: October 15 – Wednesday
8:00 AM Check-in
8:20 AM Opening remarks by Neelam Gupta, Don Gerson & Karl Walli
8:30AM Keynote: Prof. John Schott (30min + 10min Q/A)
9:10AM-10:15AM Session-6: Three Dimensional Models from Motion Imagery. Session Chair:
John Irvine
Aliakbarpour, Hadi, et.al. Analyzing Effects of On-board Sensor Imprecision in WAMI
P1
9:10
Bundle Adjustment
P2
9:25
Bruce Swett. Advanced Video Activity Analytics: Integrating and Maturing
Computer Vision and Pattern Recognition Algorithms.
P3
9:40
Recker, Shawn. Depth Data Assisted Structure-from-Motion
P4
9:55
Givens, Ryan. The HSI/Lidar Direct Method for Physics based scene modeling
Coffee break 10:10-10:25AM
10:25AM – 11:15AM – Advanced Systems and Technologies: Current and Future Challenges.
Session Chair: Richard Thissell
10:25AM – Invited Talk. Scott Fouse, Lockheed Martin Co, California
10:55AM - Invited Talk. Joshua C. Klontz. A Case Study of Automated Face Recognition: The
Boston Marathon Bombings Suspects
10:25AM-12:00Noon. Session-7: Tracking in Persistent ISR. Session Chair: Jim Aanstoos
P5
11:15
P6
11:30
P7
11:45
Viguier, Raphael . Particle Filter-based Vehicle Tracking Using Fused Spatial Features
and a Non-linear Motion Model
Arslan Basharat. Multi-Target Tracking in Video with Adaptive Integration of
Appearance and Motion Models
Carrie Pritt. Road Sign Detection on a smart phone for Traffic safety
12:00 Noon – 1:30PM. Lunch Break.
12:00 – 13:30. Executive Committee Meeting.
1:30 PM – 1:40: Logistics of AIPR2015 announcements – Neelam Gupta
1:40PM – 3:00PM. Session 8: Novel sensors and Applications for Persistent Imagery. Session Chair:
Al Williams
1:40
2:00
Invited Talk – Prof Rama Chellappa. Persistent and Pervasive Imaging Analytics
for Environmental Assessment of Oceanic Resources and Fish Stocks
Q&A 5 min
6
7
2:10
2:40
2:45
Keynote Talk – Dr Nibir Dhar. Enabling Technologies for Advanced Imaging
Q&A 5 min
Poster Reviews. 12 Posters, 2 min each.
Coffee Break -3:10PM – 3:25PM
3:25-4:40PM. Session 9: Multispectal Imagery and Exploitation. Chair: Pete Doucette
P9
3:25
P10
3:40
Timothy S. Khuon, Adaptive Automatic Target Recognition in single and Multi-Modal
Sensor Data
Daniela, Moody. Change Detection and Classification of Land Cover in Multispectral
Satellite Imagery using Clustering of Sparse Approximations (CoSA) over Learned
Feature Dictionaries
P11
3:55
John Caufield. Small Pixel Focal Plane Array
P12
4:10
P13
4:25
Gurram Prudhvi and Raghuveer Rao. Entropy Metric Regularization for Computational
Imaging with Sensor Arrays
Bruce Johnson. Computing a Heuristic Solution to the Watchman Route Problem By
Means of Photon Mapping
4:40PM. Closing remarks by Karl Walli, Don Gerson and Neelam Gupta
4:45PM. Poster session, 12 posters. Session Chairman: Neelam Gupta
1.
2.
3.
4.
5.
Krucki, Kevin. Human Re-Identification in Multi-Camera Systems.
Khryashchev, Vladimir. Gender and age recognition for video analytics solution.
Harrity, Kyle. Medical Image Segmentation using Multiscale and Super resolution methods.
Varney, Nina. Volumetric Features for Object Region Classification in 3D LiDAR Point Clouds.
Borel-Donohue, Christoph. Analysis of diurnal, long-wave hyperspectral measurements of natural
background and manmade targets under different weather conditions.
6. Bhowmik, Mrinal. Background Subtraction Algorithm for Moving Object Detection and
Evaluation of Performance Using Different Dataset.
7. Lu, Min. A Fast Coherent Point Drift Algorithm for 3D Point-Cloud Registration.
8. Pless, Robert, Democratizing the Visualization of 500 million webcam images
9. Andrew Brown, 3D Geolocation Methods.
10. Sasi, Sreela. Human Activity Detection using Sparse Representation.
11. Lt Ashley Green, Capt Robert Slaughter, Dr. John McClory. Modeling of Nuclear Detonation
Optical Output in DIRSIG
Banquet at 7PM.
Banquet Speaker: Richard Koszarski, Professor of English and Film at Rutgers University, "The
Coming of Sound to Motion Pictures."
7
8
DAY 3 - October 16 – Thursday
8:00AM – Check in
8:15AM – Opening remark by Karl Walli
8:20AM – Keynote Talk – Richard Granger. Beyond neural networks to brain engineering:
How brains recognize, localize, search, and track
9:00AM-10:00AM Session 9: Medical Image Processing. Chairman: Murray H. Loew
P1
9:00
P2
9:15
P3
9:30
P4
9:45
Albalooshi, Fatema. Automatic Detection and Segmentation of Carcinoma in
Radiographs
Lam, Walter. Novel Geometric Coordination Registration in Cone-beam Computed
Tomogram
Alan Schaum. Bayesian solutions to non-Bayesian detection problems: unification
through fusion.
Borel-Donohue, Christoph, Rapid location of radiation sources in complex
environments using optical and radiation sensors
10:00-10:15 Coffee Break
10:15AM-11:00AM Session 10: Radiation Monitoring Applications. Chairman: Alan
Schaum
P5
10:15
Slaughter, Robert, 3D Reconstructions of Atmospheric Nuclear Detonations
P6
P7
10:30
10:45
Stefan, Wagner. Smooth Chemical Vapor Detection.
Daniel Schmitt. Machine Learning of Nuclear Detonation Features
11:00AM-12:00 noon. Session 11: Seeing through Sound. Session Chairman: David
Schaefer
P8
11:00
11:20
P9
11:40
Andy Malcom. Foley: The Art of the Transparent Soundscape
Invited Talk. Ella Striem-Amit. Visual-to-auditory sensory substitution and the
“visual” functional specialization in the blind brain.
Edward Schaefer. Representing pictures with sound
Lunch 12:00 – 13:00
13:00: Announcement on Student Paper Awards.
13:05-14:20 Session 12: Learning and Exploitation Frameworks I. Chairman: K.
Palaniappan.
Comaniciu, Dorin, Marginal Space Deep Learning for Efficient Anatomy Detection in
P10
13:05
Volumetric Image Data
P11
13:20
Amit Puntambaker (Intel), Cloud-based Power Efficient High Performance Video
CODEC
8
9
P12
13:35
Collins, Roddy, KWiver: An Open-Source Cross-Platform Video Exploitation
Framework
P13
13:50
Walvoord, Derek. An automated workflow for observing track data in 3-dimensional
geo-accurate environments
P14
14:05
Verma, Nishchal. Large Displacement Optical Flow Based Image Predictor Model
14:20 – 14:35 Coffee Break
14:20 – 15:20 Session 13: Learning and Exploitation Frameworks II. Chairman: Frank
Tanner
P15
14:35
Tommy Chin. Indoor Non-Linear Target Tracking Using Wi-Fi and Video Fusion
P16
14:50
Harrity Kyle. Modified Deconvolution and Wavelets Based Fusion
P17
15:05
Prater, Ashley, Sparse approximations to generalized Fourier series through linear
programming
P18
15:20
Jing Peng. Approximate Regularized Least Squares and Parzen Windows.
15:35 – 15:45 Coffee Break
15:45 – 16:30 Session 4: Fusion and Mining for Video Analysis. Chairman: Ashley Prater
P19
15:45
P20
P21
16:00
16:15
Amankwah, Anthony. Motion Estimation of Flotation Froth Using Mutual Information
and Kalman Filter
Harrity, Kyle. Multiresoultion Deblurring.
Oh, Sangmin. Towards visual analysis of unconstrained images in social forums:
Studies on concept detection and personalized economy of images in social networks
16:30: Closing Remarks by Karl Walli and Guna Seetharaman
16:35: Adjourn Notice: Neelam Gupta and Don Gerson.
9
10
IEEE-AIPR2014 KEYNOTES TALKS
ABSTRACTS AND SPEAKER-BIOGRAPHIES
Keynote 1: “Intelligence ARPA: An Overview” by Dr. Peter Highnam, Director, IARPA.
Dr. Peter Highnam was named IARPA Director on 30 August 2012. Dr. Highnam joined IARPA
in February 2009 as the Office Director for Incisive Analysis. Prior to IARPA, he was a senior
advisor in the National Institutes of Health (NIH) and then in the Biomedical Advanced Research
and Development Authority (BARDA). From 1999 to 2003, Dr. Highnam was a DARPA
program manager with programs in electronic warfare and airborne communications. Before
joining DARPA, he worked for more than a decade in applied research in industry. Dr. Highnam
holds a Department of Health and Human Services Secretary’s Distinguished Service Award and
a Department of Defense Civilian Exceptional Service Award. He is a co-inventor on three
patents in commercial seismic exploration and holds a doctorate in computer science from
Carnegie Mellon University.
Keynote 2: “The MapGive Project” by Dr. Lee R. Schwartz, U.S. Department of State
As Geographer of the United States, Lee Schwartz holds the position of the Director of the Office
of The Geographer and Global Issues in the State Department's Bureau of Intelligence and
Research. Schwartz is the State Department’s 9th Geographer, a position that was established in
1921 and bears the statutory responsibility for providing guidance to all federal agencies on
questions of international boundaries and sovereignty claims. He also oversees the Humanitarian
Information Unit – a U.S. government interagency organization focused on unclassified data
coordination for emergency preparedness, response, and mitigation. Dr. Schwartz earned his
Ph.D. in geography from Columbia University, with a focus on political and population
geography. Prior to joining the Office of The Geographer, Schwartz was a member of the faculty
of The American University’s School of International Service. At the Department of State, he
has directed research and analysis on global issues primarily related to complex humanitarian
emergencies and has coordinated related fieldwork and applied geography projects overseas, in
particular in the Balkans, Central Asia, Russia, Afghanistan, Iraq, Sudan, the Horn of Africa,
Haiti, and Syria. His work has focused on ethnic conflict, refugee flows, peacekeeping
operations, strategic warning, and conflict mitigation and response – with an emphasis on
Geographic Information Systems (GIS) and Remote Sensing information coordination as well as
Participatory Mapping and Volunteered Geographic Information applications. Lee was the State
Department’s 2005 winner of the Warren Christopher Award for Outstanding Achievement in
Global Affairs and the 2012 recipient of the Association of American Geographers’ James R.
Anderson Medal of Honor in Applied Geography in recognition of his distinguished service to the
profession of geography
Abstract: The United States Department of State’s Humanitarian Information Unit (HIU) is a
unique interagency entity designed to break down stovepipes between federal agencies and
coordinate geospatial information critical for humanitarian mitigation and response. Housed in
the Office of The Geographer and partnering with agencies that include the National GeospatialIntelligence Agency (NGA) and the US Department of Defense (DOD), the HIU has developed
methodologies and toolkits for geospatial data coordination and collaboration – including
disseminating and sharing with NGO and intergovernmental organizations that do the bulk of
first-time response to both rapid onset and complex emergencies. Examples of The Geographer’s
role in disaster resilience efforts will be drawn from emergencies that include the Pacific Tsunami
and Haiyan Typhoon, and initiatives such as “Imagery-to the- Crowd/MapGive” and
10
11
“ROGUE/GeoSHAPE” that are helping to transform the way governments and nongovernmental
organizations collaborate on collecting, curating and sharing geospatial data critical for disaster
response. With test trials currently underway, what has been learned so far and how will these
technologies enhance operational effectiveness? Breaking down the silo systems that inhibit data
sharing and coordination; Leveraging open source tools and platforms to facilitate mobile and
“disconnected” data collection and editing; Allowing multiple organizations to contribute to the
collection and maintaining of GIS information to improve overall situational awareness.
Keynote 3: “Vision to Perception - Imagery Analytics for Land Forces” by Dr. Philip
Perconti, Director, US Army Research Laboratory – Sensors and Electron Devices
Directorate, Adelphi, MD.
Dr. Perconti currently serves as the Director of the Sensors & Electron Devices Directorate of the
Army Research Laboratory. He has responsibility for leading and transitioning the Army’s
primary basic and applied research programs in sensors, electronics, signal processing, and power
and energy component technologies. His duties include operation of unique electronics and
photonics materials fabrication and characterization facilities that enable world-class, Armyrelevant, component research and development. He is responsible for planning, executing and
balancing mission and customer program needs to ensure science and technology dominance for
the Army. He served as the Director, Science and Technology Division, US Army CERDEC
Night Vision and Electronic Sensors Directorate (NVESD), from 2000- 2013. He was also the,
Director, Electronics & Photonics Technology Office, National Institute of Standards &
Technology (NIST) during 1996-2000. He led the Imaging Technology Branch, US Army
CERDEC Night Vision and Electronic Sensors Directorate during 1990-1996.
Abstract: It is now widely recognized that, owing to the rapid growth in sensor technology and
their deployment, there is ingestion of massive amounts of data in many operational and
reconnaissance missions but progress in the ability to rapidly exploit the data has lagged behind.
Imagery in its various forms is typically the principal contributor to the data glut. While there has
been ongoing research for a long time in developing techniques for automated analysis that in
turn helps human analysts, the challenges to achieving practical solutions are many. Emphasis has
progressed from the relatively simple notion of automatic target recognition to one of recognizing
and predicting activities. This talk will outline the various challenges and provide an overview of
the Army Research Laboratory's research and collaborative efforts in the field, ranging from
much needed ground truth imagery collection to development of innovative solutions.
Keynote 4: “DIRSIG” by Dr. John Schott, Research Professor
Dr. John Schott is The Frederick and Anna B. Wiedman Professor in RIT's Chester F. Carlson
Center for Imaging Science. He has been a respected member of RIT's faculty since 1981. His
early impact at RIT laid the cornerstone for the university's imaging science program, where he
has been a leading researcher, educator, and mentor of students for decades. From this post, John
has also been a part of NASA's Landsat Science Team, and the founding director of the Digital
Imaging
and
Remote
Sensing
(DIRS)
Laboratory
at
RIT.
http://www.rit.edu/alumni/ihf/inductee.php?inductee=21
Keynote 5: “Enabling Technologies for Advanced Imaging,” Dr. Nibir K. Dhar, Deputy
Director, Night Vision Electronic Sensors Directorate.
Advances in imaging technology have huge impact on our daily lives. Innovations in optics, focal
plane arrays (FPA), microelectronics and computation have revolutionized camera design. As a
11
12
result, new approaches to camera design and low cost manufacturing is now possible. These
advances are clearly evident in visible wavelength band due to pixel scaling, improvements in
silicon material and CMOS technology. CMOS cameras are available in cell phones and many
other consumer products. Advances in infrared imaging technology have been slow due to market
volume and many technological barriers in detector materials, optics and fundamental limits
imposed by the pixel scaling and optics. There is of course much room for improvements in both,
visible and infrared imaging technology. This presentation will describe the imaging technology
challenges and related projects currently fielded through the DARPA/MTO office. In particular,
description of new technology development under a portfolio program, “Advanced Wide Field of
View Architectures for Image Reconstruction and Exploitation (AWARE)” will be described
highlighting the following: Modular and scalable camera architecture to overcome scaling
limitations of conventional imaging system design, and to demonstrate the feasibility of nearlinear growth of optical information throughput with increasing imaging system scale.
Advancement in infrared pixel scaling and high density FPA technology. Development of Focal
Plane Arrays with broadband and multi-band day/night FPA technology. Low cost manufacturing
and applications of micro- bolometer thermal technology.
Dr. Nibir Dhar received a master’s (1993) degree and Ph.D. (1996) in Electrical Engineering
from the University of Maryland at College Park in the area of Microelectronics and Electrophysics. He received a Bachelors’ degree in Electrical and Computer Engineering from George
Mason University. Dr. Dhar joined NVESD in April of 2014 as the Deputy Director for science
and technology to advance the S&T division R&D activities. Prior to joining NVESD, Dr. Dhar
served as the program manager in the Microsystems Technology Office at DARPA since March
2008. At DARPA he developed various innovative technologies in EOIR imaging and
transitioned several products. His efforts added significant value to the warfighter's objectives
and to the imaging community at large. Dr. Dhar’s work focused in the areas of novel
architectures in infrared detectors and imaging systems, nanoelectronics including NEMS/MEMS
components, novel materials synthesis techniques, bio-inspired concepts, and approaches to low
SWaP-C technologies. Prior to joining DARPA, Dr. Dhar worked as Branch Chief/Team Leader
at the Army Research Laboratory (ARL) where he led the Integrated Power and Sensing
Technology group to develop infrared sensors, soldier portable power sources, thin films,
nanomaterials development and integration of sensor/power technologies. Dr. Dhar was
responsible for a wide variety of infrared focal plane array technology including mercury
cadmium telluride materials based focal plane arrays (FPA), quantum well infrared
photodetectors, Type-II strained layer superlattice, quantum dot infrared detectors and inter-band
cascade Lasers. Dr. Dhar received numerous awards and recognitions including the Office of the
Secretary of Defense Medal for Exceptional Public Service in 2014. He is a fellow of SPIE.
Keynote 6: Beyond neural networks to brain engineering: How brains recognize, localize,
search, and track. Professor Richard Granger, Dartmouth College, NH.
Abstract: Great progress has been made in object recognition, but many of the characteristics of
real world vision tasks remain unsolved. In particular, I) recognition predominantly relies on
labeled datasets, yet these are hugely outnumbered by unlabeled image collections; ii) despite the
abundance of video data (and the fact that real-world vision occurs in a moving world), most
work in computer vision focuses on still images; iii) the tasks of segmentation, recognition,
localization, and tracking are typically treated as distinct, yet evidence suggests that information
from one can guide the others. We will describe systems being developed to recognize and track
objects in real-world environments based on algorithms derived from visual processing
mechanisms in brain circuitry.
12
13
Bio: Richard Granger is a professor at Dartmouth with positions in the Psychological and Brain
Sciences Department and the Thayer School of Engineering, and the director of Dartmouth's
Brain Engineering Laboratory. He received his Bachelor's and Ph.D. from MIT and Yale. His
research ranges from computation and robotics to cognitive neuroscience, and he is the author of
more than 100 scientific papers and numerous patents. He is a consultant to, and on the boards
of, a number of technology corporations and government research agencies, and he has been the
principal architect of advanced computational systems for military, commercial, and medical
applications.
INVITED TALKS
ABSTRACTS and BIOGRAPHIES
Robbie Schingle, Planet Labs Inc. Democratizing Satellite Image Galleries for Scientific
Studies
Planet Labs has launched and steadily expanding a network of Earth-imaging satellites called
“Doves.” The network is aimed at creating open data access fueled scientific revolution where
persistent imagery of large areas is critical, at a scale that has only been possible with NASA and
such agencies. In January 2014, the company delivered Flock 1, the world’s largest constellation
of Earth-imaging satellites, made up of 28 Doves. Together with subsequent launches, they have
launched 71 Doves, toward imaging the entire Earth, every day. Planet creates commercial and
humanitarian value with the market's most capable global imaging network. Fresh data from any
place on Earth is foundational to solving commercial, environmental, and humanitarian
challenges. Our global sensing and analytics platform unlocks the ability to understand and
respond to change at a local and global scale.
Robbie is the Chief Operating Officer of Planet Labs, and is responsible for the company’s
business operations and product development. Previously, Robbie worked at NASA serving as
the Chief of Staff for the Office of the Chief Technologist, incubating the Space Technology
Program. He managed the exoplanet-finding mission TESS, and served as lead for NASA’s Open
Government activities. Robbie received a BS in Engineering Physics from Santa Clara
University, an MBA from Georgetown, and a Masters from the International Space University.
Steve Suddarth, Transparent Sky LLC, NM. WAMI and grand scale ISR systems
in the era of ISIS: It takes a hobbyist to take a village
Abstract: Recent events have shown that the World is currently at risk of major regional or even
larger conflicts if current religious, sectarian, and national trends continue unabated. A
significant goal in addressing these threats is to stop them while they are as small as possible. A
significant additional constraint is that this task must be accomplished using few, if any "boots on
the ground". Over the past decade, the U.S. Military created and deployed several massive ISR
systems to great effect. The author's view is that we could have even greater impact with some
very small, simple, lightweight ISR technologies that fit into a cloud architecture. Although
imperfect when compared to forces on the ground, such systems could provide the numbers,
reach, and detailed up-to-date information to balance advances in sectarian violence at the village
and town level. The talk will present a technical approach to engineering systems that may be
13
14
particularly apt to addressing the kinds of rapidly evolving widespread threats that are currently
seen in the Middle East. The solutions go beyond WAMI and involve large-scale data-handling
and functions that could be useful to a broad constituency in practical, affordable and rapidly
producible forms.
Dr. Steve Suddarth, a retired U.S. Air Force Colonel, is the founder and president of Transparent
Sky LLC, which is focused on building affordable and versatile wide areas sensing and
exploitation systems. He played several key roles in the creation of modern WAMI systems and
could reasonably be considered the architect of some of many key advancements. Dr. Suddarth
has worked in related technologies spans nearly three decades including contributions in machine
learning, real-time embedded image processing, autonomous drone flight control, and complex
systems. He conceived of and led the development of the concept known as Angel Fire and (later)
Blue Devil. This included: Developing the concept together with Livermore Lab, performing
initial feasibility analysis, naming the project, marketing it to USSTRATCOM, CENTCOM, AF
Materiel Command (AFMC); Forming a team that would ultimately involve Los Alamos Lab
(LANL), Kitware, Inc., AF Inst. of Technology (AFIT), AF Research Lab (AFRL), and Jet
Propulsion Lab (JPL); Arranging for and conducting initial data collections at LANL with the
assistance of the AFIT OpTech team; Finding continuing funding through a Congressional
appropriation; Providing the initial code that was used to start development. Developing the first
version of the Angel Fire software in two months’ time, leading a team of three key engineers.
Leading collaborations with the Constant Hawk program throughout the summer of 2005.
Leading Angel Fire’s hardware design at LANL and AFIT from fall 2005 through 2006. Leading
the growing team for two years through numerous field demonstrations and development of the
system to near deployment-ready status. Leading field demonstrations that led to the decision to
deploy with the U.S. Marines. Assisted deployment planning for the USMC. Leading the
transition of the effort to AFRL for deployment the following year. Dr. Suddarth has proven the
ability to assemble the small, efficient, affordable team required, and a concept like Village
Peacekeeper could be realized in months with proper support.
Prof. Rama Chellappa. Summary of the Workshop on Robust Methods for the
Analysis of Images and Videos for Fisheries Stock Assessment
At a recent workshop organized by the National Research Council in May 2014, researchers
working in computer vision and fish stock assessment came together to explore potential areas of
collaboration. In this talk, the speaker will present a brief summary of the discussions that took
place at this workshop. While there are some commonalities between these two areas such as
detection and tracking, shape analysis, metrology and fine-grained classification, the differences
in image acquisition conditions and the environments in which data are collected need more
robust extensions of existing computer vision methods for assessing fish stock assessment.
Prof. Rama Chellappa is a Minta Martin Professor of Engineering and the Chair of the ECE
department at the University of Maryland. He is a recipient of the K.S. Fu Prize from IAPR, the
Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal
Processing Society (SPS) and the Technical Achievement and Meritorious Service Awards from
the IEEE Computer Society. At UMD, he has received college and university level recognitions
for research, teaching, innovation and mentoring of undergraduate students.He is a Fellow of
IEEE, IAPR, OSA, AAAS and ACM and holds four patents.
14
15
Joshua Klontz, Noblis, West Falls Church, VA. The investigation surrounding the
Boston Marathon bombings: A Case Study of Missed Opportunity.
The investigation surrounding the Boston Marathon bombings was a missed opportunity for
automated facial recognition to assist law enforcement in identifying sus- pects. We simulate the
identification scenario presented by the investigation using two state-of-the-art commercial face
recognition systems, and gauge the maturity of face recognition technology in matching low
quality face images of uncooperative subjects. Our experimental results show one instance where
a commercial face matcher returns a rank-one hit for suspect Dzhokhar Tsarnaev against a one
million mugshot background database. Though issues sur- rounding pose, occlusion, and
resolution continue to con- found matchers, there have been significant advances made in face
recognition technology to assist law enforcement agencies in their investigations.
Josh is a software engineer at Noblis in Falls Church, VA. He received the B.S., in Computer
Science, from Harvey Mudd College. Prior to Noblis, Josh worked at MITRE implementing face
recognition algorithms and applications. Afterwards, he studied under Dr. Anil Jain at Michigan
State University, focusing on unconstrained and cross-modality face recognition algorithms. Josh
is the primary author and maintainer of the Open Source Biometric Recognition project
(www.openbiometrics.org), which supports training, evaluating, and deploying biometric
algorithms. Josh is also the inventor of a domain specific programming language for image
processing and statistical learning called Likely (www.liblikely.org).
Visual-To-Auditory Sensory Substitution and the “Visual” Functional
Specialization in the Blind Brain
Ella Striem-Amit, [email protected]
Blindness is a highly limiting condition, affecting millions of people worldwide, and despite
much scientific advances there is currently still no widely applicable way to remedy many eye
diseases and conditions and thus restore sight. An alternative approach suggests bypassing the
eyes altogether, and delivering the visual information directly to the blind brain by using their
intact senses. Such sensory substitution devices (SSDs) rely on a real-time translation of visual
images to either sounds or touch, and teaching the blind how to process these sensorilytransformed inputs. But what can the blind perceive using such artificial visual information? Can
people who have been born blind learn to “see”, and to identify objects and people in their
surrounding? And how does the blind brain, which was deprived from its natural input from birth,
even process this information? Does it develop the visual specializations that characterize the
normally developing brain? We tested these questions by teaching a group of fully congenitally
blind participants how to use a visual-to-auditory sensory substitution device (the vOICe),
examining their visual achievements and scanning their brains using functional magnetic
resonance imaging (fMRI). We found that following a relatively limited training paradigm of
only tens of hours (on average 73 hours), the blind could learn to identify that images contain
objects, people’s body-shapes, as well as read letters and words. Using the Snellen acuity test we
assessed their visual acuity to measure beyond the World Health Organization (WHO) blindness
acuity threshold. This demonstrates the potential capacity of SSDs as inexpensive, non-invasive
visual rehabilitation aids, alone or when supplementing visual prostheses. Using fMRI we showed
that several principles of organization of the normally developing visual cortex are retained in the
blind, such as the general division to the two processing streams , and the category selectivity in
areas normally preferring visual written script (the visual word-form area) and body-shapes (the
15
16
extrastriate body area). Thus, the visual cortex showed retention of functional selectivity despite
the a-typical auditory sensory input, the lack of visual experience, the limited training duration
and the fact that such training was applied only in adulthood. These findings have both practical
bearings in relation to sight restoration and the development of sensory aids for the blind as well
as theoretical bearings regarding our understanding of brain organization.
Dr Striem earned her doctoral degree from the Hebrew University, Israel. The clinical aspect of
her research entailed developing and applying dedicated training methods for visual
rehabilitation, in teaching blind individuals how to use sensory-substitution devices. The training
program she lead has so far enabled fully and congenitally blind individuals to conduct
extraordinary tasks, such as walk in busy corridors while avoiding obstacles, identify and locate
everyday objects, notice people in their surroundings and identify their facial expression. An
example of these achievements can be seen in a movie appended to one of my recent
publications:http://www.plosone.org/article/fetchSingleRepresentation.action?uri=info:doi/10.13
71/journal.pone.0033136.s001 (for a streaming version see: http://tinyurl.com/bqe6oz3).
Additionally, she assisted in the development and testing of new devices and sensory aids.
16
17
Abstracts
17
18
Chung-Ching Lin, Sharath Pankanti and John Smith. Exploratory Computer Vision , IBM T
J Watson Research Center, Yorktown Heights NY 10598. [email protected],
[email protected]
Automatic video summarization is an important representative workload for UAV operations. A
predominant fraction of UAV videos are never watched or analyzed and there is growing interest
in having a summary view of the UAV videos for human consumption for obtaining a better
overall perspective of the visual content. Real time summarization of the UAV video events is
also important from tactical perspective. Our research focuses on developing resilient algorithms
for summarizing videos that can be efficiently processed either onboard or offline. Our previous
work [1] on the video summarization has focused on the event summarization. More recently, we
have investigated the challenges in providing the coverage summarization of the video content
from UAV videos. Different from the traditional coverage summarization taking SfM approach
(e.g., [3]) on SIFT-based [2] feature points, the UAV videos have many additional challenges
including jitter, low resolution, contrast, lack of salient features. We have attempted to use the
conventional approach to summarize UAV videos and have found that the features
correspondence algorithms do not perform well and consequently, the coverage summarization
results display many artifacts.
To overcome these challenges, we propose a novel
correspondence algorithm that exploits the 3D context that can potentially alleviate the
correspondence ambiguity. Our preliminary results on VIRAT dataset shows that our algorithm
can find many correct correspondences in low resolution imagery while avoiding many false
positives from the traditional algorithms.
Acknowledgments: This work is sponsored in part by Defense Advanced Research Projects
Agency, Microsystems Technology Office (MTO), under contract no. HR0011-13-C-0022. The
views expressed are those of the authors and do not reflect the official policy or position of the
Department of Defense or the U.S. Government. This document is: Approved for Public Release,
Distribution Unlimited.
Mobile ISR: Intelligent ISR Management and Exploitation for the
Expeditionary Warfighter
Donald Madden, Tae Eun Choe, Hongli Deng, Kiran Gunda, Himaanshu Gupta, Asaad
Hakeem (DAC), Narayanan Ramanathan, Zeeshan Rasheed and Ethan Shayne
Object Video, Inc, and Decisive Analytics Corp.
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected]
Modern warfighters are informed by an expanding variety of ISR sources, but the timely
exploitation of this data poses a significant challenge. ObjectVideo presents a system, Mobile
ISR, to facilitate ISR knowledge discovery for expeditionary warfighters. The aim is to collect,
manage, and deliver time-critical information when and where it is needed most. The Mobile ISR
system consumes video, still imagery, and target metadata from airborne, ground-based, and
hand-held sensors, and indexes that data based on content using state-of-the-art video analytics
and user tagging. The data is stored in a geospatial database and disseminated to warfighters
according to their mission context and current activity. The warfighters use an Android mobile
application to view this data in the context of an interactive map or augmented reality display,
and to capture their own imagery and video. A complex event processing engine and image-
18
19
based search enable powerful queries to the knowledge base. The system leverages the extended
DoD Discovery Metadata Specification (DDMS) card format, with extensions to include
representation of entities, activities, and relationships.
Fast Orthorectified Mosaicking of Thousands of Aerial Photographs
from Small UAVs
Mark Pritt,
Lockheed Martin Company, [email protected]
Small unmanned air vehicles (UAVs) provide an economical means of imaging large areas of
terrain at far lower cost than satellites. Applications of these systems range from precision
agriculture to law enforcement to power line maintenance. Because small UAVs fly at low
altitudes of approximately 100 meters, their cameras have only a limited field of view and must
take thousands of photographs to cover a reasonably sized area. Furthermore, to provide a unified
view of the area, these photographs must be combined into a seamless photo mosaic. The
conventional approach for accomplishing this mosaicking process is called block bundle
adjustment, and it works well if there are only a few hundred photographs. When there are
thousands of photographs, however, this method fails because its memory and computational time
requirements become prohibitively excessive. We have developed a new technique that replaces
block bundle adjustment with an iterative algorithm that is very fast and requires little memory.
After pairwise image registration, the algorithm projects the resulting tie points to the ground and
moves them closer to each other to produce a new set of control points. It fits the image
parameters to these control points and repeats the process iteratively to convergence. Results from
UAVs for precision agriculture will be presented. The resulting mosaics cover hundreds of acres
and have a GSD (ground sample distance) of less than one inch.
Imagery-based Modeling of Social, Economic and Governance
Indicators in Sub-Saharan Africa
John Irvine. Chief Scientist for Data Analytics, Draper Laboratory [email protected]
Many policy and national security challenges require understanding the social, cultural, and
economic characteristics of a country or region. Addressing failing states, insurgencies, terrorist
threats, societal change, and support for military operations require a detailed understanding of
the local population. Information about the state of the economy, levels of community support
and involvement, and attitudes toward government authorities can guide decision makers in
developing and implementing policies or operations. However, such information is difficult to
gather in remote, inaccessible, or denied areas. Draper’s previous work demonstrating the
application of remote sensing to specific issues, such as population estimation, agricultural
analysis, and environmental monitoring, has been very promising. In recent papers, we extended
these concepts to imagery-based prediction models for governance, well-being, and social capital.
Social science theory indicates the relationships among physical structures, institutional features,
and social structures. Based on these relationships, we developed models for rural Afghanistan
and validated the relationships using survey data. In this paper we explore the adaptation of those
models to sub-Saharan Africa. Our analysis indicates that, as in Afghanistan, certain attributes of
the society are predictable from imagery-derived features. The automated extraction of relevant
indicators, however, depends on both spatial and spectral information. Deriving useful measures
19
20
from only panchromatic imagery poses some methodological challenges and additional research
is needed.
Against Conventional Wisdom: Longitudinal Inference for Pattern
Recognition in Remote Sensing
Dalton Rosarioa, Christoph Borelb, Joao Romanoc
Army Research Laboratory, 2800 Powder Mill Rd., Adelphi, MD, 20783 USA
b
Air Force Institute of Technology, WPAFB, OH 45433-7765, USA
c
U.S. Army Armament RDEC, Picatinny Arsenal, NJ 07806, USA
a
[email protected]; phone 1 301-394-4235
a
In response to the 2014 IEEE AIPR theme (Democratization of Imagery), we discuss a persistent
imaging experiment dataset, which is being considered for public release in a foreseeable future,
and present our observations analyzing a subset of the dataset. The experiment is a long-term
collaborative effort among the Army Research Laboratory, Army Armament RDEC, and Air
Force Institute of Technology that focuses on the collection and exploitation of longwave infrared
(LWIR) hyperspectral and polarimetric imagery. In this paper, we emphasize the inherent
challenges associated with using remotely sensed LWIR hyperspectral imagery for material
recognition, and argue that the idealized data assumptions often made by the state of the art
methods are too restrictive for real operational scenarios. We treat LWIR hyperspectral imagery
for the first time as Longitudinal Data and aim at proposing a more realistic framework for
material recognition as a function of spectral evolution over time, to include limitations. The
defining characteristic of a longitudinal study is that objects are measured repeatedly through
time and, as a result, data are dependent. This is in contrast to cross-sectional studies in which the
outcomes of a specific event are observed by randomly sampling from a large population of
relevant objects, where data are assumed independent. The scientific community generally
assumes the problem of object recognition to be cross-sectional. We argue that, as data evolve
over a full diurnal cycle, pattern recognition problems are longitudinal in nature and that by
applying this knowledge it may lead to better algorithms.
Role of Context in ISR
Jonathan Tucker, Robert Stanfill and Suresh Subramanian, Lockheed Martin Company.
[email protected]; [email protected]; [email protected]
Detection and tracking vehicles within Wide Area Motion Imagery can be a very difficult
problem. Merging static detectors with motion based detectors can improve ROC curves. Traffic
patterns, situational awareness, and local knowledge can be exploited to guide algorithm
responses to better synthesize temporal information. Utilizing available shape files and context
labels has allowed for more complete information exploitation in detection and tracking.
20
21
Images Don’t Forget: Online Photogrammetry to Find Lost Graves
Abby Stylianou, Joseph D. O'Sullivan, Austin Abrams & Robert Pless
Department of Computer Science & Engineering, Washington University in St. Louis, St. Louis,
Missouri [email protected], [email protected], [email protected], [email protected]
The vast amount of photographic data posted and shared on Facebook, Instragram and other
social media platforms offers an unprecedented visual archive of the world. This archive captures
events ranging from birthdays, graduations and family trips, to lethal conflicts and human rights
violations. The public availability of much of this archive plays an important role in a new genre
of journalism, one led by citizens finding, analyzing, and synthesizing data into stories that
describe important events. To support this, we have built a set of browser-based tools for the
calibration and validation of online images. This paper presents these tools in the context of their
use in finding two separate lost burial locations. Often, these locations would have been marked
with a headstone or tomb, but for the very poor, the forgotten, or the victims of extremist violence
buried in unmarked graves, the geometric cues present in a photograph may contain the only
remaining reliable information about the burial location. The tools described in this paper allow
individuals without any significant geometry background to utilize those cues to locate the lost
graves, or any other outdoor image with sufficient correspondences to the physical world. This
paper will also explain the difficulties that arise due to geometric inconsistencies between
corresponding points, especially when significant changes have occurred in the physical world
since the photo was taken, and highlight visualization features on our browser-based tools that
help users to address this.
Student Author: Joseph D. O'Sullivan
A Day/Night Range Invariant Anomaly Detection For LWIR
Polarimetric Imagery
Joao M. Romano, US Army RDEC, Picatinny, NJ [email protected] and Dalton S.
Rosario, US ARL, Adelphi, MD [email protected]
Long Wave Infrared (LWIR) Polarimetric imagery has the potential of enhancing the detection
of manmade objects from natural clutter background over conventional broadband by taking
advantage of the fact that smooth manmade objects tend to emit strong linear polarization relative
to natural clutter. In the past detection of manmade objects was accomplished by observing the
Stokes vector information such as S1 and S2 components as well as the Degree of Linear
Polarization or DoLP. Although quite useful one of the main complaints of Stokes vector is that
the information is viewing angle dependent and the same sensor placed at two different angles
from the manmade object surface may or may not detect the object of interest. This paper
explores moving away from Stokes information and instead focusing on the independent
polarization imagery measurements (e.g., 0 and 90) to compose a cube of polarization
information. From this cube it will be shown that 3-dimensional manmade objects can be
discriminated from natural clutter background by taking advantage of covariance-difference
discriminating functions. The paper propose to use the M-Box covariance test to demonstrate
how well it performs compared to the Stokes parameters. The paper will also present a
background characterization methodology to be used with the M-Box covariance method to make
21
22
it range invariant by random sampling the test image with N blocks of data. The intention is to
show the following: 1) present the features that separate 3-dimensional manmade objects from
natural clutter; 2) propose a covariance-difference discriminant function; 3) demonstrate how
integrating a background random sampling approach with the proposed test allows the test to
become range invariant; 4) demonstrate the impact of increasing N, the number of random blocks
collected from the image, has on the probability of detection on a variety of targets.
Calibration Methodology and Performance Characterization of a
Polarimetric Hyperspectral Imager
Joel Holder [email protected]
Polarimetric hyperspectral imaging (P-HSI) has the potential to improve target detection, material
identification, and background characterization over conventional hyperspectral imaging and
polarimetric imaging. To fully exploit the spectro-polarimetric signatures captured by such an
instrument, a careful calibration process is required to remove the spectrally- and
polarimetrically-dependent system response (gain). Calibration of instruments operating in the
long-wave infrared (LWIR, 8 μm to 12 μm) is further complicated by the polarized spectral
radiation generated within the instrument (offset). This paper presents a calibration methodology
developed for a LWIR Telops Hyper-Cam modified for polarimetry by replacing the entrance
window with a rotatable holographic wire-grid polarizer (4000 line/mm, ZnSe substrate, 350:1
extinction ratio). A standard Fourier-transform spectrometer (FTS) spectro-radiometric
calibration is modified to include a Mueller-matrix approach to account for polarized
transmission through and polarized self-emission from each optical interface. It is demonstrated
that under the ideal polarizer assumption, two distinct blackbody measurements at polarizer
angles of 0°, 45°, 90°, and 135° are sufficient to calibrate the system for apparent degree-oflinear-polarization (DoLP) measurements. Noise-equivalent s1, s2, and DoLP are quantified
using a wide-area blackbody. A polarization-state generator is used to determine the Mueller
deviation matrix. Finally, a realistic scene involving buildings, cars, sky radiance, and natural
vegetation is presented.
Visible and SWIR Hyperspectral Polarimetric Imaging of Desert Soils
John S. Furey, Neelam Gupta [email protected]
Desert soils exhibit a complex mix and fascinating interplay of mineral content, grain sizes and
orientations, and other complicating features that make them of interest in military and civilian
applications of imaging technologies. We discuss the implementation of acousto-optic tunable
filter (AOTF) technology in the design of two novel hyperspectral polarimetric imagers. The
construction of the separate imagers in the Visible and shortwave infrared (SWIR) wavelength
bands is described, with particular attention to details enabling field deployment in the Summer
of 2014, and the engineering challenges and obstacles that were overcome. The narrative of the
rigors of the deployment in the hot American Southwest desert illustrates many of the technology
issues in getting these imaging technologies used. The Big Data aspects of the proper analysis and
handling of hyperspectral polarimetric images is outlined.
22
23
Exploring Polarimetric Hyperspectral Imaging as a Tool for Improved
Material Identification
Kevin Gross and Jake Martin. AFIT/ENP [email protected]
A new research effort is underway to investigate the degree to which polarimetric hyperspectral
imaging (P-HSI) improves material identification over conventional hyperspectral imaging. To
that end, the entrance window of a Telops LWIR (8-12 µm) hyperspectral camera was modified
to incorporate a holographic wire-grid polarizer (4000 lines/mm, ZnSe substrate, 350:1 extinction
ratio). To assess instrument performance and data reduction methods, preliminary measurements
of an uncoated glass (BK7) cylindrical lens and an uncoated quartz window were made. In the
LWIR, polarimetric measurements require careful two-point radiometric calibration to remove
the effects of polarized system response (gain) and polarized instrument self-emission (offset).
This was accomplished using on-board wide-area blackbodies which precede and overfill the
polarizing element. Treating the polarizer as ideal, degree-of-linear-polarization (DOLP) spectra
are formed from the appropriate apparent spectral radiances measured at polarizer angles of 0°,
45°, 90°, and 135°. Both unpolarized (S0) and DOLP spectra are compared to theoretical
predictions based on known surface-normal angles and spectrally-resolved complex indices of
refraction. Implications for material identification are discussed. The possibility of surface normal
estimation is also discussed.
Geodesic Paths For Time Dependent Covariance Matrices In A
Riemannian Manifold
Avishai Ben-David, Research Development and Engineering (RDECOM), Edgewood Chemical
Biological Center, Aberdeen Proving Ground, MD 21010 [email protected];
Justin Marks, Bowdoin College, Brunswick ME 04011 [email protected]
Time dependent covariance matrices are important in remote sensing and hyperspectral detection
theory. The difficulty is that C(t) is usually available only at two endpoints C(t0)=A and C(t1)=B
where C(t0<t<t1) is needed. We present the Riemannian manifold of positive definite symmetric
matrices as a framework for predicting a geodesic time dependent covariance matrix. The
geodesic path A→B is the shortest and most efficient path (minimum energy). Although there is
no guarantee that data will necessarily follow a geodesic path, the predicted geodesic C(t) is of
value as a concept. The path for the inverse covariance is also geodesic and is easily computed.
We present an interpretation of C(t) with coloring and whitening operators to be a sum of scaled,
stretched, contracted, and rotated ellipses. We show that the volume of the geodesic covariance is
smaller than that of linearly interpolated (between A and B) covariance matrix, and thus using
time dependent geodesic covariance in detection algorithms will increase the separation between
the H0 (target absent) and H1 (target present) detection scores, hence, detection performance will
improve (false alarm and detection probabilities depend on the detection algorithm and location
of targets).
23
24
Polarized Microbolometers for Target Detection in Video
David B. Chenault, John S. Harchanko, J. Larry Pezzaniti, Justin Vaden, Brian Hyatt.
Polaris Sensor Technologies 200 Westside Square, Suite 320, Huntsville, AL 35801
{david.chenault,John.Harchanko, Larry.Pezzaniti}@PolarisSensor.com
Infrared polarization relies on surface temperature, roughness, material properties, aspect angle
to the sensor, sky down-welling and background radiance reflecting from the object. Often times,
the polarization signature of a manmade object is different than the surrounding background.
Furthermore, that difference is often present even when the thermal signature of the same object
blends into the background. A novel sensing approach to detecting these signatures takes the
form of an infrared polarization imager or imaging polarimeter. In this paper, we describe several
approaches to making IR polarization measurements and specifically we describe the Polaris IR
Polarimetric Camcorder, a handheld infrared imaging polarimeter, that
produces live
polarimetric video for several polarization products. The system is immune to motion artifacts of
either the sensor or the scene. The system is battery operated, rugged, and weighs about one
pound and can be helmet mounted or handheld. The operator views the polarization products in
real time on either a helmet mounted display or small external display integrated with a digital
video recorder.Polarization sensitive microbolometers are used to produce, at a minimum, S0 and
S1 polarization products. A top level description of the camcorder is given followed by
performance characteristics and representative data including scenarios in which polarization
contrast remains high during periods of zero thermal contrast. Also, data will be presented that
shows the detection of objects in natural clutter, long after thermal equilibrium of the target with
the background has been established.
Automated Geo-referenced 3D Reconstruction from Multiview Satellite
Images
L. Lee* K. Wang+ J. Frahm+ *MIT Lincoln Laboratory +University of North Carolina
Chapel Hill [email protected] [email protected] [email protected]
There is a growing need for high resolution three-dimensional representations of earth’s terrain
and man-made features in applications such as urban planning, flood risk management, coastal
erosion, and disaster relief. Traditionally, such high resolution 3D models can only be achieved
through the use of LiDAR or IfSAR technologies. With the availability of high resolution
imaging satellites capable of collecting multi-look data, DSM can be acquired by applying 3D
reconstruction techniques adapted from computer vision field. Existing satellite sensors are also
able to collect images of much larger areas in shorter amount of time than LiDAR sensors and
hence provide the opportunity for country-scale coverage. Satellite images have sensor model
representations (rational functional model, or RFM) that are significantly different from the
sensor representations used in traditional 3D reconstruction techniques, which presents a major
challenge to adaptation of existing algorithms. While people have produced DSM from satellite
images before, the process involves manual initiation of feature registration, and is most often not
geo-referenced, or manually geo-referenced. We propose an entirely automated DSM generation
system that addresses the unique sensor model representation of satellite images by using the
RFMs without approximating them using traditional linear projection matrices that are familiar in
3D reconstruction in computer vision. In addition, we provide a solution to the lack of automated
geo-referencing capability by automatically registering the satellite images to a common
24
25
reference map data set. Our end result of 3D surface model is a high resolution, geo-referenced
data set produced without any human intervention.
This work is sponsored by the Department of Air Force under Air Force Contract #FA8721-05-C-0002. Opinions,
interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the
United States Government.
Integrating Spectral and Textural Features Using Wavelet Analysis for
Hyperspectral Image Classification
Brajesh Kumar, Onkar Dikshit. Geoinformatics Group, Department of Civil Engineering,
Indian Institute of Technology Kanpur, India {brajeshk, onkar}@iitk.ac.in
This paper presents a supervised classification framework that integrates wavelet transform based
spectral and textural features for hyperspectral image classification. The experiments are
performed on DAIS 7915 data (512 x 512 pixels with 5 m pixel size) acquired over an area
known as ‘La Mancha Alta’ to the south of Madrid, Spain which is divided into eight land cover
classes. It was acquired over 79 bands ranging from 0.4 to 12.5 μm. Only 65 bands were retained
after preprocessing removing the noisy bands. Investigations involved application of 1-D discrete
wavelet transform (DWT) along the wavelength dimension of the hyperspectral data for
dimensionality reduction followed by 2-D DWT for texture feature extraction. The combined
spectral textural feature set is used for classification. The pixel wise classification is performed
using multi-class one-vs-one support vector machine (SVM) classifier. SVM is trained with
Gaussian radial basis function (RBF) kernel. The parameters C and γ for RBF kernel SVM are
determined optimally using 5-fold cross validation. Detailed accuracy analysis using parameters
like overall accuracy, kappa, and tau statistics, etc. reveals that integration of spectral and spatial
information significantly improves the classification accuracy.
Extension of No-Reference Deblurring Methods Through Image Fusion
Michael H Ferris , Dr. Erik Blasch , Dr. Soundararajan Ezekiel, and Michael McLaughlin,
University of Binghamton, AFRL/RI, Indiana University of PA. [email protected],
[email protected], [email protected], [email protected]
An important and pressing issue in image quality enhancement is extracting an optimal amount of
information from a blurred image without a reference image for comparison. Most studies have
approached this issue by using iterative algorithms in an attempt to deconvolve the blurred image
into the ideal image. This process is very difficult due to the need to estimate a point spread
function for the blur after each iteration, which can be computationally expensive for much
iteration. In fact this process often causes some amount of distortion or "ringing" in the deblurred
image. However, image fusion may provide a solution. By deblurring a no-reference image, then
fusing it with the blurred image, we were able to extract additional amounts of information from
the fused image. As stated above, the deblurring process causes some degree of information loss;
the act of fixing one section of the image causes distortion in another section of the image. Hence
by fusing the blurred and deblurred images together we can retain salient information from the
blurred image and gain important information from the deblurred image. We have found that this
process significantly reduces the "ringing" in the deblurred image. The fusion process is then
validated by three different evaluation metrics; Mutual Information (MI), Mean Square Error
(MSE), and Peak Signal to Noise Ratio (PSNR). This paper details an extension of the no-
25
26
reference image deblurring process and the initial results indicate that image fusion has the
potential to be an incredibly useful tool in the image deblurring field.
Learning Tree-structured Approximations for Conditional Random
Fields
Skurikhin, Alexei, MS D440, Space Data Systems Group, Los Alamos National Laboratory, Los
Alamos, NM, 87545, USA. [email protected]
Exact probabilistic inference is computationally intractable in general probabilistic graph-based
models, such as Markov Random Fields and Conditional Random Fields (CRFs). We investigate
spanning tree approximations for the discriminative CRF model. We decompose the original
computationally intractable grid-structured CRF model containing many cycles into a set of
tractable sub-models using a set of spanning trees. The structure of spanning trees is
generated uniformly at random among all spanning trees of the original graph. These trees
are learned independently to address the classification problem and Maximum Posterior Marginal
estimation is performed on each individual tree. Classification labels are produced via voting
strategy over the marginals obtained on the sampled spanning trees. The learning is
computationally efficient because the inference on trees is exact and efficient. Our objective
is to investigate the capability of approximation of the original loopy graph model with
loopy belief propagation inference via learning a pool of randomly sampled acyclic graphs.
We focus on the impact of memorizing the structure of sampled trees. We compare two
approaches to create an ensemble of spanning trees, whose parameters are optimized during
learning: (1) memorizing the structure of the sampled spanning trees used during learning and, (2)
not storing the structure of the sampled spanning trees after learning and regenerating trees anew.
Experiments are done on two image datasets consisting of synthetic and real-world images. These
datasets were designed for the tasks of binary image denoising and man-made structure
recognition.
Secret Communication in Colored Images Using Saliency Map as Model
Manish Mahajan and Navdeep Kaur [email protected] and [email protected].
Steganography is a process that involves hiding a message in an appropriate carrier for example
an image or an audio file. Many algorithms have been proposed for this purpose in spatial &
frequency domain. But in almost all the algorithms it has been noticed that as one embeds the
secret data in the image certain characteristics or statistics of the image get disturbed. To deal
with this problem another paradigm named as adaptive steganography exists which is based upon
some mathematical model. Visual system of human beings does not process the complete area of
image rather focus upon limited area of visual image. But in which area does the visual attention
focused is a topic of hot research nowadays. Research on psychological phenomenon indicates
that attention is attracted to features that differ from its surroundings or the one that are unusual
or unfamiliar to the human visual system. Object or region based image processing can be
performed more efficiently with information pertaining locations that are visually salient to
human perception with the aid of a saliency map. So saliency map may act as model for adaptive
steganography in images. Keeping this in view, a novel steganography technique based upon
saliency map has been proposed in this work.
26
27
Automatic detection of ice layers in synthetic aperture radar
images
Maryam Rahnemoonfar, School of Engineering and Computing Sciences, Texas A&M
University-Corpus Christi, USA [email protected]
Global warming has caused serious damage to our environment in recent years. Accelerated loss
of ice from Greenland and Antarctica has been observed in recent decades. The melting of polar
ice sheets and mountain glaciers has a considerable influence on sea level rise and altering ocean
currents, potentially leading to the flooding of the coastal regions and putting millions of people
around the world at risk. Synthetic aperture radar (SAR) systems are able to provide relevant
information about subsurface structure of polar ice sheets. Manual layer identification is
prohibitively tedious and expensive and is not practical for regular, long-term ice-sheet
monitoring. Automatic layer finding in noisy echogram images is quite challenging due to huge
amount of noise, limited resolution and variations in ice layers and bedrock. This study presents
an efficient automatic algorithm to detect several layers of ice sheets using mathematical
morphology operations. Our approach involves the identification and selection of internal layers.
Experimental results of testing on publicly available SAR data of Greenland and Antarctica show
promising capabilities for automatically detecting ice layers.
Robust Vehicle Edge Detection by Cross Filter Method
Katy Po Ki Tang, Henry Y.T. Ngan, Senior Member, IEEE, Department of Mathematics, Hong
Kong Baptist University, Kowloon Tong, Hong Kong
[email protected]
In visual surveillance, vehicle tracking and identification is very popular and applied in many
applications such as traffic incident detection, traffic control and management. Edge detection is
the key to the success of vehicle tracking and identification. Edge detection is to identify edge
locations or geometrical shape changes in term of pixel value along a boundary of two regions in
an image. This paper aims to investigate different edge detection methods and introduce a Cross
Filter (CF) method, with a two-phase filtering approach, for vehicle images in a given database.
First, four classical edge detectors namely the Canny detector, Prewitt detector, Roberts detector
and Sobel detector are tested on the vehicle images. The Canny detected image is found to offer
the best performance in Phase 1. In Phase 2, the robust CF, based on a spatial relationship of
intensity change on edges, is applied on the Canny detected image as a second filtering process.
Visual and numerical comparisons among the classical edge detectors and CF detector are also
given. The average DSR of the proposed CF method on 10 vehicle images is 95.57%.
Student author: Katy Po Ki Tang.
Time-adapted CLAHE for HDR Tonemapping of Full Motion Video
Matthew Baran, Penn State University, [email protected]
27
28
High bit-depth data is becoming ubiquitous in the world of imaging and remote sensing. Single
frame images are often stored and processed at higher precision than can be visualized on
standard display technology. This problem is addressed with High Dynamic Range (HDR)
tonemapping, which nonlinearly maps brightness levels from a high bit-depth image into a low
bit-depth format. High bit-depth video is becoming increasingly available, and the latest video
encoding standards are being developed with high bit-depth support. We have developed an
approach to HDR tonemapping on high bit-depth video to map HDR data into older formats and
standard displays. We have updated the well-known Contrast Limited Adaptive Histogram
Equalization (CLAHE) algorithm to perform HDR video tonemapping with a time-adaptive
histogram transformation. In addition to brightness contrast, we use the L*a*b* colorspace to
amplify color contrast in the video stream. The transformed HDR video data maintains important
details in local contrast while maintaining relative brightness levels globally. Our results show
that time-adapted HDR tonemapping methods can be used in real-time video processing to store
and display HDR data in low bit-depth formats with less loss of detail compared to simple
truncation.
Timing Mark Detection on Nuclear Detonation Video
Lt Col Dan Schmitt, Air Force Institute of Technology, Dayton, OH. [email protected]
During the 1950s and 1960s the United States conducted and filmed over 200 atmospheric
nuclear tests establishing the foundations of atmospheric nuclear detonation behavior. Each
explosion was documented with about 20 videos from three or four points of view. Synthesizing
the videos into a 3D video will improve yield estimates and reduce error factors. The videos were
captured at a nominal 2500 frames per second, but range from 2300-3100 frames per second
during operation. In order to combine them into one 3D video, individual video frames need to
be correlated in time with each other. When the videos were captured a timing system was used
that shined light in a video every 5 milliseconds creating a small circle exposed in the frame.
This paper investigates several method of extracting the timing from images in the cases when the
timing marks are occluded and washed out, as well as when the films are exposed as expected.
Results show an improvement over past techniques. For normal videos, occluded videos, and
washed out videos, timing is detected with 99.3%, 77.3%, and 88.6% probability with a 2.6%,
11.3%, 5.9% false alarm rate, respectively.
Enhanced View Invariant Gait Recognition Using Feature Level Fusion
Himanshu Chaubey*, Madasu Hanmandlu* and Shantaram Vasikarla# *Bharti School of
Telecommunication Technology & Management, *Department of Electrical Engineering, IIT
Delhi, New Delhi, India #Dept. of Computer Science, California State University, Northridge, CA
91330 [email protected] [email protected],[email protected]
In this paper, following the model-free approach for gait image representation, an individual
recognition system is developed using the Gait Energy Image (GEI) templates. The GEI
templates can easily be obtained from an image sequence of a walking person. Low dimensional
feature vectors are extracted from the GEI templates using Principal Component Analysis (PCA)
and Multiple Discriminant Analysis (MDA), followed by the nearest neighbor classification for
recognition. Genuine and imposter scores are computed to draw the Receiver Operating
28
29
Characteristics (ROC). In practical scenarios, the viewing angles of gallery data and probe data
may not be the same. To tackle such difficulties, View Transformation Model (VTM) is
developed using Singular Value Decomposition (SVD). The gallery data at a different viewing
angle are transformed to the viewing angle of probe data using the View Transformation Model.
This paper attempts to enhance the overall recognition rate by an efficient method of fusion of the
features which are transformed from other viewing angles to that of probe data. Experimental
results show that fusion of view transformed features enhances the overall performance of the
recognition system.
Student author: Himanshu Chaubey
A Container-based Elastic Cloud Architecture for Real-Time FullMotion Video (FMV) Target Tracking
Ryan Wu, Yu Chen, Erik Blasch, Bingwei Liu, Genshe Chen, Dan Shen, Dept. of Electrical
and Computing Engineering, Binghamton University, Binghamton, NY 13902, USA; Air Force
Research Laboratory, Rome, NY 13440, USA; and, Intelligent Fusion Technology, Inc.
Germantown, MD 20876, USA frwu10, ychen, [email protected],
[email protected], fgchen, [email protected]
Full-motion video (FMV) target tracking requires the objects of interest be detected in a
continuous video stream. Maintaining a stable track can be challenging as target attributes change
over time, frame-rates can vary, and image alignment errors may drift. As such, optimizing FMV
target tracking performance to address dynamic scenarios is critical. Many target tracking
algorithms do not take advantage of parallelism due to dependencies on previous estimates which
results in idle computational resources when waiting for such dependencies to resolve. To address
this problem, a container-based virtualization technology is adopted to make more efficient use of
computing resources for achieving an elastic information fusion cloud. In this paper, we leverage
the benefits provided by container-based virtualization to optimize an FMV target tracking
application. Using OpenVZ as the virtualization platform, we parallelize video processing by
distributing incoming frames across multiple Virtual Environments (VEs). A concurrent VE
reassembles processed frames into video output. We implement a system that dynamically
allocates VE computing resources to match frame production and consumption between VEs. The
experimental results verify the viability of container-based virtualization for improving FMV
target tracking performance and provides a solution for mission-critical information fusion tasks.
Student Author: Ryan Wu
29
30
Analyzing the Effects of On-board Sensor Imprecision in WAMI
Bundle Adjustment
Hadi Aliakbarpour1, V. B. Surya Prasath1, Raphael Viguier1, Rengarajan Pelapur1,
Mahdieh Poostchi1, Guna Seetharaman2, Kannappan Palaniappan1.
Department of Computer Science, University of Missouri, Columbia, MO 65211
Information Directorate, Air Force Research Laboratory, Rome, NY 13441
[email protected]
Camera pose estimation has been explored for the past few decades but still remains an active
topic with the prevalence of new sensors and platforms. Among the existing pose estimation
methods, Bundle Adjustment (BA) based approaches are robust providing reasonable results even
when only partial information is available. BA refers to simultaneously refining the pose of $N$cameras and 3D structure of scene points subject to a set of projective constraints such that an
appropriate error measure is minimized. Normally, in BA after extracting salient features and
establishing correspondences, an estimate of the camera rotation and translation, together known
as the camera pose, is obtained using either fundamental or homography matrix estimation. These
initial estimates are then used in a triangulation step where corresponding features are
geometrically fused to obtain an initial estimate for the 3D point cloud reconstruction of the scene
structure. The crucial part of BA is the optimization and refinement steps, given the initial
estimates. Unlike general BA utilized in other computer vision tasks where there is often no
sensor metadata, in BA for Wide Area Motion Imagery (WAMI) noisy camera pose
measurements from on-board sensors are available. This enables us to develop efficient
streamlined BA algorithms exploiting sensor and platform geometries and flightpaths. We show
that the fundamental matrix or homography transformation estimation step can be bypassed, but
errors in the metadata due to noisy sensor measurements and adverse operating environments,
must be taken into account. In this paper, we analyze the effects of measurement noise in position
(from GPS) and rotation (from IMU) sensors on BA results, in terms of accuracy and robustness
of the recovered camera parameters using a simulation testbed. We also investigate how matching
errors in a sequence of corresponding features used to perform 3D triangulation can affect overall
precision. The impact on the robustness of camera pose estimation for N-view BA in the context
of large scale WAMI-based 3D reconstruction is discussed.
Student Authors: Raphael Viguier, Rengarajan Pelapur, Mahdieh Poostchi
Advanced Video Activity Analytics: Integrating and Maturing
Computer Vision and Pattern Recognition Algorithms.
Bruce Swett, Chief Scientist and Vice President, EOIR Technologies, Inc. [email protected]
While significant strides continue to be made in computer vision and pattern recognition, the
software solutions are often fragile, incompatible with other algorithms, and unable to operate
either in real-time or on massive amounts of data (scale). The AVAA project has developed a
framework for integrating, standardizing and testing computer vision and pattern recognition
algorithms called VPEF (Video Processing and Exploitation Framework). VPEF allows
algorithms from commercial, Government, open-source and academic organizations to operate
together in a plug-in, pipeline architecture. AVAA parallelizes VPEF instances, allowing the
algorithms to operate on a high volume of video data without re-engineering the algorithm
30
31
software. The Hadoop-based AVAA cloud architecture also provides distributed ingestion,
indexing, analysis, data storage, data search / retrieval, and data visualization capabilities. By
providing an end-to-end cloud architecture, new computer vision and pattern recognition
algorithms can easily be added, parallelized, and improved by combining them with the suite of
existing capabilities. The AVAA project provides both a transition path for new algorithms for
use by the Department of Defense, as well as a model for public-private partnership in developing
and fielding new technologies.
Depth Data Assisted Structure-from-Motion Parameter Optimization
and Feature Track Correction
Shawn Recker(1,2), Christiaan Gribble(2), Mikhail M. Shashkov(1), Mario Yepez(1), Mauricio
Hess-Flores (1), and Kenneth I. Joy(1) 1 Institute of Data Analysis and Visualization, Univ. of
California Davis, Davis, CA 2 Applied Technology Operations, SURVICE Engineering, Belcamp,
MD 21234
Structure-from-Motion (SfM) applications attempt to reconstruct the three-dimensional (3D)
geometry of an underlying scene, from a collection of images, taken from various camera
viewpoints. Traditional optimization techniques in SfM, which compute and refine the camera
poses and 3D structure, rely only on feature tracks, or sets of corresponding pixels, generated
from color (RGB) images. With the advent of reliable depth sensor information, these
optimization procedures can be augmented to increase the accuracy of the reconstruction. This
paper presents a general cost function, which evaluates the quality of a reconstruction based upon
a previously established angular cost function and depth data estimates. The cost function takes
into account two error measures: first, the angular error between each computed 3D scene point
and its corresponding feature track location, and second, the difference between the sensor depth
value and its computed estimate. A bundle adjustment parameter optimization is implemented
using the proposed cost function and evaluated for accuracy and performance. As opposed to
traditional bundle adjustment, in the event of feature tracking errors, a corrective routine is also
present to detect and correct inaccurate feature tracks. The algorithm involves clustering depth
estimates of the same scene point and observing the difference between the depth point estimates
and the triangulated 3D point. Results on both real and synthetic data are presented and show that
reconstruction accuracy is improved.
Student Authors: Shawn Recker, Mikhail M. Shashkov, and Mario Yepez
Evaluating the HSI/Lidar Direct Method for physics-based scene
modeling
Ryan N. Givens, Karl C. Walli, Michael T. Eismann. Air Force Institute of Technology,
[email protected]
Recent work has been able to automate the process of generating three-dimensional, spectrally
attributed scenes for use in physics-based modeling software using the Hyperspectral/Lidar Direct
method. The Hyperspectral/Lidar Direct method autonomously generates three-dimensional
Digital Imaging and Remote Sensing Image Generation (DIRSIG) scenes from input highresolution imagery, lidar data, and hyperspectral imagery and has been shown to do this
31
32
successfully using both modeled and real datasets. While the output scenes look realistic and
appear to match the input scenes under qualitative comparisons, a more quantitative approach is
needed to evaluate the full utility of these autonomously generated scenes. This paper seeks to
improve the evaluation of the spatial and spectral accuracy of autonomously generated threedimensional scenes using the DIRSIG model. Two scenes are presented for this evaluation. The
first is generated from a modeled dataset created using the DIRSIG model and the second is
generated using data collected over a real-world site. Synthetic imagery over the recreated scenes
are then compared to the original input imagery to evaluate how well the recreated scenes match
the original scenes in spatial and spectral accuracy and to determine the ability of the recreated
scenes to produce useful outputs for algorithm development.
Particle Filter-based Vehicle Tracking Using Fused Spatial Features
and a Non-Linear Motion Model
Raphael Viguier1, Guna Seetharaman2, Kannappan Palaniappan1.
Department of Computer Science, University of Missouri, Columbia, MO 65211
Information Directorate, Air Force Research Laboratory, Rome, NY 13441
[email protected]
Tracking in full motion and wide area motion imagery poses many challenges for feature-based
techniques since appearance changes can easily distract the tracker away from the true target.
Motion prediction is often used to improve the robustness of target tracking by constraining
appearance matching to be within an expected region of interest based on the target motion
behavior. We improve upon the typical motion models by incorporating a more realistic nonisotropic prediction error noise model and incorporating an orientation and magnitude based
representation of vehicle dynamics. A particle filter-based approach is used to handle both model
complexity and fusion with object spatial appearance feature likelihood maps. Experiments using
several video datasets show a significant increase in the average track length, especially during
turns. The particle filter combined with a non-linear motion model outperforms our previous
Kalman filter-based prediction using a linear motion model in the context of the Likelihood of
Features Tracking (LoFT) system.
Student Author: Raphael Viguier
Multi-Target Tracking in Video with Adaptive Integration of
Appearance and Motion Models
Arslan Basharat1, Ilker Ersoy2, Kannappan Palaniappan2, and Anthony Hoogs1
1Kitware, Inc., 28 Corporate Drive, Clifton Park, NY 12065
2Dept of Computer Science, University of Missouri, Columbia, MO 65211
In recent years various appearance-based single target trackers have been proposed with high
accuracy in FMV and WAMI. CSURF and LOFT trackers are two such examples that are able to
continue tracking targets under difficult conditions but require manual initialization and
additional computational cost. Tracking at urban scale is challenging when the goal is to
32
33
automatically track hundreds to thousands of targets in real-time in WAMI or dozens of highresolution targets in FMV. Here we propose a hybrid tracking architecture that utilizes motion
detections to robustly initialize multiple tracks, uses a blended approach to integrate appearancebased trackers, provides a generalized API for interfacing such trackers, and adaptively uses
motion detection or appearance match to update a track. High quality motion detections are
evaluated for track updates prior to appearance-based updates due to lower computational
complexity. On the other hand, appearance-based tracker updates are preferred under difficult
conditions like temporary stopping, low contrast, partial occlusion, complex backgrounds and
clutter. Independent of the approach used to update the track, the system allows for the
appearance-based trackers to update their model after each track update. Moreover, this
architecture also includes time-reverse backward tracking over a limited period of time to exploit
asymmetric temporal information for increased target coverage and tracking success. We have
successfully interfaced CSURF and LOFT appearance-based trackers into the proposed
architecture. This was achieved by implementing the interface API from the Matlab library
implementation of these trackers into the overall C++ system. We present quantitative evaluation
of the proposed system with four different approaches for appearance modeling; CSURF and
LOFT are the two recently demonstrated trackers and for baseline comparison we use template
matching with sum-of-squared differences (SSD) and normalized cross-correlation. The results
show that CSURF appearance-based tracking produces the best track quality when integrated in
the proposed motion-based framework.
Road Sign Detection on a Smartphone for Traffic Safety
Carrie Pritt, IEEE Student Member [email protected]
According to the World Health Organization, 1.24 million deaths are attributed to traffic
accidents each year. One approach to reducing traffic fatalities is the use of expensive Advanced
Driver Assistance Systems, which are still under development. The goal of this work is the
development of a low-cost driver assistance system that runs on an ordinary smartphone. It uses
computer vision techniques and multiple-resolution template matching to detect speed limit signs
and alert the driver if the speed limit is exceeded. It inputs an image of the sign to be detected
and creates a set of multiple-resolution templates. It also inputs photographs of the road from the
smartphone camera at regular intervals and generates multiple-resolution images from the
photographs. In the first stage of processing, fast filters restrict the focus of attention to smaller
areas of the photographs where signs are likely to be present. In the second stage, the system
matches the templates against the photographs using fast normalized cross correlation (NCC) to
detect speed limit signs. The multiple resolutions enable the NCC approach to detect signs at
different scales. In the third stage, the system recognizes the sign by matching a series of
annotated speed templates to the image at the position and scale that were determined by the
detection stage. It compares the speed limit with the actual vehicle speed as computed from the
smartphone GPS sensor and issues warnings to the driver as necessary.
Student Author: Carrie Pritt
Adaptive Automatic Target Recognition in Single and Multi-Modal
Sensor Data
33
34
Timothy S Khuon1, Robert S. Rand1 and Eric Truslow2 1National Geospatial Intelligence,
2
Northeastern University (Truslow) [email protected], [email protected]
For single-modal data, Target recognition and classification in a 3D point cloud is a non-trivial
process due to the nature of the data collected from a sensor system where the signal can be
corrupted by noise from the environment, electronic system, A/D converter, etc... Therefore, an
adaptive system with a specific desired tolerance is required to perform classification and
recognition optimally. The feature-based pattern recognition algorithm described below, is
generalized for solving a particular global problem with minimal change. Since for the given
class set, a feature set must be extracted accordingly. For instance, man-made urban target
classification, rural and natural objects, and human organ classification would require different
and distinct feature sets.This study is to compare the adaptive automatic target recognition in
single sensor and the distributed adaptive pattern recognition in multi-sensor fusion. The
similarity in automatic target recognition between sensor fusion and single-sensor is the ability to
learn from experiences and decide on a given pattern. Their main difference is that the sensor
fusion makes a decision from the decisions of all sensors whereas the single sensor requires a
feature extraction for a decision.
Change Detection and Classification of Land Cover in Multispectral
Satellite Imagery using Clustering of Sparse Approximations (CoSA)
over Learned Feature Dictionaries
Daniela I. Moody, Steven P. Brumby Los Alamos National Laboratory, MS D436, PO Box
1663, Los Alamos, NM 87545 [email protected]
Neuromimetic machine vision and pattern recognition algorithms are of great interest for
landscape characterization and change detection in satellite imagery in support of global climate
change science and modeling. We present results from an ongoing effort to extend machine
vision methods to the environmental sciences, using adaptive sparse signal processing combined
with machine learning. A Hebbian learning rule is used to build multispectral, multiresolution
dictionaries from regional satellite normalized band difference indexes data. Land cover labels
are automatically generated via our CoSA algorithm: Clustering of Sparse Approximations, using
a clustering distance metric that combines spectral and spatial textural characteristics to help
separate geologic, vegetative, and hydrologic features. Land cover labels are estimated in
example Worldview-2 satellite images of a particular region taken at different times, and are used
to detect seasonal and yearly surface changes. Our results suggest that neuroscience-based models
are a promising approach to practical pattern recognition problems in remote sensing.
Small Pixel Focal Plane Array Technology
John Caulfielda, Jerry Wilsona, Nibir Dharb aCyan Systems, 5385 Hollister Ave, Suite 105
Dr. Santa Barbara, CA, 93111, email: [email protected]; bARMY, NVESD
34
35
Improved imaging systems using smaller sub diffraction sized pixels have shown good imaging
results. There are known limits in undersampled and critically sampled sensors regarding
resolution and aliasing. Oversampling the image using sub-diffraction size pixels offers much
more than improved resolution, smaller FPAs, optics, and dewar systems. Oversampled pixels
results in processing techniques for smaller pixels that enable a number of related systems
benefits such as improved Instantaneous Field of View (IFOV), Noise Equivalent Power (NEP),
False Alarm Rate, and detection range, as well as other system level benefits. We will show data
from the first 2.4 Megapixel 5 micron pitch ROIC and demonstrate that spatial oversampling can
improve aliasing, sensitivity, and drive reductions in False Alarms through oversampled
correlated processing. Oversampled pixels allow larger format FPAs and smaller optics, resulting
in reductions in size, power, and weight. Oversampled IR sensors will also improve detection
and acuity in turbulent and hazy conditions over larger pixel IR focal plane array sensors. We
will review the phenomena of smaller pixels have lower SNR, and how using temporal and
spatial oversampling can compensate and effectively increase SNR lost with smaller pixels. We
will quantify the limits of performance of Oversampling based on theory, and also with Monte
Carlo type analysis using realistic parameters such as shot noise and thermal noise. We will show
quantitative data to illustrate the improvements in resolution, NEP, detection range, and false
alarm suppression of the oversampled IR sensor as the temporal and spatial oversampling are
increased.
Entropy Metric Regularization for Computational Imaging with Sensor
Arrays
1
Prudhvi Gurram1 and Raguveer Rao2,
MBO Partners Inc, and 2Army Research Lab,
Adelphi, MD [email protected] and [email protected]
Correlative interferometric image reconstruction is a computational imaging method for
synthesizing images from sensor arrays and relies on estimating source intensity by using the
cross-correlation across near-field or far-field measurements from multiple sensors of the arrays.
Key to using the approach is the exploitation of a relationship between the correlation across the
sensor measurements and the source intensity. This relationship is of a Fourier transform type
when the sensors are in the far-field of the source and the velocity of wave propagation in the
intervening medium is constant. Often the estimation problem is ill-posed resulting in unrealistic
reconstructions of images. Positivity constraints, boundary restrictions, l1 regularization, and
sparsity constrained optimization have been applied on the recovered source intensity in previous
work. In recent work, the sensor measurements were assumed to be noise-less for entropy metric
optimization, which is untenable in reality. This paper considers the noisy case and formulates the
estimation problem as least squares minimization with entropy metrics, either minimum or
maximum, as regularization terms. Situations involving far-field interferometric imaging of
extended sources will be considered and results illustrating the advantages of these entropy
metrics and their applicability will be provided.
A Comparative Study Of Methods To Solve The Watchman Route
Problem In A Photon Mapping-Illuminated 3D Virtual Environment
Bruce A. Johnsona, Hairong Qib, and Jason C. Isaacsa (a) Naval Surface Warfare Center,
Panama City Division, 110 Vernon Ave., Panama City, FL 32407 (b) Min Kao Electrical
Engineering and Computer Science Building 1520 Middle Dr., Knoxville, TN 37996 USA
35
36
Understanding where to place static sensors such that the amount of information gained is
maximized while the number of sensors used to obtain that information is minimized is an
instance of solving the NP-hard art gallery problem (AGP). A closely-related problem is the
watchman route problem (WRP) which seeks to plan an optimal route by an unmanned vehicle
(UV) or multiple UVs such that the amount of information gained is maximized while the
distance traveled to gain that information is minimized. In order to solve the WRP, we present the
Photon-mapping-Informed active-Contour Route Designator (PICRD) algorithm. PICRD
heuristically solves the WRP by selecting AGP-solving vertices and connecting them with
vertices provided by a 3D mesh generated by a photon-mapping informed segmentation algorithm
using some shortest-route path-finding algorithm. Since we are using photon-mapping as our
foundation for determining UV-sensor coverage by the PICRD algorithm, we can then take into
account the behavior of photons as they propagate through the various environmental conditions
that might be encountered by a single or multiple UVs. Furthermore, since we are being agnostic
with regard to the segmentation algorithm used to create our WRP-solving mesh, we can adjust
the segmentation algorithm used in order to accommodate different environmental and
computational circumstances. In this paper, we demonstrate how to adapt our methods to solve
the WRP for single and multiple UVs using PICRD using two different segmentation algorithms
under varying virtual environmental conditions.
Human Re-Identification in Multi-Camera Systems
Kevin Krucki, University of Dayton; Dr. Vijay Asari, University of Dayton; Dr. Chris BorelDonohue, Air Force Institute of Technology; Dr. David Bunker, Air Force Institute of
Technology [email protected]; [email protected];
[email protected]; and,
[email protected]
We propose a human re-identification algorithm for a multi-camera surveillance environment
where a unique signature of an individual is learned and tracked in a scene. The feed from each
camera is processed using a motion detector to get locations of all individuals. To compute the
signature, we propose a combination of different descriptors on the detected body such as the
Local Binary Pattern histogram (LBPH) for the local texture and a HSV color-space based
descriptor for the color representation. For each camera, a signature computed by these
descriptors is assigned to the corresponding individual along with their direction in the scene.
Knowledge of the person’s direction allows us to make separate identifiers for the front, back,
and sides. These signatures are then used to identify individuals as they walk across different
areas monitored by different cameras. The challenges involved are the variation of illumination
conditions and scale across the cameras. We test our algorithm on a dataset captured with 3 Axis
cameras arranged in the UD Vision Lab as well as a subset of the SAIVT dataset and provide
results which illustrate the consistency of the labels as well as precision/accuracy scores.
Student Author: Kevin Krucki
Gender And Age Recognition for Video Analytics Solution
36
37
Vladimir Khryashchev, Andrey Priorov and Alexander Ganin, 150000, Sovetskaya 14-309,
P.G. Demidov Yaroslavl State University, Yaroslavl, Russia e-mail: [email protected]
An application for video data analysis based on computer vision and machine learning methods
is presented. Novel gender and age classifiers based on adaptive features, local binary patterns
and support vector machines are proposed. Gender recognition, for example, can be used to
collect and estimate demographic indicators. Besides, it can be an important preprocessing step
when solving the problem of person identification, as gender recognition allows twice to reduce
the number of candidates for analysis (in case of identical number of men and women in a
database), and thus twice to accelerate the identification process. More than 94% accuracy of
viewer's gender recognition is achieved. Human age estimation is another problem in the field of
computer vision which is connected with face area analysis. Among its possible applications one
should note electronic customer relationship management, security control and surveillance
monitoring, biometrics. All the stages are united into a real-time system of audience analysis. The
system allows to extract all the possible information about people from the input video stream, to
aggregate and analyze this information in order to measure different statistical parameters. The
noted features allow applying the proposed system in various spheres of life: places of mass stay
of people (stadiums, theaters and shopping centers), transport knots (airports, railway and auto
stations), digital signage network optimization, etc.
Medical Image Segmentation using Multiscale and Super-Resolution
methods
En-Ui Lin, Soundararajan Ezekiel, Waleed Farag, Indiana University of Pennsylvania.
[email protected], [email protected], [email protected]
In many medical imaging applications, a clear delineation and segmentation of areas of interest
from low resolution images is crucial. It is one of the most difficult and challenging tasks in
image processing and directly determines the quality of final result of the image analysis. In
preparation for segmentation, we first use preprocessing methods to remove noise and blur and
then we use super-resolution to produce a high resolution image. Next, we will use wavelets to
decompose the image into different sub-band images. In particular, we will use discrete wavelet
transformation (DWT) and its enhanced version double density dual discrete tree wavelet
transformations (D3-DWT) as they provide better spatial and spectral localization of image
representation and have special importance to image processing applications, especially medical
imaging. The multi-scale edge information from the sub-bands is then filtered through an iterative
process to produce a map displaying extracted features and edges, which is then used to segment
homogenous regions. We have applied our algorithm to challenging applications such as gray
matter and white matter segmentations in Magnetic Resonance Imaging (MRI) images. Finally,
we apply performances metrics which demonstrate the strength of our proposed method in the
problem of medical image segmentation.
Student Author: En-Ui Lin.
Volumetric Features for Object Region Classification in 3D LiDAR
Point Clouds
37
38
Nina Varney and Vijayan Asari; University of Dayton [email protected]
LiDAR data is a set of geo-spatially located points which contain (X, Y, Z) location and intensity
data. This paper presents the extraction of a novel set of volume and texture based features from
segmented point-clouds. The data is first segmented into individual object regions using an
automatic seeded region growing technique. These object regions are normalized to an N x N x N
voxel space, where each voxel contains information about the location and density of points
within that voxel. A set of volumetric features including 3D form factor, fill, stretch, rotation
invariant local binary pattern (RILBP), corrugation, contour, plainness and relative variance are
extracted to represent the object region. The form factor, fill and stretch provide a series of
meaningful relationships between the volume, surface area and shape of the object. RILBP
provides a textural descriptor from the intensity of the LiDAR data. The corrugation, contour and
plainness are extracted by 3D Eigen analysis of the object volume to describe the details of the
object’s surface. Relative variance provides a representation of the distribution of points
throughout the object. The new feature set is found to be robust, and scale and rotation invariant
for object region classification. The performance of the proposed feature extraction technique has
been evaluated on a set of segmented and voxelized point cloud objects in the aerial LiDAR data
from Surrey, British Columbia, that was available through the Open Data Program. The
volumetric features, when used as an input to a two class SVM classifier, correctly classified the
object regions with an accuracy of 97.5%, with a focus on separating man made versus vegetation
objects. Future research will aim to extend these features as inputs to a multi-class classifier, to
identify man-made objects such as fences, vehicles, buildings, etc.
Analysis Of Diurnal, Long-Wave Hyperspectral Measurements Of
Natural Background And Manmade Targets Under Different Weather
Conditions
Christoph Borel, Research Associate Professor, Department of Engineering Physics, Air Force
Institute of Technology, WPAFB, OH 45433, [email protected] ; Dalton Rosario, U.S. Army
Research Laboratory, Adelphi, MD 20783, [email protected] ; Joao Romano,U.S.
Army Armament RDEC, Picatinny Arsenal, NJ 07806, USA, [email protected]
In this paper we describe results of the analysis of diurnal Fourier Transform spectrometry data
taken at Picatinny Arsenal in New Jersey with the long-wave hyper-spectral camera from Telops
under different weather conditions. In the near future all or part of the data will be made available
to the public. The first part of the paper discusses the processing from raw data to calibrated
radiance and emissivity data. Data was taken during several months under different weather
conditions every 6 minutes from a 213ft high tower of surrogate tank targets for a project
sponsored by the Army Research Laboratory in Adelphi, MD. An automatic calibration and
analysis program was developed which creates calibrated data files and HTML files. The first
processing stage is a flat-fielding operation where for the minimum and maximum extent of the
interferogram is used to estimate the gain or vignetting and the mean value or baseline of the
interferogram is the offset. During this step the mean base line is used to find dead pixels
(baseline low or at the maximum). Noisy pixels are detected where the standard deviation over
the part of the interferogram which is outside of the center-burst region is computed. If a predefined threshold is exceeded then the pixel is flagged as noisy. A bad pixel map for dead and
noisy pixels is created and for each scan line the bad pixels interferograms are replaced. Then a
38
39
flat-fielded and bad pixel corrected calibration cube using the gain and offset determined by a
single blackbody measurement is created. In the second stage each flat-fielded cube is Fourier
transformed and a real-valued un-calibrated radiance cube is created for a pre-defined
wavenumber range. Next the radiometric calibration is performed using a 2-point calibration
computed from the two blackbody measurements and then applied to each data cube. Two-point
calibrated radiance cubes are then created in ENVI format and the HTML file contains
quicklooks of spectra of selected pixels, the original and flat-fielded cubes as animated GIF
images and links to all the intermediate files that are created. For selected cubes a temperatureemissivity separation algorithm is applied where the cloudyness and cloud temperature is varied.
Resulting retrieved cloudyness fractions will be compared with measured cloud cover fractions
for opaque and thin clouds. The second part discusses environmental effects such as diurnal and
seasonal atmospheric and temperature changes and the effect of cloud cover on the data. The
effect of environmental conditions on the temperature-emissivity separation will be discussed.
Background Subtraction Algorithm for Moving Object Detection and
Evaluation of Performance Using Different Dataset
Kakali Das1, Mrinal Kanti Bhowmik1 Department of Computer Science and Engineering,
Tripura University, Suryamaninagar 799022, Agartala, India
Moving object detection using video streams has an important role in different computer vision
applications such as video surveillance, navigation systems, recognition, classification & activity
analysis. In this paper, a modified object detection technique is proposed, which is an improvised
version of an existing object detection technique called ViBe (Visual Background Extractor) by
using a post-processing step. The post processing step includes simple median filter for getting
more accurate result of background subtraction. This paper also presents about the newly created
SAMEER-TU (Society for Applied Microwave Electronics Engineering & Research- Tripura
University) dataset containing Visual videos for moving object detection. This technique is tested
on three video sequences, and ground truths for each frame for each video sequence are created.
Some of the input frames of these videos and their corresponding resultant ground truths are
reported in this paper. Comparative study is also carried out between the existing different
benchmark datasets and SAMEER-TU (Society for Applied Microwave Electronics Engineering
& Research- Tripura University) dataset in terms of accuracy. Some experimental results are
reported over some typical video sequences. The video sequences include some simple video and
some video incorporate dynamic background, which is a critical problem for moving object
detection. Dynamic Background describes a background where the background objects are
moving such as, swaying tree in the background, flowing water, fountain in background, etc. this
paper shows that the use of the median filter in the post processing step of moving object
detection algorithm increase the accuracy approximately 13%. Experimental results validate the
improvements of the existing algorithm ViBe (Visual Background Extractor).
A 3D Pointcloud Registration Algorithm Based on Fast Coherent Point
Drift
Min Lu, Jian Zhao, Yulan Guo, Jianping Ou, Janathan Li
39
40
[email protected], [email protected], [email protected],
[email protected], [email protected]
Pointcloud registration has a number of applications in various research areas. Computational
complexity and accuracy are two major concerns for a pointcloud registration algorithm. This
paper proposes a novel Fast Coherent Point Drift (F-CPD) algorithm for 3D pointcloud
registration. The original CPD method is very time-consuming. The situation becomes even
worse when the number of points is large. In order to overcome the limitations of the original
CPD algorithm, a global convergent squared iterative expectation maximization (gSQUAREM)
scheme is proposed. The gSQUAREM scheme uses an iterative strategy to estimate the
transformations and correspondences between two pointclouds. Experimental results on a
synthetic dataset show that the proposed algorithm outperforms the original CPD algorithm and
the Iterative Closest Point (ICP) algorithm in terms of both registration accuracy and convergence
rate.
Student Author: Yulan Guo
Democratizing the Visualization of 500 Million Webcam Images
Joseph D. O'Sullivan, Abby Stylianou, Austin Abrams & Robert Pless, Department of
Computer Science & Engineering, Washington University in St. Louis, St. Louis, Missouri
[email protected], [email protected], [email protected], [email protected]
Five years ago we reported at AIPR on a nascent project to archive images from every webcam in
the world and to develop algorithms to geo-locate, calibrate, and annotate this data. This archive
of many outdoor scenes (AMOS) has now grown to include 28000 live outdoor cameras and over
550 million images. This is actively being used in projects ranging from large scale
environmental monitoring to characterizing how built environment changes (such as adding bike
lanes in DC) affects physical activity patterns over time. But the biggest value in a very long
term, widely distributed image dataset is the rich set of “before” data that can be used to
characterize changes in natural experiments. To exploit this we build and share a collection of
web-tools to support large scale data driven exploration that allow anyone to compare imagery
and find unusual events. One visualization tools is “A Tale of Two Years”, an image browser
that visualizes each image in comparison to an earlier image. When the earlier image is exactly a
year earlier, this tool highlights changes in biological response to climate. When used in urban
images, the tool highlights changes in built environment structures. A second visualization tool
uses PCA to find the subspace that characterizes the variations in this scene, and highlights
imagery that does not fit that subspace. This anomaly detection captures both imaging failures
such as lens flare and also unusual situations such as street fairs. These tools, while not
technically complicated, are the parts of the AMOS that are most widely used by a non-technical
audience, and we share case studies where these tools highlight interesting scene features and
events.
Student Authors: Joseph D. O'Sullivan
40
41
Automated 3D Geo Registration Methods
Andrew Brown, Mike Moore, Tim Fair, and John Berger, Toyon Research Corporation.
[email protected], [email protected], [email protected], and [email protected]
Toyon Research Corporation has developed a robust library of image processing algorithms for
automated 3D reconstruction and has recently developed a software application called 3D Geo
Registration (3DGR) for use in automating track alignment procedures and improving on
registration accuracy for various Wide Area Motion Imagery (WAMI) systems. Toyon’s
algorithms include advanced sensor models for focal plane and scanning imagers, in addition to
data driven algorithms for automatically registering and aligning images to produce highly
accurate models that represent the world. Toyon’s 3D model outputs enable alignment with
known 2D and 3D reference sources for use in airborne and satellite based surveillance
applications. Toyon has developed these algorithms in conjunction with various Government
sponsors and through various Small Business Innovative Research (SBIR) efforts with the
Department of Defense and the Department of Homeland Security.
Human Activity Detection using Sparse Representation
Dipti Killedar, Sreela Sasi. Department of Computer and Information Science, Gannon
University, Erie, PA, USA [email protected]; [email protected]
Human activity detection from videos is very challenging. It has numerous applications in sports
evalution, video surveillance, elder/child care, etc. Lots of research has been done to develop
different techniques in the area of human activity detection such as Hidden Markov Model
(HMM), Maximum Entropy Markov Model (MEMM), Sensor based methods, etc. In this
research, a model using sparse representation is presented for human activity detection from
video data. Sparse representation creates a model for the video data using a linear combination of
the dictionary and a coefficient matrix. The dictionary is created using spatio temporal features of
video data. These Spatio temporal features are extracted from the training video data using the
Spatio Temporal Interest Points (STIP) algorithm. The K-Singular Value Decomposition (KSVD) algorithm is used for learning dictionaries from these spatio temporal points for the training
video dataset. This algorithm is a generalization of the K-means clustering process. K-SVD is an
iterative method that alternates between sparse coding of the test data based on the current
dictionary and a process of updating the dictionary. After dictionary learning stage, L1normalization is used to solve the linear equations to get a sparse solution. Finally, human action
is classified using a minimum threshold residual value of the corresponding action class in the
testing video dataset. Experiments are conducted using the KTH dataset, which contains a
number of action videos recorded in a controlled environment. The current approach performed
well in classifying activities with a success rate of 90%.
Modeling of Nuclear Detonation Optical Output in DIRSIG
1st Lt Ashley Green, Capt Robert Slaughter, Dr. John McClory,
41
42
Air Force Institute of Technology [email protected]
Previous research has demonstrated the capability to simulate the sensorresponse to a nuclear
fireball within the Digital Imaging and Remote Sensing Image Generation (DIRSIG) model,
using both an analytical single temperature model and historic data. Modern nuclear effects codes
have been developed that incorporate multidimensional interactions. This research combines the
output of a modern radiation and shock physics multi-dimensional nuclear effects code with
DIRSIG, a Monte Carlo multi-bounce photon tracking code. The nuclear effects code was used to
determine the temperature and density of the three dimensional nuclear fireball. The output of the
code was then used as the input to DIRSIG. DIRSIG was used to analyze how environmental
interactions change the optical signal received by a realistic sensor. Emphasis was placed on the
significant effects of photon interactions in an urban canyon scenario following a nuclear
detonation such as reflections off buildings or other surrounding objects. Sensor models were
developed for silicon bhangmeters, terrestrial security cameras, and standard vehicle dash
cameras to analyze the performance constraints of these sensors from an optical diagnostic
perspective.
Student Author: 1St Lt Ashley Green
Automatic Segmentation of Carcinoma in Radiographs
Fatema Albalooshi1, Sara Smith2, Paheding Sidike1, Yakov Diskin1 and Vijayan Asari1
1
University of Dayton, 2University of Cincinnati College of Medicine [email protected]
A strong emphasis has been made on making the healthcare system and the diagnostic procedure
more efficient. In this paper, we present an automatic detection technique designed to segment
out abnormalities in X-ray imagery. Utilizing the proposed algorithm allows radiologists and
their assistants to more effectively sort and analyze large amount of imagery. In radiology, Xray beams are used to detect various densities within a tissue and to display accompanying
anatomical and architectural distortion. Lesion localization within fibrous or dense tissue is
complicated by a lack of clear visualization as compared to tissues with an increased fat
distribution. As a result, carcinoma and its associated unique patterns can often be overlooked
within dense tissue.
We introduce a new segmentation technique that integrates prior
knowledge, such as intensity level, color distribution, texture, gradient, and shape of the region of
interest taken from prior data, within segmentation framework to enhance performance of region
and boundary extraction of defected tissue regions in medical imagery. Prior knowledge of the
intensity of the region of interest can extremely help in guiding the segmentation process,
especially when the carcinoma boundaries are not well defined and when the image contains nonhomogeneous intensity variations. We evaluate our algorithm by comparing our detection results
to the results of the manually segmented regions of interest. Through metrics, we also illustrate
the effectiveness and accuracy of the algorithm in improving the diagnosis efficiency for medical
experts.
Novel Geometric Coordination Registration in Cone-beam Computed
Tomogram
42
43
Walter Lam, [email protected]
The use of the cone-beam computed tomography (CBCT) in medical field enables the clinicians
to visualise the hard tissue of head and neck region in a cylindrical field of view (FOV). The
images are usually presented with reconstructed three-dimensional (3D) imaging and its
orthogonal (x-, y- and z- planes) images. Spatial relationship of the structures in these orthogonal
views is important for diagnosis of disease as well as planning for treatment. However, the nonstandardized positioning of the object during the CBCT data acquisition often induces errors in
measurement since orthogonal images cut at different planes might look similar. In order to solve
the problem, this study proposes an effective mapping from the Cartesian coordinates in a cube
physically to its respective coordinates in 3D imaging. Therefore, the object (real physical
domain) and the imaging (computerized virtual domain) can be linked up and registered. In this
way, the geometric coordination of the object/imaging is defined and its orthogonal images would
be fixed on defined planes. The images can then be measured with vector information and serial
imagings can also be directly compared.
Bayesian Solutions to Non-Bayesian Detection Problems: Unification
Through Fusion
Alan Schaum, Navy Research Lab, [email protected]
In 1950 Abraham Wald proved that every admissible statistical decision rule is either a Bayesian
procedure or the limit of a sequence of such procedures. He thus provided a decision-theoretic
justification for the use of Bayesian inference, even for non-Bayesian problems. It is often
assumed that his result also justified the use of Bayesian priors to solve such problems. However,
the principles one should use for defining the values of prior probabilities have been controversial
for decades, especially when applied to epistemic unknowns. A new approach indirectly assigns
values to the quantities usually interpreted as priors by imposing specific constraints on a
detection algorithm. No assumptions about prior “states of belief” are necessary. The result shows
how Wald’s theorem can accommodate both Bayesian and non-Bayesian problems. The
unification is mediated by the fusion of clairvoyant detectors.
Rapid Location of Radiation Sources in Complex Environments using
Optical and Radiation Sensors
Christoph Borela, David Bunkera and Graham W Alfordb. Center for Technical Intelligence
Studies and Research, Department of Engineering Physics, Air Force Institute of Technology,
WPAFB, OH 45433; and, University of Tennessee Knoxvill, Knoxville, TN [email protected];
[email protected]; [email protected]
Baseline radiation background is almost never known and constantly changes particularly in
urban areas. It is difficult to know what the expected background radiation should be and how a
radiological incident may elevate the radiation. Naturally occurring radiation from rocks and
43
44
building materials often contributes significantly to measured radiation. Buildings and other tall
structures also shield radiation and thus need to be taken into account. Models of natural
occurring background radiation can be derived from knowledge of geology, building material
origins, vegetation, and weather conditions. After a radiological incident, the radiation will be
elevated near the event, and some material may be transported by mechanisms such as airborne
transport and/or run-off. Locating and characterizing the sources of radiation quickly and
efficiently are crucial in the immediate aftermath of a nuclear incident. The distribution of
radiation sources will change naturally and also due to clean-up efforts. Finding source strengths
and locations during both the initial and clean-up stages is necessary to manage and reduce
contaminations. The overall objective of the “Rapid Location Of Radiation Sources In Complex
Environments Using Optical And Radiation” research project is to design and validate gamma ray
spectrum estimation algorithms that integrate optical and radiation sensor collections into high
resolution, multi-modal site models for use in radiative transport codes. Our initial focus will be
on modeling the background radiation using hyper-spectral information from visible through the
shortwave infrared sensors and thermal imagers. The optical data will complement available
ancillary data from other sources such as Geographic Information Systems (GIS) layers, e.g.
geologic maps, terrain, surface cover type, road network, vegetation (e.g. serpentine vegetation),
3-D building models, known users of radiological sources, etc. In absence of GIS layers, the data
from the hyper-spectral imager would be analyzed with special with special software to
automatically create GIS layers and radiation survey data to come up with a method to predict
background radiation distribution. We believe the estimation and prediction of the natural
background will be helpful in finding anomalous point, line and small area sources and minimize
the number of false alarms due to natural and known man-made radiation sources such as
radiological medical facilities, industrial users of radiological sources.
Sparse 3D Reconstructions of Atmospheric Nuclear Detonations
Robert Slaughter, Tyler Peery, John McClory and Karl Walli.
[email protected], [email protected], [email protected] and
[email protected] Air Force Institute of Technology
Researchers at Lawrence Livermore National Laboratory (LLNL) have started digitizing nearly
10,000 technical films spanning the above ground atmospheric nuclear testing operations
conducted by the United States between 1945 and the 1960s. Researchers at the Air Force
Institute of Technology (AFIT) have begun employing modern digital image processing and
computer vision techniques to exploit this data set and determine specific invariant features of the
early fireball dynamic growth. The focus of this presentation is to introduce the methodology
behind three dimensional reconstructions of time varying nuclear fireballs. Multi-view geometry
algorithms were used to perform sparse reconstructions of the nuclear events from the multiple
cameras observing the detonation at different look angles. From matched image pairs, sparse
reconstructions are determined through manual selection of key features. This presentation will
focus on the applied technique, pursued techniques, initial results, and difficulties of performing a
3D reconstruction of a time varying nuclear fireball from digitized films.
Stduent Authors: Robert Slaughter and Tyler Peery
Smoothed Chemical Vapor Detection
44
45
Stefan Wager and Guenther Walther, Stanford University. [email protected],
[email protected]
Manolakis and D’Amico [2005] describe methods for chemical vapor detection by hyperspectral
imaging. These methods work well with strong signals; however, with weaker signals, their
performance is reduced by the presence of background noise. In this paper, we show how to
improve their performance using spatial smoothing with a spline basis tailored to the gas
detection problem. We present results on the 2014 DTRA/NSF/NGA chemical detection
challenge.
Student Author: Stefan Wager
Machine Learning Nuclear Detonation Features
Lt Col Dan Schmitt, Air Force Institute of Technology, Dayton, OH. [email protected]
Nuclear explosion yield estimation equations based on a 3D model of the explosion volume will
have a lower uncertainty than the current radius based estimation. To accurately collect data for a
volume model of atmospheric explosions requires building a 3D representation from 2D images.
The majority of 3D reconstruction algorithms use the SIFT (scale-invariant feature transform)
feature detection algorithm which works best on feature-rich objects with continuous angular
collections. These assumptions are different from the archive of nuclear explosions that have
only 3 points of view. This paper reduces 300 dimensions derived from an image based on
Fourier analysis and five edge detection algorithms to a manageable number to detect sunspots
that may be used to correlate videos of different viewpoints for 3D reconstruction Furthermore,
experiments test whether histogram equalization, Wiener filtering, and median filters improve
detection of these features using four kernel sizes passed over these features. Dimension
reduction using principal component analysis (PCA), forward subset selection, ReliefF, and
FCBF(Fast Correlation-Based Filter) are combined with a Mahalanobis distance classifiers to find
the best combination of dimensions, kernel size, and filtering to detect the sunspots. Results
indicate that sunspots can be detected with hit rates of 90% and false alarms < 1%.
Foley: The Art of the Transparent Soundscape
Andy Malcom, [email protected]
In this talk by Emmy Award winning foley artist Andy Malcolm, you will be introduced to the
unique art of foley.Foley is the art of performing sounds for film, video and other media to make
viewers believe that the sound effects are actually real. Except for situations where the foley
becomes animated, the best foley tracks are transparent. The viewers should not be able to realize
that the sound was not actually part of the filming process itself. Andy will explain and
demonstrate how and why this process is used in producing sound for movies. The art of foley
reinforces what the audience is experiencing and brings the action to life. The most effective
sound isn't always the actual one. You will see how Andy creates the illusion of the real sound
using surprisingly unconventional objects in unexpected ways. The term "Foley" is named after
Jack Foley, the first practitioner of the art. Jack started in the motion picture business in the silent
picture era and lived through the exciting times when the industry converted to sound. Sound and
45
46
image have remained the primary components of cinema for many decades. Each is as
sophisticated and as carefully constructed as the other. Foley artists represent an anachronism; the
survival of acoustic invention in an era of digitized technology.
Representing Pictures with Sound
Edward Schaefer, System Development Engineer, [email protected]
A coarse representation of pictures can be created with sound. A series of such sounds can be
used to represent an animation or a movie. In this project, pictures are divided into a 4x4 array of
"sound pixels". The position of each sound pixel is assigned a musical note, and the contents of
each sound pixel is used to create a volume. The resultant sound is the representation of the
picture. Algorithms for creating notes and volumes will be described. The behavior of the
program will be illustrated with sequences pictures with sounds. Generating sounds for movies
using this technique will be discussed.
Marginal Space Deep Learning for Efficient Anatomy Detection in Volumetric Image Data
Bogdan Georgescu, Yefeng Zheng, Hien Nguyen, Vivek Singh, David Liu, Dorin Comaniciu
Imaging and Computer Vision, Siemens Corporate Technology, Princeton, NJ
[email protected]; [email protected]; [email protected];
[email protected]; [email protected]; [email protected];
Fast and robust anatomical object detection is a fundamental task in medical image analysis,
supporting the entire clinical workflow from diagnosis, patient stratification, therapy planning,
intervention and follow-up. State-of-the-art methods for automatic image analysis exploit large
annotated image databases, being based on machine learning techniques applied to relevant image
features associated with the anatomy of interest. Nevertheless, when the object appearance is
complex or the dimensionality of the object parameter space is large there are still challenges in
providing effective anatomy detection solutions. With this work we introduce a class of marginal
space deep learning techniques that combine the strengths of deep learning artificial neural
networks with learning on projected parameter subspaces of increasing dimensionality. One of
the characteristics of the deep learning architecture is the ability to encode complex image
patterns in hierarchical features on each neural network layer. In addition, by using approximate
decomposable weights, we can preserve the classification performance while significantly
improving the speed of applying such classifier over high dimensional data. Furthermore, the
mechanism of marginal space learning allows us to learn classifiers in marginal spaces of
gradually increasing dimensionality. For example, to detect a 3D object described by 9 pose
parameters (there parameters each for position, orientation and scale) we learn marginal
classifiers in the position space, position-orientation space and position-orientation-scale space.
As a result, the overall learning process is efficient, focusing the search on high probability
regions of the parameter space, thus providing excellent run-time performance. We demonstrate
the proposed marginal space deep learning technique for landmark detection in volumetric
computed-tomography data and cardiac magnetic resonance images. The cross-validated
46
47
experiments show a significant error reduction and speed-up in comparison with previously
reported results.
Cloud Based High Performance Video Transcoding Platform
Amit Puntambekar, Mike Coward, Craig Lee and Garrett Choi, QuickFire Networks
Corporation. San Diego, CA. [email protected], [email protected],
[email protected], [email protected]
The rapid proliferation of video capture devices (e.g. smartphones) coupled with the desire to
view content on TV Everywhere devices (tablets, smartphones, etc.) has exposed a new
bottleneck, namely the need to quickly transcode (translate) that massive amount of video into
formats that TV Everywhere devices can understand. We showcase one of the fastest video
transcoding service, currently known in the state of the art. Available via the public and private
cloud, QuickFire.TV is based on highly parallel / distributed processing software, enabling video
processing at 10-100x real-time vs. existing solutions that transcode at 1-2x real-time.
Kwiver:A Open-Source Cross-Platform Video Exploitation Framework
Keith Fieldhouse, Matthew J. Leotta, Arslan Basharat, Russell Blue, David Stoup, Charles
Atkins, Linus Sherrill, Benjamin Boeckel, Pau Tunison, Jacob Becker, Matthew Dawkins,
Matthew Woehlke, Roderic Collins, Matt Turek, Anthony Hoogs [All with Kitware, Inc.]
Corresponding Author: [email protected]
We introduce KWiver, a cross-platform video exploitation framework that Kitware has begun
releasing as open source. Kitware is utilizing a multi-tiered open-source approach to reach as
wide an audience as possible. Kitware's government-funded efforts to develop critical defense
technology will be released back to the defense community via Forge.mil, a government open
source repository. Infrastructure, algorithms, and systems without release restrictions will be
provided to the larger video analytics community via KWiver.org and github.
Our goal is to provide a video analytics technology baseline forrepeatable and reproducible
experiments, and to provide a focal point for collaboration and contributions from groups across
the community. KWiver plans to provide several foundational capabilities. A multi-processing
framework allows algorithmic worker code to execute and communicate in a multiprocessing
environment. A companion data abstraction layer allows code to scale from small-scale desktop
environments based on file I/O to large multi-core systems communicating via databases.
Visualization tools provide cross-platform GUIs for viewing algorithmic results overlaid on
source video data. Finally, an integrated evaluation framework enables not only quantitative
evaluation via common detection and tracking metrics, but qualitative feedback by feeding
annotation and scoring states to the visualization tools. KWiver is the technology behind a fullframe, frame-rate WAMI tracker which has been deployed OCONUS and has been released as
government open source on Forge.mil. Upcoming releases will include FMV source data, ground
truth, baseline tracking capability, computed tracking results, and evaluation products.
47
48
An Automated Workflow for Observing Track Data in 3-Dimensional
Geo-Accurate Environments
Derek J. Walvoord and Bernard V. Brower, Exelis, 400 Initiative Drive, Rochester, New
York 14606. [email protected], and [email protected]
Recent developments in computing capabilities and persistent surveillance systems have enabled
advanced analytics and visualization of image data. Using our existing capabilities, this work
focuses on developing a unified approach to address the task of visualizing track data in 3dimensional environments. Our current structure from motion (SfM) workflow is reviewed to
highlight our point cloud generation methodology, which offers the option to use available
sensor telemetry to improve performance. To this point, an algorithm outline for navigation
guided feature matching and geo-rectification in the absence of ground control points (GCPs) is
included in our discussion. We then provide a brief overview of our on-board processing
suite,which includes real-time mosaic generation, image stabilization, and feature tracking.
Exploitation of geometry refinements, inherent to the SfM workflow, is then discussed in the
context of projecting track data into the point cloud environment for advanced visualization.
Results using the new Exelis airborne collection system, Corvus Eye, are provided to discuss
conclusions and areas for future work.
Large Displacement Optical Flow Based Image Predictor Model
Nishchal K. Verma and Aakansha Mishra Department of Electrical Engineering, Indian
Institute of Technology Kanpur, Kanpur, India [email protected] and
[email protected]
This paper proposes a Large Displacement Optical Flow based Image Predictor Model for
generating future image frames by applying past and present image frames. The predictor model
is an Artificial Neural Network (ANN) and Radial Basis Function Neural Network (RBFNN)
Model whose input set of data is horizontal and vertical components of velocities estimated using
Large Displacement Optical Flow for every pixel intensity in a given image sequence. There has
been a significant amount of research in the past to generate future image frames for a given set
of image frames. The quality of generated images is evaluated by Canny’s edge detection Index
Metric (CIM) and Mean Structure Similarity Index Metric (MSSIM). For our proposed algorithm,
CIM and MSSIM indices for all the future generated images are found better when compared
with the most recent existing algorithms for future image frame generation. The objective of this
study is to develop a generalized framework that can predict future image frames for any given
image sequence with large displacements of objects. In this paper, we have validated our
developed Image Predictor Model on an image sequence of landing jet fighter and obtained
performance indices are found better as compared to most recent existing image predictor models
On Parzen Windows Classifiers
48
49
Jing Peng1 and Guna Seetharaman2 1 Computer Science Department, Montclair State
University, Montclair, NJ. 2 Information Directorate, Air Force Research Laboratory, Rome, NY
[email protected] and [email protected]
Parzen Windows classifiers have been applied to a variety of density estimation as well as
classification tasks with considerable success. Parzen Windows are known to converge in the
asymptotic limit. However, there is a lack of theoretical analysis on their performance with finite
samples. In this paper we show a connection between Parzen Windows and the regularized least
squares (RLS) algorithm, which has a well-established foundation in computational learning
theory. This connection allows us to provide interesting insight into Parzen Windows classifiers
and their performance in finite sample settings. Finally, we show empirical results on the
performance of Parzen Windows classifiers using a number of real data sets. These results
corroborate well our analysis.
Modified Deconvolution using Wavelet Image Fusion
Michael McLaughlin1, Erik Blasch2, Soundararajan Ezekiel1, Mark Alford2, Maria
Cornacchia2, Adnan Bubalo2, Millicent Thomas3 1Indiana University of Pennsylvania, Indiana,
PA, 2Air Force Research Lab, Rome, NY, 3Northwest University, Kirkland, WA.
[email protected], [email protected], [email protected], [email protected],
[email protected], [email protected], [email protected]
Image quality can be affected by a number of factors. The two predominant ones are noise and
blur. Blur typically manifests itself as a smoothing of edges. It can be described as the
convolution of an image with an unknown blur kernel or function. The inverse to this process is
known as deconvolution, which is a very difficult process even in the absence of noise. Removing
blur from an image has multiple stages: first, we must identify or approximate the blur kernel, and
then perform a deconvolution of the estimated kernel and blurred image. This is often an iterative
process with successive approximations of the kernel leading to optimal results. However, it is
unlikely that a given image is blurred uniformly; in real world situations most images are already
blurred. Such blur can be a product of object motion or camera motion/defocus which will lead
to separate blur kernels. The result of this process will sharpen blurred regions, but also degrade
the regions previously unaffected by blur. It should be noted that this process is complex and
computationally expensive. To remedy this, we propose a novel modified deconvolution
approach to removing blur from a no-reference image. First, we estimate the blur kernel, and then
we perform a deconvolution on the blurred image. Finally, wavelet techniques are implemented
to fuse the blurred and de-blurred images. In this way we recover the details in the blurred image
that are lost by deconvolution, but retain the sharpened features in the de-blurred image. We
evaluate the effectiveness of our proposed approach using several metrics and compare them to
standard approaches. Our results show that this approach has potential applications to many
fields including medical imaging, topography, and computer vision.
49
50
Sparse Generalized Fourier Series via Collocation-Based Optimization
Ashley Prater, Air Force Research Laboratory, Information Directorate, Rome, NY
[email protected]
Generalized Fourier series with orthogonal polynomial bases have useful applications in several
fields, including differential equations, pattern recognition, and image and signal processing.
However, computing the generalized Fourier series can be a challenging problem, even for
relatively well behaved functions. In this paper, a method for approximating a sparse collection of
Fourier-like coefficients is presented that uses a collocation technique combined with an
optimization problem inspired by recent results in compressed sensing research. The discussion
includes approximation error rates and numerical examples to illustrate the effectiveness of the
method. One example displays the accuracy of the generalized Fourier series approximation for
several test functions, while the other is an application of the generalized Fourier series
approximation to rotation-invariant pattern recognition in images.
Indoor Non-Linear Target Tracking Using Wi-Fi and Video
Tommy Chin, Rochester Institute of Technology, [email protected]
Target tracking through object recognition software and wireless signal measurements on Wi-Fi
enabled devices have been used in the past decade to enhance the security and safety of an area of
interest. Many organizations—such as a municipal airport or a grocery store—use an array of
distributed cameras to monitor the wellbeing of their premise. In object recognition software,
many pitfalls are shown when a target of interest is out of focus due to overbearing items—such
as a shelf—that hide the directionality of the individual. Additionally, visual tracking is also lost
when the target is positioned into a crowded region that creates confusion for the recognition
system. A common solution to estimate the direction and track of the individual is through the
utilization of Kalman and Gaussian filters in aims to continuously track the target in obscured
environments. This approach is also prone to a weakness when the prediction is deemed invalid
when the direction becomes non-linear. To solve this problem, the main intent of this research is
to apply information fusion to Received-Signal-Strength-Indication (RSSI) measurements
collected through Wi-Fi enabled mobile devices and object recognition data to track a target
through the distributed camera system within an indoor environment. The primary filtering
mechanism will be focused towards Kalman and Gaussian methods. As a result of the research,
measurement is drawn to compare the newly formed tracking solution to that of the actual path
for correlation. This is a novel and intuitive approach to target tracking as it can be utilized in
indoor environments.
Motion Estimation of Flotation Froth Using Mutual Information and
Kalman Filter
Anthony Amankwah1 and Chris Aldrich2 1Computer Science Department, University of
Ghana, P O Box LG Accra , Ghana (E-mail: [email protected]). 2Department of Mining
Engineering and Metallurgical Engineering, Western Australian School of Mines, Curtin
University, GPO Box U1987, Perth, WA, 6824, Australia
50
51
The estimation of motion of froth using image processing techniques is difficult since bubbles
collapse and merge leading to bubble deformations. The most popular motion estimation
technique is block matching. In the block matching algorithm the image frame is typically
divided in nonoverlapping rectangular blocks. The best match to the current block of pixels is
searched in the previous frame of the video sequence within a search area about the location of
the current block. The best solution is the full search algorithm which exhaustively searches for
the best matched block within all locations of the search window. Due to the high computational
cost full search several fast search techniques have been introduced. The fast search techniques
reduce the number of matching calculations per block image by selecting a subgroup of possible
candidate locations. The fast search algorithms include three-step-search, four-step-search,
diamond search, and 2D logarithmic search. The mean square error (MSE) and mean absolute
difference (MAD) are considered the best similarity metrics for motion estimation. In this work
we use mutual information with a bin size of two as the similarity metric. The computational cost
is similar to MSE and MAD. To further improve accuracy of the estimated results of our
algorithm we use the Kalman filter. Experimental results show that the proposed motion
estimation technique improves the motion estimation accuracy in terms of peak signal-to-noise
ratio of the reconstructed frame.
Multi-resolution Deblurring
Michael McLaughlin1, Erik Blasch2, Soundararajan Ezekiel1, Mark Alford2, Maria
Cornacchia2, Adnan Bubalo2, Millicent Thomas3 1Indiana University of Pennsylvania, Indiana,
PA, 2Air Force Research Lab, Rome, NY, 3Northwest University, Kirkland, WA.
[email protected], [email protected], [email protected], [email protected],
[email protected], [email protected], [email protected]
As technology advances, blur in an image remains as an ever present issue in the image
processing field. A blurred image is mathematically expressed as a convolution of a blur function
with a sharp image plus noise. Removing blur from an image has been widely researched and is
still an active field of research today. Without a reference image, identifying, measuring, and
removing blur from a given image is very challenging. This process involves estimating the blur
kernel to match with various types of blur including camera motion/defocus or object motion.
Various blur kernels have been studied over many years, but the most common function is the
Gaussian. Once the blur kernel (function) is estimated, a deconvolution is performed with the
kernel and the blurred image. Many existing methods operate in this manner, however, these
methods remove blur from the blurred region, but damage the un-blurred regions of the image.
This is due to the actual intensity values of the pixels in the image being used in the deblurring
process and become easily distorted. The method proposed in this paper uses multi-resolution
analysis techniques to separate blur, edge, and noise coefficients. Deconvolution with the
estimated blur kernel is then performed on these coefficients instead of the actual pixel intensity
values before reconstructing the image. Additional steps will be taken to retain the quality of unblurred regions of the blurred image. The result will then be compared against standard
deblurring techniques using several metrics including mutual information and structural similarity
based metrics. Experimental results on simulated and real data show that our approach achieves
51
52
higher quality results than previous approaches on various blurry and noise images. Further, our
approaches have military, medical, and topographic application.
Towards Visual Analysis of Unconstrained Images in Social Forums:
Studies On Concept Detection and Personalized Economy Of Images In
Social Networks
Sangmin Oh, Eric Smith, Yiliang Xu, Anthony Hoogs. KitWare Inc, NY.
[email protected]
In this work, we present our recent work on visual analysis of unconstrained images in social
forums. Recently, enormous amount of images are being shared via social network, which exhibit
extreme diversity in terms of semantic contents, visual quality, and styles. While such large
quantity of shared content can be used as a resource to extract valuable information, it has
become crucial to develop algorithmic solutions to automate visual understanding to enable
knowledge discovery from such challenging resources. For visual content retrieval, we show that
concept detectors which are trained from example images based on extracted features can be
effectively used to identify variety of concepts from a large archive of data shared on social
forums. Such categories include objects, scenes, and events, among others, and advanced
algorithmic solution is used to accelerate both the learning and detection process. In addition, we
present a novel study on analyzing individual user's behavioral patterns regarding images shared
on social forums. In particular, we view diverse user activities on social multimedia services as an
economy, where the first activity mode of sharing or posting is interpreted as supply, and another
mode of activity such as commenting on images is interpreted as consumption. To characterize
user profiles in these two behavioral modes, we propose an approach to characterize users' supply
and consumption profiles based on the image content types with which they engage. We then
present various statistical analyses, which confirm that there is an unexpected significant
difference between these two behavioral modes. Furthermore, we introduce a statistical approach
to identify users with salient profiles, which can be useful for social multimedia services to block
users with undesirable behavior or to promote viral content.
Polarimetric Calibration and Characterization of the Telops Field Portable
Polarimetric-Hyperspectral Imager in the LWIR
Joel Holder Air Force Institute of Technology, Dayton, OH [email protected]
Polarimetric-hyperspectral imaging brings two traditionally independent modalities together to
potentially enhance scene characterization capabilities. This could increase confidence in target
detection, material identification, and background characterization over traditional hyperspectral
imaging. In order to fully exploit the spectro-polarimetric signal, a careful calibration process is
required to remove both the radiometric and polarimetric response of the system (gain). In the
long-wave infrared (LWIR, 8 μm to 12 μm ), calibration is further complicated by the polarized
self-emission of the instrument itself (offset). This paper presents a calibration methodology
developed for a LWIR Telops Hyper-Cam that has been modified with a rotatable linear wire-grid
polarizer (4000 line/mm, ZnSe, 350:1 extinction ratio). A standard spectro-radiometric calibration
method for Fourier-transform spectrometers (FTS) is modified with a Mueller matrix approach to
52
53
account for polarized transmission through and polarized self-emission from each optical
component. This is done for two cases: one assuming that the instrument polarizer is ideal, and a
second method which accounts for a non-ideal instrument polarizer. It is shown that a standard
two-point radiometric calibration at each instrument polarizer angle is sufficient to remove the
polarimetric bias of the instrument, if the instrument polarizer can be assumed to be ideal. For the
non-ideal polarizer case, the Mueller deviation matrix is determined for the system, and used to
quantify how non-ideal the system is. The noise-equivalent s1, s2, and DoLP are also quantified
using a wide-area blackbody. Finally, a scene with a variety of features in it is imaged and
analyzed.
53