Fakultät Informatik Zoomable Cell Matthias Reimann, Anne Tuukannen, Michael Schroeder Marcel Spehr, Dimitrij Schlesinger, Stefan Gumhold Vision Microscopy images 10.000nm 1.000 nm A B Data Protein interactions and 3D structures 100 nm C D 10 nm E Natural coordinate system F Abstract cell G 1 nm H I Zoomable cell >200.000 images from scientific literature >48.000 3D protein structures from PDB ? 2 Achievements Protein interactions and 3D structures Natural Coordinate System Algorithm for constraint-based construction of large complexes with use case in histone methyltransferase SET1 (In Proteomics 2010) Novelty: constrain combinatorial explosion with protein interaction data Network visualisation with power graphs with disease applications Novelty: exploiting modules in graphs for visualisation (In Experimental Cell Research 2010, Neurological Research 2010, 2 submitted) DOG4DAG: Semi-automated generation of term, definitions, and hierarchies Novelty: First system to integrate all steps of ontology generation + Evaluation (In Bioinformatics 2010) Microscopy images Image search and classification with natural coordinate system Novelty: Image library with 1.3Mio images, Implementation of image similarity measures and filters, GoImage system with 745000 images (manuscript in preparation) 3 Contents Constraint-based modelling of complexes Image search and classification with natural coordinate system Limits and Perspectives 4 Histones and histone methyltransferases DNA is a long string wound around histones Molecular Cell Biology, Lodish et al. Histones are modified by histone methyltransferases Met Histone H3 Dim-5 protein Suv39-type histone K9 methyltransferase Interactions of SET1 subunits Positive Interactions Set1 Set1 Set1 Set1 Set1 Set1 Bre2 Swd1 Swd2 Bre2 - Bre2 Shg1 Ssp1 Swd1 Swd2 Swd3 Bre2 Swd3 Swd2 Sdc1 Negative Interactions AP/MS and Y2H Roguev et al. 2004 , Dehe et al. JBC (2006) Set1 - Sdc1 Bre2 – Sdc1 – Bre2 6 Workflow 7 Constraints 8 Model 9 Contents Constraint-based modelling of complexes Image search and classification with natural coordinate system Limits and Perspectives 10 Image search and classification Current engines: Yale Image finder, (FigSearch), BioText Search Engine No web No filtering of graphs, tables, formulas, photos, etc. Zoomable Cell image search 90% of images not suitable Yahoo Boss API to retrieve several millions of images Bottom up approach: Manual selection of 2.000 from 20.000 images Expansion of seed images to 745.000 images by image similarity Image similarity Gist scene descriptor (960 image features) Approximate nearest neighbour clustering 11 266 images for Endoplasmic reticulum 2 out of 266 for rodents Similar images 12 Navigation in Large Information Spaces Similarity based vector representation of images transforms problem into search scenario in high dimensions Image features constitute space Images are instances of space points Usage of Kernel PCA permits consistent handling of similarities from different feature domains Similarity Measures di Kernel PCA Combined Feature Space ϕ (dimensionality adjustable) Image features Name Runtime / image Quality Reference Gist 0.15 sec ++ [Oliva and Torralba; 2006] Image details CEDD 0.01 sec +++ [Chatzichristofis and Boutalis; 2008] Specific content Tanimoto Coefficient FCTH 0.01 sec ++ [Chatzichristofis and Boutalis; 2008] Specific content Tanimoto Coefficient CLD; SCD; EHD; 0.01 sec + [Manjunath et al; 2002] Specific content; intuitiveness 1 sec ++ Unpublished; Inhouse development Model complexity; computation time SIFT Bag-ofFeatures 0.3 sec ++ [Csurka et al; 2004] Perspective transformations; image clutter Annotation 0.001 sec +++ [Joachims, 1999] Existence and quality of labels Low Res 0.004 sec + Segmentation Limits Affine image transformations; illumination changes Similarity e L2 e L2 Dot Product e L2 String kernel (word frequencies) e L1 Directed Search Strategies in ϕ ϕ2 Idea from Optimization Taxi-Cab Method: reduce to 1D minimization along ϕ axes User performs search interactively: Problem: sparsely sampled space x0 ϕ1 ϕ2 ϕ1 Directed Search Strategies in ϕ Sample along each ϕ axis: as close as possible to axis reachability of all images Solution use Voronoi Regions of ϕ axes to assign all images filter out all images that can be reached indirectly over remaining images Results Star view and user feedback 18 Continuous zoom at different time steps and resolutions Problem: Transforming one image into another by generating a sequence of intermediate images to achieve a seamless transition I 2 As automatic as possible and user adjustable Thin plate spline + optical flow Contributions, Limits, and Perspectives Contributions: Limits: Small scale, semi-automated, constraint-based modelling of complexes possible Compact visualisation of protein interaction networks with power graphs Image search (library, textual and image similarity, navigation, learning) Zooming Large-scale, automated modelling of complexes not possible Additional data needed on structures, interactions, EM maps, localisation Integration of coherent image data Perspective: Concrete application in fruitfly development Integrating video microscopy data, manual annotation, protein and protein interaction data 20 21
© Copyright 2024