Universidade de São Paulo Biblioteca Digital da Produção Intelectual - BDPI Departamento de Ciências de Computação - ICMC/SCC Comunicações em Eventos - ICMC/SCC 2014-05 Being similar is not enough: how to bridge usability gap through diversity in medical images International Symposium on Computer-Based Medical Systems, 27th, 2014, New York. http://www.producao.usp.br/handle/BDPI/45664 Downloaded from: Biblioteca Digital da Produção Intelectual - BDPI, Universidade de São Paulo 2014 IEEE 27th International Symposium on Computer-Based Medical Systems Being Similar is Not Enough: How to Bridge Usability Gap Through Diversity in Medical Images L´ucio F. D. Santos∗ , Marcos V. N. Bedo∗ , Marcelo Ponciano-Silva∗† , Agma J. M. Traina∗ and Caetano Traina Jr∗ ∗ Department of Computer Science, University of S˜ ao Paulo, Brazil † Fed. Institute of Education, Science and Technology of the Triˆ angulo Mineiro, Brazil {luciodb, bedo, ponciano, agma, caetano}@icmc.usp.br similarity-based searches, represented in metric spaces [5]. The two most well-known similarity-based comparison operators are the similarity range (Rq) and the k-nearest neighbor (k-N N q). While a query Rq retrieves the images closer or at the same distance from the query image, the k-N N q operator retrieves the k elements most similar to it. Writing a correct query definition is a multidisciplinary subject, which can help to bridge both the semantic and the usability gap. According to [2], usability gaps are related to how easy it is to use a system from the health staff perspective, allowing them to easily and intuitively formulate queries and later interact with the result through query refinement and relevance feedback. However, in a clinical application, querying a massive database employing k-N N q and/or Rq predicates may often retrieve images too similar among themselves (i.e. retrieved images are more similar to each other than to the query center, being called nearduplicates) [6]. Figure 1 (a) illustrates the result obtained by the traditional k-N N q algorithm in a search space, where dotted balls show the relation between the near-duplicates group. As it can be seen, plain k-N N q returns two near-duplicates, which may be unsatisfactory to the expert, hence they probably bring the same amount of information when they are both in the result set. Near-duplicate’s requires more effort from the specialist to navigate and analyze the elements, often requiring several query reformulations until the desired answer is obtained [5]. In order to present a more useful result and bypass the problem of the near-duplicates, several research areas from different domains have introduced “diversity” to similarity result sets [7], [8]. The intuition behind diversity is to generate a result set that includes elements not only similar to the query element, but also diverse among themselves. Recently, this property has been explored in medical images systems [9] as a bi-criteria optimization problem, where similarity and diversity compete with each other ruled by a trade-off parameter defined by the user, which results in an NP-hard problem [10]. In spite of the efforts to reduce the computational cost required by the diversity approach, the problem remains costly and setting the trade-off parameter among similarity and diversity for each query is difficult and Abstract—In this paper we present a technique developed to bridge the usability gap in Content-Based Medical Image Retrieval (CBMIR) systems exploring both similarity and diversity. Usability gaps are related to how easy to use a software tool from the radiologist’s perspective is. Although much have been done to better express similarity queries, the use of CBMIR over massive databases may have drawbacks that impact its usability. We claim that much of the problems derives from the fact that many images returned are closer to each other than to the query element (near-duplicates). To target this nuisance, we propose to boost similarity queries with diversity, using a technique to hierarchically cluster near-duplicates. We tailored a domain-independent and parameter-free method by controlling the maximum area reached in the search space. This novel approach to improve CBMIR systems take advantage of diversity expectations. The proposed approach BridGE (Better result with influence diversification to Group Elements) aims at adding new relevant information to the analysts, reducing the need of further query refinement or relevance feedback cycles. The results are displayed to the specialist as a traditional CBMIR result whereas the radiologists are able to expand the clusters and navigate through them. The results support our claim that a CBMIR system empowered with diversity is able to bridge the usability gap, grouping near-duplicates and being at least 2 orders of magnitude faster than its mainly competitors. Keywords-Content-Based Medical Image Retrieval; Similarity Queries; Search Result Diversification; I. I NTRODUCTION When radiologists analyze new cases in the clinical routine, they can be motivated to search for similar past cases in a historic database that could have had similar known anomalies. In fact, retrieving similar images have potential to help the specialists to interpret medical images, providing new insights and contributions to the current case. Also, differential diagnosis techniques may help to increase (or decrease) the certainty degree of the professionals about their previous diagnosis hypothesis [1]. The automatically retrieval of similar images has been studied by many researchers [2][3], culminating (among others) in the current components/methods of the ContentBased Medical Image Retrieval (CBMIR) systems [4]. CBMIR-based tools retrieve images that are similar to the given query image instead of using traditional data (i.e. text or numerical attributes related to the image). Usually, those systems are supported by operations involving 1063-7125/14 $31.00 © 2014 IEEE DOI 10.1109/CBMS.2014.21 287 Figure 1. Element selection for similarity queries in an Euclidean bi-dimensional space. Squares are the elements selected. (a) The solution space for the traditional k-N N q centered at query element (sq ), including near-duplicates. (b) The result set retrieved by an optimization diversity approach. (c) The result diversification retrieved by the proposed diversity on CBMIR: clustered near-duplicates (circle) and its representative elements (pentagons). Triangle means elements not returned/used to answer the query. at least generally unintuitive for some users [11]. Figure 1 (b) depicts an example of the diversity provided by the method developed in [9], which shows that even considering diversity, it is possible to have near-duplicates. II. BACKGROUND Most CBMIR systems rely on the similarity query paradigm [12], [13] to perform image retrieval. Similarity queries over images are performed comparing pairs of feature vectors extracted from them. The vectors are compared using a function to measure their similarity, and based on similarity criteria (either Rq or k-N N q) the system retrieves images that are close/similar to the given query image. The problem with those traditional criteria is that they produce a great number of near-duplicates in the result set. A solution for the near-duplicate problem is to take into account the diversity among the elements in the result set. Diversity has been tackled in various ways by different research areas [7], [8]. The most common method is solving a bi-criteria optimization problem, where similarity and diversity compete with each other, ruled by a trade-off parameter [14], [10], [9], which is defined by an expert. Generally speaking, those methods receive the result of basic similarity algorithms configured to retrieve more elements than requested by the user and thereafter apply the bi-criteria objective function to re-rank the solution inducing diversity among the elements sent to the user. Another definition of diversity considers that it exists a minimum distance ξp allowable between each pair of elements [15]. It assumes that if two elements are closer than ξp , then they probably carry the same amount of information and only one should be returned (minimum-distance rule). Besides the higher computational cost, the main problem to adopt content-based diversity queries in CBMIR systems is the number of additional parameter required for each query, which contributes to widen the usability gap. Notice that a query using those methods may not return the requested k similar images (see Figure 1 (b)) when the diversity is considered, showing that they sacrifice similarity to the query element to achieve diversity, imposing a tradeoff between similarity and offering more options to the user. In this paper we explore result diversification on CBMIR systems to group the near-duplicates automatically, retrieving more information to the specialist right in the first query (retrieving the answer illustrated in Figure 1(c)). In addition, the user can hierarchically visualize each nearduplicate group in the result set and proceed interactively, performing either the shifting of the query element in an intuitive way or a relevance feedback through the Rocchio method [1]. The results reported in this paper were obtained using images from two real medical datasets. Our proposed method highlights the usability difference between the traditional CBMIR and the CBMIR with diversity. We follow two strategies to evaluate our proposal: 1) we measured the number of interesting elements returned by our approach and by the traditional k-N N q algorithm, and 2) we measured the time required to execute traditional similarity queries and those improved with diversity. The results showed that our method is at least 2 orders of magnitude faster than the competitors to group near-duplicate images and returns 10 times more elements from the space-solution at the cost of being at most 2 times slower than traditional (non-diverse) k-N N q . The remainder of this paper is structured as follows. Section 2 summarizes the main concepts and related works. Section 3 presents the methodology applied to our technique, while Section 4 details experiments performed over two real medical datasets and analyzes the results achieved. Finally, Section 5 presents the conclusions and future works. 288 Other recent approach defines diversity without requiring more information from the user, the so called result diversification based on influence (RDI) [5]. It is based on an automatically defined minimum distance among two elements si and sj that employs only the position among the elements relative to the query element sq . The minimum distance is estimated using the concept of “influence” intensity I, which is defined as the inverse of the distance between si and sq . Thus, an element sj is more influenced by si than sq , if I(si , sj ) ≥ I(sj , sq ). The BRID (Better Result with Influence Diversification) technique implements the influence concept for similarity queries. It considers that all the elements influenced by element si can be discarded from the result set. Furthermore, as BRID uses the distance from each element si to the query center to define the region of influenced elements, the estimated radius is strictly increasing and may exceed the context of the query (the nearest elements), which provides a holistic vision of the dataset from the perspective of the query element. In this paper, we extend the BRID technique in two distinct ways. First, we expand the search algorithm to retrieve not only the answer taking the RDI concept into account, but also to retain and group the near-duplicates instead of discarding then. Second, we limit the search space using a heuristic that explores a context more focused on the query result, avoiding traversing the entire dataset. We choose to adopt RDI because it does not require setting parameters, thus being transparent for the user. The BridGE technique uses the RDI concept to cluster medical images, retrieving for each of the k resulting image its own similar image subset. The k image subset allows a new, hierarchical way to present results, showing the neighborhood around elements with respect to its distance to the query element. It is important to highlight that nearduplicate images are stored to enable the user to ask about them, even though the images are similar among themselves. They may be interesting because of other data associated to them, distinct from the image itself, such as medical records and other metadata. We also propose a heuristic denominated ContextBoundary (CB), whereby we expect to restrict and control the size of the space solution’s visualization. Our approach allows the user to perform query refinements: for instance, to browse the result set and select one of the retrieved images as a new query element. Moreover, BridGE also permits executing relevance feedback cycles, embedding an adequate strategy, as for example the well-known Rocchio technique. A. BridGE: Grouping elements based on Influence Our proposed method incrementally builds the result set selecting representative elements for groups of nearduplicates. The first step selects the first element that is not Influenced by others in the result set. The main contribution of BridGE over BRID is that it assigns the near-duplicates to the non-influenced elements (called representatives), keeping they hierarchically sorted, forming lists of clusters. Previous techniques simply discard the near-duplicates, considering that they can not help the user. However, in spite of the images be near-duplicates, the additional information associated to each element can impact the support decision process when analyzing the medical images. Each cluster list is sorted by the distances between the elements and their representative, which is maintained as the list header. Thus a cluster is maintained as list: Si {si1 , si2 , ..., sin } | d(si1 , si2 ) ≤ d(si1 , si3 ) ≤ ... ≤ d(si1 , sin ), n ∈ N. Therefore, in a hierarchical way, the representative image summarizes the entire set of near-duplicate ones regarding the similarity to the query element. The algorithm processes the search space sequentially. After finding the first cluster, it iteratively continues searching for the next not Influenced element until retrieving k of them. Thus, BridGE always retrieves more information when compared to the traditional approaches because for each element it can have an associated cluster list which can be shown to the user on request to navigate the neighborhood of each element. We call this operation as a “local expanded query”. Figure 1 (a) and Figure 1 (c) illustrates the difference between the traditional result produced by a similarity query and by our proposal when searching for the five most similar cases. Using the CBMIR front-end, the radiologist can navigate through the result set, expanding the sorted cluster III. P ROPOSED METHOD Intuitively, content-based queries should retrieve the most relevant images close to the query element. Therefore, the resources that the user can employ to express and to analyze the results are directly related to the system usability and its application to the clinical routine. A method intended to bridge the usability gap takes into account the socalled query usability levels: query statement, query feedback and query refinement. Accordingly, we developed the Better result with influence diversification to Group Elements technique (BridGE) that covers the aforementioned levels. It extends and improves the BRID algorithm specifically targeting the task of recovering medical images. It provides a clear way to express queries, facilitates expressing query refinements and allows the user to explore relevance feedback in an intuitive and unobtrusive way. Moreover, the proposed technique allows a clever way to cluster and hierarchically display results, diminishing the cluttering of presenting too many similar elements without discharging these elements. We select elements that are not near-duplicates from others in the search space, which keeps the fundamental nature of the operations that retrieve images similar to the query element, but internally considers the relationships among the elements, making the result analysis task more intuitive. 289 by selecting its representative images and retrieving their reports and metadata. Moreover, the specialist can perform relevance feedback or just select a new query element, by shifting the search space. ξi = i 1 d(su , sq ) i u=1 (1) This process repeats until BridGE finds the required k representative elements or no other element exists in the search space. Notice that controlling the increase of the influence radius on the relative rank of the result set elements regarding the query element allows BridGE to define a search space focused on the query and assigns only the too similar elements as near-duplicates, without bothering the user to set parameters. B. The Context-Boundary to CBMIR with diversity Algorithms that analyze several elements of the dataset to diversify a query have two mainly problems: 1) it is hard to maintain the search space focused on the query context, that is, around the elements that remain similar to the query element, and 2) maintain an acceptable time to retrieve the answer (computational cost). The first problem is related to the semantics of the returned elements, as too much dissimilar elements from the query center may induce users to think that the query was misinterpreted. The second problem is related to the system acceptability, because the user will not wait for minutes to receive the answer to a query that even may need to be reformulated or refined. To surpass such problems, we can consider that there exists a maximum context radius (MCR) (see Definition 1) related to the query element, which avoids returning elements farther than a distance ξmax . Indeed, some of the approaches reviewed in Section II uses concepts equivalent to the MCR to reduce computational costs and define the query context. However, defining the distance ξmax to the query element relies on the user expertise about the distance distribution among the elements in the whole dataset, which impaires the usability of the system. IV. E XPERIMENTAL E VALUATION In this section we compare our proposed BridGE algorithm to the traditional (non-diverse) k-N N q, to a diversity algorithm based on the k-medoid clustering algorithm (CLT) [10] and to the diversification algorithm based on the optimization approach (OPT) [9]. The CLT algorithm employs a concept similar to that used in BridGE, as it also groups similar elements and returns a set of representative elements. OPT represents the concept of diversity on similarity queries already applied to medical images. We follow two strategies to evaluate our proposal. The first one determines which method has more probability to empower a similarity query to recover new information. For this, we measured the number of images retrieved by our BridGE algorithm and by the traditional k-N N q one. The second is a performance test to evaluate the cost to provide more useful answers to a similarity query with diversity. We evaluate the results by processing two real image datasets: the MRIBalan dataset [9] and the ImageCLEFmed dataset [16]. The former is composed of 704 images of magnetic resonance imaging (MRI) obtained from the Clinical Hospital at Ribeirao Preto of University of Sao Paulo (USP). The image feature vectors were obtained by the method proposed in [17] and were compared using the Euclidean distance function (L2 ) evaluated over the 30 features extracted. The ImageCLEFmed dataset is composed of 5,042 biomedical images of 32 manually assigned disjoint global categories, which is a subset of a larger collection of six different datasets used for the medical image retrieval task in ImageCLEFmed 2007 [16]. The image features vector of each image in this dataset was extracted using the SIFT descriptor and modelled using the Bag-of-Features approach with a dictionary of 310 visual words. The Manhattan distance function (L1 ) is used for this dataset. For each evaluated dataset, we randomly chose 100 different elements to be employed as query center elements. The parameter required for CLT and OPT to balance similarity and diversity was set to 0.5, which has been reported by their authors to achieve on average the best diversification [10]. Regarding BridGE, as it is a parameter-free technique, we Definition 1. Maximum Context Radius (MCR): Given a domain S, a dataset S ∈ S, a query element sq ∈ S, a distance ξmax , ξmax ∈ R and a distance evaluation function d, the elements most similar to sq is the subset S ⊂ S| ∀ sj ∈ S , d(sj , sq ) ≤ ξmax . Our approach automatically estimates an influence radius for each element to maintain the representative elements among the most similar ones, which better adapts to the distance distribution around the query element without requesting any information to the user. The Context-Boundary heuristic is defined based on the relative rank of each element in the result set regarding the query element. BridGE starts assuming that the influence radius around the query element is zero, since the result set is empty. The most similar element to the query element is inserted as the representative and its distance to the query element is used to define the first influence radius, ξ1 = d(s1 , sq ). All the elements si in the search space at a distance d(si , s1 ) ≤ ξ1 are grouped as near-duplicates of s1 and removed from the search space. Thereafter, the next element more similar to the query element that is not a nearduplicate of s1 is inserted in the result set and its influence radius is defined following Equation 1. 290 #$% &% ' ! "## Figure 2. Result sets for a query over the MRIBalan dataset, centered at the “query element” shown and considering k = 5. (a) The result obtained by BridGE highlighting the near-duplicate images hierarchically grouped to an element of the result set and (b) the result set generated by the traditional k-N N q. Notice that the 5 images returned by k-N N q are the same grouped as near-duplicates of the third returned by BridGE. do not need to set any query parameter besides the number k of elements to retrieve. The experiments were performed on a computer with an Intel Core i7 processor and 8 GB of main memory, under Ubuntu Linux 11.10. All the algorithms were implemented on C++, using the same programming framework for all of them to enable fair comparisons. each other than the result presented by the plain similarity query shown in Figure 2 (b). In addition, each image in the diversity result shows also the number of other nearduplicate images, indicating for the user that grouped images do exist. Thereafter, if the near-duplicates are interesting to the user, it is possible to expand the query result without resubmitting a query to the system. Notice that we improved the query coverage to naturally include the diversity without missing the context of retrieving the most similar images nor requesting new parameters from the user. A. Recovering New Information Experiment In a massive image database, near-duplicate images usually have a lower probability to add new information to the query result. Nevertheless, the near-duplicates should be made available on request to the user, increasing the understanding of the provided answer as, if they were properly presented, it is possible to extract additional relevant information. The search for similar images with high probability of new information (i.e, that are not near-duplicate to each other) using traditional k-N N q may need a large number of images to be retrieved. Therefore, it is frequently required that the user interacts with the system through queries refinement and relevance feedback until the system have been feed with enough information to retrieve an image set that brings enough information. For instance, suppose that a user is looking for the 5 cases most similar to the current one, taking into account the similarity of the MRI image of its patient over the MRIBalan dataset. Two possible results for this example are presented in Figure 2. The result obtained by BridGE is shown in Figure 2 (a). As it can be seen, the diversity query retrieved 5 elements that are more dissimilar from In order to evaluate the information retrieval power, we measured the number of images retrieved by BridGE and by the traditional k-N N q. Figure 3 (a) shows the measurements obtained to answer queries by similarity over MRIBalan dataset. Besides the high number of elements retrieved by BridGE, the results are sorted employing sets of similar elements, where the images that have more probability to add new information are shown on the top. Thus, BridGE retrieves the number of images defined by the specialist, as it can be seen in Figure 3 (b), while k-N N q linearly fetches those images with a slight slope, in general returning less than 50% of images that are not near-duplicated of each other. We presented our approach to specialists in CBIR to evaluate both alternatives. They performed similarity queries using both the traditional k-N N q and BridGE. They stated that BridGE is indeed capable of presenting more meaningful images than k-N N q, providing more information about the relationships among the elements, yet being able to reduce the need of further query refinements. The specialists also highlighted that similarity queries with diversification that 291 set. However, the experiments showed that our proposed technique is faster than the other diversity approaches to execute similarity queries with diversity. In fact, BridGE is consistently around two orders of magnitude faster than OPT, and BridGE is consistently around four orders of magnitude faster than the CLT algorithm, which is based on the k-medoid clustering algorithm. The result presented in Figure 4 pinpoints that our approach can be seamlessly integrated to a CBMIR system at the cost of doubling the required time as compared to k-N N q (non-diverse), while the competitors are easily 100 times slower. Moreover, BridGE does not require new information from the specialist to retrieve relevant images in a feasible time, turning the use of diversity on medical images transparent and intuitive. Figure 3. Relation among the number of retrieved elements and of the representative ones in the result set. (a) Total amount of retrieved images for a query by BridGE and k-N N q and (b) Number of representative images retrieved. V. C ONCLUSIONS This paper presented an improved version for result diversification on k-N N queries based on influence to enhance the usability of CBMIR systems. This technique is able to automatically group near-duplicate entries and shows only those most representatives for the query, thus having more probability to retrieve new information to the specialist starting from the first query in an interaction section. The method BridGE that we proposed includes a new way to transparently express similarity queries with diversity, without cluttering the display with too much information, since it just shows the representative image of each cluster (the near-duplicates) and allows the users to walk deeper in the information only if they want to. Moreover, BridGE spots the similar images that can bring new useful information to the user. No information is omitted or lost compared to the traditional k-N N q approach. Instead, all the images considered near-duplicates are grouped and sorted, making it easier to the specialist to browse among them. To validate our method, we performed experiments using two real medical datasets that span up to 5,042 data elements and are represented by feature vectors of up to 310 dimensions. The experiments show that diversity techniques can enhance the use of CBMIR systems to avoid query refinement and further relevance feedback cycles. Thus, BridGE can retrieve more representative images than k-N N q and group the near-duplicates to increase the usefulness of the data retrieved. Moreover, unlike the competitors, our proposed technique has the desirable feature of not requesting any external parameter from the users, yet maintaining the context focused on the query element. The performance tests showed that BridGE is at least 2 orders of magnitude faster than the closest competitors, while being able to retrieve up to 10 times more elements than the traditional k-N N q algorithm. Evaluation performed by health personnel highlighted that being able to ask for similarity queries with diversification without setting external parameters is a more adequate Figure 4. Average running time for the (a) MRIBalan dataset and (b) ImageCLEFmed dataset for k varying from 3 to 11. do not require external parameters demand less effort from themselves, being simpler and more adequate to embedded into CBMIR systems for use in the daily clinical routine. B. Performance Experiment In order to evaluate the retrieval performance of BridGE regarding its competitors, we also performed queries on the MRIBalan and ImageCLEFmed datasets. For each evaluated dataset, we randomly chose 100 different elements to be employed as query elements. Each point measured in the running time graphs represents the average number of microseconds required to evaluate 100 queries with constant values for k, but posed at distinct query elements. Figure 4 shows, in log scale, the average time demanded by each technique to answer queries when k varies from 3 to 11 for both MRIBalan (Figure 4 (a)) and ImageCLEFmed (Figure 4 (b)) datasets. For this experiment, we restricted the search space of CLT and OPT algorithms to the farthest element retrieved by BridGE, because those algorithms require this information in advance (see Definition 1) to reduce the computational cost of using the entire dataset. Notice that if we do not have BridGE, this information must be estimated by the users, using their knowledge about the dataset, which is a factor hurting CLT and OPT usability. As expected, the traditional k-N N q is the fastest nondiverse algorithm. This occurs because it does not consider the relationship among the elements in the result 292 expression to pose queries in CBMIR tools. Thus, the main contribution of BridGE is to reduce the system’ usability gap, by better expressing similarity queries that considers the relationship among the elements in the result set, without increasing the performance gap. Moreover, we believe that BridGE has a great potential to be seamlessly integrated into CBMIR tools, facilitating later inroads of those systems in the clinical routine. As a future work, besides the incorporation of the developed technique into a CBMIR with traditional queries to perform a user-centered analysis regarding the feasibility of using diversity in clinical routine, we will explore other ways to control the context boundary to adequately suit similarity queries into he daily clinical practice. [8] M. Drosou and E. Pitoura, “Search result diversification,” SIGMOD Record, vol. 39, no. 1, pp. 41–47, Sep. 2010. [9] R. L. Dias, R. Bueno, and M. X. Ribeiro, “Reducing the complexity of k-nearest diverse neighbor queries in medical image datasets through fractal analysis,” in Proceedings of 26th IEEE Symposium on Computer-Based Medical Systems., ser. CBMS, 2013, pp. 101–106. [10] M. R. Vieira, H. L. Razente, M. C. N. Barioni, M. Hadjieleftheriou, D. Srivastava, C. Traina Jr, and V. J. Tsotras, “On query result diversification,” in IEEE 27th International Conference on Data Engineering. Hannover, Germany: IEEE, 2011, pp. 1163–1174. [11] A. Angel and N. Koudas, “Efficient diversity-aware search,” in ACM SIGMOD International Conference on Management of Data. Athens, Greece: ACM, 2011, pp. 781–792. ACKNOWLEDGMENT The authors would like to thank FAPESP, CNPq, Capes and SticAMSUD for the financial support. [12] D. Kaster, P. Bugatti, M. Ponciano-Silva, A. Traina, P. Marques, A. Santos, and J. Traina, Caetano, “Medfmi-sir: A powerful dbms solution for large-scale medical image retrieval,” in Information Technology in Bio- and Medical Informatics, ser. Lecture Notes in Computer Science, C. B¨ohm, S. Khuri, L. Lhotska, and N. Pisanti, Eds. Springer Berlin Heidelberg, 2011, vol. 6865, pp. 16–30. R EFERENCES [1] M. Ponciano-Silva, J. P. Souza, P. H. Bugatti, M. V. N. Bedo, D. S. Kaster, R. T. V. Braga, A. D. Bellucci, P. M. Azevedo-Marques, C. T. Jr., and A. J. M. Traina, “Does a cbir system really impact decisions of physicians in a clinical environment?” in Proceedings of 26th IEEE Symposium on Computer-Based Medical Systems., ser. CBMS, 2013, pp. 41– 46. [13] M. V. N. Bedo, M. P. da Silva, D. dos Santos Kaster, P. H. Bugatti, A. J. M. Traina, and C. T. Jr, “Higiia: A perceptual medical cbir system applied to mammography classification,” in Demo and Applications Session of the XXVII Brazilian Symposium on Databases (SBBD). S˜ao Paulo, SP: SBC Brazilian Computer Society, 2012. [2] T. M. Deserno, S. Antani, and L. R. Long, “Ontology of gaps in content-based image retrieval.” Journal of Digital Imaging, vol. 22, no. 2, pp. 202–215, 2009. [14] J. Carbonell and J. Goldstein, “The use of MMR, diversitybased reranking for reordering documents and producing summaries,” in 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: ACM, 1998, pp. 335– 336. [Online]. Available: http://doi.acm.org/10.1145/290941. 291025 [3] Y. Liu, D. Zhang, G. Lu, and W.-Y. Ma, “A survey of contentbased image retrieval with high-level semantics,” Pattern Recognition., vol. 40, no. 1, pp. 262–282, Jan. 2007. [4] M. O. G¨uld, C. Thies, B. Fischer, and T. M. Lehmann, “A generic concept for the implementation of medical image retrieval systems,” International Journal of Medical Informatics, vol. 76, no. 2-3, pp. 252–259, 2007. [15] T. Skopal, V. Dohnal, M. Batko, and P. Zezula, “Distinct nearest neighbors queries for similarity search in very large multimedia databases,” in 11th International Workshop on Web Information and Data Management. Hong Kong, China: ACM, 2009, pp. 11–14. [5] L. F. D. Santos, W. D. Oliveira, M. R. P. Ferreira, A. J. M. Traina, and C. Traina, Jr., “Parameter-free and domainindependent similarity search with diversity,” in Proceedings of the 25th International Conference on Scientific and Statistical Database Management, ser. SSDBM. New York, NY, USA: ACM, 2013, pp. 5:1–5:12. [16] H. M¨uller, T. Deselaers, T. Deserno, J. Kalpathy Cramer, E. Kim, and W. Hersh, “Overview of the imageclefmed 2007 medical retrieval and medical annotation tasks,” in Advances in Multilingual and Multimodal Information Retrieval, ser. Lecture Notes in Computer Science, C. Peters, V. Jijkoun, T. Mandl, H. M¨uller, D. Oard, A. Pe˜nas, V. Petras, and D. Santos, Eds. Springer Berlin Heidelberg, 2008, vol. 5152, pp. 472–491. [6] J. Banda, M. Schuh, T. Wylie, P. McInerney, and R. Angryk, “When too similar is bad: A practical example of the solar dynamics observatory content-based image-retrieval system,” in New Trends in Databases and Information Systems, ser. Advances in Intelligent Systems and Computing, B. Catania, T. Cerquitelli, S. Chiusano, G. Guerrini, M. K¨ampf, A. Kemper, B. Novikov, T. Palpanas, J. Pokorn´y, and A. Vakali, Eds. Springer International Publishing, 2014, vol. 241, pp. 87–95. [17] A. Balan, A. J. M. Traina, A. Traina, and P. AzevedoMarques, “Fractal analysis of image textures for indexing and retrieval by content,” in Proceedings of 18th IEEE Symposium on Computer-Based Medical Systems., ser. CBMS, June 2005, pp. 581–586. [7] L. F. D. Santos, W. D. Oliveira, M. R. P. Ferreira, R. L. F. Cordeiro, A. J. M. Traina, and C. T. Jr., “Evaluating the diversification of similarity query results,” Journal of Information and Data Management, vol. 4, no. 3, pp. 188–203, 2013. 293
© Copyright 2024