Download Report

MITSUBISHI
ELECTRIC
Title:
ITEVIL
Reference:
VIL01D112
Author:
Date:
30/05/01
Signature:
MPEG7 VISUAL SHAPE DESCRIPTORS
Miroslaw Bober, Francoise Preteux and Y
M Kim
Checked by:
Signature:
COVER SHEET
1
Rev.
Export
Control
A
NLR
MPEG7 VISUAL SHAPE DESCRIPTORS
AUTHORS
Miroslaw Bober (corresponding author)
Visual Information Laboratory
Mitsubishi Electric Information Technology Centre Europe
20 Frederick Sanger Road
The Surrey Research Park
Guildford, Surrey GU2 5YD, United Kingdom
Phone:
(+44) 1483 885818
Fax:
(+44) 1483 579107
email:
[email protected]
Prof. F. PRETEUX Institut National des Telecommunications 9, Rue Charles Fourier 91011 Evry Cedex FRANCE Phone : (+33) 01.60.76.46.57 Fax : (+33) 01.60.76.43.81 email:
Francoise.Preteux@intevry.fr WhoiYul Yura Kim
Associate Professor
School of Electrical and Computer Engineering
Hanyang University
Tel : +82222900351
Fax : +82222926316
Mob : 01197832353
Email:[email protected]
2
MPEG7 VISUAL SHAPE DESCRIPTORS
Miroslaw Bober, Francoise Preteux and YM Kim
1 Introduction
This chapter is concerned with techniques for describing and matching shape features of 2dimensional and 3dimensional objects. The Visual Part of the MPEG7 standard defines three descriptors with different properties: the contourbased shape, the regionbased shape and the 3D Shape spectrum descriptors. The theory underpinning each of them, their respective properties and application domains are explained here in some detail. We also present examples of applications demonstrating the ability and applicability of the MPEG7 shape descriptors. Object shape features are very powerful when used in similarity search and retrieval. This is because the shape of objects is usually strongly linked to object functionality and identity. Humans can recognize characteristic objects solely from their shapes – an illustration that shape often carries semantic information. This property distinguishes shape from some other elementary visual features, such as colour or texture. A large body of research has been devoted to shapebased recognition, retrieval and indexing [1].
Many applications, including those concerned with visual objects retrieval or indexing, are likely to use shape features. For example, an intelligent video security system may use shape techniques to determine the identity of an intruder. Shape descriptors can be used in ecommerce, where it is difficult to use a text index to specify the required shape of the object, and querybyexample is simpler and faster. In fact, any application targeting visual scene interpretation and object recognition cannot do it efficiently without powerful shape tools. Naturally, the best retrieval performance can be achieved by using a combination of visual (and possibly other) feature descriptors, but that subject is beyond the scope of this chapter. In the following section, an overview of the MPEG7 shape descriptors is presented, and their functionalities, properties and application domains are analysed. Section 3 describes the performance tests and experimental results obtained in MPEG7 Core Experiments, including the databases used in the tests. The regionbased shape descriptor, contourbased shape descriptor and 3D Shape Descriptors are presented in subsequent Sections 4,5 and 6. Section 7 deals with a combination of the Multi View and contour shape descriptors. Example applications are outlined in Section 8 and conclusions are drawn in Section 9. 2 Overview of Shape Descriptors The notion of object shape and shape similarity, while intuitively clear, may have many meanings. Most of the realworld objects are 3dimensional and a 3Dshape descriptor has been developed by MPEG [2,22] and is presented in Section 6. However, in the image and video world, 2D projections of the realworld 3D objects onto image plane are visible and often their exact 3D shapes are unknown. To cover this important domain, MPEG7 provides contourbased shape and regionbased shape tools. 3
Figure 1 Example of contourbased and regionbased region similarity
Figure 1 shows that even in the 2D case, there can be two notions of similarity: based on the object contour and based on the distribution of all pixels within contour. Objects in the first row have similar spatial distribution of pixels and are therefore similar according to a regionbased criterion. However, they clearly have different outline contours. When contourbased similarity is concerned, objects shown in each column are similar. Posing a query with the object located in the first row and second column will result in retrieved object from the first row (if regionbased similarity is concerned) or second column (if contourbased similarity is concerned). MPEG7 supports both notions of similarity having regionbased and contourbased shape descriptors.
2.2 Regionbased shape descriptor
Figure 2 Example of shapes where regionbased shape is applicable
The regionbased shape descriptor belongs to the broad class of shape analysis techniques based on moments . It uses a complex 2D Angular Radial Transformation (ART) , defined on a unit disk in polar coordinates. The regionbased shape descriptor expresses pixel distribution within a 2D object region; it can describe complex objects consisting of multiple disconnected regions as well as simple objects with or without holes (Figure 2). Some important features of this descriptor are: • It gives a compact and efficient way of describing properties of multiple disjoint regions simultaneously,
• It can cope with errors in segmentation where an object is split into disconnected subregions, provided that the information which subregions contribute to the object is available and used during the descriptor extraction, • The descriptor is robust to segmentation noise, for example saltandpaper noise. The ART descriptor is considered in detail in Section 4. 2.3 Contourbased shape descriptor
Figure 3 Examples of shapes where contourbased shape is applicable
The contour shape descriptor is based on the Curvature Scale Space representation of the contour . Objects for which characteristic shape features are contained in their contour are described efficiently by the contour shape descriptor – see Figure 3. Some important features of this descriptor are: • It emulates well the shape similarity perception of the human visual system and has good shape generalisation properties. 4
It is robust to significant nonrigid deformations (e.g. the outline of a running person or walking beetle) and to perspective transformations
• It is a compact and its size adjusts depending on the contour complexity.
Note that, as shown before, objects may have similar region shape properties and still exhibit different contour properties (and vice versa). If a complex object consists of multiple disjoint regions, each of region component contours can be described separately, using the contourbased descriptor and the MPEG7 description scheme. The descriptor, its properties and example applications are considered in detail in Section 5.
•
2.4 MultipleView Descriptor for Shape Sometimes a threedimensional object model may not be known, but many applications require more then just 2D information. Let’s take eshopping as an example the customer may imagine the required product, say a flower vase, but there is no actual 3D object from which a 3D descriptor could be extracted. Such a mental picture consists of a set of 2D views, which when combined describe the 3D properties of the object. To cater for such situations, a combination of a MultipleView descriptor and Contour shape or Regionshape descriptors can be used. Section 7 provides more details and experimental results. 3. Selection and testing of the MPEG7 descriptors. All shape descriptors proposed for the MPEG standard were rigorously tested in Core Experiments [3], where retrieval performance, descriptor size, matching complexity and other properties were examined. Based on the test results, the best techniques for the contourshape, regionshape and 3D shape descriptors were selected. A good descriptor captures characteristic shape features in a concise manner, which should render itself easily to fast search and browsing. It is not required that the original shape can be reconstructed from the descriptor, and therefore a significant compactness of the representation can be achieved. Naturally, it should be invariant to scaling, translation and rotation. Furthermore, as one is frequently dealing with digital (sampled) images, scaling of objects to a relatively small size (as compared to the sampling grid) may result in significant distortions in their shape. It is therefore critical that the shape representation should be robust to such effects. Other types of distortions often present result from the imaging conditions, say a 2D shape may undergo a perspective transformation as a results of change in the viewing angle. Some objects are flexible and the representation should be able to cope with non
rigid deformations. Finally, the descriptors should cope with distortions typical for object segmentation and extraction processes.
To help the reader understand the properties of the shape descriptors, we will present a short overview of the Core Experiments and the databases used in each category. 3.1 Testing Contourbased shape descriptors. Testing of the contourbased descriptors was performed in Core Experiment 1 (CE1) on a database of over 3400 images. Four experiments, referred to as A1, A2, B and C were conducted. Robustness to rotation and scaling (CE1A1, CE1A2), similaritybased retrieval and shape generalisation properties (CE1B) and robustness to nonrigid deformation (CE1C) were tested. The CE1 test database consists of three main components:
• Classified Shapes: 1400 shapes of various objects classified into 70 different classes (20 examples in each class). • Geometrically Transformed Shapes: Rotated and scaled versions of the first image in each class of the Classified Shapes Database.
5
•
Nonrigid motion shapes: 200 frames taken from the “Bream sequence” depicting swimming fish and about 1100 shapes of fish and other marine creatures were included in this database. 3.1.1 Classified Shapes Database for similaritybased retrieval
Since the notion of shape similarity is subjective and it may be sometimes difficult to precisely rank similarity, a new way of assessing performance was developed. The idea is construct the database in such a way that the shapes create “natural” clusters of similar shapes. The similarity between shapes within the cluster is greater that the similarity of shapes between clusters. Figure 4. Example of shape classes in the contourbased shape test set
This is illustrated on Figure 4, where cluster A contains shapes depicting horses (side view), cluster B contains shapes of camels, and cluster C shapes of cattle. While there is a different degree of variability of shapes in each cluster (see the position of tail or legs), the degree of similarity between objects within each class is greater then the similarity between two objects from two different classes. This property simplifies the assessment of the retrieval performance: A query with a shape from a cluster A should ideally return all remaining shapes from cluster A at top ranks before shapes from any other clusters are retrieved.
This part of the database consists of 1400 shapes in 70 clusters (classes) of 20 shapes each. We have examples of natural objects with significant shape variability, such as beetles (Figure 5, top row); manuallydrawn objects (second row), objects extracted from cartoons with lesser amount of shape variability (third row) and drawn objects targeting specifically regionbased similarity (bottom row). Each of the 1400 images is used as a query in turn and the number of correctly retrieved shapes in the top 40 is calculated. (An object is considered to be correctly retrieved if it comes from the same class as the shape used for query).
6
Figure 5. Example shape clusters from the database for similaritybased retrieval
3.1.2 Geometrically Transformed Shapes
All the techniques tested supported rotational and scaling invariance, however it was imperative to test the robustness of each approach to the effects such as changes to the shape due to image resampling (rotation & scaling), quantisation of the representation, and the overall behaviour of the software implementations. For this reason, two additional databases were constructed to test scaling invariance (referred to as A1) and to test rotationalinvariance (referred to as A2). One image from each of the 70 classes was used to derive digitally scaled versions for the A1 dataset (scaling factors: reduction by factors: 0.3 0.25, 0.2 and 0.1 and enlargement by factor 2). The A2 dataset contains digitally rotated versions of the reference images (digital rotation by angles 9, 36, 45, 90 and 150 degrees). Thus there are 70 clusters in each of the database, every cluster containing 7 images (the reference image and 6 rotated or scaled versions), 420 images in total. The query is posed using every image in the database, and the number of correct matches in the top 7 shapes is calculated for each query. The overall performance for each transformation is calculated by dividing the total number of correct retrievals by 2940 (The maximum number of correct retrieval for 420 queries). In this experiment, in order to be counted as a correct match the shape must be retrieved in the top N matches, where N is the number of shapes from the same cluster (e.g. N=7). This is a more stringent requirement that in the similarity
based retrieval experiment, where the correct matches are computed in the 2N top retrievals. The scaleinvariance test is the more challenging of the two, due to the fact that some images are severely distorted when reduced by factors of 0.2 or 0.1. This was intended – the objective is to determine the robustness to various degrees of distortions. Figure 6 shows examples where scaling by a factor of 0.1 caused severe distortions, to a degree where even a human may find it difficult to determine a true shape. Figure 6 Shapes of child, rat and bottle (left box) and their scaleddown and resampled versions (right box)
3.1.3 Nonrigid motion shapes
The main characteristic features of object shape are usually retained under some nonrigid deformations. Many applications will rely on this property and therefore it is important that good shape descriptors can still cope with such deformations. In this experiment, 200 frames showing a swimming fish (often known in MPEG community as the Bream fish) are combined with a database of 1100 shapes of marine creatures, including many images of other fish. The query is posed with the first image of the Bream fish and the number of instances of other Bream fish in the top 200 retrievals is computed. The shape of the fish is deformed due to swimming and also the fish turns causing more dramatic changes to its shape. Figure 7 shows the query shape bream000 (left box), three shapes of the bream fish under nonrigid deformations (middle box) and some other fish present in the database (right box).
Figure 7 Example of shapes used to test robustness to nonrigid deformations
7
3.2 Testing Regionbased shape descriptors.
Regionbased techniques were tested in the Core Experiment number 2 (CE2). A database of trademarks is used to evaluate performance on complex shapes consisting of multiple disjoint regions, possibly with holes. The test database consists of two main components: classified trademarks shapes and transformed shapes.
3.2.1 Classified Shapes: The S8 database contained 2800 trademark shapes: 678 objects are classified into 10 different classes (variable number of examples in each class) and 2122 remaining trademarks are unclustered. Examples of shapes in two classes are shown in Figure 8.
Figure 8 Example of shape classes in the regionbased shape test set
3.2.2 Geometrically Transformed Shapes: Four tests are performed to test robustness to: digital rotation and scaling (CE2A1, A2) analogue rotation and scaling (CE2A3), and perspective transformation (CE2A4). • Digital rotation and scaling CE2A1, A2: 20 randomly selected trademarks were digitally scaled by factors of 3, 1.5, 0.5 and 0.33 and rotated by 9, 36, 45, 90, 120, and 150 degrees. After merging with the original images A1 contains 100 images and A2 contains 140 images. The query is performed using all shapes in each class.
• Analogue rotation and scaling CE2A3. Capturing a video sequence with 20 different trademarks printed on a board and subsequently digitising certain images from the sequences, created the database A3. Scaling and rotation is effected by changes in the imaging conditions such as cameratoobject distance and relative position. Both, analogue and digital cameras were used. • Perspective transformation is tested using dataset CE2A4. This dataset was created by capturing video sequences with 20 different trademarks printed on a rigid board. For each reference shape 10 images exhibiting different degree of perspective distortions, were selected (200 shapes in total) and combined with the reference database S8.
Figure 9 Examples of shapes under perspective transformation used in tests
3.3 Testing 3D shape descriptors. Construction a database for testing 3D descriptors was perhaps the most difficult task due to lack of sufficiently large publicly available dataset. Some 1290 models were collected by combining the 3D library used for MPEG4 experiments, 3D café dataset (http://www.3dcafe.com) and models of letters generated by one of the participants. The participants divided the dataset into 15 broad categories, such as 4limbs, cars, aerodynamicshapes, trees, balloons, lettera, letterb etc. As in the previous experiments all classified models were used as queries and the Bull’s Eye score was used. In the following sections we will introduce each shape descriptor, namely the regionbased shape descriptor, the contourbased shape descriptor, and the 3D shape descriptor.
8
4. The Regionbased shape descriptor
In this section, we present the MPEG7 regionbased shape descriptor, which can handle objects consisting of single or multiple regions. Three regionbased shape descriptors were proposed to MPEG7: a MultiLayer Eigen Vector Descriptor (MLEV) , a descriptor based on Zernike moments [28], and a descriptor based on Angular Radial Transform (ART) . The technologies competed through the process of Core Experiments, which concluded that the ART descriptor offers the best overall performance for regionbased similarity. The strength of the regionbased shape techniques is that they perform well for complex shapes that consist of several disjoint regions, such as trademarks or logos, emblems, clipart and characters. The contourbased shape descriptors, presented in the next section, may not be suitable for such complex objects. One of the reasons is that the contour on an object can be changed drastically if there is a small cracklike opening or an object touching neighbouring objects. For example, the shapes shown in Figure 10 (a) and (c) are very similar according to human perception. Their contours or boundaries, as shown in subfigures (b) and (d), however, are very different depending on whether or not the inner star touches the outer circle. 1
1
2
2
3
3
4
6
5
(a) (b) (c) (d)
Figure 10 Visually similar images; (a) An example consists of a touching star and a circle, (b) The contours of (a), (c) Similar example with nontouching star and a circle (d) The contours of (c)
Naturally, contourbased shape descriptors are not appropriate for describing shapes consisting of several disjoint regions; for example the contourbased shape descriptors cannot be used to differentiate the shapes shown in Figure 11.
Figure 11 Example of complex shapes, consisting of multiple regions
The regionbased shape descriptor uses the Angular Radial Transform. 4.1 The Definition of ART
ART is the orthogonal unitary transform defined on a unit disk that consists of the complete orthonormal sinusoidal basis functions in polar coordinates. The ART coefficients are defined by:
Fnm = Vnm ( ρ ,θ ) , f ( ρ ,θ ) = ∫
2π
0
1
∫ V ( ρ ,θ ), f ( ρ ,θ ) ρdρdθ (1)
0
∗
nm
where Fnm is an ART coefficient of order n and m, f ( ρ, θ) is an image function in polar coordinates, and Vnm ( ρ, θ) is the ART basis function that are separable along the angular and radial directions, i.e.,
Vnm ( r ,q ) = Am ( q ) Rn ( r ) (2)
In order to achieve rotation invariance, an exponential function is used for the angular basis function,
Am ( q ) =
1
exp ( jmq ) 2p
The radial basis function is defined by a cosine function,
9
(3)
Rn ( r ) =
1
2 cos ( p n r
n=0
n 0
)
(4)
Real parts and imaginary parts of ART basis functions are shown in Figure 12.
m =0 m=0
m=1
m=2
m=3
m=4
m=5
m=6
m=7
m=8
m=1
m=2
m=3
m=4
m=5
m=6
m=7
m=8
n =0
n=1
n=2
n= 4 n=5
( a) Real part n=3
n=6
n=7
n=0
n=8
n= 1
n=2
n=3
n= 4
n=5
(b) Imaginary part
n=6
n= 7
n=8
Figure 12 Real and Imaginary parts of ART basis functions
Rotation Invariance of the magnitudes of ART
a
The magnitudes of the ART coefficients are rotation invariant. Let the image f ( r ,q ) be the rotated image of f ( r ,q ) by an angle α about its origin,
f a ( r ,q ) = f ( r , a + q ) (6)
ART of the rotated image are then given as
a
Fnm
=
1
2p
2p
1
0
0
*
Vnm
( r ,q ) f a ( r ,q ) r d r dq (7)
or
a
Fnm
= Fnm exp ( - jma ) (8)
Here, the magnitude of the ART of the rotated image and that of the reference is the same, i.e.,
a
Fnm
= Fnm (9)
4.2 The ART Descriptor
The ART descriptor is defined as a set of normalized magnitudes of ART coefficients. For scale normalization, ART coefficients are divided by the magnitude of ART coefficient of order n=0, m=0. The ART coefficient of order n=0, m=0 is not used as a descriptor element because it is constant after scale normalization. To keep the descriptor size to minimum, quantisation is applied to each coefficient using four bits per coefficient. 4.3 Similarity Measure
The distance between two shapes with ART descriptor is calculated by summing of the absolute differences of each order of descriptor elements.
10
Dissimilarity =
i
M d [ i ] - M q [ i ] (10)
Here, the subscript d and q represent image in the database and query image, respectively. M is the array of ART descriptor values.
4.4 Experimental results
Table 1 (middle row) presents the results obtained in the Core Experiment 2 using the ART descriptor implementation from the MPEG7 XM (eXperimental Model) version 5.0. Good results are obtained in tests of the rotational and scaling invariance. The performance is somewhat degraded under a perspective transformation test (part CE2A4), and a possible solution to improve it is suggested in the next section. Finally, the tests of similarity based retrieval (CE2B) show a score of over 66%, which is high considering the complexity of the test database. CE2A1
CE2A2
CE2A3
CE2A4
CE2B
Overall 1
ARTXM
86.20%
98.06%
98.76%
62.65%
66.13%
76.28%
ART+Compact
87.60%
97.96%
99.06%
97.69%
65.04%
80.31%
Table 1 CE2 Core Experiment Results for the Region Shape descriptor
Figure 13 Example similarity retrieval results for a trademark database
Figure 13 shows an example of the retrieval results, for a query image 3075, for which the performance is close to the average obtained in part B. There are 22 shapes in this cluster. According to the manual clustering, shapes numbered 4,5,7,8,9,13 and 16 are considered members of a different cluster (therefore counted as errors), however a large degree of similarity is still visible between them and the query image. Results for the Core Experiment CE1 are presented in Table 2. It can be seen from parts A1 and A2 that the descriptor offers very good rotational and scaling invariance. As mentioned earlier, the region
based approach is more robust to deformations related to the extreme scaling – in CE1A1 it outperforms the contour shape descriptor by 7%. In similarity based retrieval (part B), the regionshape descriptor is on average about 15% worse compared to the contourshape descriptor. A detailed class
byclass analysis shows that out of 70 shape classes, the regionshape descriptor significantly outperforms the contourbased descriptor in 8 cases, while the contourshape descriptor significantly outperforms the regionshape descriptor in 42 cases. More details on Core Experiment CE1 are provided in section CE1A1
CE1A2
CE1B
CE1C
Overall2
ARTXM
97.60%
100%
68.34%
92%
86.40%
Table 2 CE1 Core Experiment Results for the Region Shape Descriptor
Average performance in CE2 = { (A1 + A2 + A3 + A4)/4 + B }/2
Overall result in CE1 is defined as: CE1 = { (A1 + A2)/2 + B + C }/3
1
2
11
4.5 Compensation for the perspective transformations
Perspective transformations of the shape frequently occur in real world applications. It is desirable that shape descriptors have the capability to retrieve shapes distorted by perspective transformations. However, most regionbased shape descriptors perform poorly in such situations. As shown in Table 1, the retrieval performance in the CE2A4 experiment (involving perspective transformations) is relatively low compared with the performance under rotation and scaling. In this section, a method to improve the retrieval performance in the presence of perspective transformation is discussed.
J. G. Leu proposed a compaction algorithm to obtain the skew invariance . The objective of the compaction algorithm is to normalize an image via a sequence of two linear transformations so that the covariance matrix of the compacted image becomes a scaled identical matrix. So, the compacted image by the algorithm becomes invariant to translation, scaling and skew. As the result of the compaction algorithm, a triangle becomes an equilateral triangle, a rectangle becomes a square etc. Although the compaction algorithm can normalize the skewness caused by perspective transformation, it is not desirable in all applications. The similarity distance between two images can be defined as follows when the normalization for perspective transformation is taken into account,
Dissimilarity = min( DO , DC ) (11)
Here DO and DC are the similarity distances from the query image to the original image set and compacted image set, respectively.
Table 1 compares the retrieval performances of the ART descriptor with and without the perspective normalization (first and second row respectively). The retrieval performance has been drastically improved when the normalization for perspective transformation is needed (CE2A3), while those of other experiments remain almost the same.
5 Contour Shape Descriptor
In this section, the MPEG7 contour shape descriptor is presented: we explain the underlying theory, discuss its properties and overview some experimental results. The contour shape descriptor is based on the Curvature ScaleSpace (CSS) representation of the contour . The CSS representation has been successfully used for search and retrieval in the past, but it has been further extended and optimised during the MPEG development phase. Six different techniques were compared in Core Experiments: Fourierbased techniques, shape representations based on Zernike moments and ART, turning angles and waveletbased techniques. It was concluded that the CSS based descriptor offered the best overall performance for contourbased similarity. A detailed overview of the CSS representation and its use in the MPEG7 standard is presented in [4]. 5.1 The Curvature scalespace representation The concept of CSS representation is intuitively easy. It is based on an observation that when comparing shapes, humans tend to decompose shape contours into concave and convex sections. The overall similarity is assessed based on similarity of the “corresponding” sections: e.g. how prominent they are, their length relative to the contour length and their position and order on the contour. The CSS representation also “decomposes” the contour into convex and concave sections by determining the inflection points (e.g. ponts where curvature is zero). This is done in a multiresolution fashion, where the contour is analysed at various scales, obtained by a smoothing process. Figure 14 shows shape of a fish (left) and its CSS image (right). There are 3 peaks in the CSS image, (marked as A, B, and C) signifying that the fish contour has 3 concave parts (marked correspondingly 12
on the contour). The CSS image also shows that the concave segments A and C are approximately of similar perceptual importance (because the peak heights are similar), while concavity B is less prominent. We can also deduce that concavity B is located inbetween concavities A and C and find the relative distances between the inflection points. Figure 14 Example of the CSS image representation for a fish contour
Extraction of the CSS representation involves calculation of the curvature of a (usually digital) contour, while progressively smoothing the contour. Figure 15 shows the shape evolution during filtering (left column), the corresponding curvature functions (middle column) and the CSS image (right column). The curvature of a contour is defined as the derivative of the tangent angle to the contour. Let us assume that the contour C is represented by a parametric vector equation:
c(u ) = ( x(u ), y (u )) , where x, y are the coordinates of the pixels belonging to the contour and u is an arbitrary parameter. For a closed, planar contour, where u is the normalised arc length parameter, the curvature can be calculated as:
κ (u ) = x (u ) y (u ) − x(u ) y (u )
(12)



x
(
u
),
x
(
u
)
where are the first and second order derivatives of the x with respect to the parameter u. The smoothing is performed iteratively and for each realisation of the contour (e.g. after each iteration), the zero crossings of the curvature function are computed. The filtering process continues until a convex contour results, that is until there are no more zerocrossings of the curvature function. Figure 15 shows three contours C1,C2,C3 after increasing amount of smoothing has been applied and their corresponding curvature functions (middle column). The zero crossings of the curvature function are marked on each contour. Contour C3 is convex and therefore there are no zero crossings in its curvature function. Each zero crossing has a two dimensional vector associated with it (s,l), where s is the amount of the smoothing applied and l the position of the zero crossing on the contour. l is expressed as the distance along the contour from an arbitrarily chosen point on the contour to the inflection point, while circulating the contour clockwise. The CSS image is obtained by plotting all the zero crossing points on a 2dimensional plane, where vertical axis corresponds to the amount of filtering and horizontal axis corresponds to the position on the contour. Thus each contour C1,C2,C3 defines a single horizontal line in the CSS image, shown in Figure 15 (left).
13
Figure 15 Extraction of the CSS image for the fish contour of Fig. 14
5.2 Descriptor Syntax and Extraction
The descriptor consists of the eccentricity and circularity values of the original and filtered contour (each quantised to 6 bits), the index indicating the number of peaks in the CSS image (6 bits), the magnitude (height) of the largest peak (7 bits) and the x and y positions on the remaining peaks (9 bits per peak). The average size of the descriptor is 112 bits per contour. Extraction of the circularity and eccentricity
The circularity of the contour is calculated using the formula (12). It is uniformly quantized to 6 bits in the range [12110]. If the circularity value calculated is above 110, the value is clipped to 110.
perimeter 2
circularity
(13)
area
The calculation of eccentricity is as follows:
eccentricity =
where: i02
M
∑ ( yk
k =1
i20 + i02 + i20 2 + i02 2 − 2i20 i02 + 4i112
i20 + i02 − i20 2 + i02 2 − 2i20 i02 + 4i112
M
M
k =1
k =1
(14)
y c ) 2 , i11 = ∑ ( x k − x c )( y k − y c ) , i 20 = ∑ ( x k − x c ) 2 , M is the number of points inside the contour shape, and (xc,yc) is the center of mass of the shape. eccentricity is uniformly quantised to 6 bits in the range [110]. If the eccentricity value calculated is above 10, the value is clipped to 10.
Extraction of the peak parameters
The CSS image does not have to be explicitly extracted, we only need the position of the peaks. Fast methods of extraction are described in . Here we outline a simple but slower procedure for ease of understanding. Firstly, N equidistant points are selected on the contour, starting from an arbitrary point and following the contour clockwise. The xcoordinates and ycoordinates of the selected N points are grouped into two series X, Y. The smoothing of the contour is performed by repetitive application of a lowpass filter with the kernel (0.25,0.5,0.25) to X and Y coordinates of the selected N contour points. As a result of the smoothing, the contour evolves and its concave parts gradually flattenout, until the contour becomes convex (no zero crossings in the curvature), at which point the iterations terminate. For each smoothed contour, the zerocrossings of its curvature function are computed. The CSS image is constructed by marking each zerocrossing at an appriopriate position. 14
The CSS image horizontalcoordinates correspond to the indices of the contour points selected to represent the contour (1,…N), and CSS image verticalcoordinates correspond to the amount of filtering applied, defined as the number of passes of the filter. So if the contour obtained after k=20 iterations has a curvature zero crossing at location corresponding to index I=60, the pixel (60,20) will be black. Once all zero crossings were marked, the locations of the prominent peaks (x_css, y_css) in the CSS image are extracted. The location values are suitably normalised so that they are invariant to the number of sampling points and the size of the contours. The peaks are ordered based on decreasing values y_css, transformed using a nonlinear transformation and quantised. The details of the quantisation are explained in .
5.3 Properties of the contour shape descriptor
The contour shape descriptor has some very useful properties. Firstly, the representation is invariant to orientation, scale and mirroring of the object contour. It has also been shown to be robust to noise in the contour . Secondly, unlike the region shape descriptor, it retains the local information of the input shape (e.g. the order and neighbouring information about the features of the shape). This is because every contour concavity and convexity has its own corresponding contour on the CSS image. Local information plays important role in retrieval, for example one can determine the number of concave and convex elements on the contour just by counting the number of the socalled peaks (maxima) in the CSS image.
The shape properties of contours are important for retrieval of semantically similar objects [1]. The descriptor is also very efficient in applications where high variability in the shape is expected, due to, for example, deformations in the object (rigid or nonrigid) or perspective deformations. It supports search for shapes that are semantically similar for humans, even when significant intraclass variability exists, such as in the images of Beetles shown in Figure 5.
Finally, many objects have distinguishing features in properties of their contour while having similar region properties and can efficiently differentiate between such shapes.
All the above properties can be clearly seen when analysing the experimental results, described in the following section. 5.4 Experimental Results
The results obtained in the Core Experiment 1 are summarised in Table 3, using the MPEG7 XM implementation version 5.0. It can be seen that very good results are obtained in the rotational and scaling invariance tests. However, the real strength of the contour shape descriptor is in its performance in similaritybased retrieval (part B), where the average result is above 79%. CSSXM
CE1A1
CE1A2
CE1B
CE1C
Overall
91.03%
100%
79.15%
96%
90.22%
Table 3 CE1 Core Experiment Results for the Contour Shape Descriptor
15
Figure 16 Example similaritybased retrieval results for the contourshape descriptor
Figure 16 shows example results for a similaritybased retrieval, for the butterfly1 query image. It can be seen that the majority of butterfly images are retrieved correctly despite the significant variability in the shape of the wings, the size and spread of the antennae and whether or not the corpus of the body is protruding at the back. Images 16,19 and 20 are errors, but still some degree of similarity exists.
A classbyclass analysis of performance of the shape descriptors in similaritybased retrieval (CE1B) illustrates very well the differences between the regionbased and contourbased notion of similarity. Let us consider the shape classes where relative performance of one of the descriptors is better by 50% or more, compared to the other descriptor. This means that one of the descriptors retrieved 50% more correct shapes for the given class, which is a very significant difference in performance. The contour shape descriptor outperforms the region shape by 50% or more for 17 classes Figure 17 shows example shapes. Figure 17 Example of shapes where contour descriptor significantly outperforms region descriptor
The fist class includes images of a running person, where significant nonrigid deformations occur. The contour –based technique is much more robust to such deformations. Fork (second object in Figure 17) is a good example of a shape where characteristic features are conveyed in the contour. The region
based approach tends to confuse it with other elongated objects that expand at one end (e.g. Spoon, Hammer). Similarly, for Butterfly some characteristic features are in its antennae, which have a relatively small area and thus are not sufficiently emphasized by the regionbased approach. Finally, the classes of Beetles, Horses and Elephants are examples where the natural variability in shapes is much better captured by the contourbased approach. The four classes where the region based approach performs better are shown in Figure 18. The first, second and fourth shapes represent classes where the same “basic convex shape” such as Circle (Device9), Triangle (Device4) and Pentagon (Device6), has been altered by cutting out a variable number of long and narrow openings (Refer to Figure 5, bottom row, to see some shapes in the “pentagon” class). All such shapes have a similar distribution of pixels within their regions, but obviously very different contours, because the number and shape of the openings differs within each class. Finally, the third shape, a square with rounded edges, is a convex shape and therefore it contour carries very little characteristic information (no peaks in the CSS image). The contourbased technique tends to confuse it with other convex objects, for example with the Circle. 16
Figure 18 Classes where the region shape descriptor significantly outperforms the contour shape descriptor
6. 3D Shape Descriptor
The 3D Shape Spectrum Descriptor (3D SSD) captures characteristic shape features of 3D objects by exploiting some local geometrical attributes of objects’ surfaces defined by their 3D mesh models. The Shape Spectrum has been previously used for indexing and retrieval of 2D images and 3D range data .
6.1 Definition of the 3D Shape Descriptor
Before defining the shape spectrum, the concept of the shape index should be introduced. The shape index was first introduced by Koenderink and is defined as a function of the two principal curvatures. 1
2
Let p be a point on a regular 3D surface S and let k p and k p denote the principal curvatures 1
2
associated with point p ( k p ≥ k p ). The shape index at point p, denoted by Ip, is defined as: .
(15)
The shape index is a local geometrical attribute of a 3D surface, expressed as the angular coordinate of a polar representation of the principal curvature vector. The shape index ranges in the interval [0,1] and is not defined for planar surfaces. The shape index provides a scale for representing salient elementary shapes such as convex, concave, rut, ridge and saddle (Figure 19), and is invariant with respect to scale and Euclidean transforms. Title: fig3.eps
Creator: MATLAB, The Mathworks, Inc.
CreationDate: 02/02/ 1 13:01:36
Title: fig4.eps
Creator: MATLAB, The Mathworks, Inc.
CreationDate: 02/02/ 1 13:01:51
a. Spherical cup (0.0)
b. Rut (0.25)
Title: fig5.eps
Creator: MATLAB, The Mathworks, Inc.
CreationDate: 02/02/ 1 13:02:02
c. Minimal Saddle (0.5)
Title: fig1.eps
Creator: MATLAB, The Mathworks, Inc.
CreationDate: 02/02/ 1 13:01:07
Title: fig2.eps
Creator: MATLAB, The Mathworks, Inc.
CreationDate: 02/02/ 1 13:01:19
d. Ridge (0.75)
e. Spherical cap (1.0)
Figure 19 Elementary shapes and their corresponding shape index (in brackets).
It may be recalled that the principal curvatures are defined as the eigenvalues of the Weingarten map (W) given by the following expression: W = I −1 II , (16)
where I and II denote respectively the first and the second fundamental differential forms. Considering a Cartesian coordinate system and a Monge parametrization S = ( x , y , z ) = ( x , y , f ( x , y )) of the surface, with f C2 differentiable, the fundamental differential forms I and II can be expressed as symmetric and positive semidefinite matrices, and are given by the following equations:  S ,S
 x x
I =
 S x ,S y
S x ,S y   1 + f x2

 =
S y ,S y   f x f y
 S ,N
 xx
II = 
 S xy , N
S xy , N 
 f xx
1

=

S yy , N 
1 + f x2 + f y2  f xy
17
fx fy 
 1 + f y2 
and f xy 
 , f yy 
(17)
where N denotes the normal vector to the surface at the considered surface point p, and Sxy is the standard Monge notation for the partial derivatives of the surface S with respect to variables x and y. Here, notations used correspond to : ∂S
∂ 2S
∂S
∂ 2S
∂ 2S
Sx =
, S y =
, S xx = 2 , S xy =
, S yy = 2 .
(18)
∂y
∂x
∂x∂y
∂y
∂x
The shape spectrum of the 3D mesh is defined as the distribution of the shape index calculated over the entire mesh. We note that the above definitions refer to surfaces in the continuous space ℜ 3 . The extraction of the 3D SSD from discrete, polyhedral 3D meshes is described in detail in the following section. 6.2 Extraction of the 3D SSD Descriptor Many 3D models are expressed in VRML format, which is primarily intended for graphics purposes. Before applying differential geometrybased analysis to such data, it is necessary to transform it into orientable meshes, represented by a minimal number of connected components, without degenerated or overlapped polygons. To obtain the results presented here, a preliminary regularizing filtering of the 3D mesh models has been applied. Related approaches for repairing 3D mesh models to form manifold surfaces and solid regions are reported in . In addition, in order to obtain smoother representations, necessary when considering surface derivatives, a subdivision algorithm based on Loop's subdivision scheme was applied.
6.2.1 Estimation of the Principal Curvatures Estimating the principal curvatures is the key step of the 3D SSD extraction. Indeed, the 3D SSD performance strongly depends on the accuracy of estimates and the robustness of the estimation technique used for the triangulation of the 3D model. Our approach is based upon a seconddegree polynomial surface fitting. The principal curvatures are associated with each face of the mesh. First, the mean normal vector to ~
face fi of the mesh, denoted by N fi , is defined as the weighted average of the normal vectors of the 0
adjacent faces of the considered face fi (Equation 18). Two faces of the mesh are said to be 0adjacent if and only if they share a common vertex. ~
N fi =
∑
wk N f k
∑
wk N f k
{ }
f k ∈F0 f i
{ }
f k ∈F0 f i
Here, F0 {
fi
(18)
} denotes the set of 0adjacent faces of fi, N f k denotes the normal vector to face fk, wk is the weighting coefficient associated with the 0adjacent face fk, and ⋅ the L2 norm of a vector in ℜ 3 . Each weighting coefficient wk is set to equal the area of the face fk, in order to take into account some irregularities in the mesh face distribution. A local Cartesian coordinate system (x, y, z) is defined such that its origin coincides with the gravity ~
center of the face fi and the zaxis is associated with the previously defined mean normal vector N fi . {
} iN=1 denotes the cloud of points made of the centroids of the considered face f and Let W= ( x i , y i ,z i )
i
its 0adjacent faces, with coordinates expressed in the local coordinate system. The parametric surface approximation is achieved by fitting a quadric surface through the set of points W.
Let S = ( x , y , z ) = ( x , y , f a ( x , y )) denote a second degree polynomial surface, expressed as: f a ( x , y ) = a 0 x 2 + a1 y 2 + a 2 xy + a 3 x + a 4 y + a 5 , t
(19)
t
where ai's are real coefficients. By denoting a = (a 0 a1 a 2 a 3 a 4 a 5 ) and b( x , y ) = ( x y xy x y 1) , Equation 6 can be rewritten more compactly by using standard matrix notations: 18
2
2
.
(20)
t
The parameter vector a = (a 0 a1 a 2 a 3 a 4 a 5 ) is determined by applying a least square fit procedure. N
N
Given the data points denoted by { ( x i , y i ,z i )} i =1 , with associated weights { wi } i =1 , the parameter vector â corresponding to the optimal (in the weighted mean square error sense) fit is expressed as: N

a =  ∑ wi b( x i , y i )b t ( x i , y i )
 i =1

−1
N
N

 ∑ wi z i b( x i , y i ) = arg min ∑ wi ( z i − f a ( x i , y i )) 2 .  i =1

a ∈ℜ 6 i =1
(21)
N
Note that the representation of the gravity centers { ( x i , yi ,z i )} i =1 in local coordinates guarantees the invariance of the approximation with respect to Euclidean transforms. Once the parametric approximation is computed, the principal curvatures k1 and k2 can be easily determined as the eigenvalues of the Weingarten map W, since the first and second differential forms I and II at (x, y) = (0, 0) take simple expressions, given in Equation 22.
 a0 a2 
1
 1 + a 32 a 3 a 4 
II =

.

I =
,
(22)
2
 a3a4 1 + a4 
1 + a 2 + a 2  a 2 a1 
3
4
Note, that it is also possible to consider the set Fn { f i } of the nadjacent faces (n>0). Such a set is recursively defined from F0 { f i } by considering successively wider neighborhoods. In this case, the computational complexity increases proportionally to the number of nadjacent faces without providing a significant improvement of the principal curvature final estimation. 6.2.2 Computation of the 3D Shape Spectrum The 3D SSD of a 3D mesh M is defined as the histogram of the shape index values, calculated over the entire mesh. The histogram is represented on a Nbins number of bins, uniformly quantizing the range of N
bins
the 3D shape index values (the [0, 1] interval) into a set of Nbins intervals, { ∆ k } k =1
, where N
 k −1
−1 
k 
∆k = 
,
, 1 .  , for k = 1, N bins − 1 and ∆ N bins =  bins
 N bins N bins 
 N bins

The relative (with respect to the total area of the mesh) area of each face of the mesh of shape index belonging to the interval ∆ k is added to the kth component of the 3D SSD. Two additional components, denoted by PlanarSurface and SingularSurfaces, and representing the relative surfaces of planar and border patches, respectively, are introduced in the 3D SSD. The planar surfaces must be separately considered since the 3D shape index is not defined for planar patches. Let ka denote the L2 norm of the curvature vector (k1, k2): k a = k 12 + k 22 . The degree of curvedness of the surface is defined as C = Area(F0 {f i }) * k a . Each face with curvedness C less than a threshold T, will be considered as part of a planar patch and its relative area will increment the PlanarSurface component of the 3D SSD. Unreliable estimates of the principal curvatures may occur in the case of border faces with a number of 0order adjacent faces smaller than a predefined number Nmin. Hence, the border faces are not taken into account when computing the shape index and their relative area is stored into the SingularSurfaces component. 6.2.3 Example similarity measure
The similarity measure is L1 or L2 norm computed on vectors formed by the SSD histograms. Such similarity measures may, or not, take into account the nonplanar surfaces or the singular surfaces, depending on the target application.
19
The 3D SSD has been evaluated within the MPEG7 Core Experiments (CE’s). The results presented in next section are those obtained and reported within the framework of the MPEG7 3D Shape CE over 5 successive MPEG meetings (October 1999 July 2000). 6.3 Experimental Results
The 3D SSD has been evaluated on a data set consisting of 1300 3D meshes. To facilitate objective evaluation, a subset consisting of 228 models has been selected and categorized into 15 distinct categories, based on the shape similarity as perceived by the participants in the CE.
The L1based norm was used and the singular and planar histogram bins have not been taken into account. and the 3D SSD has been renormalized such that N bins
∑ SSD(i ) = 1 .
i =1
Figures 20 and 21 show some examples of similarity retrieval corresponding to queries from the "Cars" (2.a) "Aerodynamic" (2.b) "4Limbs" (3.a and 3.b) and "Trees" (3.c) categories. The first 6 retrieved results are presented according to a decreasing similarity order, from left to right and from top to bottom. In the upper left cell the query is presented. Figures 21.a and b show that nice results are obtained for articulated objects in the "4Limbs" category, where elements in various postures are retrieved (the humanoid with arms and legs spreadout in the second row of Figure 21.a, the sitting dog in Figure 21.b). For objective evaluation of the proposed 3D SSD in terms of retrieval performances, the Bull’sEye Percentage (BEP) score is computed. The BEP is defined for each query as the percentage of correct matches (i.e. items within the same category as the query) among the 2Q top retrieved results, where Q denotes the size of the considered query category. After computing the BEP scores for each categorized item, an average BEP is computed for each category. The global mean score of 85% demonstrates the good discriminatory properties of the 3D SSS. (a)
20
(b)
Figure 20 Similarity retrieval results for queries within the "Cars" (a) and "Aerodynamics" (b) categories
(a)
(b)
21
Figure 21 Similarity retrieval results for queries within the "4Limbs" (a, b) and "Trees" (c) categories.
In order to decrease as much as possible the complexity of the 3D SSD, the effects of uniformly quantizing the 3D SSD and of decreasing the number of histogram bins of the representation, with respect to the similarity retrieval related performances were studied. Experimental results show that a 12 bits for the quantization values yields hit rates approaching the floating point representation performances. More compact representations, down to 8 7 bits are also possible with 3% degradation of the mean retrieval rate. Concerning the number of bins of the representations, good results are obtain in a range varying from 100 down to 10 histogram bins, with a degradation of the mean retrieval rate of only 5%. Thus, a compact representation of the 3D shape spectrum descriptor can be used while ensuring optimal retrieval rates. 7 Combining a Multiple View and Contour Shape Descriptors
The MultipleView Descriptor has been developed to combine 2D descriptors representing a visual feature of a 3D object seen from different view angles [10]. The descriptor forms a complete 3D view
based representation of the object. Any 2D visual descriptor, such as for example contourshape, regionshape, colour or texture can be used. The MultipleView Descriptor supports integration of the 2D descriptors used in the image plane to describe features of the 3D (real world) objects. Experiments with the MultipleView Descriptor and the Contourbased Shape Descriptor have shown good performance in multiview description of 3D shapes . 8 Example Applications of the MPEG7 Shape Descriptors
We will now consider two example applications of MPEG7 shape descriptors: a cartoon search engine and a trademark search system. (MZB Note: insert short para on the trademark search system)
Cartoon Search Engine Figure 22 shows an example of a Webbased cartoon clip search application, which uses MPEG7 visual descriptors. Short cartoon stories are stored in MPEG4 format on a server. The characters present in each keyframe are segmented and their contour shape and dominant colour descriptors extracted and stored as the MPEG7 descriptors. The user selects a required cartoon character from the group of images in the upper part of the display. Alternatively, the visual properties of the searched character can be drawn/selected with a mouse. A search is performed on the descriptions stored on the 22
servers and key frames from the video clips containing similar characters are displayed to the user (lower row of images). The user can then select the required clip, which is transmitted over the network and played back. Combination of the dominant colour and contour shape descriptor gives robust performance, even in the case of partial occlusion, as can be seen in two of the retrieved key frames in the Figure 22. Figure 22 GUI of the Shape and Colour Cartoon Search Engine
9 Conclusions
A coherent set of shape descriptors has been developed as an international collaborative effort and now forms a key part of the MPEG7 visual specification. Extensive tests shown that the descriptors offer stateoftheart retrieval performance, and have excellent generalisation properties, similar to that of the human visual system. High compactness and expression efficiency combined with fast extraction and matching should satisfy even the most demanding realtime application. We expect to see a wide range of novel devices and applications, based on the MPEG7 shape description technology in the near future. Some very early examples were presented here, but it appears that there is much more to come. ACKNOWLEDGEMENTS
The authors of this chapter would like to thank all participants in the MPEG “shape experimentation group”, especially WS Kim, K. Muller, J Ohm, J. Cooper, T. Zaharia, HK Kim, J.D. Kim, for their technical contributions towards development of the MPEG shape descriptors.
References
[1] Shimon Ullman, “High Level Vision”, The MIT Press, 1997 23
[2] Text of ISO/IEC 159383/FCD Information technology – Multimedia content description interface – Part 3 Visual, ISO/IEC JTC1/SC29/WG11/N4062, March 2001 (Singapore)
[3] S. Jeannin and M. Bober, “Description of Core Experiment for Motion/Shape,” ISO/IEC N2690, Seoul, Mar. 1999.
[4] F. Mokhtarian, M. Bober, “The Curvature Scale Space Representation: Theory, Applications, and MPEG7 Standardization”, Kluwer Academic Publishers, to appear in 2001
[5] J.J. Koenderink, A.J. van Doorn, “Surface shape and curvature scales”, Image and Vision Computing, 10 (8), 1992.
[6] WhoiYul Kim, YoungSung Kim,“A new regionbased shape descriptor ”, ISO/IEC MPEG99/M5472, Maui, Dec. 99. [7] ChoChuak Ten, Roland T. Chin, “On Image Analysis by the Methods of Moments”, IEEE Transactions on attern Analysis and Machine Intelligence, Vol. 10, No. 4, July 1988, pp. 496513.
[8] M. Bober, – “Shape descriptor based on Curvature Scale Space”, MPEG7 proposal P320, Lancaster, UK, February 1999.
[9] Jong Deuk Kim and HaeKwang Kim, – “Shape Descriptor based on MultiLayer Eigen Vector”, MPEG7 proposal P517, Lancaster, UK, February 1999.
[10] F. Mokhtarian, A.K. Mackworth, “A theory of multiscale, curvaturebased shape representation for planar curves” , IEEE Trans. On Pattern Analysis and Machine Intelligence, Vol. 14, No 8, August 1992, pp. 789805.
[11] J. G. Leu, “Shape Normalization Through Compacting,” Pattern Recognition Letters, vol. 10, pp. 243250, 1989.
[12] K. Müller, JR Ohm, J. Cooper , M Bober, “ Results of 2D/3D Shape Core Experiments MS4 ”, ISO/IEC MPEG99/M6190, Beijing, China, July 2000.
[13] C. Nastar, "The Image Shape Spectrum for Image Retrieval", INRIA Technical Report RR3206, July 1997.
[14] C. Dorai, A.K. Jain, "Shape SpectrumBased View Grouping and matching of 3D FreeForm Objects", IEEE Trans. on PAMI, Vol. 19, No. 10, pp. 11391145, 1997. [15] J. Koenderink, "Solid shape", The MIT Press, Cambridge, Massachusetts, 1990. [16] M. Do Carmo, "Differential geometry of curves and surfaces", PrenticeHall Inc., Englewood Cliffs, New Jersey, 1976. [17] M. Spivak, "A comprehensive introduction to differential geometry", 2nd ed., Houston: Publish or Perish, 1979. [18] P. Hoogvorst, "Filtering of the 3D mesh models", ISO/IEC JTC1/SC29/WG11, MPEG00/ M6101, Geneva, June 2000. [19] A. Gueziec, G. Taubin, F. Lazarus, W. Horn,"Converting Sets of Polygons to Manifold Surfaces by Cutting and Stitching", IEEE Visualization'98, 1998. [20] T.M. Murali, Thomas A. Funkhouser, "Consistent solid and boundary representations", Computer Graphics (1997 SIGGRAPH Symposium on Interactive 3D Graphics), pp. 155162, March 1997,. [21] G. Butlin, C. Stops, "CAD data repair", Proc. of the 5th Int. Meshing Roundtable, October 1996. [22] G. Barequet, S. Kumar, "Repairing CAD models", Proc. of IEEE Visualization'97, October 1997. [23] C. T. Loop, "Smooth subdivision surfaces based on triangles", Master's Thesis, Dep. of Mathematics, Univ. of Utah, August 1987. [24] T.Zaharia, F. Preteux, M. Preda, "3D Shape spectrum descriptor", ISO/IEC JTC1/SC29/WG11, MPEG99/M5242 , Melbourne, Australia, October 1999. [25] T. Zaharia, F. Preteux, "3D Shape descriptors: Results and performance evaluation" ", ISO/IEC JTC1/SC29/WG11, MPEG99/ M5592, Maui, December 1999. [26] T. Zaharia, F. Preteux, "Crosscheck of the 3D shape spectrum descriptor", ISO/IEC JTC1/SC29/WG11, MPEG00/ M5917 , Noordwijkerhout, The Netherlands, March 2000. [27] T. Zaharia, F. Preteux, "3D Shape Core Experiment: The influence of mesh representation", ISO/IEC JTC1/SC29/WG11, MPEG00/ M6103, Geneva, June 2000. 24
[28] T. Zaharia, F. Preteux, "3D Shape Core Experiment: Semantic versus geometric categorization of 3D mesh models", ISO/IEC JTC1/SC29/WG11, MPEG00/M6104, Geneva, Switzerland, June 2000. [29] T. Zaharia, F. Preteux, "The influence of the quantization step on the 3D shape spectrum descriptor performances", ISO/IEC JTC1/SC29/WG11, MPEG00/M6316, Beijing, China, July 2000
[30] W.S. Kim and W. Y Kim, « ContentBased Trademark Retrieval System using visually salient features », IEEE Conference on Computer Vision and Pattern Recognition, 1997, pages 307312
25