A Comparison of Measures for Detecting Natural Shapes in Cluttered Backgrounds
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
International Journal of Computer Vision 34(2/3), 81–96 (2000) ° c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. A Comparison of Measures for Detecting Natural Shapes in Cluttered Backgrounds LANCE R. WILLIAMS∗ Dept. of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA KARVEL K. THORNBER NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA Abstract. We propose a new measure of perceptual saliency and quantitatively compare its ability to detect natural shapes in cluttered backgrounds to five previously proposed measures. As defined in the new measure, the saliency of an edge is the fraction of closed random walks which contain that edge. The transition-probability matrix defining the random walk between edges is based on a distribution of natural shapes modeled by a stochastic motion. Each of the saliency measures in our comparison is a function of a set of affinity values assigned to pairs of edges. Although the authors of each measure define the affinity between a pair of edges somewhat differently, all incorporate the Gestalt principles of good-continuation and proximity in some form. In order to make the comparison meaningful, we use a single definition of affinity and focus instead on the performance of the different functions for combining affinity values. The primary performance criterion is accuracy. We compute false-positive rates in classifying edges as signal or noise for a large set of test figures. In almost every case, the new measure significantly outperforms previous measures. Keywords: contours, saliency, closure, grouping 1. Introduction define the affinity between a pair of edges somewhat differently, all incorporate the Gestalt principles of The goal of segmentation is to partition a set of image good-continuation and proximity in some form. A measurements (e.g., edges) into equivalence classes saliency function maps the set of affinities between corresponding to distinct objects. In this paper, we con- all pairs of oriented or directed edges (i.e., the affinity sider a somewhat simpler grouping problem which (fol- matrix) to a saliency vector. In this paper, we have lowing Shashua and Ullman, 1988) we call the saliency chosen to compare the definitions of saliency—not affi- problem. The goal of the saliency problem is to assign nity. The differences in the authors’ definitions of affin- a value to each edge which is correlated with whether ity prevents a direct comparison since each requires its that edge belongs to a shape or is background noise. own set of parameters. The choice was either to (1) Given the distribution of saliency values, it is then of- optimize the performance of each measure over its ten possible to choose a threshold which will segment required parameters and compare the different mea- the edges into shape and noise classes. sures with each using its optimal parameter setting; or Each of the saliency measures proposed in the liter- (2) replace the individual affinity functions with a sin- ature is a function of a set of affinity values assigned to gle function and compare performance using a single pairs of edges. Although the authors of each measure parameter setting. Apart from requiring an impracti- cal amount of work, the first approach has the disad- ∗ This paper describes research performed while the first author was vantage of confounding the definitions of affinity and at NEC Research Institute. saliency so that the relative merits of each are difficult to
82 Williams and Thornber disentangle. The shortcoming of the second approach (which is the one we adopted) is that while provid- ing the best comparison of the saliency functions, it says nothing about the relative merits of the affinity functions. Although unlikely, it also ignores possible dependencies between the specific affinity and saliency functions used in a given measure. The affinity functions can be divided into three classes. Functions in the first class are based on co- circularity (Guy and Medioni, 1996; Hérault and Horaud, 1993; Parent and Zucker, 1989; Ullman, Figure 1. Every edge has a single orientation and two distinct di- 1976). The disadvantage of these functions is that rections. Its orientation, θ , is an angular quantity in the range, 0 to π . they are non-generic—a circle does not have sufficient Its directions are θ and θ + π . Left: A vector, y, with n components, degrees of freedom to smoothly join two arbitrary can be used to associate a single value, yi , with each of n edges. Right: A vector, x, of length 2n, can be used to associate two values, positions and orientations in the plane. They are also xi and x ī , with each of n edges. These values correspond to the two difficult to motivate using arguments based on the distinct directions. statistics of natural shapes. Functions in the second class are based on curves of least energy (Horn, 1981; Shashua and Ullman, 1988). The affinity between two R directed edges is inversely related to the energy, 0 ds (ακ 2 (s) + β), in the curve of least energy join- ing the two edges. Functions in the third class are based on an explicit prior probability distribution of natu- ral shapes modeled by a stochastic motion (Mumford, 1994; Thornber and Williams, 1996; Williams and Jacobs, 1997). A particle travels with constant speed in the direction θ(t). Change in direction is a normally distributed random variable with zero mean. Conse- quently, θ (t) is a Brownian motion. The variance of Figure 2. The vectors, x and x̄, are identical except for a permuta- tion which exchanges the values associated with opposite directions. the random variable reflects the prior expectation of That is, xi = x̄ ī and xī = x̄i . smoothness. In addition, a constant fraction of particles decay per unit time. The half-life reflects the prior ex- pectation of shortness. The affinity between two edges, and two distinct directions. Its orientation, θ , is an i and j, is defined as the sum of the probabilities over angular quantity in the range, 0 to π . Its directions all paths joining the two edges, i.e., P 0 ( j | i ). Curves are θ and θ + π . We will use x to represent a vec- of least energy and stochastic motions are closely re- tor of values associated with edge directions and y to lated. In fact, it is possible to show that the energy represent a vector of values associated with edge ori- of the curve of least energy is a linear function of the entations. If a stimulus contains n edge segments then log-likelihood of the maximum likelihood stochastic the vector, x, has 2n components while the vector, y, motion (Mumford, 1994; Williams and Jacobs, 1997). has n components (see Fig. 1). The vectors x and x̄ are 0 R function2 P ( j | i ) behaves very simi- It follows that the identical except for a permutation which exchanges the liarly to exp[− 0 dt (ακ (t) + β)] when 0 is a curve values associated with opposite directions, i.e., xi = x̄ ī of least energy. This is because the probability associ- and x ī = x̄i . For example, if xi is a value associated ated with 0 (and curves of similiar shape) dominates with an edge of direction, θ , then x̄i is a value associ- the probabilities summed over all paths. ated with the same edge, but in the opposite direction, To facilitate the exposition, we will introduce a sin- θ + π (see Fig. 2). gle nomenclature for describing all of the saliency All of the saliency measures in our comparison asso- measures. One of the major differences between the ciate an affinity value with a pair of oriented or directed measures is whether they are formulated using orienta- edges. We will use the n × n matrix A to represent the tions or directions. Every edge has a single orientation affinities between all pairs of oriented edges and the
Detecting Natural Shapes in Cluttered Backgrounds 83 computed by a local parallel network.1 Using our nomenclature, the saliency of a network element (one for each position and direction) at time t + 1 is related to the saliencies of neighboring elements at time t by the following update equation: xi(t+1) = 1 + max Pi j x (t) j j Recently, Alter and Basri (1996) have done an ex- tensive analysis of Shashua and Ullman’s method and give expressions for the saliency measure for the case Figure 3. Left: The n×n matrix A can be used to represent the affin- of continuous curves.2 The saliency of a directed edge i ity values between all pairs of oriented edges. Right: The 2n × 2n matrix, P, is needed to represent the affinity values between all pairs equals the maximum of the saliencies of all continuous of directed edges. While the affinity oriented edge i has for j equals curves, 0, which begin at that edge: the affinity that oriented edge j has for i, this does not (generally) hold for directed edges. Basically, Ai j = A ji (i.e., A = AT ) but 8(i) = max 8(0) in general, Pi j 6= P ji . Although not symmetric, P exhibits another 0∈C(i) kind of symmetry. If we use the subscript ī to denote the oppo- site direction to i then Pi j = P j̄ ī and P ji = Pī j̄ . This is termed The saliency of a continuous curve, 0, is given by the time-reversal symmetry. following expression: Z sn Rs Rs dt (1−σ (t)) − dt κ 2 (t) 2n×2n matrix P to represent the affinity values between 8(0) = ds σ (s) · ρ s1 · e s1 all pairs of directed edges (see Fig. 3). An important s1 distinction between saliency measures based on orien- where σ (.) is an indicator function which equals one tation and those based on direction involves the symme- where the curve lies on an edge element (and equals try (or non-symmetry) of the affinity matrices. While zero elsewhere), ρ is a parameter in the interval [0, 1) the affinity oriented edge i has for j equals the affinity which controls the rate of convergence, and κ 2 (.) is the that oriented edge j has for i, this does not (generally) square of the curvature. The overall effect is that the hold for directed edges. Basically, Ai j = A ji (i.e., Shashua-Ullman measure favors long smooth curves A = AT ) but in general, Pi j 6= P ji . containing only a few short gaps. For the special case Although not symmetric, P exhibits another kind of of a curve threading a sequence of n edges of negligible symmetry. If we use the subscript ī to denote the op- length, we have the following simplification: posite direction to i then Pi j = P j̄ ī and P ji = Pī j̄ . This is termed time-reversal symmetry. For the pur- X n − R si dt (κ 2 (t)−ln ρ) poses of our comparison, we will define Ai j to be 8(0) = e s1 max(Pi j , Pī j , Pi j̄ , Pī j̄ ), that is, the affinity between i=1 two orientations is defined to be the maximum of the We observe that the first two terms of this series will affinities among all combinations of directions. dominate all subsequent terms unless the radius of cur- vature is large and the curve is densely sampled.3 Con- 2. Saliency Measures sequently, for visual patterns consisting of a sparsely sampled curve in a background of noise, the Shashua In the following section, we provide short synopses of and Ullman measure becomes local and greedy. We the saliency measures used in the comparison. will see that this seriously limits its performance on such patterns in the presence of correlated noise. 2.1. Shashua and Ullman (SU) 2.2. Hérault and Horaud (HH) Shashua and Ullman (1988) were the first to use the term saliency in the sense that it is being used in this Hérault and Horaud (1993) cast the problem of seg- paper. Building on earlier work by Montanari (1971), menting a set of oriented edges into figure and ground they described a saliency measure which could be as a quadratic programming problem which is solved by
84 Williams and Thornber simulated annealing. The objective function consists the Raleigh Quotient: of two terms, −E saliency − E constraint : yT Ay µ ¶ 1 yT y min − yT Hy − bT y for y ∈ {−1, +1}n y 2 When A is symmetric, the Raleigh Quotient is maxi- P mized by the eigenvector, y, associated with the largest where Hi j = Ai j − α and bi = j (Ai j − α). The positive real eigenvalue of A: affinity function used by Hérault and Horaud is based on co-circularity, smoothness and proximity. Hérault and Horaud say only that α is a parameter related to the λy = Ay signal-to-noise ratio but do not say how it is chosen or provide the value they used in their experiments. Ex- This measure can also be optimized using the following perimentally, we have found that their method is very recurrence equation: sensitive to the choice of this parameter. If α is too P large, the solution consists of all −1’s (i.e., all ground) Ai j y (t) yi(t+1) j j =P P while if it is too small it consists of all +1’s (i.e., all j k A jk yk (t) figure). Determining the proper value of α makes the job of fairly comparing Hérault-Horaud with measures which has been independently proposed as a saliency lacking a comparable parameter difficult. Therefore computation by Yen and Finkel (1996) and by Perona (for the comparison) we decided to maximize E saliency and Freeman (1998). From linear algebra, we know over 0–1 solution vectors with exactly m components that the vector y will converge to the eigenvector as- equal to 1: sociated with the largest positive real eigenvalue of A. Viewed this way, we see that A is being used as a lin- max yT Ay for y ∈ {0, 1}n and yT y = m ear relaxation labeling operator and that maximizing y the Raleigh Quotient is equivalent to solving a linear relaxation labeling problem as defined by Rosenfeld, where m is the number of figure edges and n is the total Hummel and Zucker (1976). number of edges. Although in a real application, we would generally not know the value of m, we do know this value for all of our test patterns. For this reason, 2.4. Guy and Medioni (GM) the modified problem should provide a lower bound on the false-positive rate for the Hérault-Horaud mea- Guy and Medioni (1996) describe a saliency compu- sure. tation which involves the summation of vector voting patterns based on co-circularity and proximity. The 2.3. Sarkar and Boyer (SB) distribution of votes which accumulate at a point in the plane is represented by its 2 × 2 covariance matrix. Sarkar and Boyer (1996) describe a saliency measure The predominant orientation at a point is determined and apply it to the problem of distinguishing developed by the eigenvector of the covariance matrix with largest and undeveloped land in aerial images. Although this eigenvalue. Neglecting the clever device of represent- is a somewhat different application than the one con- ing the vote distributions by their covariance matrices, sidered in this paper, the similarity between Sarkar and it is possible to interpret Guy and Medioni’s voting Boyer’s computation and our own makes a compari- patterns as representing the correlation between orien- son worthwhile. In addition to good-continuation and tations at different locations in the image plane. In our proximity, Sarkar and Boyer’s affinity function incor- nomenclature, the saliency at an edge would be the sum porates pairwise measures useful for detecting clusters of the voting patterns due to all other edges: of buildings such as parallelism and perpendicularity. X The affinity function we used in the comparison is the yi = Ai j same one we used with the other methods (i.e., equiva- j lent to only a subset of the relations proposed by Sarkar and Boyer). Given an affinity matrix, A, Sarkar and which is essentially one iteration of linear relaxation la- Boyer propose that the saliency vector, y, maximizes beling using the operator A and a constant input vector.
Detecting Natural Shapes in Cluttered Backgrounds 85 2.5. Williams and Jacobs (WJ) length t + 1 random walks), is given by the following recurrence equation: Williams and Jacobs (1997) describe a method for com- P (t) puting a representation of illusory contours and oc- (t+1) j Pi j x j xi =P P (t) cluded surface boundaries which they call a stochastic j k P jk x k completion field. The magnitude of the stochastic com- P P pletion field at (u, v, φ) is the probability that a particle The j k P jk xk(t) term in the denominator is a nor- following a stochastic motion (representing the prior malization factor. Without the normalization after each distribution of boundary completion shapes) will pass step, the vector x would quickly approach zero, be- through (u, v, φ) on a path joining two boundary frag- cause random walks of increasing length have decreas- ments. Although not portrayed as a saliency measure, ing probability. In the steady-state, this normalization it is easy and natural to use this method to compute factor equals λ: saliency. The saliency of an edge is defined to be the probability that a particle following a stochastic motion λx = Px will pass through that edge on a path joining two others. where the eigenvector, x, represents the fraction of ran- The saliency vector is given by the (component-wise) dom walks located at any given edge and the eigen- product of x and x̄ where each component of x is: value, λ, represents the ratio of the number of random X walks which reach one more edge to the number which xi = Pi j drift off or die in every step of the random process.5 In j the steady-state, the variation of the eigenvalue equals zero (i.e., δλ/δx = 0) and the eigenvalue itself is given The value of xi is the probability that a particle will by the following equation: reach directed edge i from some other edge j. The saliency of i is just xi multiplied by x̄i (i.e., the proba- x̄T P x bility that a particle will reach the same edge but with λ= x̄T x opposite direction). From time-reversal symmetry, we see that this equals the probability that a particle start- While there are other expressions for λ, the above ex- ing at any edge will pass through edge i and eventually pression is significant because it makes explicit the re- reach another edge.4 lationship between error in x and error in λ.6 However, unlike the Raleigh Quotient for symmetric matrices, x does not maximize this expression. Furthermore, al- 3. A New Measure (WT) though x is a fixed-point, there is no guarantee that a process which starts at a random vector and repeatedly We define the salience of an edge to be the relative applies the recurrence equation will converge to x.7 number of closed random walks which visit that edge. Recall that our stated goal was to compute the rel- By random walk, we mean a sequence of edges visited ative number of closed random walks through every by a particle with edge-to-edge transition probabili- edge. We defined a closed random walk to be a ran- ties given by the matrix, P. By closed random walk, dom walk which begins and ends at the same edge. we mean a random walk which begins and ends at the Unfortunately, because P is not a Markov matrix, the same edge. Although it is important to distinguish be- eigenvector with largest positive real eigenvalue does tween (1) random walks; and (2) the paths in the plane not represent a distribution of closed random walks— followed by a particle when θ(t) is a Brownian mo- only a small subset of the random walks contributing tion, these two notions are intimately related. This is to x are actually closed. The problem which concerns because the probability that a particle located at edge us is how to characterize this subset. i at time-step t will be at edge j at time-step t + 1 To begin, we observe that any random walk of in- is defined to be the sum over the probabilities of all finite length beginning and ending at edge i can be paths between i and j, i.e., P j i ≡ P 0 ( j | i). It follows exhaustively decomposed into an infinite number of that the distribution of random walks of length t + 1 closed random walks each of which visits edge i ex- is related to the distribution of random walks of length actly once. Because (after cyclic permutation) any visit t through multiplication by the matrix, P. If xi(t) rep- to edge i is also the beginning and ending state of an resents the fraction of random walks of length t which infinite length random walk, we conclude that the re- end at directed edge i, then xi(t+1) (i.e., the fraction of lative number of closed random walks which visit edge
86 Williams and Thornber i and the relative number of random walks of infinite product of x and x̄. In effect, at each edge we are con- length which begin and end at edge i are in one-to- structing the cartesian product of the set of random one correspondence. The following expression gives walks bridging edges in all past-times and the set con- the relative number of random walks of infinite length sisting of their time-reversed counterparts.8 This forms which begin and end at edge i: a set of closed random walks. Pt ci = lim P ii t 4. Directionality and Tangent Continuity j Pjj t→∞ In the Yen and Finkel (1996) saliency computation, the To evaluate the above expression, we first divide P by support for oriented edge i due to all other oriented its largest real positive eigenvalue, λ, and distribute the edges j is given by P the linear relaxation labeling up- (t) P P limit over the numerator and denominator, which yields date step, yi(t+1) = j A ji y j / j (t) k A jk yk . Be- cause Yen and Finkel’s intention was to model the ¡ ¢s lims→∞ Pλ ii visual cortex, the y(t) vector represents a fixed lat- ci = P ¡ ¢t tice of discrete positions and orientations. The update limt→∞ j Pλ j j step can be viewed as convolution with a large ker- nel filter followed by normalization.9 The symmetry To evaluate the numerator and denominator, we ob- of the A matrix manifests itself in the plane as mir- serve that P can be written as XDX̄T where the columns ror image symmetry in the kernel (see Fig. 4). The 1 of X are xµ /(x̄Tµ xµ ) 2 , the right eigenvectors of P and D state of the computation at any given time is summa- is diagonal with the corresponding eigenvalues, µ, on rized by y(t) , which represents a distribution of random the diagonal. From time-reversal symmetry, λx = Px walks of length t. In general, this iteration will con- implies λx̄ = PT x̄. Consequently, the columns of X̄ are verge to the eigenvector, y, associated with the largest 1 x̄µ /(x̄Tµ xµ ) 2 , the left eigenvectors of P. Hence, for t positive real eigenvalue of A, i.e., it computes the SB sufficiently large, measure. µ ¶t µ ¶t For illustrative purposes, it is worth considering P D lim = lim X X̄T a network analogous to that described by Yen and t→∞ λ t→∞ λ ¡ ¢ Finkel, but based on a lattice of discrete positions X xµ µλ t x̄Tµ and directions (i.e., x(t) instead of y(t) ) and using a = lim t→∞ µ x̄Tµ xµ xλ x̄Tλ = x̄Tλ xλ This is because λ is the largest real positive eigenvalue of P and therefore λ > |µ| for all eigenvalues, µ, except µ = λ. Applying this result to the denominator of the expression for ci yields X µ P ¶t X (xx̄T ) j j lim = =1 t→∞ j λ jj j x̄T x Figure 4. In the Yen and Finkel (1996) saliency computation, the support for oriented edge i due to all other oriented edges j Consequently, the relative number of closed random is given by the linear relaxation labeling update step, yi(t+1) = P (t) P P (t) walks through edge i is given by the numerator j A ji y j / j k A jk yk . Because Yen and Finkel’s intention was to model the visual cortex, the y(t) vector represents a fixed lat- µ ¶s P (xx̄T )ii xi x̄i tice of discrete positions and orientations. The update step can be ci = lim = =P viewed as convolution with a large kernel filter followed by normali- s→∞ λ T x̄ x ii j x j x̄ j zation. The symmetry of the A matrix manifests itself in the plane as mirror image symmetry in the kernel. In general, this iteration will To within a constant factor, this is simply xi x̄i . The converge to the eigenvector, y, associated with the largest positive whole distribution is given by the (component-wise) real eigenvalue of A, i.e., it computes the SB measure.
Detecting Natural Shapes in Cluttered Backgrounds 87 x(t) ) can be interpreted as distributions of random walks of length t, the random walks of the Yen and Finkel net- work contain cusps (i.e., reversals in direction). The fundamental problem is that without representing di- rections in the state vector, the linear relaxation label- ing process cannot remember the direction of travel of the random walk in the previous time-step—it only knows its orientation. Because the linear relaxation la- beling operator is symmetric, at the next time-step, the random walk is equally likely to continue in either di- rection. One of these directions preserves tangent con- tinuity. The other introduces a cusp. This is illustrated in Fig. 6. After the initial iteration, this process tends to increase the salience of the noise edges as much as the signal edges. It explains why the performance of Guy and Medioni’s saliency computation (essentially one iteration of linear relaxation labeling using the op- Figure 5. Hypothetical saliency network is based on a lattice of erator A and a constant input vector) is superior to that discrete positions and directions (i.e., x(t) instead of y(t) ) and uses a of Sarkar and Boyer. It also explains why WT outper- non-symmetric linear relaxation labeling operator (i.e., P instead of forms both. A). In this network, the support for directed edge i due to all other directed edgesPj is given by The importance of directionality in enforcing tan- Pthe linear relaxation labeling update P step, xi(t+1) = j P ji x (t) j / j (t) k P jk x k . Because P is not sym- gent continuity in computations involving contours metric, this iteration is not guaranteed to converge. The salience of seems not to be generally recognized in the human edge i is given by the product of xi and x̄i where x is the eigenvector vision community. See for example, the paper of Field associated with the largest positive real eigenvalue of P. This quan- et al. (1993), which suggests that the phenomenon tity represents the relative number of closed random walks through edge i. of “pop-out” of salient contours is evidence for an non-symmetric linear relaxation labeling operator (i.e., P instead of A), see Fig. 5. In this network, the sup- port for directed edge i due to all other directed edges j is givenPby the linearP relaxation P labeling update step, xi(t+1) = P x j ji j (t) / j (t) k jk x k . As before, the P state of the computation at any given time is summa- rized by x(t) , which represents a distribution of random walks of length t. However, because P is not symmetric, there is no guarantee that this process will converge to x, the eigenvector with largest positive real eigenvalue. Figure 6. The explicit representation of two directions (i.e., x and Furthermore, although x is a fixed-point, x alone does x̄) and the use of a non-symmetric linear relaxation labeling operator not represent a distribution of closed random walks. (i.e., P) is essential if the intermediate states of the relaxation labeling process (i.e., x(t) , x(t+1) , etc.) are to be interpreted as distributions of The salience of edge i is given by the product of the random walks which are continuous in tangent. Repeated application ith components of the right and left eigenvectors, xi of a symmetric linear relaxation labeling operator (i.e., A) to a vector and x̄i . It is the product which represents the relative of saliencies associated with orientations (i.e., y) yields distribu- number of closed random walks through edge i. tions of random walks which can reverse direction at edge locations We observe that the use of directions and a non- (left). After the initial iteration, this process tends to increase the salience of the noise edges as much as the signal edges. This ex- symmetric linear relaxation operator is essential—even plains why the performance of Guy and Medioni’s saliency compu- for visual patterns consisting of non-directional ele- tation (essentially one iteration of linear relaxation labeling using the ments, e.g., Gabor patches of even phase. While the in- operator A and a constant input vector) is superior to that of Sarkar termediate states of both networks at time t (i.e., y(t) and and Boyer. It also explains why WT outperforms both.
88 Williams and Thornber “association field” which looks very much like Fig. 4. 6. Results A notable exception is the “bipole” cell of the bound- ary contour system of Grossberg and Mingolla (1985), 6.1. Saliency Maps for Simple Patterns which (like our model) combines inputs from opposite directions. In order to gain some insight into the strengths and weaknesses of the various measures, it will be useful to apply them to a simple test pattern consisting of edges 5. Contour Saliency from a circle (either thirty, twenty, or ten uniformly spaced samples) in a background of one hundred edges To develop some intuition for the meaning of the eigen- of random position and orientation.11 The three test value, it will be useful to consider an idealized situation. patterns are shown in Fig. 7(a–c). We know from linear algebra that the eigenvalues of P We will first look at the performance of the Shashua are solutions to the equation det(P − λI ) = 0. Now, and Ullman (SU) measure. In order to visualize the consider a closed path, 0, threading m directed edges. saliencies assigned to each edge, the edges are dis- The probability that a particle following this path will played as rectangles with lengths and widths propor- reach directed edge, 0(i mod m)+1 , given that it is lo- tional to the raw saliency values. The raw saliency map cated at directed edge, 0i , equals P 0 (0(i mod m)+1 | 0i ). for the thirty edge circle is shown in Fig. 8(a). It is Assuming that the probability of a particle traveling clear that the SU measure assigns significantly higher from directed edge 0i to 0 j when 0 j does not im- saliencies to the edges of the circle than to the back- mediately follow 0i on the closed path is negligible ground edges. This would allow the circle to be seg- (i.e., P ji = P 0 (0 j | 0i ) when j = (i mod m) + 1 and mented from its background simply by choosing an P ji = 0 otherwise) then: appropriate saliency threshold. The raw saliency map à !1/m for the twenty edge circle is shown in Fig. 8(b). In this Y m ¡ ¯ ¢ case, the saliencies of the circle edges are compara- λ(0) = P 0(i mod m)+1 ¯ 0i 0 ble to those of the background, making segmentation i=1 of the circle by thresholding impossible. Note also satisfies det(P − λI ) = 0. This is the geometric that one of the two most salient edges belongs to the mean of the transition probabilities in the closed path.10 background. This is an edge, which, simply through Normally long contours have very low probability: chance, lies almost exactly on the circle with the correct Qm 0 i=1 P ( 0 (i mod m)+1 | 0i ). However, the properties of the geometric mean are such that smoothness and clo- sure are favored and long contours suffer no penalty. It is useful to compare this to the saliency which Shashua and Ullman assigns to a curve, which is given by the following geometric series: X ∞ Y j 8(0) = P 0 ( 0i+1 | 0i ) j=1 i=1 Figure 7. (a) Thirty edge circle in a background of one hundred Shashua and Ullman desired a saliency measure noise edges; (b) twenty edge circle and (c) ten edge circle. which favored long, smooth contours yet converged to a finite value for contours of infinite length (i.e., for closed contours). Unfortunately, the rate at which this series converges depends critically on the values of the P 0 ( 0i+1 | 0i ). If the transition probabilities are too small, the series will converge too rapidly (and the measure becomes local and greedy). Conversely, if they are too large, the series will converge too slowly. In summary, we see that the geometric mean has the Figure 8. (a) Raw saliencies computed by SU for thirty edge circle; properties Shashua and Ullman wanted but lacks other (b) raw saliencies computed by SU for twenty edge circle and (c) raw undesirable properties. saliencies computed by SU for ten edge circle.
Detecting Natural Shapes in Cluttered Backgrounds 89 greater saliencies than the edges of the background. However, we observe that the saliency values on the upper left portion of the circle are significantly larger than those on the lower right. If the eigenvector with largest positive real eigenvalue is interpreted as a limi- ting distribution of random walks between the edges, we see that this distribution is dominated by random Figure 9. (a) Figure edges computed by HH for thirty edge circle; walks (with reversals in direction) through the spuri- (b) figure edges computed by HH for twenty edge circle and (c) figure ous edge at the 10 o’clock position. Because the SB edges computed by HH for ten edge circle. measure does not enforce tangent continuity or clo- sure, the effect of a single unfortuitously placed edge orientation (approximately at the 10 o’clock position). can be profound. In Fig. 10(b) and (c) the asymmetry It is unreasonable to expect any measure to be able to becomes very pronounced as the sampling of the circle distinguish this edge from the edges of the circle. Fi- becomes less frequent. The consequence is that the SB nally, the performance of the SU measure on the ten measure has failed to isolate the circle from its back- edge circle (see Fig. 8(c)) is also poor. ground, even in a case where a very simple method like We next applied the segmentation method of Hérault WJ has little problem (see Fig. 13(b)).12 and Horaud (HH) to the same three test patterns. It Figure 11(a–c) shows the raw saliency values for the is important to note that we are solving the modified three circles computed using the method of Guy and optimization problem described previously. That is, we Medioni (GM). Like the SU measure, the GM measure are optimizing yT Ay over all vectors, y ∈ {0, 1}n and assigns significantly higher saliency values to the edges |y| = m where n is the total number of edges and of the thirty edge circle than to the background edges. m is the number of figure edges. Because it is given However, also like the SU measure, the GM meaure the number of figure edges, HH possesses a consid- performs more poorly on the twenty and ten edge cir- erable advantage over the other measures and these cles. Overall, the similiarities between the saliencies results should be interpreted accordingly. The segmen- computed by the SU measure and those computed by tation for the thirty and twenty edge circles are shown the GM measure are quite striking. We speculate that, in Fig. 9(a) and (b). With the exception of omitting for contours defined by relatively few edges of neg- one circle edge at 1 o’clock to permit inclusion of the ligible length separated by large gaps, the largest Pi j spurious 10 o’clock edge, HH computes a perfect seg- are relatively small (e.g., approximately 0.1). Conse- mentation. However, the results on the ten edge circle quently, the geometric series computed by SU is dom- (see Fig. 9(c)) show that the method has failed to seg- inated by its first term, which equals max j Pi j . In turn, ment the circle from its background. becausePthe spatial attenuation of the affinity function is In order to better visualize the large range of saliency rapid, j Pi j ≈ max j Pi j , i.e., the sum of the affinities values computed by the method of Sarkar and Boyer from all edges is dominated by the maximum affinity (SB), the lengths and widths of the rectangles are drawn from all edges. proportional to log(1.0 + 106 · xi ), where xi is the Given the similiarity of the saliencies computed by saliency of edge i. Figure 10(a) shows the log salien- the SU and GM measures on the thirty, twenty, and ten cies for the thirty edge circle computed using the SB edge circles, it is useful to include an example where method. In general, the edges of the circle are assigned the saliency values computed using the two methods are Figure 10. (a) log Saliencies computed by SB for thirty edge circle; Figure 11. (a) Raw saliencies computed by GM for thirty edge (b) log saliencies computed by SB for twenty edge circle and (c) log circle; (b) raw saliencies computed by GM for twenty edge circle saliencies computed by SB for ten edge circle. and (c) raw saliencies computed by GM for ten edge circle.
90 Williams and Thornber Figure 12. (a) An incomplete circle with twenty three edges; (b) Figure 14. (a) Log saliencies computed by WT for thirty edge cir- raw saliency values computed using the SU measure and (c) raw cle; (b) log saliencies computed by WT for twenty edge circle and saliency values computed using the GM measure. (c) log saliencies computed by WT for ten edge circle. very different. Figure 12(a) shows an incomplete cir- to discriminate edges of the ten edge circle from those cle with twenty three edges. There are no noise edges. of the background. The log saliency map computed Figure 12(b) shows the raw saliency values computed by the WJ measure for the ten edge circle is shown in using the SU measure. Edges near the two ends of the Fig. 13(c). A significant number of background edges incomplete circle are assigned the highest saliencies, have saliencies comparable to those assigned to edges and edges far from the ends are assigned the small- of the circle. Consequently, segmentation of the circle est saliencies. In this figure, the magnitude of Pi j for from its background using thresholding is not possible. adjacent P22 i and j is greater than one. Consequently, P11 Finally, Fig. 14(a–c) show the log saliency maps for P n=1 i j n À n n=1 Pi j . This explains the difference in the three circles computed using the WT measure. With relative saliency. Figure 12(c) shows the raw saliency the exception of the spurious 10 o’clock edge, all of the values computed using the GM measure. Except for the background edges are assigned saliencies of negligible edges on the ends, which are assigned somewhat lower magnitude. This holds even for the ten edge circle— values, most edges are assigned comparable saliencies. which no other measure was able to segment from the Figure 13(a) and (b) show the log saliency maps background. for the thirty and twenty edge circles computed us- ing the method of Williams and Jacobs (WJ). With the exception of the spurious 10 o’clock edge, all of the 6.2. False Positive Rates for Simple Patterns background edges are assigned saliencies of negligi- ble magnitude. The contrast between the saliencies The first quantitative comparison used test patterns computed by WJ and those computed by GM is quite which consisted of short oriented edges spaced uni- striking. The difference in discrimination power is due formly around the perimeter of a circle in a background solely to the use of directionality (i.e., P and x vs. A of edges with random positions and orientations (see and y) and the (component-wise) multiplication of the Fig. 15(a)). We computed the saliency of both shape and vector x by the vector x̄ in WJ. This multiplication en- noise edges using each of the six measures: SU, HH, forces the constraint that an edge must form a bridge be- SB, GM, WJ, and WT. The edges were then sorted in tween two others. The result is saliencies with a greater ascending order based on their saliencies. The salience range of magnitudes than possessed by the components of the most salient edge is φ1 and the salience of the of x or x̄ alone. However, this constraint is not enough least salient edge is φn . Given m shape edges, we de- fine a false-positive as a noise edge which is assigned a salience larger than φm+1 . The false-positive rate for each measure was computed for patterns consist- ing of different numbers of shape and noise edges. The false-positive rate for each combination (e.g., 20 shape edges and 70 noise edges) was estimated by averaging the false-positive rate for ten trials using different noise patterns.13 The right half of Fig. 15(a) is a plot of the percentage false-positives versus the number of noise Figure 13. (a) Log saliencies computed by WJ for thirty edge circle; edges for the twenty-edge circle. (b) log saliencies computed by WJ for twenty edge circle and (c) log All of the measures perform reasonably well (less saliencies computed by WJ for ten edge circle. than 10% false-positive rate) at the low noise-levels (40
Detecting Natural Shapes in Cluttered Backgrounds 91 Figure 15. (a) Twenty edge circle; (b) ten edge sine curve; (c) twenty edge circle (dipole noise) and (d) ten edge circle. All patterns are shown at a noise-level of fifty.
92 Williams and Thornber noise edges or less). At higher noise-levels, the perfor- 6.3. Fruit and Texture Patterns mance of the measures begins to diverge. It is interest- ing that GM significantly outperforms SB, since GM Our intention in the last comparison was to test the is essentially one iteration of SB. We speculate that the saliency measures on a collection of “real images” but false-positive rate is increased by additional relaxation- to do so in a way which would allow meaningful error labeling steps using the non-directional operator. We rates to be estimated. In the past, when new grouping also observe that GM performs comparably to HH— methods have been proposed, their performance has not even though the HH measure is significantly more been systematically compared to others from the lite- expensive to compute. Finally, at the lower signal- rature. Although the proposed methods are typically to-noise ratios, WJ and WT have significantly lower demonstrated on two or three “real images,” because false-positive rates. the computational goal is often not well defined, per- The second comparison was identical to the first ex- formance is impossible to gauge. Consequently, it is cept that the shape edges formed an open-ended sine unclear whether or not the methods represent genuine curve (see Fig. 15(b)). The right half of Fig. 15(b) is a improvements in the state of the art. plot of the percentage false-positives versus the number We decided to construct test patterns from pairs of of noise edges for the ten-edge sine curve. The rela- real images in such a way that performance on the tively poor performance of the WT measure compared saliency problem could be objectively measured. Nine to the other measures can be attributed to its explicit different fruits and vegetables were placed in front of reliance on closure. Nevertheless, it still outperforms a uniformly colored background (three of these are the SB measure for higher signal-to-noise ratios and shown in Fig. 16(a–c)). This allowed their silhouettes has an error rate comparable to that of SB (i.e., within 5%) at lower signal-to-noise ratios. As in the previous comparison, the performance of GM and HH are nearly identical. The false-positive rates of these measures is somewhat larger than that of the WJ measure. The SU measure had the best performance. In the third comparison, we used a background con- sisting of correlated (i.e., dipole) noise (see Fig. 15(c)). A dipole consists of two collinear edges separated by a gap of size equal to the distance between successive edges of the circle. Because the two edges forming a dipole are collinear, the affinity between the edges forming a dipole is greater than between adjacent cir- cle edges. Consequently, it is impossible to distinguish noise edges from shape edges using purely local mea- sures. Indeed, all of the measures but WJ and WT have nearly a 100% false-positive rate. In the case of the SU measure, this is because (for gaps of this size) the geometric series is dominated by the first two terms.14 In the fourth comparison (see Fig. 15(d)), we used a ten-edge circle. This is a challenging pattern because the sampling rate is so low—only one edge per 36 de- grees of circumference. Most of the measures perform poorly, even at relatively high signal-to-noise ratios. For a noise-level of 80, the GM, SU and HH measures are performing almost at chance, or 90% false-positive rate. The SB and WJ measures perform slightly bet- ter, with false-positive rates of 80% and 70%, respec- Figure 16. (a–c) Banana, pear, red onion, (d–f), banana edges, pear tively. In contrast, the false-positive rate for WT is edges, red onion edges, (g–i) terrain, brick, water and (j–l) terrain under 5%. edges, brick edges, water edges.
Detecting Natural Shapes in Cluttered Backgrounds 93 to be extracted using straightforward methods. The ori- entation at points uniformly spaced along the silhouette was then estimated using a robust line fitting technique (see Fig. 16(d–f)). Because the fruits and vegetables varied widely in size, the maximum of the x and y dimensions of the silhouette was determined (i.e., the bounding square) and the set of edges was rescaled to an absolute size of 32 × 32. Figure 18. (a) Raw saliencies computed by SU for banana with To serve as background, we selected nine images terrain mask; (b) raw saliencies computed by SU for pear with brick of natural texture from the MIT Media Lab texture mask and (c) raw saliencies computed by SU for red onion with water database (three of these are shown in Fig. 16(g–i)). The mask. Canny edge detector was applied to a 64 × 64 block from each texture and the resulting edges were filtered on contrast to create a set of nine masking patterns con- sisting of approximately 800 edges each (see Fig. 16 (j–l)). Test patterns were constructed by insetting the fruit and vegetable silhouettes into the center 32 × 32 regions of the size 64×64 natural textures. Figure 17(a) displays the banana with terrain background, Fig. 17(b) displays the pear with brick background, and Fig. 17(c) Figure 19. (a) Figure edges computed by HH for banana with ter- displays the red onion with water background. Because rain mask; (b) figure edges computed by HH for pear with brick mask and (c) figure edges computed by HH for red onion with water mask. the actual fruits and vegetables varied widely in size, the distances between adjacent edges of their silhou- ettes after rescaling is also somewhat variable. For ex- ample, because the banana was somewhat larger than the pear, the edges of its silhouette are more closely spaced after rescaling. This explains why some meth- ods perform consistently better on one test pattern than on another. The raw saliency maps computed by the SU measure for the three fruit and texture test patterns are shown Figure 20. (a) Log saliencies computed by SB for banana with in Fig. 18(a–c). The figure edges computed by HH are terrain mask; (b) log saliencies computed by SB for pear with brick mask and (c) log saliencies computed by SB for red onion with water shown in Fig. 19(a–c). The log saliency maps com- mask. puted by the SB measure for the three fruit and tex- ture test patterns are shown in Fig. 20(a–c). The raw saliency maps computed by the GM measure are shown in Fig. 21(a–c). The log saliency maps computed by the WJ measure are shown in Fig. 22(a–c). Finally, the log saliency maps computed by the WT measure are Figure 21. (a) Raw Saliencies computed by GM for banana with terrain mask; (b) raw saliencies computed by GM for pear with brick mask and (c) raw saliencies computed by GM for red onion with water mask. shown in Fig. 23(a–c). We observe that, qualitatively, the performance of the various measures on the fruit Figure 17. (a) Banana with terrain mask; (b) pear with brick mask and texture test patterns is similiar to that observed for and (c) red onion with water mask. the circle test patterns.
94 Williams and Thornber positive rate for the SB measure is 72% (i.e., 8% better than chance performance). The false-positive rates for SU, GM, HH and WJ are all approximately 50%. In contrast, the false-positive rate for the WT measure is 20% (i.e., 60% better than chance performance). Fur- thermore, after the signal-to-noise ratio is reduced by a factor of two, the false-positive rate for the WT measure Figure 22. (a) Log saliencies computed by WJ for banana with remains under 50%. terrain mask; (b) log saliencies computed by WJ for pear with brick mask and (c) log saliencies computed by WJ for red onion with water mask. 7. Conclusion In this paper, we introduced a new measure of percep- tual saliency and quantitatively compared its ability to detect natural shapes in cluttered backgrounds to five previously proposed measures. The saliency measure is based on the distribution of closed random walks through the edges. We computed false-positive rates in classifying edges as signal or noise for a large set of Figure 23. (a) Log saliencies computed by WT for banana with test figures. In almost every case, the new measure sig- terrain mask; (b) log saliencies computed by WT for pear with brick nificantly outperforms previous measures. We observe mask and (c) log saliencies computed by WT for red onion with that the explicit representation of two edge directions water mask. and a non-symmetric affinity matrix are necessary to enforce tangent continuity in salient contours. Apart For a quantitative comparison, edges from the nine from our method (WT), only the method of Shashua fruit and vegetable silhouettes (signal) and nine natural and Ullman (SU) possesses this property. However, texture masking patterns (noise) were combined to con- unlike our method, the SU method does not expli- struct a set of 405 test patterns. The texture edges are citly enforce contour closure. Since smooth, closed undersampled to achieve a given signal-to-noise ra- contours are judged to be more salient by human ob- tio. These patterns represent all 81 silhouette and tex- servers (Field et al., 1993; Kovacs and Julesz, 1993), ture combinations at five different signal-to-noise ratios we conjecture that these two reasons underly the im- (see Fig. 16(m–o)).15 Each of the six saliency measures proved performance of our method relative to previous was run on all of the test patterns and false-positive methods. rates were computed as before. The results are plotted in Fig. 24. For a signal-to-noise ratio of 0.2, the false- Appendix In this Appendix, we give the analytic expression for the affinity function used in the comparisons (see (Thornber and Williams, 1996) for its derivation).16 We define the affinity, P ji , between two directed edges, i and j, to be: Z ∞ P j i ≡ P 0 ( j | i) = dt P( j | i; t) ≈ F P( j | i; topt ) 0 where P( j | i; t) is the probability that a particle which begins its stochastic motion at (xi , yi , θi ) at time 0 will be at (x j , y j , θ j ) at time t. The affinity between two edges is the value of this expression integrated over stochastic motions of all durations, P 0 ( j | i). This in- Figure 24. False-positive rate for fruit and vegetable silhouettes tegral is approximated analytically using the method of with natural texture backgrounds. steepest descent. The approximation is the product of P
Detecting Natural Shapes in Cluttered Backgrounds 95 evaluated at the time at which the integral is maximized In general, this quantity will not converge to a finite value as t (i.e., topt ), and an extra factor, F. The expression for P goes to infinity. at time t is: 2. Anyone who is interested in understanding the Shashua and Ullman saliency computation in greater detail is encouraged to £ ¤ ¡ ¢ 3 exp − T6t 3 (at 2 − bt + c) · exp −t τ read Alter and Basri’s veryRhelpful paper. s P( j | i; t) = p 3. A typical value for exp(− s1i dt (κ 2 (t) − ln ρ)) is 0.1, so that π 3 T 3 t 7 /2 8(0) = 1 + 0.1 + 0.01 + · · · = 1.11. In such cases, ninety-nine percent of the saliency is contributed by the first two terms. where 4. It is also worth noting that the WJ measure can be computed very efficiently using a multi-resolution method. See (Williams et al., 1997) a = [2 + cos(θ j − θi )] / 3 5. Unlike a Markov process, λ is usually very small—the great ma- b = [x ji (cos θ j + cos θi ) + y ji (sin θ j + sin θi )]/γ jority of particles never reach another edge. In a Markov process, ¡ ¢± the probabilities in every column of the transition-probability c = x 2ji + y 2ji γ 2 matrix must sum to one. Consequently, the largest eigenvalue also equals one. for x ji = x j − xi and y ji = y j − yi . The parameters T , 6. Specifically, it shows that if x were in error by δx, the calculated τ and γ determine the distribution of shapes (where T is λ would be in error by only δx squared. 7. For this reason, in all of our experiments, we compute the eigen- the diffusion coefficient, τ is particle half-life and γ is vector with largest positive real eigenvalue using EISPACK. speed). In all of our experiments, T = 0.002, τ = 5.0 8. Those random walks which visit edges of opposite direction and and γ = 1. The expression for P should be evaluated in reverse order. at t = topt , where topt is real, positive, and satisfies the 9. Probably the best example of this style of visual computation is following cubic equation: the Parent and Zucker (1989) network, which uses (non-linear) relaxation labeling to improve local measurements of contour tangent and curvature. The relaxation labeling operator repre- −7 t 3 /4 + 3(at 2 − 2bt + 3c)/T = 0 sents constraints between discrete tangent and curvature labels at all locations in the plane. Although the Parent and Zucker network and the Yen and Finkel network are similiar at an algo- If more than one real, positive root exists, then the root rithmic level, they compute very different functions. Indeed, the maximizing P( j | i; t) is chosen.17 Finally, the extra goal of the former is more accurately described as “sharpening” factor F is: than “saliency.” s 10. Equivalently, minus one times the logarithm of the eigen- Pm equals0 the average transition energy: − ln λ(0) = 5 2πtopt value F= ± − i=1 ln P ( 0(i mod m)+1 | 0i )/m. 12(3 c − b topt )/T + 7topt3 2 11. The radius of the circle equals 16 and the noise edges are uni- formly distributed within a square of size 64. For our purposes here, we ignore the exp(−t/τ ) factor 12. It is worth pointing out (again) that Sarkar and Boyer did not in the steepest descent approximation for topt . We note design their method for problems of this sort. Their inten- tion was to segment aerial photos of construction sites. It is that by increasing γ , the distribution of contours can likely that tangent continuity and closure are less important for be uniformly scaled. that domain. However, Yen and Finkel (1996) and Perona and Freeman (1998) describe the same measure and demonstrate it using simple test patterns very much like the ones shown here. Acknowledgments 13. We wanted to ensure that HH was not unfairly penalized be- cause of the inherent difficulty of solving the combinatorial op- The authors wish to thank Hong Pan for providing timization problem. We therefore computed the value of yT Ay the silhouettes of fruits and vegetables. Thanks also for the perfect segmentation and accepted a trial only when to Shyjan Mahamud, Tairan Wang, and Majd Sakr for the simulated annealing procedure returned a greater or equal value. After ten failed attempts, we restarted that trial with a their assistance in running experiments and plotting re- new noise-pattern. sults. Finally, we are grateful to David Jacobs, Michael 14. For the thirty-edge circle, the geometric series converges more Langer, and Shyjan Mahamud for many helpful discus- slowly. Presumably, SU would continue to improve (relative to sions. the other measures) as the size of the gaps decreases. 15. The original fruit and texture images and files containing the silhouette and texture edges can be downloaded from Notes http://www.cs.unm.edu/∼williams/saliency.html. 16. For a derivation of a related affinity function, see the recent 1. In our nomenclature, Montanari’s update equation is xi(t+1) = paper of Sharon, Brandt and Basri (1997). min(−ln Pi j + x (t) (t) j ). After t time-steps, x i equals the energy 17. For a discussion on solving cubic equations, see Press et al., of the minimum energy curve of length t beginning at edge i. 1988.
96 Williams and Thornber References Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1988. Numerical Recipes in C, Cambridge University Alter, T. and Basri, R. 1996. Extracting salient contours from images: Press. An analysis of the saliency network. In Proc. IEEE Conf. on Comp. Rosenfeld, A., Hummel, R., and Zucker, S. 1976. Scene labeling by Vision and Pattern Recognition (CVPR ’96), San Francisco, CA, relaxation operations. IEEE Trans. on Systems Man and Cyber- pp. 13–20. netics 6:420–433. Field, D.J., Hayes, A., and Hess, R.F. 1993. Contour integration by Sarkar, S. and Boyer, K. 1996. Quantitative measures of change based the human visual system: Evidence for a local association field. on feature organization: Eigenvalues and eigenvectors. In Proc. Vision Research, 33:173–193. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’96), Grossberg, S. and Mingolla, E. 1985. Neural dynamics of form per- San Francisco, CA, pp. 478–483. ception: Boundary completion, illusory figures, and neon color Shashua, A. and Ullman, S. 1988. Structural saliency: The detection spreading. Psychological Review, 92:173–211. of globally salient structures using a locally connected network. Guy, G. and Medioni, G. 1996. Inferring global perceptual contours In 2nd Intl. Conf. on Computer Vision (ICCV ’88), Clearwater, from local features. Intl. Journal of Computer Vision, 20:113–133. FL. Horn, B.K.P. 1981. The curve of least energy, MIT AI Lab Memo Sharon, E., Brandt, A., and Basri, R. 1997. Completion ener- No. 612, MIT, Cambridge, MA. gies and scale. In Proc. IEEE Conf. Computer Vision and Pat- Hérault, L. and Horaud, R. 1993. Figure-ground discrimination: tern Recognition (CVPR ’97), San Juan, Puerto Rico, pp. 884– A combinatorial optimization approach. IEEE Trans. on Pattern 890. Analysis and Machine Intelligence, 15:899–914. Thornber, K.K. and Williams, L.R. 1996. Analytic solution of Kovacs, I. and Julesz, B. 1993. A closed curve is much more than an stochastic completion fields. Biological Cybernetics, 75:141– incomplete one: Effect of closure in figure-ground segmentation. 151. Proc. Natl. Acad. Sci. USA, 90:7495–7497. Ullman, S. 1976. Filling-in the gaps: The shape of subjective con- Montanari, U. 1971. On the optimal detection of curves in noisy tours and a model for their generation. Biological Cybernetics, pictures. Comm. of the Assoc. for Computing Machinery, 14:335– 21:1–6. 345. Williams, L.R., Wang, T., and Thornber, K.K. 1997. Computing Mumford, D. 1994. Elastica and computer vision. In Algebraic stochastic completion fields in linear-time using a resolution pyra- Geometry and Its Applications, Chandrajit Bajaj (Ed.), Springer- mid. In Proc. of 7th Intl. Conf. on Computer Analysis of Images Verlag: New York. and Patterns (CAIP ’97), Kiel, Germany. Parent, P. and Zucker, S.W. 1989. Trace inference, curvature consis- Williams, L.R. and Jacobs, D.W. 1997. Stochastic completion fields: tency and curve detection. IEEE Transactions on Pattern Analysis A neural model of illusory contour shape and salience. Neural and Machine Intelligence, 11:823–889. Computation, 9:849–870. Perona, P. and Freeman, W. 1998. A factorization approach to group- Yen, S. and Finkel, L. 1996. “Pop-Out” of salient contours in a net- ing. In Proc. of the 5th European Conf. on Computer Vision work based on striate cortical connectivity. Investigative Opthal- (ECCV ’98), Freiburg, Germany. mology and Visual Science (ARVO), 37(3):5293.
You can also read