Next Section: Similarity
III. Experimental Measurement
and Computation of Similarity
Measurement is the process of assigning numbers to objects according
to a set of rules. This process serves to describe and organize phenomena,
and it provides a means of testing theories about the measured objects.
For nonverbal subjects, the raw materials of similarity measurement consist
primarily of generalization, discrimination and transfer data based on
errors, response rates, and reaction times. This section describes
how one goes from these raw materials to the assignment of similarity numbers,
reviews a few of the advantages and disadvantages of different approaches,
and suggests some implications of similarity measures for models of avian
A few years ago a chapter on similarity in a book about animal behavior
would have been titled “Stimulus Generalization,” and similarity
would be defined by relative responding across stimuli in generalization
and discrimination tasks. This approach may work for problems that
can be addressed by using stimuli that vary on one dimension and similarities
need only be roughly known. However, individual generalization gradients
are quite limited as a general tool for conceptualizing and measuring similarity.
Some time ago, in a book on generalization, Shepard and I both pointed
out some of the limitations (D. Blough, 1965; Shepard, 1965), which
are briefly summarized here.
Consider a prototypical generalization gradient published by Guttman
and Kalish (1956). Pigeons were rewarded on an intermittent schedule
for pecking a key illuminated by monochromatic light of 550 nm. Reward
then ceased, and 11 wavelengths from 490 to 610 nm were presented a number
of times in random order. Figure
8 shows the mean across
birds of the number of responses made to each stimulus.
What does the gradient in Figure 8
reveal about the similarities between test and
training stimuli and among the various test stimuli? Assume at
least an ordinal relation between response rate and similarity, that
is, the more a pigeon pecks at a stimulus the more similar
that stimulus is to the rewarded stimulus. Then the gradient permits the
ordering of similarities between the training stimulus, 550 nm, and the
various test stimuli, from 560 nm (marked "C"), which is most
similar, to 490nm, which is least similar.
it is hard to say much more than this.
First, of course, the gradient provides comparisons only between the
test stimuli and the training stimulus, 550 nm. For example, one
cannot assume that stimuli giving the same number of responses on the test
are similar to each other; 570 nm (at D) is probably not similar to 540 nm
(at B), though the pigeons responded about equally to both.
the metric properties of the scales used to plot the gradient are
unknown. On the response scale, for example, C differs from A by
about half the number of responses as does D, but we cannot say that C twice
as similar to A as is D. As for the stimulus scale, there
is no reason to assume that wavelength is related to similarity in a simple
For example, the Guttman & Kalish gradient in Figure 8 suggests
that 560 nm is more similar to 550 nm than
is 540 nm, though both differ from 550 nm by 10 nm. In other
“true shape” of the generalization gradient is indeterminate from these
data alone (Blough, 1965; Shepard, 1965).
Guttman & Kalish (1956) collected generalization gradients based on
training at several different stimuli, in addition to the one shown in
Figure 8. This permitted Shepard (1965) to resolve the stimulus scaling
problem in a unique and powerful way.
He noted that the slopes of the gradients
tended to covary where they overlapped.
For example, gradients centered at 550, 570, 580
and 600 nm were all rather flat in the vicinity of 570 nm. (This can
be seen near D in Figure 8, where the slope is less than that at B ).
found that by rescaling the abscissa such that data points near 570
nm were pushed horizontally closer together (in addition to similar, smaller
he could make all the gradients take on approximately the same shape.
rescaling process is illustrated in Figure 9.
The top panel shows the gradient from Figure 8 (around 550 nm) in
blue, together with an approximation to the empirical gradient
nm, in green.
Note that the blue gradient falls less steeply on the right than on
the left, whereas the green gradient falls less steeply to the left and not
so sharply peaked.
The bottom panel shows the result of rescaling the abscissa,
mostly by reducing the distance between 560 nm and 580 nm.
The gradients are now about the same shape.
They are also approximately symmetrical around the training
The transformation necessary to bring this about provides an
equal-interval similarity scale. (Shepard's analysis gained power because it
transformed 5 gradients, not just two, to about the same shape.)
success of such rescaling is not preordained. Rescaling will produce a common shape for overlapping gradients only
if the same underlying similarity scale determines all the gradients.
But when rescaling is successful, the resulting scale would be
expected apply generally to pigeon experiments using this range of
wavelengths. The potent idea that a number of interlocking measurements from the
same set of stimuli can uniquely determine a similarity scale has been
It is at the core of non-metric multidimensional scaling, to which we
Multidimensional scaling is the measurement procedure that corresponds
most closely with the geometric approach to similarity described
earlier. The procedure is typically applied
to a matrix of values each of which represents a behavioral measure of
the similarity between two objects in a set; all possible pairs of objects
are usually represented in the matrix. For the behavioral measure,
human subjects may rate object pairs for similarity; humans or other
animals may provide error or latency data from discrimination or search
tasks. Using these data as input, a computer algorithm provides a
spatial map, in which interstimulus distances correspond to dissimilarities
between stimuli. This map can efficiently describe patterns or structures
within such data that may bear directly on models for the mental representation
of similarity (e.g. Nosofsky, 1992; Shepard, 1987; Tversky, 1977).
Multidimensional scaling and related analyses can reduce a large amount
of data to a relatively simple structure that is often easy to visualize
and can present important relationships in an economical way. The
effectiveness of this analysis rises rapidly with the number of different
stimuli employed. Each pairing of objects provides a data value,
and the number of binary relations (R) between pairs of a set of objects
rises approximately as the square of the number of objects; specifically
R = (n2 – n )/ 2, if self-similarity is excluded.
Thus, for example, the 26 letters of the alphabet number can be paired
in 325 ways, providing a rich set of highly interconnected values.
A geometric map of similarity is like a geographic map; just as
the latter compactly represents thousands of distances among pairs of cities
or other locations, a similarity map captures many relations among perceptual
objects. A point in a similarity space corresponds to each object, and
distances between these points represent dissimilarities between the objects;
the smaller the distance between objects, the greater is their similarity.
This multidimensional map of similarity is most compelling when the
stimulus objects have an inherent dimensionality. In actual cases,
these dimensions often correspond to physical attributes such as size or
intensity, although they need not do so. Even in such cases, however,
as mentioned above for the generalization gradient, the correspondence
between the physical and psychological measures is typically non-linear,
and the nature of the psychological similarity scale is generally unknown
Fortunately, non-metric multidimensional scaling algorithms assume nothing
about the data measurement scale except the ordinal relation. They
attempt to match as closely as possible the rank order of the data values
(errors, latencies, etc.) with the rank order of the distances in
multidimensional space. It might appear that much is lost when all
information in the data other than rank order is discarded, but Shepard
(e.g. 1980) and others have shown that with a sufficient number of objects
the metric structure of the space can be accurately constructed from rank
order alone. A forceful demonstration takes the direct airline distances
between 15 or more cities in the United States and distorts them with a
transformation that leaves their order unimpaired - for example, by taking
their logs, squaring, or raising them to an exponent. With
the distorted data as input, a non-metric scaling program such as ALSCAL
will place the cities accurately in a two-dimensional map of the
US, with distances conforming once again to a ratio scale.
Significantly, the program will also discover and plot the transformation
that was used to distort the data.
Thus, using minimal assumptions, non-metric scaling can recover the
"psychological space" within which the similarities of objects are distributed,
and it can also recover the transformation that relates the behavioral
measure to similarity. Various methods can yield appropriate input data
in avian subjects, and the following are a few illustrations from studies
| Figure 10. (redrawn from Riggs et al,
Among the earliest scaling experiments done with pigeons were descriptions
of hue similarity in the pigeon. In a study designed to investigate
early color processing, Riggs, P. Blough, and Schafer (1972) presented
a stimulus field consisting of a pattern of stripes that alternated rapidly
in wavelength. The luminance of the stimuli was equated, following D. Blough
(1957). The electrical response of the retina to this alternation
increased non-linearly with wavelength difference, and it could be taken
as a measure of the perceptual similarity of the stimuli. A matrix
of the electrical responses to all pairings of 12 wavelengths from
495 to 660 nm provided input to a nonmetric multidimensional scaling program.
Figure 10 shows the result, which can be seen as a partial color
circle for the pigeon. Possibly the function would have formed a
more complete circle if a wider range of wavelengths had been used.
11. A color circle for the pigeon; the colored spots suggest
the appearance of the stimuli to a human observer.
(Redrawn from Schneider, 1972, with colors added.).
At about the same time, Schneider (1972) used a behavioral method to
derive a more complete color circle. His pigeons discriminated between
identical and different pairs of wavelengths in a yes/no signal detection
task. The two lights appeared on the two halves of the center key
of a three-key chamber. If the wavelengths were the same, right key
pecks were rewarded; if they differed, left key pecks were rewarded.
The accuracy of performance to each possible pair of 11 wavelengths was
taken as a measure of the dissimilarity of the pair, and non-metric scaling
yielded the result shown in Figure 11.
These experiments suggest that similarities among colors for pigeons
are organized in a manner generally like those for humans. This is quite
interesting in view of the anatomical differences between the species,
particularly the colored oil droplets that filter the light reaching the
pigeon's cone receptors. (see P. Blough (2001)
and Husband & Shimizu (2001) for more
information regarding the structure and evolution of the pigeon retina).
In these experiments with wavelength, multidimensional scaling was applied
to a relatively well-understood continuum, and the similarity results clarify
aspects of the pigeon's visual function. This sort of scaling
can also be applied to stimulus sets especially constructed to explore
the processes involved in the identification and discrimination of objects.
We next consider an example of this sort, which also introduces a task
that is convenient for studying inter-object similarity in pigeons.
This study, from D. Blough (1988) used an "odd item" search task.
An array of 32 forms appeared on the screen of a computer monitor. The
forms were all identical, with one exception, which was the target item;
the pigeon got food on an intermittent schedule for pecking at the target,
which was randomly placed from trial to trial. Pigeons learn this
task readily, perhaps because it resembles a natural foraging situation.
Depending on the circumstances, search accuracy or search speed may
be the primary measure. Search is swift and accurate if the odd form
is quite different from all the others; search is slow and may be inaccurate
if the odd form is similar to the others. This odd-item task is particularly
efficient because all possible pairs of items may be presented in random
order and usually appear in the same experimental session; this scheme
automatically provides equal treatment of the stimuli and counterbalancing
for all the target items.
|| One of the sets of forms that was used appears in
Figure 12 to the left (D. Blough, 1988). Each of the 16 items is a
block that varies in size together with a U that varies in width. There are 4 values on each of the two physical dimensions, though the objects
might be described in other ways as well, such as the overall size of the
pair of stimuli taken together. The two forms that appeared on any
trial were drawn from this set of 16; one was the target, the other, repeated
31 times, was the distractor. Over the course of a session each item appeared
as the target, paired with every other item as the distractor.
| Two typical displays based on this set of forms appear in
the Figure 13 to the right. In the upper one, the target differs from the distractors
on both block size and U width dimensions. In the lower display , the form that was the target for the
upper display has become the distractor. The new target is harder
to find; it differs from the distractors only in that its block size
is somewhat smaller. The data consisted of mean search speeds collected
for each of the 240 possible pairings of these forms.
Non-metric scaling based on the resulting matrix produced the
structure shown in Figure
14. The figure suggests that a psychological
dimension corresponds to each of the physical dimensions, and that similarity
among the forms can be roughly equated with their distance in this similarity
space (also see Cook,
Katz, & Cavoto, 1997 for another example of using scaling analyses to
look at avian discrimination behavior). As to the metric relation between the behavioral measure,
search reaction time, and similarity, I showed that to a good approximation
there is an exponential relation between the momentary probability of detecting
a target (which determines variations in search speed) and the similarity
between the target and the distractor (D. Blough, 1988).
Dimensions, and Attention
The data displayed thus far have all been scaled in a space analogous
to physical space. Distances in this space follow the so-called Euclidean
metric; in two dimensions, the Pythagorean theorem governs
the relation between the coordinates of points and the distances between
them. Thus, if the the coordinates are a and b and the distance is
c = ( a2 + b2 ) 1/2
or, in general, with several dimensions:
c = ( a2 + b2
+ c2 + .... ) 1/2 Euclidean
There are other rules for computing distance. If one is
walking between opposite corners of a block filled with buildings, the
sidewalk distance is given by the simple sum of adjacent sides of the block
rather than by equation (1). This distance rule is aptly
named the “city-block” metric:
(3) c = ( a + b + c + ... )
where again c is the direct distance and a and b are the distances along
the coordinate axes.
Both Euclidean and City Block are special cases of the Minkowski metric:
c = ( aN + bN
+ cN + ..... ) 1/N
where N=1 for the city-block metric and N=2 for the Euclidean metric.
A crucial property of the Euclidean metric is that it yields a constant
distance no matter how the coordinates are moved or rotated with respect
to objects in the space. It seems obvious that the distance between
objects does not change with axis rotation; it is less obviously true that
Euclidean space is unique in this respect. If the exponent N
in equation (4) does not equal two, distances change when the axes
are rotated. This result is exemplified in Figure 15, which
shows two dots separated by a distance “c” . Two sets of coordinates
are shown, one rotated with respect to the other. Euclidean distance
"c" between the dots is 5 regardless of the position of the coordinates.
However, by the city-block metric, the distance between the two dots, computed
from their coordinates, is 7 for one set of axes and 5 for the other set
of axes (see Figure 15 below).
c = ( 4 2 + 3
) ½ = ( 25 ) ½
= 5 Euclidean
c = (4
+ 3 1 ) 1 = 7
c = ( 5 2 + 0 2
) ½ = ( 25 ) ½ =
c = ( 5
1 + 0 1
) 1 = 5
Figure 15. This figure illustrates the
different effects of axis rotation on distances determined by
the Euclidean metric and city-block metric. Euclidean distance
between the dots is constant regardless of rotation.
City-block distance changes with a change in axis orientation.
The significance of all this for similarity scaling is that the
metric structure of similarity space is initially unknown and it may reflect
important properties of inter-object similarity and the cognitive
operations involved in computing it. One example is the degree to
which a perceiver analyzes and attends to the dimensions that underlie
similarity computation. For example, when human subjects are
asked to estimate the similarity between pairs of colored patches
varying in hue, lightness and saturation, the matrix of data they
produce is best fit by placing the colored patches in a Euclidean space.
Three such patches appear in Figure 16 (the
stimuli on the left). Within limitations of the
display, the spots differ somewhat
in hue (vertical) and lightness (horizontal). One would expect the
dissimilarity between the stimuli connected by line c to be given by equation
2, the Euclidean metric.
Figure 16. Example of "Integral"
Figure 17. Example of "Separable"
This outcome agrees with other evidence that these dimensions are "integral"
that is, observers do not analyze them into component dimensions when they
compare them. This is one implication of the rotation-invariance of the
Euclidean metric; the psychological space has no preferred axes.
In contrast to the colored patches, forms varying in distinct, separately
identifiable aspects may fit best into a city-block similarity space.
One example for which this has been demonstrated in humans are circles
that vary in size with inscribed radii that vary in angle. This is illustrated
by the three forms in Figure 17 (the stimuli to the right). Similarity judgments for these forms were better fit by a city-block
than by a Euclidean metric; thus, in Figure 17 the dissimilarity
represented by line c should be best approximated by equation 3.
This finding agrees with other evidence that these dimensions are "separable”;
it suggests that observers analyze the stimuli into their two prominent
aspects, determining a and b, and summing these to get c. The
presence of specific preferred axes corresponds to the non-rotatability
of the city-block representation (Shepard 1987, 1991).
This analysis suggests another look at the pigeon similarity data for
U/ block forms shown in Figure 14 above. To us these forms appear
to be separable; it seems easy to focus attention on the upper U or the
lower block. This may be true for pigeons as well; for these forms,
unlike other tested items, the city-block metric provided a slightly
better fit to the data than the Euclidean fit shown in Figure 14 (D. Blough,
A final implication of this analysis is that, in the case of separable
dimensions (city-block metric), differential attention may alter the observed
similarities among objects by changing the relative weights given the dimensions.
This is somewhat analogous to the attentional variation of feature weighting
that Tversky built into his contrast model, discussed above in the theory
section. Other data also suggest
that the pigeon may attend to different parts of simple forms (D. Blough,
1993), and this matter awaits further exploration. For further
discussion, see D. Blough (1989, 1991, 1992, 1993).
Features and Cluster
Many natural objects do not lend themselves very well to description
in terms of dimensions, and even fairly simple forms often lack obvious
dimensionality, unless they are specifically designed with dimensions in
mind, such as those in Figures 12 and 17. Additional problems
with the geometric approach were discussed in the theoretical section on
is an alternative to a spatial representation, and it can also suggest
how subjects perceive objects; inspection of the results may suggest
the features of the objects that affect this classification.
A number of algorithms have been devised for use in classifying objects
on the basis of similarity. As a relatively simple example, consider
the popular program, CLUSTER, which may be applied to matrices of
object-comparison data like those already considered . Using such
a data set, CLUSTER puts the objects into a space with as many dimensions
are there are objects, and it then computes interobject distances in this
many-dimensional space. Finally, it places objects together in groups
based on the squared Euclidean distance between each pair of items.
A specific avian example is provided by a cluster analysis of pigeon
letter confusion data (D. Blough, 1985). To construct a cluster tree
from distances computed as just described, CLUSTER first grouped
U and V, the two letters that were separated by the least distance. This
pair is plotted at the bottom of Figure 18. D and O were separated
by the next smallest distance, and they are plotted next, and so on for
NW, then BP.
However, at this point, the next larger distance was not that between
two individual letters. Rather, it was determined that the average of the
M-N and the M-W distance was less than any similar value, so at this point
CLUSTER added M to NW to create a new group, MNW. This process continued,
with the distance criterion for joining a group gradually relaxed until
a final single cluster was reached. In Figure 18 the vertical axis reflects
the average distance between items in a cluster. Because U and V
had the smallest distance, they are the lowest in the picture, and so on.
|Figure 18. (Redrawn from D. Blough, 1985).
The potential relation between this cluster analysis and the feature
approach discussed in the theory section is evident if one considers the
features that seem common to clusters - for example, the small upper loop
common to ARBP, the open centers of DOQ, the straight verticals of
ITL, and so forth. Human observers have commented that the pigeon
tree diagram in Figure 18 looks very plausible to them, and in fact cluster
diagrams based on human judgments are quite similar to the pigeon diagram
shown here. This suggests that humans and pigeons see these
forms in similar ways; the result also is consistent with the idea
that an analysis into features plays a role in the perception of such forms,
though this is by no means a necessary conclusion. Further discussion
of the featural aspects of these data, and related matters, may be
found in D. Blough (1985) and Blough & Franklin (1985).
In a different context, Dooling (e.g. 1990) faced the problem of relating
the differential behavioral effects of bird calls to the birds' perceptual
classification of those calls. To investigate this matter, calls
of budgerigars in a large breeding colony were recorded under several conditions
(e.g. for "contact calls", the birds were separated; for "alarm calls"
they were disturbed. Frequency spectra of samples of the calls are
shown in Figure 19). Then, in a discrimination
were rewarded for pecking a key within two sec following the successive
presentation of two different calls, but they were not rewarded if the
calls were identical. The bird's response latency on "different"
trials was taken as an index of stimulus similarity. Data analysis
was based on a matrix of these latencies that came from pairing each call
with every other call in a set. For the data shown in Figure 20 ten calls were used, five collected in a "contact" situation, five in an
"alarm" situation. A two-dimensional multidimensional scaling
solution and the output of a tree cluster analysis in Figure
20 (Figures redrawn from Dooling et al, 1990). It is evident from these results that the two types of calls are indeed
distinct for the birds, and they are relatively similar within the alarm
and contact categories. Contact calls are less similar to each other
than are alarm calls, and this fits with other evidence that birds use
differences in contact calls for recognition of individual birds.
Measures of similarity exemplified in this section are conceptual and
analytic tools of considerable potential for the analysis of cognitive
processes in birds and other animals. Some key aspects:
(1) A variety of behavioral data can be used as input (e.g. search speed,
same-different discrimination speed and accuracy, generalization functions).
(2) Data on similarities can clarify the functioning of sensory and
perceptual systems (e.g. Schneider's pigeon color circle).
(3) Similarities determined by these methods may with some confidence
be compared across species boundaries (e.g. pigeon and human letter similarities).
(4) Similarity data may be used in conjunction with behavioral
observations to clarify the functional classification of stimuli
(e.g. the work of Dooling on bird calls).
(5) Scaling procedures can reveal the transformations that relate
behavioral measures of generalization and discrimination to perceptual
similarity (e.g. Shepard's Universal Law of Generalization).
(6) The spatial metrics determined by scaling can give information
on the analysis into features or dimensions performed by perceptual systems,
and can help to predict how attention or other variables may affect this
analysis (e.g. results of letter and artificial form experiments).
Next Section: Similarity