Large-Scale Open-Set Classification Protocols for ImageNet
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
This is a pre-print of the original paper accepted at the Winter Conference on Applications of Computer Vision (WACV) 2023. Large-Scale Open-Set Classification Protocols for ImageNet Andres Palechor Annesha Bhoumik Manuel Günther Department of Informatics, University of Zurich, Andreasstrasse 15, CH-8050 Zurich https://www.ifi.uzh.ch/en/aiml.html arXiv:2210.06789v2 [cs.CV] 18 Oct 2022 Abstract beyond imagination a decade before. Supervised im- age classification algorithms have achieved tremendous Open-Set Classification (OSC) intends to adapt success when it comes to detecting classes from a finite closed-set classification models to real-world scenar- number of known classes, what is commonly known ios, where the classifier must correctly label samples as evaluation under the closed-set assumption. For of known classes while rejecting previously unseen un- example, the deep learning algorithms that attempt known samples. Only recently, research started to in- the classification of ten handwritten digits [16] achieve vestigate on algorithms that are able to handle these more than 99% accuracy when presented with a digit, unknown samples correctly. Some of these approaches but it ignores the fact that the classifier might be con- address OSC by including into the training set negative fronted with non-digit images during testing [6]. Even samples that a classifier learns to reject, expecting that the well-known ImageNet Large Scale Visual Recog- these data increase the robustness of the classifier on nition Challenge (ILSVRC) [26] contains 1000 classes unknown classes. Most of these approaches are evalu- during training, and the test set contains samples from ated on small-scale and low-resolution image datasets exactly these 1000 classes, while the real world contains like MNIST, SVHN or CIFAR, which makes it diffi- many more classes, e.g., the WordNet hierarchy [19] cult to assess their applicability to the real world, and currently knows more than 100’000 classes.1 Training to compare them among each other. We propose three a categorical classifier that can differentiate all these open-set protocols that provide rich datasets of natural classes is currently not possible – only feature compar- images with different levels of similarity between known ison approaches [22] exist – and, hence, we have to deal and unknown classes. The protocols consist of subsets with samples that we do not know how to classify. of ImageNet classes selected to provide training and Only recently, research on methods to improve clas- testing data closer to real-world scenarios. Addition- sification in presence of unknown samples has gained ally, we propose a new validation metric that can be more attraction. These are samples from previously employed to assess whether the training of deep learn- unseen classes that might occur during deployment of ing models addresses both the classification of known the algorithm in the real world and that the algorithm samples and the rejection of unknown samples. We needs to handle correctly by not assigning them to any use the protocols to compare the performance of two of the known classes. Bendale and Boult [2] provided baseline open-set algorithms to the standard SoftMax the first algorithm that incorporates the possibility to baseline and find that the algorithms work well on neg- reject a sample as unknown into a deep network that ative samples that have been seen during training, and was trained on a finite set of known classes. Later, partially on out-of-distribution detection tasks, but drop other algorithms were developed to improve the detec- performance in the presence of samples from previously tion of unknown samples. Many of these algorithms unseen unknown classes. require to train on samples from some of the unknown classes that do not belong to the known classes of in- terest – commonly, these classes are called known un- 1. Introduction known [18], but since this formulation is more confusing than helpful, we will term these classes as the negative Automatic classification of objects in images has classes. For example, Dhamija et al. [6] employed sam- been an active direction of research for several decades ples from a different dataset, i.e., they trained their sys- now. The advent of Deep Learning has brought algo- tem on MNIST as known classes and selected EMNIST rithms to a stage where they can handle large amounts of data and produce classification accuracies that were 1 https://wordnet.princeton.edu
Known Physical Entity Negative Unknown Organism Artifact Animal Dog Fox Wolf Bear Ungulate Musteline Device Vehicle Food P1 Hunting Dog P2 Bird Insect Fish Monkey Fungus Fruit Furniture Computer Car Truck Boat P3 Figure 1: Class Sampling in our Open-Set Protocols. We make use of the WordNet hierarchy [19] to define three protocols of different difficulties. In this figure, we show the superclasses from which we sample the final classes, all of which are leaf nodes taken from the ILSVRC 2012 dataset. Dashed lines indicate that the lower nodes are descendants, but they might not be direct children of the upper nodes. Additionally, all nodes have more descendants than those shown in the figure. The colored bars below a class indicate that its subclasses are sampled for the purposes shown in the top-left of the figure. For example, all subclasses of “Dog” are used as known classes in protocol P1 , while the subclasses of “Hunting Dog” are partitioned into known and negatives in protocol P2 . For protocol P3 , several intermediate nodes are partitioned into known, negative and unknown classes. letters as negatives. Other approaches try to create case the performance of three simple algorithms in this negative samples by utilizing known classes in different paper. We decided to build our protocols based on the ways, e.g., Ge et al. [8] used a generative model to form well-known and well-investigated ILSVRC 2012 dataset negative samples, while Zhou et al. [30] try to utilize [26], and we build three evaluation protocols P1 , P2 internal representations of mixed known samples. and P3 that provide various difficulties based on the One issue that is inherent in all of these approaches WordNet hierarchy [19], as displayed in Fig. 1. The – with only a few exceptions [2, 25] – is that they eval- protocols are publicly available,3 including source code uate only on small-scale datasets with a few known for the baseline implementations and the evaluation, classes, such as 10 classes in MNIST [16], CIFAR-10 which enables the reproduction of the results presented [14], SVHN[21] or mixtures of these. While many algo- in this paper. With these new protocols, we hope to rithms claim that they can handle unknown classes, foster more comparable and reproducible research into the number of known classes is low, and it is un- the direction of open-set object classification as well clear whether these algorithms can handle more known as related topics such as out-of-distribution detection. classes, or more diverse sets of unknown classes. Only This allows researchers to test their algorithms on our lately, a large-scale open-set validation protocol is de- protocols and directly compare with our results. fined on ImageNet [28], but it only separates unknown The contributions of this paper are as follows: samples based on visual 2 and not semantic similarity. Another issue of research on open-set classification is • We introduce three novel open-set evaluation pro- that most of the employed evaluation criteria, such as tocols with different complexities for the ILSVRC accuracy, macro-F1 or ROC metrics, do not evaluate 2012 dataset. open-set classification as it would be used in a real- world task. Particularly, the currently employed vali- • We propose a novel evaluation metric that can be dation metrics that are used during training a network used for validation purposes when training open- do not reflect the target task and, thus, it is unclear set classifiers. whether the selected model is actually the best model for the desired task. • We train deep networks with three different tech- In this paper we, therefore, propose large-scale open- niques and report their open-set performances. set recognition protocols that can be used to train and test various open-set algorithms – and we will show- • We provide all source code3 for training and eval- 2 In fact, Vaze et al. [28] do not specify their criteria to se- uation of our models to the research community. lect unknown classes and only mention visual similarity in their supplemental material. 3 https://github.com/AIML-IfI/openset-imagenet
2. Related Work as known and the remaining classes as unknown [9]. Sometimes, other datasets serve the roles of unknowns, In open-set classification, a classifier is expected to e.g., when MNIST build the known classes, EMNIST correctly classify known test samples into their respec- letters [11] are used as negatives and/or unknowns. tive classes, and correctly detect that unknown test Similarly, the known classes are composed of CIFAR-10 samples do not belong to any known class. The study and other classes from CIFAR-100 or SVHN are nega- of unknown instances is not new in the literature. For tives or unknowns [15, 6]. Only few papers make use example, novelty detection, which is also known as of large-scale datasets such as ImageNet, where they anomaly detection and has a high overlap with out- either use the classes of ILSVRC 2012 as known and of-distribution detection, focuses on identifying test in- other classes from ImageNet as unknown [2, 28], or stances that do not belong to training classes. It can be random partitions of ImageNet [25, 24]. seen as a binary classification problem that determines Oftentimes, evaluation protocols are home-grown if an instance belongs to any of the training classes or and, thus, the comparison across algorithms is very not, but without exactly deciding which class [4], and difficult. Additionally, there is no clear distinction on includes approaches in supervised, semi-supervised and the similarities between known, negative and unknown unsupervised learning [13, 23, 10]. classes, which makes it impossible to judge in which However, all these approaches only consider the clas- scenarios a method will work, and in which not. Fi- sification of samples into known and unknown, leaving nally, the employed evaluation metrics are most often the later classification of known samples into their re- not designed for open-set classification and, hence, fail spective classes as a second step. Ideally, these two to address typical use-cases of open-set recognition. steps should be incorporated into one method. An easy approach would be to threshold on the maximum class probability of the SoftMax classifier using a confidence 3. Approach threshold, assuming that for an unknown input, the 3.1. ImageNet Protocols probability would be distributed across all the classes and, hence, would be low [17]. Unfortunately, often in- Based on [3], we design three different protocols to puts overlap significantly with known decision regions create three different artificial open spaces, with in- and tend to get misclassified as a known class with creasing level of similarity in appearance between in- high confidence [6]. It is therefore essential to devise puts – and increasing complexity and overlap between techniques that are more effective than simply thresh- features – of known and unknown classes. To allow olding SoftMax probabilities in detecting unknown in- for the comparison of algorithms that require negative puts. Some initial approaches include extensions of 1- samples for training, we carefully design and include class and binary Support Vector Machines (SVMs) as negative classes into our protocols. This also allows implemented by Scheirer et al. [27] and devising recog- us to compare how well these algorithms work on pre- nition systems to continuously learn new classes [1, 25]. viously seen negative classes and how on previously While the above methods make use only of known unseen unknown classes. samples in order to disassociate unknown samples, In order to define our three protocols, we make use other approaches require samples of some negative of the WordNet hierarchy that provides us with a tree classes, hoping that these samples generalize to all un- structure for the 1000 classes of ILSVRC 2012. Partic- seen classes. For example, Dhamija et al. [6] utilize neg- ularly, we exploit the robustness Python library [7] ative samples to train the network to provide low confi- to parse the ILSVRC tree. All the classes in ILSVRC dence values for all known classes when presented with are represented as leaf nodes of that graph, and we a sample from an unknown class. Many researchers use descendants of several intermediate nodes to form [8, 29, 20] utilize generative adversarial networks to our known and unknown classes. The definition of the produce negative samples from the known samples. protocols and their open-set partitions are presented in Zhou et al. [30] combined pairs of known samples to Fig. 1, a more detailed listing of classes can be found define negatives, both in input space and deeper in the in the supplemental material. We design the protocols network. Other approaches to open-set recognition are such that the difficulty levels of closed- and open-set discussed by Geng et al. [9]. evaluation varies. While protocol P1 is easy for open- One problem that all the above methods possess is set, it is hard for closed-set classification. On the con- that they are evaluated on small-scale datasets with trary, P3 is more easy for closed-set classification and low-resolution images and low numbers of classes. Such more difficult in open-set. Finally, P2 is somewhere in datasets include MNIST [16], SVHN [21] and CIFAR- the middle, but small enough to run hyperparameter 10 [14] where oftentimes a few random classes are used optimization that can be transferred to P1 and P3 .
Table 1: ImageNet Classes used in the Protocols. This table shows the ImageNet parent classes that were used to create the three protocols. Known and negative classes are used for training the open-set algorithms, while known, negative and unknown classes are used in testing. Given are the numbers of classes: training / validation / test samples. Known Negative Unknown All dog classes Other 4-legged animal classes Non-animal classes P1 116: 116218 / 29055 / 5800 67: 69680 / 17420 / 3350 166 : — / — / 8300 Half of hunting dog classes Half of hunting dog classes Other 4-legged animal classes P2 30: 28895 / 7224 / 1500 31: 31794 / 7949 / 1550 55: — / — / 2750 Mix of common classes including Mix of common classes including Mix of common classes including P3 animals, plants and objects animals, plants and objects animals, plants and objects 151: 154522 / 38633 / 7550 97: 98202 / 24549 / 4850 164: — / — / 8200 In the first protocol P1 , known and unknown classes for training, one for validation and one for testing. The are semantically quite distant, and also do not share training and validation partitions are taken from the too many visual features. We include all 116 dog original ILSVRC 2012 training images by randomly classes as known classes – since dogs represent the splitting off 80% for training and 20% for validation. largest fine-grained intermediate category in ImageNet Since training and validation partitions are composed which makes closed-set classification difficult – and of known and negative data only, no unknown data is select 166 non-animal classes as unknowns. P1 can, provided here. The test partition is composed of the therefore, be used to test out-of-distribution detec- original ILSVRC validation set containing 50 images tion algorithms since knowns and unknowns are not per class, and is available for all three groups of data: very similar. In the second protocol P2 , we only look known, negative and unknown. This assures that dur- into the animal classes. Particularly, we use several ing testing no single image is used that the network has hunting dog classes as known and other classes of 4- seen in any stage of the training. legged animals as unknown. This means that known and unknown classes are still semantically relatively 3.2. Open-Set Classification Algorithms distant, but image features such as fur is shared be- We select three different techniques to train deep tween known and unknown. This will make it harder networks. While other algorithms shall be tested in for out-of-distribution detection algorithms to perform future work, we rely on three simple, very similar and well. Finally, the third protocol P3 includes ancestors well-known methods. In particular, all three loss func- of various different classes, both as known and un- tions solely utilize the plain categorical cross-entropy known classes, by making use of the mixed 13 classes loss JCCE on top of SoftMax activations (often termed defined in the robustness library. Since known and as the SoftMax loss) in different settings. Generally, unknown classes come from the same ancestors, it is the weighted categorical cross-entropy loss is: very unlikely that out-of-distribution detection algo- rithms will be able to discriminate between them, and N 1 XX C real open-set classification methods need to be applied. JCCE =− wc tn,c log yn,c (1) N n=1 c=1 To enable algorithms that require negative samples, the negative classes are selected semantically similar where N is the number of samples in our dataset to the known or at least in-between the known and (note that we utilize batch processing), tn,c is the the unknown. It has been shown that selecting nega- target label of the nth sample for class c, wc is a tive samples too far from the known classes does not class-weight for class c and yn,c is the output prob- help in creating better-suited open-set algorithms [6]. ability of class c for sample n using SoftMax activation: Naturally, we can only define semantic similarity based ezc,n on the WordNet hierarchy, but it is unclear whether yc,n = C (2) these negative classes are also structurally similar to ezc0 ,n P the known classes. Tab. 1 displays a summary of the c0 =1 parent classes used in the protocols, and a detailed list of the logits zn,c , which are the network outputs. of all classes is presented in the supplemental material. The three different training approaches differ with Finally, we split our data into three partitions, one respect to the targets tn,c and the weights wc , and how
negative samples are handled. The first approach is the the ones that provide a separate probability for the plain SoftMax loss (S) that is trained on only samples unknown class and those that do not. from the K known classes. In this case, the number of The final evaluation on the test set should differ- classes C = K is equal to the number of known classes, entiate between the behavior of known and unknown and the targets are computed as one-hot encodings: classes, and at the same time include the accuracy of ( the known classes. Many evaluation techniques pro- 1 c = τn posed in the literature do not follow these require- ∀n, c ∈ {1, . . . , C} : tn,c = (3) 0 otherwise ments. For example, computing the area under the where 1 ≤ τn ≤ K is the label of the sample n. For ROC curve (AUROC) will only consider the binary simplicity, we select the weights for each class to be classification task: known or unknown, but does not identical: ∀c : wc = 1, which is the default behav- tell us how well the classifier performs on the known ior when training deep learning models on ImageNet. classes. Another metric that is often applied is the By thresholding the maximum probability max yc,n , macro-F1 metric [2] that balances precision and recall c for a K+1-fold binary classification task. This metric cf. Sec. 3.3, this method can be turned into a simple has many properties that are counter-intuitive in the out-of-distribution detection algorithm. open-set classification task. First, a different thresh- The second approach is often found in object detec- old is computed for each of the classes, so it is pos- tion models [5] which collect a lot of negative samples sible that the same sample can be classified both as from the background of the training images. Similarly, one or more known classes and as unknown. These this approach is used in other methods for open-set thresholds are even optimized on the test set, and of- learning, such as G-OpenMax [8] or PROSER [30].4 In ten only the maximum F1-value is reported. Second, this Background (BG) approach, the negative data is the method requires to define a particular probability added as an additional class, so that we have a total of being unknown, which is not provided by two of our of C = K + 1 classes. Since the number of negative three networks. Finally, the metric does not distin- samples is usually higher than the number for known guish between known and unknown classes, but just classes, we use class weights to balance them: treats all classes identically, but consequences of classi- N fying an unknown sample as known are different from ∀c ∈ {1, ..., C} : wc = (4) CNc misclassifying a known sample. where Nc is the number of training samples for class c. The evaluation metric that follows our intuition Finally, we use one-hot encoded targets tn,c according best is the Open-Set Classification Rate (OSCR), to (3), including label τn = K + 1 for negative samples. which handles known and unknown samples separately As the third method, we employ the Entropic [6]. Based on a single probability threshold θ, we Open-Set (EOS) loss [6], which is a simple extension compute both the Correct Classification Rate (CCR) of the SoftMax loss. Similar to our first approach, we and the False Positive Rate (FPR): have one output for each of the known classes: C = K. {xn | τn ≤ K ∧ arg max yn,c = τn ∧ yn,c > θ} For known samples, we employ one-hot-encoded target CCR(θ) = 1≤c≤K values according to (3), whereas for negative samples |NK | we use identical target values: {xn | τn > K ∧ max yn,c > θ} 1 FPR(θ) = 1≤c≤K (6) ∀n, c ∈ {1, . . . , C} : tn,c = (5) |NU | C Sticking to the implementation of Dhamija et al. [6], we where NK and NU are the total numbers of known select the class weights to be ∀c : wc = 1 for all classes and unknown test samples, while τn ≤ K indicates a including the negative class, and leave the optimization known sample and τn > K refers to an unknown test of these values for future research. sample. By varying the threshold θ between 0 and 1, we can plot the OSCR curve [6]. A closer look to (6) 3.3. Evaluation Metrics reveals that the maximum is only taken over the known Evaluation of open-set classification methods is a classes, purposefully leaving out the probability of the more tricky business. First, we must differentiate be- unknown class in the BG approach.5 Finally, this def- tween validation metrics to monitor the training pro- inition differs from [6] in that we use a > sign for both cess and testing methods for the final reporting. Sec- FPR and CCR when comparing to θ, which is critical ond, we need to incorporate both types of algorithms, 5 A low probability for the unknown class does not indicate 4 While these methods try to sample better negatives for a high probability for any of the known classes. Therefore, the training, they rely on this additional class for unknown samples. unknown class probability does not add any useful information.
when SoftMax probabilities of unknown samples reach 4. Experiments 1 to the numerical limits. Note that the computation of the Correct Classifi- Considering that our goal is not to achieve the cation Rate – which is highly related to the classifica- highest closed-set accuracy but to analyze the perfor- mance of open-set algorithms in our protocols, we use tion accuracy on the known classes – has the issue that a ResNet-50 model [12] as it achieves low classification it might be biased when known classes have different amount of samples. Since the number of test samples errors on ImageNet, trains rapidly, and is commonly in our known classes is always balanced, in our evalu- used in the image classification task. We add one fully- ation we are not affected by this bias, so we leave the connected layer with C nodes. For each protocol, we adaptation of that metric to unbalanced datasets as fu- train models using the three loss functions SoftMax (S), SoftMax with Background class (BG) and Entropic ture work. Furthermore, the metric just averages over Open-Set (EOS) loss. Each network is trained for 120 all samples, telling us nothing about different behav- ior of different classes – it might be possible that one epochs using Adam optimizer with a learning rate of known class is always classified correctly while another 10−3 and default beta values of 0.9 and 0.999. Addi- class never is. For a better inspection of these cases, tionally, we use standard data preprocessing, i.e., first, open-set adaptations to confidence matrices need to be the smaller dimension of the training images is resized to 256 pixels, and then a random crop of 224 pixels is developed in the future. selected. Finally, we augment the data using a random 3.4. Validation Metrics horizontal flip with a probability of 0.5. Fig. 2 shows OSCR curves for the three methods For validation on SoftMax-based systems, often clas- on our three protocols using logarithmic FPR axes – sification accuracy is used as the metric. In open-set for linear FPR axes, please refer to the supplemental classification, this is not sufficient since we need to bal- material. We plot the test set performance for both ance between accuracy on the known classes and on the the negative and the unknown test samples. Hence, negative class. While using (weighted) accuracy might we can see how the methods work with unknown sam- work well for the BG approach, networks trained with ples of classes6 that have or have not been seen during standard SoftMax and EOS do not provide a proba- training. We can observe that all three classifiers in ev- bility for the unknown class and, hence, accuracy can- ery protocol reach similar CCR values in the closed-set not be applied for validation here. Instead, we want to case (FPR=1), in some cases, EOS or BG even outper- make use of the SoftMax scores to evaluate our system. form the baseline. This is good news since, oftentimes, Since the final goal is to find a threshold θ such that open-set classifiers trade their open-set capabilities for the known samples are differentiated from unknown reduced closed-set accuracy. In the supplemental mate- samples, we propose to compute the validation metric rial, we also provide a table with detailed CCR values using our confidence metric: for particular selected FPRs, and γ + and γ − values 1 X NN 1 computed on the test set. γ = − 1 − max yn,c + δC,K Regarding the performance on negative samples of NN n=1 1≤c≤K K (7) the test set, we can see that BG and EOS outperform 1 X NK γ+ + γ− the SoftMax (S) baseline, indicating that the classifiers γ = + yτ γ= learn to discard negatives. Generally, EOS seems to NK n=1 n 2 be better than BG in this task. Particularly, in P1 For known samples, γ + simply averages the SoftMax EOS reaches a high CCR at FPR=10−2 , showing the score for the correct class, while for negative samples, classifier can easily reject the negative samples, which is γ − computes the average deviation from the minimum to be expected since negative samples are semantically possible SoftMax score, which is 0 in case of the BG and structurally far from the known classes. class (where C = K + 1), and K 1 if no additional back- When evaluating the unknown samples of the test ground class is available (C = K). When summing set that belong to classes that have not been seen dur- over all known and negative samples, we can see how ing training, BG and EOS classifiers drop performance, well our classifier has learned to separate known from and compared to the gains on the validation set the be- negative samples. The maximum γ score is 1 when all havior is almost similar to the SoftMax baseline. Espe- known samples are classified as the correct class with cially when looking into P2 and P3 , training with the probability 1, and all negative samples are classified as negative samples does not clearly improve the open-set any known class with probability 0 or K 1 . When look- 6 Remember that the known and negative sets are split into ing at γ and γ individually, we can also detect if the + − training and test samples so that we never evaluate with samples training focuses on one part more than on the other. seen during training.
P1 Negative P2 Negative P3 Negative 0.8 0.6 0.4 0.2 0.0 CCR P1 Unknown P2 Unknown P3 Unknown 0.8 0.6 0.4 0.2 0.010 4 10 3 10 2 10 1 10010 4 10 3 10 2 10 1 10010 4 10 3 10 2 10 1 100 S BG EOS FPR Figure 2: Open-Set Classification Rate Curves. OSCR curves are shown for test data of each protocol. The top row is calculated using negative test samples, while the bottom row uses unknown test samples. Curves that do not extend to low FPR values indicate that the threshold in (6) is maximized at θ = 1. classifiers. However, in P1 EOS still outperforms S and provides lower confidences for negative samples (γ − ). BG for higher FPR, indicating the classifier learned This indicates that the class balancing technique in (4) to discard unknown samples up to some degree. This might have been too drastic and that higher weights shows that the easy task of rejecting samples very far for negative samples might improve results of the BG from the known classes can benefit from EOS train- method. Similarly, employing lower weights for the ing with negative samples, i.e., the denoted open-set EOS classifier might improve the confidence scores for method is good for out-of-distribution detection, but known samples at the cost of lower confidence for nega- not for the more general task of open-set classification. tives. Finally, from an open-set perspective, our confi- dence metric provides insightful information about the 5. Discussion model training; so far, we have used it to explain the model performance, but together with more parameter After we have seen that the methods perform well tuning, the joint γ metric can be used as criterion for on negative and not so well on unknown data, let us early stopping as shown in the supplemental material. analyze the results. First, we show how our novel vali- We also analyze the SoftMax scores according to (2) dation metrics can be used to identify gaps and incon- of known and unknown classes in every protocol. For sistencies during training of the open-set classifiers BG samples from known classes we use the SoftMax score and EOS. Fig. 3 shows the confidence progress across of the correct class, while for unknown samples we take the training epochs. During the first epochs, the con- the maximum SoftMax score of any known class. This fidence of the known samples (γ + , left in Fig. 3) is enables us to make a direct comparison between differ- low since the SoftMax activations produce low values ent approaches in Fig. 4. When looking at the score for all classes. As the training progresses, the models distributions of the known samples, we can see that learn to classify known samples, increasing the correct many samples are correctly classified with a high prob- SoftMax activation of the target class. Similarly, be- ability, while numerous samples provide almost 0 prob- cause of low activation values, the confidence of neg- ability for the correct class. This indicates that a more ative samples (γ − , right) is close to 1 at the begin- detailed analysis, possibly via a confusion matrix, is ning of the training. Note that EOS keeps the low required to further look into the details of these errors, activations during training, learning to respond only but this goes beyond the aim of this paper. to known classes, particularly in P1 , where values are More interestingly, the distribution of scores for un- close to 1 during all epochs. On the other hand, BG known classes differ dramatically between approaches
P1 Known P1 Negative P1 S P1 BG P1 EOS 1.0 0.8 3000 2400 0.6 1800 0.4 1200 600 0.2 0 P2 S P2 BG P2 EOS 750 P2 Known P2 Negative 600 1.0 450 0.8 300 0.6 150 0 P3 S P3 BG P3 EOS 0.4 5000 0.2 4000 3000 P3 Known P3 Negative 2000 1.0 1000 0.8 00.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.6 Known Unknown Score 0.4 0.2 20 40 60 80 100 120 20 40 60 80 100 120 Figure 4: Histograms of SoftMax Scores. We S BG EOS Epoch evaluate SoftMax probability scores for all three methods and all three protocols. For known samples, we present his- tograms of SoftMax score of the target class. For unknown Figure 3: Confidence Propagation. Confidence samples, we plot the maximum SoftMax score of any known values according to (7) are shown across training epochs class. For S and EOS, the minimum possible value of the of S, BG and EOS classifiers. On the left, we show the 1 latter is K , which explains the gaps on the left-hand side. confidence of the known samples (γ + ), while on the right the confidence of negative samples (γ − ) is displayed. the protocols is provided in the supplemental material. We evaluate the performance of three classifiers in and protocols. For P1 , EOS is able to suppress high every protocol using OSCR curves and our proposed scores almost completely, whereas both S and BG still confidence validation metric. Our experiments show have the majority of the cases providing high probabil- that the two open-set algorithms can reject negative ities of belonging to a known class. For P2 and, par- samples, where samples of the same classes have been ticularly, P3 a lot of unknown samples get classified as seen during training, but face a performance degrada- a known class with very high probability, throughout tion in the presence of unknown data from previously the evaluated methods. Interestingly, the plain Soft- unseen classes. For a more straightforward scenario Max (S) method has relatively high probability scores such as P1 , it is advantageous to use negative samples for unknown samples, especially in P2 and P3 where during EOS training. While this result agrees with [6], known and unknown classes are semantically similar. the performance of BG and EOS in P2 and P3 shows that these methods are not ready to be employed in 6. Conclusion the real world, and more parameter tuning is required In this work, we propose three novel evaluation pro- to improve performances. Furthermore, making better tocols for open-set image classification that rely on the use of or augmenting the negative classes also poses a ILSVRC 2012 dataset and allow an extensive evalua- challenge in further research in open-set methods. tion of open-set algorithms. The data is entirely com- Providing different conclusions for the three different posed of natural images and designed to have various protocols reflects the need for evaluation methods in levels of similarities between its partitions. Addition- scenarios designed with different difficulty levels, which ally, we carefully select the WordNet parent classes we provide within this paper. Looking ahead, with the that allow us to include a larger number of known, novel open-set classification protocols on ImageNet we negative and unknown classes. In contrast to previ- aim to establish comparable and standard evaluation ous work, the class partitions are carefully designed, methods for open-set algorithms in scenarios closer to and we move away from implementing mixes of sev- the real world. Instead of using low-resolution images eral datasets (where rejecting unknown samples could and randomly selected samples from CIFAR, MNIST be relatively easy) and the random selection of known or SVHN, we expect that our open-set protocols will es- and unknown classes inside a dataset. This allows us tablish benchmarks and promote reproducible research to differentiate between methods that work well in out- in open-set classification. In future work, we will inves- of-distribution detection, and those that really perform tigate and optimize more different open-set algorithms open-set classification. A more detailed comparison of and report their performances on our protocols.
References [16] Yann LeCun, Corinna Cortes, and Christopher J. C. Burges. The MNIST database of handwritten digits, [1] Abhijit Bendale and Terrance Boult. Towards open 1998. world recognition. In Conference on Computer Vision [17] Ofer Matan, R.K. Kiang, C.E. Stenard, B. Boser, J.S. and Pattern Recognition (CVPR). IEEE, 2015. Denker, D. Henderson, R.E. Howard, W. Hubbard, [2] Abhijit Bendale and Terrance E. Boult. Towards open L.D. Jackel, and Yann Le Cun. Handwritten char- set deep networks. In Conference on Computer Vision acter recognition using neural network architectures. and Pattern Recognition (CVPR). IEEE, 2016. In USPS Advanced Technology Conference, 1990. [3] Annesha Bhoumik. Open-set classification on Ima- [18] Dimity Miller, Lachlan Nicholson, Feras Dayoub, and geNet. Master’s thesis, University of Zurich, 2021. Niko Sünderhauf. Dropout sampling for robust object [4] Paul Bodesheim, Alexander Freytag, Erik Rodner, and detection in open-set conditions. In International Con- Joachim Denzler. Local novelty detection in multi- ference on Robotics and Automation (ICRA). IEEE, class recognition problems. In Winter Conference on 2018. Applications of Computer Vision (WACV), 2015. [19] George A Miller. WordNet: An electronic lexical [5] Akshay Dhamija, Manuel Günther, Jonathan Ventura, database. MIT press, 1998. and Terrance E. Boult. The overlooked elephant of [20] Lawrence Neal, Matthew Olson, Xiaoli Fern, Weng- object detection: Open set. In Winter Conference on Keen Wong, and Fuxin Li. Open set learning with Applications of Computer Vision (WACV), 2020. counterfactual images. In European Conference on [6] Akshay Raj Dhamija, Manuel Günther, and Ter- Computer Vision (ECCV), 2018. rance E. Boult. Reducing network agnostophobia. In [21] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Advances in Neural Information Processing Systems Bissacco, Bo Wu, and Andrew Y. Ng. Reading digits (NeurIPS), 2018. in natural images with unsupervised feature learning. [7] Logan Engstrom, Andrew Ilyas, Shibani Santurkar, In Advances in Neural Information Processing Systems and Dimitris Tsipras. Robustness (python library), (NIPS) Workshop, 2011. 2019. [22] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya [8] Zongyuan Ge, Sergey Demyanov, and Rahil Gar- Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sas- navi. Generative OpenMax for multi-class open set try, Amanda Askell, Pamela Mishkin, Jack Clark, classification. In British Machine Vision Conference Gretchen Krueger, and Ilya Sutskever. Learning trans- (BMVC), 2017. ferable visual models from natural language supervi- sion. In International Conference on Machine Learn- [9] Chuanxing Geng, Sheng-Jun Huang, and Songcan ing (ICML), 2021. Chen. Recent advances in open set recognition: A sur- [23] Jie Ren, Peter J Liu, Emily Fertig, Jasper Snoek, vey. Transactions on Pattern Analysis and Machine Ryan Poplin, Mark Depristo, Joshua Dillon, and Bal- Intelligence (TPAMI), 43(10):3614–3631, 2021. aji Lakshminarayanan. Likelihood ratios for out-of- [10] Izhak Golan and Ran El-Yaniv. Deep anomaly de- distribution detection. In Advances in Neural Infor- tection using geometric transformations. In Advances mation Processing Systems (NeurIPS), 2019. in Neural Information Processing Systems (NeurIPS), [24] Ryne Roady, Tyler L. Hayes, Ronald Kemker, Ayesha 2018. Gonzales, and Christopher Kanan. Are open set clas- [11] Patrick Grother and Kayee Hanaoka. NIST special sification methods effective on large-scale datasets? database 19 handprinted forms and characters 2nd edi- PLOS ONE, 15(9), 2020. tion. Technical report, National Institute of Standards [25] Ethan M. Rudd, Lalit P. Jain, Walter J. Scheirer, and and Technology (NIST), 2016. Terrance E. Boult. The extreme value machine. Trans- [12] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian actions on Pattern Analysis and Machine Intelligence Sun. Deep residual learning for image recognition. In (TPAMI), 2017. Conference on Computer Vision and Pattern Recogni- [26] Olga Russakovsky, Jia Deng, Hao Su, Jonathan tion (CVPR), 2016. Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, [13] Heinrich Jiang, Been Kim, Melody Guan, and Maya Andrej Karpathy, Aditya Khosla, Michael Bernstein, Gupta. To trust or not to trust a classifier. In Advances et al. Imagenet large scale visual recognition challenge. in Neural Information Processing Systems (NeurIPS), International Journal of Computer Vision (IJCV), 2018. 115(3), 2015. [14] Alex Krizhevsky and Geoffrey Hinton. Learning mul- [27] Walter J. Scheirer, Anderson de Rezende Rocha, tiple layers of features from tiny images. Technical Archana Sapkota, and Terrance E. Boult. Towards report, University of Toronto, 2009. open set recognition. Transactions on Pattern Analy- [15] Balaji Lakshminarayanan, Alexander Pritzel, and sis and Machine Intelligence (TPAMI), 35(7), 2013. Charles Blundell. Simple and scalable predictive [28] Sagar Vaze, Kai Han, Andrea Vedaldi, and Andrew uncertainty estimation using deep ensembles. In Zissermann. Open-set recognition: A good closed-set Advances in Neural Information Processing Systems classifier is all you need? In International Conference (NIPS), 2017. on Learning Representations (ICLR), 2022.
[29] Yang Yu, Wei-Yang Qu, Nan Li, and Zimin Guo. Open-category classification by adversarial sample generation. In International Joint Conference on Ar- tificial Intelligence (IJCAI), 2017. [30] Da-Wei Zhou, Han-Jia Ye, and De-Chuan Zhan. Learning placeholders for open-set recognition. In Conference on Computer Vision and Pattern Recog- nition (CVPR), 2021.
P1 Negative P2 Negative P3 Negative 0.8 0.6 0.4 0.2 0.0 CCR P1 Unknown P2 Unknown P3 Unknown 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 S BG EOS FPR Figure 5: Linear OSCR Curves. OSCR curves in linear scale for negative data (top tow) and unknown data (bottom row) of the test set for each protocol. Supplemental Material 7. Linear OSCR Curves Depending on the application, different false positive values are of importance. In the main paper, we focus more on the low FPR regions since these are of more importance when a human needs to look into the positive cases and, hence, large numbers of false positives involve labor cost. When false positives are not of utmost importance, e.g., when handled by an automated system in low-security applications, the focus will be more on having high correct classification rate. Therefore, in Fig. 5 we provide the OSCR plots according to figure 2 of the main paper, but this time with linear FPR axis. These plots highlight even more that the Entropic Open-Set (EOS) and the SoftMax with Background class (BG) approaches fail when known and unknown classes are very similar, as provided in P2 and P3 . 8. Results Sorted by Protocol Fig. 6 displays the same results as shown in Figure 2 of the main paper, but here we compare the results of the different protocols. This shows that the definition of our protocols follows our intuition. We can see that protocol P1 is easy in open-set, but difficult in closed-set operation, especially when looking to the performance of BG and EOS, which are designed for open-set classification. On the opposite, protocol P3 achieves higher closed-set recognition performance (at FPR = 1) while dropping much faster for lower FPR values. Protocol P2 is difficult both in closed- and open-set evaluation, but open-set performance does not drop that drastically as in P3 , especially considering the unknown test data in the second row. Due to its relatively small size, P2 is optimal for hyperparameter optimization and hyperparameters should be able to be transferred to the other two protocols – unless the algorithm is critical w.r.t. hyperparameters. We leave hyperparameter optimization to future work. 9. Detailed Results In Tab. 2 we include detailed numbers of the values seen in figures 2 and 3 of the main paper. We also include the cases where we use our confidence metric to select the best epoch of training. When comparing Tab. 2(a) with Tab. 2(b) it can be seen that the selection of the best algorithm based on the
0.8 S Negative BG Negative EOS Negative 0.6 0.4 0.2 0.0 CCR 0.8 S Unknown BG Unknown EOS Unknown 0.6 0.4 0.2 0.010 4 10 3 10 2 10 1 10010 4 10 3 10 2 10 1 10010 4 10 3 10 2 10 1 100 P1 P2 P3 FPR Figure 6: Comparison of Protocols. OSCR curves are split by algorithm to allow a comparison across protocols. Results are shown for negative data (top row) and unknown data (bottom row). confidence score provides better CCR values for BG in protocols P1 and P2 , and better or comparable numbers for EOS across all protocols. The advantage of S is not that expressed since S does not take into consideration negative samples and, thus, the confidence for unknown γ − gets very low in Tab. 2(a). When using our validation metric, training stops early and does not provide a good overall, i.e., closed-set performance. Therefore, we propose to use our novel evaluation metric for training open-set classifiers, but not for closed-set classifiers such as S. 10. Detailed List of Classes For reproducibility, we also provide the list of classes (ImageNet identifier and class name) that we used in our three protocols in Tables 3, 4 and 5. For known and negative classes, we used the samples of the original training partition for training and validation, and the samples from the validation partition for testing – since the test set labels of ILSVRC2012 are still not publicly available.
Table 2: Classification Results. Correct Classification Rates (CCR) are provided at some False Positive Rates (FPR) in the test set. The best CCR values are in bold. γ + is calculated using known and γ − using unknown classes. (a) Using Last Epoch Confidence CCR at FPR of: Epoch γ + γ − 1e-3 1e-2 1e-1 1 P1 - S 120 0.665 0.470 0.077 0.326 0.549 0.678 P1 - BG 120 0.650 0.520 0.146 0.342 0.519 0.663 P1 - EOS 120 0.663 0.846 0.195 0.409 0.667 0.684 P2 - S 120 0.665 0.341 — 0.189 0.443 0.683 P2 - BG 120 0.598 0.480 — 0.111 0.339 0.616 P2 - EOS 120 0.592 0.708 — 0.233 0.476 0.664 P3 - S 120 0.767 0.240 — — 0.540 0.777 P3 - BG 120 0.753 0.344 — 0.241 0.543 0.764 P3 - EOS 120 0.684 0.687 — 0.298 0.568 0.763 (b) Using Best Epoch Confidence CCR at FPR of: Epoch γ + γ − 1e-3 1e-2 1e-1 1 P1 - S 20 0.584 0.906 0.270 0.543 0.664 0.667 P1 - BG 99 0.660 0.628 0.167 0.420 0.572 0.673 P1 - EOS 105 0.672 0.877 0.283 0.535 0.683 0.694 P2 - S 39 0.603 0.528 0.031 0.134 0.384 0.659 P2 - BG 113 0.638 0.491 — 0.156 0.368 0.658 P2 - EOS 101 0.599 0.694 — 0.223 0.473 0.666 P3 - S 15 0.677 0.527 0.083 0.239 0.509 0.745 P3 - BG 115 0.747 0.379 — 0.243 0.539 0.761 P3 - EOS 114 0.696 0.688 — 0.266 0.585 0.772
Table 3: Protocol 1 Classes: List of super-classes and leaf nodes of Protocol P1 . Known Negative Unknown Id Name Id Name Id Name n02084071 dog n02118333 fox n07555863 food n02085620 Chihuahua n02119022 red fox n07684084 French loaf n02085782 Japanese spaniel n02119789 kit fox n07693725 bagel n02085936 Maltese dog n02120079 Arctic fox n07695742 pretzel n02086079 Pekinese n02120505 grey fox n07714571 head cabbage n02086240 Shih-Tzu n02115335 wild dog n07714990 broccoli n02086646 Blenheim spaniel n02115641 dingo n07715103 cauliflower n02086910 papillon n02115913 dhole n07716358 zucchini n02087046 toy terrier n02116738 African hunting dog n07716906 spaghetti squash n02087394 Rhodesian ridgeback n02114100 wolf n07717410 acorn squash n02088094 Afghan hound n02114367 timber wolf n07717556 butternut squash n02088238 basset n02114548 white wolf n07718472 cucumber n02088364 beagle n02114712 red wolf n07718747 artichoke n02088466 bloodhound n02114855 coyote n07720875 bell pepper n02088632 bluetick n02120997 feline n07730033 cardoon n02089078 black-and-tan coonho n02125311 cougar n07734744 mushroom n02089867 Walker hound n02127052 lynx n07745940 strawberry n02089973 English foxhound n02128385 leopard n07747607 orange n02090379 redbone n02128757 snow leopard n07749582 lemon n02090622 borzoi n02128925 jaguar n07753113 fig n02090721 Irish wolfhound n02129165 lion n07753275 pineapple n02091244 Ibizan hound n02129604 tiger n07753592 banana n02091467 Norwegian elkhound n02130308 cheetah n07754684 jackfruit n02091635 otterhound n02131653 bear n07760859 custard apple n02091831 Saluki n02132136 brown bear n07768694 pomegranate n02092002 Scottish deerhound n02133161 American black bear n03791235 motor vehicle n02092339 Weimaraner n02134084 ice bear n02701002 ambulance n02093256 Staffordshire bullte n02134418 sloth bear n02704792 amphibian n02093428 American Staffordshi n02441326 musteline mammal n02814533 beach wagon n02093647 Bedlington terrier n02441942 weasel n02930766 cab n02093754 Border terrier n02442845 mink n03100240 convertible n02093859 Kerry blue terrier n02443114 polecat n03345487 fire engine n02093991 Irish terrier n02443484 black-footed ferret n03417042 garbage truck n02094114 Norfolk terrier n02444819 otter n03444034 go-kart n02094258 Norwich terrier n02445715 skunk n03594945 jeep n02094433 Yorkshire terrier n02447366 badger n03670208 limousine n02095314 wire-haired fox terr n02370806 ungulate n03770679 minivan n02095570 Lakeland terrier n02389026 sorrel n03777568 Model T n02095889 Sealyham terrier n02391049 zebra n03785016 moped n02096051 Airedale n02395406 hog n03796401 moving van n02096177 cairn n02396427 wild boar n03930630 pickup n02096294 Australian terrier n02397096 warthog n03977966 police van n02096437 Dandie Dinmont n02398521 hippopotamus n04037443 racer n02096585 Boston bull n02403003 ox n04252225 snowplow n02097047 miniature schnauzer n02408429 water buffalo n04285008 sports car n02097130 giant schnauzer n02410509 bison n04461696 tow truck n02097209 standard schnauzer n02412080 ram n04467665 trailer truck n02097298 Scotch terrier n02415577 bighorn n03183080 device n02097474 Tibetan terrier n02417914 ibex n02666196 abacus n02097658 silky terrier n02422106 hartebeest n02672831 accordion n02098105 soft-coated wheaten n02422699 impala n02676566 acoustic guitar n02098286 West Highland white n02423022 gazelle n02708093 analog clock n02098413 Lhasa n02437312 Arabian camel n02749479 assault rifle n02099267 flat-coated retrieve n02437616 llama n02787622 banjo n02099429 curly-coated retriev n02469914 primate n02794156 barometer n02099601 golden retriever n02480495 orangutan n02804610 bassoon n02099712 Labrador retriever n02480855 gorilla n02841315 binoculars n02099849 Chesapeake Bay retri n02481823 chimpanzee n02879718 bow n02100236 German short-haired n02483362 gibbon n02910353 buckle n02100583 vizsla n02483708 siamang n02948072 candle n02100735 English setter n02484975 guenon n02950826 cannon n02100877 Irish setter n02486261 patas n02965783 car mirror n02101006 Gordon setter n02486410 baboon n02966193 carousel n02101388 Brittany spaniel n02487347 macaque n02974003 car wheel n02101556 clumber n02488291 langur n02977058 cash machine n02102040 English springer n02488702 colobus n02992211 cello n02102177 Welsh springer spani n02489166 proboscis monkey n03000684 chain saw n02102318 cocker spaniel n02490219 marmoset n03017168 chime n02102480 Sussex spaniel n02492035 capuchin n03075370 combination lock n02102973 Irish water spaniel n02492660 howler monkey n03110669 cornet n02104029 kuvasz n02493509 titi n03126707 crane Continued on next page
Protocol 1 Classes: List of super-classes and leaf nodes of Protocol P1 . Known Negative Unknown Id Name Id Name Id Name n02104365 schipperke n02493793 spider monkey n03180011 desktop computer n02105056 groenendael n02494079 squirrel monkey n03196217 digital clock n02105162 malinois n02497673 Madagascar cat n03197337 digital watch n02105251 briard n02500267 indri n03208938 disk brake n02105412 kelpie n03249569 drum n02105505 komondor n03271574 electric fan n02105641 Old English sheepdog n03272010 electric guitar n02105855 Shetland sheepdog n03372029 flute n02106030 collie n03394916 French horn n02106166 Border collie n03425413 gas pump n02106382 Bouvier des Flandres n03447721 gong n02106550 Rottweiler n03452741 grand piano n02106662 German shepherd n03467068 guillotine n02107142 Doberman n03476684 hair slide n02107312 miniature pinscher n03485407 hand-held computer n02107574 Greater Swiss Mounta n03492542 hard disc n02107683 Bernese mountain dog n03494278 harmonica n02107908 Appenzeller n03495258 harp n02108000 EntleBucher n03496892 harvester n02108089 boxer n03532672 hook n02108422 bull mastiff n03544143 hourglass n02108551 Tibetan mastiff n03590841 jack-o’-lantern n02108915 French bulldog n03627232 knot n02109047 Great Dane n03642806 laptop n02109525 Saint Bernard n03666591 lighter n02109961 Eskimo dog n03691459 loudspeaker n02110063 malamute n03692522 loupe n02110185 Siberian husky n03706229 magnetic compass n02110341 dalmatian n03720891 maraca n02110627 affenpinscher n03721384 marimba n02110806 basenji n03733131 maypole n02110958 pug n03759954 microphone n02111129 Leonberg n03773504 missile n02111277 Newfoundland n03793489 mouse n02111500 Great Pyrenees n03794056 mousetrap n02111889 Samoyed n03803284 muzzle n02112018 Pomeranian n03804744 nail n02112137 chow n03814639 neck brace n02112350 keeshond n03832673 notebook n02112706 Brabancon griffon n03838899 oboe n02113023 Pembroke n03840681 ocarina n02113186 Cardigan n03841143 odometer n02113624 toy poodle n03843555 oil filter n02113712 miniature poodle n03854065 organ n02113799 standard poodle n03868863 oxygen mask n02113978 Mexican hairless n03874293 paddlewheel n03874599 padlock n03884397 panpipe n03891332 parking meter n03929660 pick n03933933 pier n03944341 pinwheel n03992509 potter’s wheel n03995372 power drill n04008634 projectile n04009552 projector n04040759 radiator n04044716 radio telescope n04067472 reel n04074963 remote control n04086273 revolver n04090263 rifle n04118776 rule n04127249 safety pin n04141076 sax n04141975 scale n04152593 screen n04153751 screw n04228054 ski n04238763 slide rule n04243546 slot n04251144 snorkel Continued on next page
Protocol 1 Classes: List of super-classes and leaf nodes of Protocol P1 . Known Negative Unknown Id Name Id Name Id Name n04258138 solar dish n04265275 space heater n04275548 spider web n04286575 spotlight n04311174 steel drum n04317175 stethoscope n04328186 stopwatch n04330267 stove n04332243 strainer n04355338 sundial n04355933 sunglass n04356056 sunglasses n04372370 switch n04376876 syringe n04428191 thresher n04456115 torch n04485082 tripod n04487394 trombone n04505470 typewriter keyboard n04515003 upright n04525305 vending machine n04536866 violin n04548280 wall clock n04579432 whistle n04592741 wing n06359193 web site End of table Protocol 1
Table 4: Protocol 2 Classes: List of super-classes and leaf nodes of Protocol P2 . Known Negative Unknown Id Name Id Name Id Name n02087122 hunting dog n02087122 hunting dog n02085374 toy dog n02087394 Rhodesian ridgeback n02096051 Airedale n02085620 Chihuahua n02088094 Afghan hound n02096177 cairn n02085782 Japanese spaniel n02088238 basset n02096294 Australian terrier n02085936 Maltese dog n02088364 beagle n02096437 Dandie Dinmont n02086079 Pekinese n02088466 bloodhound n02096585 Boston bull n02086240 Shih-Tzu n02088632 bluetick n02097047 miniature schnauzer n02086646 Blenheim spaniel n02089078 black-and-tan coonho n02097130 giant schnauzer n02086910 papillon n02089867 Walker hound n02097209 standard schnauzer n02087046 toy terrier n02089973 English foxhound n02097298 Scotch terrier n02118333 fox n02090379 redbone n02097474 Tibetan terrier n02119022 red fox n02090622 borzoi n02097658 silky terrier n02119789 kit fox n02090721 Irish wolfhound n02098105 soft-coated wheaten n02120079 Arctic fox n02091244 Ibizan hound n02098286 West Highland white n02120505 grey fox n02091467 Norwegian elkhound n02098413 Lhasa n02115335 wild dog n02091635 otterhound n02099267 flat-coated retrieve n02115641 dingo n02091831 Saluki n02099429 curly-coated retriev n02115913 dhole n02092002 Scottish deerhound n02099601 golden retriever n02116738 African hunting dog n02092339 Weimaraner n02099712 Labrador retriever n02114100 wolf n02093256 Staffordshire bullte n02099849 Chesapeake Bay retri n02114367 timber wolf n02093428 American Staffordshi n02100236 German short-haired n02114548 white wolf n02093647 Bedlington terrier n02100583 vizsla n02114712 red wolf n02093754 Border terrier n02100735 English setter n02114855 coyote n02093859 Kerry blue terrier n02100877 Irish setter n02120997 feline n02093991 Irish terrier n02101006 Gordon setter n02125311 cougar n02094114 Norfolk terrier n02101388 Brittany spaniel n02127052 lynx n02094258 Norwich terrier n02101556 clumber n02128385 leopard n02094433 Yorkshire terrier n02102040 English springer n02128757 snow leopard n02095314 wire-haired fox terr n02102177 Welsh springer spani n02128925 jaguar n02095570 Lakeland terrier n02102318 cocker spaniel n02129165 lion n02095889 Sealyham terrier n02102480 Sussex spaniel n02129604 tiger n02102973 Irish water spaniel n02130308 cheetah n02131653 bear n02132136 brown bear n02133161 American black bear n02134084 ice bear n02134418 sloth bear n02441326 musteline mammal n02441942 weasel n02442845 mink n02443114 polecat n02443484 black-footed ferret n02444819 otter n02445715 skunk n02447366 badger n02370806 ungulate n02389026 sorrel n02391049 zebra n02395406 hog n02396427 wild boar n02397096 warthog n02398521 hippopotamus n02403003 ox n02408429 water buffalo n02410509 bison n02412080 ram n02415577 bighorn n02417914 ibex n02422106 hartebeest n02422699 impala n02423022 gazelle n02437312 Arabian camel n02437616 llama End of table Protocol 2
You can also read