Fashion Coordinates Recommender System Using Photographs from Fashion Magazines
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Fashion Coordinates Recommender System Using Photographs from Fashion Magazines Tomoharu Iwata Shinji Watanabe Hiroshi Sawada NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan {iwata.tomoharu,watanabe.shinji,sawada.hiroshi}@lab.ntt.co.jp Abstract from fashion magazines. We use a topic model to learn co- ordinate information that includes the intrinsic relationship Fashion magazines contain a number of pho- between fashion items. The proposed topic model is a proba- tographs of fashion models, and their clothing co- bilistic generative model of images for each fashion item re- ordinates serve as useful references. In this pa- gion. The learned coordinate information is used to recom- per, we propose a recommender system for cloth- mend coordinates. Intuitively, the proposed method finds ref- ing coordinates using full-body photographs from erence photographs that are similar to the query photograph, fashion magazines. The task is that, given a pho- and recommends fashion items that are similar to those in the tograph of a fashion item (e.g. tops) as a query, found reference photograph. By using photographs that are to recommend a photograph of other fashion items taken from the user’s favorite fashion magazines to train the (e.g. bottoms) that is appropriate to the query. With topic model, the proposed method can recommend personal- the proposed method, we use a probabilistic topic ized coordinates that coincide with the user’s preference. model for learning information about coordinates The proposed method is novel in the sense that it learns from visual features in each fashion item region. information about coordinates automatically from a collec- We demonstrate the effectiveness of the proposed tion of reference photographs. Some fashion recommen- method using real photographs from a fashion mag- dation methods have been proposed [Nagao et al., 2008; azine and two fashion style sharing services with Shen et al., 2007; Yonezawa and Nakatani, 2009]. For exam- the task of making top (bottom) recommendations ple, [Shen et al., 2007] proposed a recommender system for given bottom (top) photographs. clothing to suit a given situation using item attributes such as brands, types (e.g. jeans) and styles (e.g. formal, trendy and sporty). [Nagao et al., 2008] recommends clothing relevant 1 Introduction to the weather and the situation using annotated data. These Many people are interested in their fashion. They regularly methods require system developers and users to attach meta- check trends in fashion magazines, and refer to them when data to clothes, which is a tiresome and time-consuming task. buying clothes and choosing coordinates. Fashion magazines On the other hand, the proposed method does not require the contain a number of photographs of fashion models, and their attachment of meta-data; it only requires users to take pho- clothing coordinates serve as useful references. tographs of their own clothes and system developers to col- In this paper, we propose a recommender system for cloth- lect reference photographs. We can easily collect reference ing coordinates using full-body photographs from fashion photographs from fashion magazines, online clothes stores, magazines. The task is that, given a photograph of a fash- or fashion photograph sharing services, such as Chicisimo1 ion item (e.g. tops) as a query, to recommend a photograph and Weardrobe2. of other fashion items (e.g. bottoms) that is appropriate to The remainder of this paper is organized as follows. In Sec- the query. The proposed method can be helpful when select- tion 2, we present a method for detecting the top and bottom ing clothes to wear from one’s wardrobe. Sometimes, it is regions in full-body photographs from a wide variety of pho- difficult for a person to make a quick decision about stylish tographs. In Section 3, we propose a topic model for learning coordinates. The proposed method can also be used when information about coordinates from the reference full-body purchasing clothes that match one’s own clothes, and when photographs. In Section 4, we describe a coordinates recom- purchasing matching clothes for the top and bottom half of mendation method that uses the topic model. In Section 5, we the body in an online store. If the store employs experts to demonstrate the effectiveness of the proposed method by us- provide advice, we can ask them about coordinates. How- ing real photographs from a fashion magazine and two style ever, online stores have no such staff. sharing services with the task of top (bottom) recommenda- With the proposed method, coordinate information is 1 learned from visual features in each fashion item region (e.g. http://chicisimo.com 2 top or bottom half of the body) using full-body photographs http://www.weardrobe.com 2262
Table 1: Notation Symbol Description D set of reference full-body photographs R set of regions, e.g. R = {top, bottom} r region, r ∈ R V number of unique visual features (r) Nd number of features in region r of photograph d (r) xdn nth feature in region r of photograph d, (r) xdn ∈ {1, · · · , V } Figure 1: Example of top and bottom region detection using K number of latent topics a face detection algorithm. (r) zdn latent topic of the nth feature in region r (r) of photograph d, zdn ∈ {1, · · · , K} θd topic proportions for photograph d, tions given bottom (top) photographs. Finally, we present θd = {θdk }K k=1 , θdk ≥ 0, k θdk = 1 concluding remarks and a discussion of future work in Sec- (r) tion 6. φk multinomial distribution over features for topic k in region r, (r) (r) (r) (r) φk = {φkv }Vv=1 , φkv ≥ 0, v φkv = 1 2 Top and Bottom Region Detection The proposed method described in Sections 3 and 4 can be 3 Topic Model for Coordinates used to recommend arbitrary fashion items, such as tops, bot- Let D be a set of reference photographs. Each reference pho- toms, hats, bags, shoes and accessories. However, this pa- tograph consists of a pair of visual features, one in the top per focuses on top (bottom) recommendations given a bottom (t) (b) (r) and one in the bottom region (xd , xd ). Here, xd repre- (top) photograph, since it is the most basic and frequently sents visual features in region r of photograph d. The top and occurring situation. In such cases, the proposed method re- bottom regions can be detected using the method described in quires a collection of pairs of top and bottom images to learn Section 2. We can use arbitrary visual features, such as color, coordinates. This section describes a simple method of identi- texture and local descriptors such as the Scale Invariant Fea- fying full-body photographs from a collection of photographs (r) in fashion magazines, and detecting their top and bottom re- ture Transform (SIFT) [Lowe, 2004]. We assume that xd (r) gions. (r) (r) N is represented by bag-of-features; i.e. xd = {xdn }n=1 d , In addition to full-body photographs of fashion models, (r) where xdn is the nth visual feature in region r of photograph fashion magazines include photographs of clothes, bags, ac- d. Our notation is summarized in Table 1. cessories, wording and logos. Therefore, we need to select We use a topic model for learning information about co- only full-body photographs from all the given photographs. ordinates since the topic model can extract the intrinsic rela- We utilize a face detection algorithm [Rowley et al., 1998; tionship between fashion items from full-body photographs. Viola and Jones, 2004] for this task. First, we detect the lo- A topic model is a probabilistic generative model for dis- cation of faces in each photograph. Next, we select full-body crete data such as text documents. Topics in fashion pho- photographs by using the following two rules: they must con- tographs, for example, represent fashion categories, such tain a face region, and they must have sufficient space for a as casual, formal, elegant, classic and conservative. Topic full body; the space can be calculated from the size and po- models are successfully used for a wide variety of appli- sition of the face region. When more than one face region is cations including information retrieval [Blei et al., 2003; detected, we use one that is the closest to the vertical center Hofmann, 1999], collaborative filtering [Hofmann, 2003; line of the photograph. Iwata et al., 2009a], and modeling images [Blei and Jor- We determine the top and bottom regions by assuming that dan, 2003; Cao and Fei-Fei, 2007; Iwata et al., 2009b]. The the top region is double the width and 2.5 times the height proposed model is an extension of latent Dirichlet allocation of the face region, and the bottom region is double the width (LDA) [Blei et al., 2003], which is a representative topic and 3.5 times the height of the face region. Figure 1 shows an model. example of top and bottom region detection. The proposed topic model can learn cooccurrence informa- We can detect the top and bottom regions directly by using tion about visual features in top and bottom regions. For ex- pose estimation instead of face detection algorithms. Some ample, a white top is likely to be coordinated with a dark blue algorithms for detecting human body parts have been pro- bottom when color information is used for the visual features. posed [Ramanan, 2007; Andriluka et al., 2009]. However, we Using the learned cooccurrence information, we recommend utilize a face detection algorithm for simplicity, because face coordinates as described in the next section. The proposed detection performance is superior to that for other regions, model assumes that photograph d has its own topic propor- and most of the poses in fashion magazine photographs show tions θd , which is a low dimensional intrinsic representation the front view of a standing model. of the photograph. Each topic has its own visual feature prob- 2263
distributions. The first factor on the right hand side of (1) is (t) (t) (t) z x φ (t) β calculated as follows: α θ N(t) |D| Γ(Kα) k Γ(Ndk + α) z (b) x(b) φ(b) β(b) P (Z|α) = , (2) (b) Γ(α)K Γ(Nd + Kα) N D K d where Γ(·) is the gamma function, |D| is the size of set D, Figure 2: Graphical model representation of the proposed Ndk is the number of features assigned to topic k in the pho- topic model for recommending top and bottom coordinates. tograph d and Nd = k Ndk . Similarly, the second factor is given as follows: (r) Γ(V β (r) ) K Γ(N (r) + β (r) ) ability depending on region φk , which represents cooccur- P (X|Z, β) = w kv , (r) Γ(β (r) )V (r) (r) ) rence information. The visual feature xdn is generated ac- r k Γ(N k +Vβ (r) (3) cording to the region dependent probabilities φ (r) after topic (r) zdn where Nkv is the number of times feature v has been as- (r) zdn is chosen from topic proportions θd . (r) (r) signed to topic k in region r, and Nk = v Nkv . In summary, the proposed model assumes the following The inference of the latent topics Z given reference pho- generative process for a set of reference photographs X = tographs can be efficiently computed using collapsed Gibbs (r) {xd }r,d , sampling [Griffiths and Steyvers, 2004]. Given the current state of all but one variable zj , where j = (d, r, n), the as- 1. For each topic k = 1, · · · , K: signment of a latent topic to the nth feature in region r of (a) For each region r ∈ R: photograph d is sampled from the following probability: i. Draw the visual feature probability for region r (r) (r) Nkxj \j + β (r) φk ∼ Dirichlet(β (r) ) P (zj = k|D, Z\j ) ∝ (Ndk\j + α) · , (4) (r) 2. For each reference photograph d ∈ D: Nk\j + V β (r) (a) Draw topic proportions θd ∼ Dirichlet(α) where \j represents the count when excluding the jth feature. (b) For each region r ∈ R: The parameters α and β (r) can be estimated by maximiz- (r) ing the joint distribution (1) by using the fixed-point iteration i. For each feature in region r, n = 1, · · · , Nd : method described in [Minka, 2000] as follows: (r) A. Draw topic zdn ∼ Multinomial(θd ) Ψ(Nkd + α) − |D|KΨ(α) (r) (r) α ← α d k , (5) B. Draw feature xdn ∼ Multinomial(φ (r) ) K ( d Ψ(Nd + Kα) − |D|Ψ(Kα)) zdn (r) (r) where α and β (r) are Dirichlet distribution parameters. Fig- v Ψ(Nkv + β ) − KV Ψ(β (r) ) β (r) ← β (r) k , ure 2 shows a graphical model representation of the proposed (r) (r) ) − KΨ(V β (r) ) V k Ψ(N k + V β topic model for top and bottom regions, where shaded and un- shaded nodes indicate observed and latent variables, respec- (6) tively. The structure of the model is the same as that of the where Ψ(·) is the digamma function defined by Ψ(x) = ∂ log Γ(x) polylingual topic model [Mimno et al., 2009], which is used ∂x . By iterating Gibbs sampling with (4) and maxi- for extracting topics from documents in many languages. mum likelihood estimation with (5) and (6), we can infer la- The proposed topic model can learn coordinate informa- tent topics while optimizing the parameters. (r) tion about other regions in addition to top and bottom re- We can estimate φkv by using the following equation: gions, such as hats, bags, shoes and accessories. In this (r) case, a reference photograph consists of a set of visual fea- (r) Nkv + β (r) (r) φ̂kv = , (7) tures with multiple regions {xd }r∈R , where e.g. R = (r) Nk + V β (r) {top, bottom, hat, bag}. The joint distribution of visual features and latent topics is which is used for recommending coordinates as described in described as follows: the next section. P (X, Z, |α, β) = P (Z|α)P (X|Z, β), (1) 4 Coordinate Recommendation based on (r) (r) Topic Model where Z = {zd }r,d is a set of latent topics, zd = (r) (r) Nd (r) In this section, we describe a method for recommending co- {zdn }n=1 , and β = {β }r is a set of Dirichlet parame- ordinates using the topic model learned with reference pho- ters. We can integrate out multinomial distribution parame- tographs. Intuitively speaking, we recommend a bottom (top) (r) ters, {θd}d and {φk }k,r , because we use Dirichlet distri- that has the closest topic proportions to those of the given top butions for their priors, which are conjugate to multinomial (bottom). 2264
Let C (t) and C (b) respectively be a set of photographs full-body visual features topic visual query top photos for each region proportions features photo of top and bottom garments that the user owns. The pro- posed recommendation method finds a bottom (top) photo- graph ĉ ∈ C (b) (ĉ ∈ C (t) ) that matches the given top (bot- top bottom tom) photograph c∗ ∈ C (t) (c∗ ∈ C (b) ). Each photograph topic own bottom photos (r) is associated with visual features xc , c ∈ C (r) . Note that model (r) photographs in C need not be paired; they can be pho- tographs of a shirt, a blouse or a skirt. We can create those data from full-body photographs of the user by using the re- gion detection method described in Section 2. Here, we aim to recommend coordinates from among the clothes that the recommended user owns. With recommendation in an online store, C (r) bottom find a bottom that has the can be the store’s photographs. recommend closest topic proportions to system the query top In the following, we explain the case of a bottom recom- mendation given a top. We can recommend a top given a bottom in the same way. First, we estimate topic proportions Figure 3: Framework of proposed coordinate recommenda- θc∗ for a given top photograph and the user’s own bottom tion method. photographs C (b) using the topic model learned with refer- (r) ence photographs Φ̂ = {φ̂k }k,r . The assignment of a latent topic zj , where j = (c, r, n), is sampled as follows: women’s fashion magazine. There were 14,813 photographs including photographs of clothes, bags, shoes, cosmetics (r) P (zj = k|x(r) c , zc\j , Φ̂) ∝ (Nkc\j + α) · φ̂kxj , (8) and accessories as well as full-body fashion model pictures. Chicisimo and Weardrobe are fashion community web ser- (r) N (r) vices for women in which users can share style photographs. where zc = {zcn}n=1 c represents a set of latent topics in The communities have a rule stating that shared photographs region r of photograph c. Then, we recommend the bottom ĉ should show the full-body of the user. We used 1,059 and that has the closest topic proportions to those of the given top 1,410 photographs from Chicisimo and Weardrobe data, re- photograph as follows: spectively. ĉ = arg max S(θc∗ , θc ), (9) c∈C (b) 5.2 Region Detection where S(·, ·) represents a similarity function. For the similar- In the Magazine data, 2,062 photographs out of 14,813 were ity, we use the following negative KL divergence: estimated as full-body by the proposed method described θc ∗ k in Section 2. We checked the results for top and bot- S(θc∗ , θc ) = − θc∗ k log , (10) tom region extraction from the 2,062 photographs by hu- θck man judgement. The regions were adequately extracted k in 1,502 photographs (73%). We also checked the Chi- since topic proportions θc are a probability distribution, and cisimo and Weardrobe data, and the regions were adequately the KL divergence is one of the most commonly used dis- extracted from 75%(797/1,059) and 71%(1,004/1,410) of tances for probability distributions. Although the recommen- the photographs, respectively. For the Magazine data, we dation of one photograph is assumed above, we can also rec- checked randomly sampled 221 photographs that were de- ommend n photographs with the highest S(θc∗ , θc ) values. termined as not being full-body using the proposed method. Figure 3 shows the framework of the proposed coordinate 81%(180/221) of the photographs were truly not full-body. recommendation method. In the framework, full-body pho- Although the proposed detection method is simple, it can de- tographs for reference are converted to visual features for tect top and bottom regions with the high true positive rate, each region, and the proposed topic model is learned using and the high true negative rate. In the following recommenda- the visual features. The query top and the user’s own bottom tion experiments, we used 1,502, 797 and 1,004 photographs photographs are also converted to visual features, and topic respectively from the Magazine, Chicisimo and Weardrobe proportions of the query and the user’s own photographs are data that were checked by human judgement. estimated using the learned topic model. Then, a bottom is recommended that has the closest topic proportions to those 5.3 Baseline Recommendation Method based on of the query top. Visual Feature Similarities 5 Experiments Before describing our experimental coordinate recommenda- tion results, we describe a baseline method for recommenda- 5.1 Data Set tion based on visual feature similarities and compare it with We evaluated our proposed coordinate recommender system the proposed topic model based method. based on the topic model using three real photograph col- Assume that a photograph of a top c∗ is given as a query. lections: Magazine, Chicisimo and Weardrobe. The Maga- We would like to find a photograph of a bottom that matches zine data consists of photographs taken from 32 copies of a a given top from a set of the user’s own bottom photographs 2265
C (b) . First, the baseline method finds the top photograph dˆ ure 5 shows the average similarities and the standard devia- that is closest to the given top c∗ from a collection of refer- tions with different numbers of topics. The proposed method ence photographs D as follows: achieved higher accuracies and similarities for all three data sets compared with the baseline and random methods. This dˆ = arg max F (x(t) (t) c∗ , xd ), (11) result indicates that the proposed method can learn infor- d∈D mation about coordinates adequately using the topic model. where F (·, ·) represents a similarity function, For the similar- Since the proposed method uses the similarities of extracted ity function, we used the cosine similarity in the experiment, low dimensional intrinsic representations, it is more robust which is a commonly used similarity function in the bag-of- than the baseline method, which naively uses similarities be- features representation. Then, the baseline method recom- tween high dimensional visual features. mends bottom photograph ĉ that is the most similar to the When we analyze the extracted topics, photographs with bottom in the found reference photograph dˆ from the user’s similar coordinates form clusters. For example, a topic is own bottom photographs C (b) : frequently assigned to photographs showing white one-piece suits, and another topic is frequently assigned to photographs (b) ĉ = arg max F (xd̂ , x(b) c ). (12) showing white t-shirts and blue jeans, which correspond to c∈C (b) formal and casual topics, respectively. In the same way, the baseline method can recommend a The computational time of a query response was 0.04 sec- matching top given a bottom. ond, which includes estimating topic proportions of the query and sorting photographs by their similarities. 5.4 Recommendation We evaluated the proposed method for recommending coor- 6 Conclusion dinates based on a topic model with a baseline method based We have proposed a topic model for recommending fashion on the similarities between visual features. coordinates using a collection of photographs from fashion We used a color histogram for the visual features as de- magazines. We have confirmed experimentally that the pro- scribed in [Swain and Ballard, 1991]. Specifically, the three posed method can learn information about coordinates appro- color axes used for the histogram are defined as follows: priately, and can be used for recommending fashion coordi- rg = r − g, by = 2b − r − g, wb = r + g + b, where r, nates. g and b represent red, green and blue signals, respectively. Although our results have been encouraging to date, our The wb axis, which indicates intensity, was divided into eight approach can be further improved in a number of ways. First, sections, and the rg and by axes were divided into 16 sec- we would like to improve the region detection method. We tions, and therefore the dimension of the visual features was utilized a face detection algorithm for simplicity to detect V = 2, 048. top and bottom regions. However, top and bottom regions We generated 100 sets of training and test data, in which 10 may not appear immediately below the face region when the % of the samples were randomly selected as test data. With fashion model leans, sits or lies down; the border of top and the proposed method, the training data set is assumed to be a bottom regions differs depending on the clothes, the fash- collection of reference photographs D, and the parameters of ion model’s style, and the angle at which the photographs the topic model are inferred using the training data. The test were taken. By detecting regions directly using pose esti- data are assumed to be sets of photographs C (t) and C (b) mation algorithms, the method becomes applicable to pho- owned by a user. In the evaluation, each of the top (bot- tographs with a wide variety of poses. Second, the proposed tom) photographs in the test data is given as the input, and topic model can be used to recommend coordinates for fash- a matching bottom (top) photograph is recommended, while ion items other than tops and bottoms, such as shoes, bags, the bottom (top) photographs are held out. We assumed that hats and accessories. Third, we would like to investigate ap- the correct answer for a recommendation given a top (bottom) propriate visual features for recommending coordinates. In photograph was the bottom (top) photograph that was paired our experiments, we used color information for the features. with the given photograph. However, the texture and the shape of items may also consti- For the evaluation, we used the following two measure- tute important information. Finally, it is valuable to recom- ments: n-best accuracy and similarity. The n-best accuracy mend coordinates that match the user’s preference and situ- represents the rate of recommending the correct bottom (top) ation, such as business, dating or formal. This personalized photograph with n recommendations given a test top (bottom) recommendation can be achieved by using the history of the photograph. The similarity evaluation measurement is the av- clothes the user has worn and the context. erage similarity between the recommended clothing and the held-out paired clothing. References Figure 4 shows the average accuracies and the standard de- [Andriluka et al., 2009] Mykhaylo Andriluka, Stefan Roth, and viations over 100 experiments on the three data sets. The Pro- Bernt Schiele. Pictorial structures revisited: People detection and posed plot represents results for the proposed method where articulated pose estimation. In CVPR, pages 1014–1021, 2009. the number of topics K = 20, the Baseline plot represents [Blei and Jordan, 2003] David M. Blei and Michael I. Jordan. Mod- results for the baseline method based on the similarities be- eling annotated data. In SIGIR ’03: Proceedings of the 26th tween visual features, and the Random plot represents re- Annual International ACM SIGIR Conference on Research and sults for a method that recommends clothes randomly. Fig- Development in Information Retrieval, pages 127–134, 2003. 2266
0.8 0.9 0.9 Proposed Proposed Proposed 0.7 Baseline 0.8 Baseline 0.8 Baseline Random Random Random 0.6 0.7 0.7 0.6 0.6 0.5 accuracy accuracy accuracy 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 5 10 15 20 25 30 5 10 15 20 25 30 5 10 15 20 25 30 n n n (a) Magazine (b) Chicisimo (c) Weardrobe Figure 4: n-best accuracies with different numbers of recommendations. 0.55 0.5 0.55 0.5 0.5 0.45 0.45 0.45 0.4 similarity similarity similarity 0.4 0.4 0.35 0.35 0.35 0.3 Proposed 0.3 Proposed Proposed 0.3 0.25 Baseline Baseline Baseline Random Random Random 0.25 0.25 0.2 10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50 10 15 20 25 30 35 40 45 50 #topics #topics #topics (a) Magazine (b) Chicisimo (c) Weardrobe Figure 5: Similarities with different numbers of topics. [Blei et al., 2003] David M. Blei, Andrew Y. Ng, and Michael I. [Mimno et al., 2009] David Mimno, Hanna M. Wallach, Jason Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003. Naradowsky, David A. Smith, and Andrew McCallum. Polylin- [Cao and Fei-Fei, 2007] L. Cao and L. Fei-Fei. Spatially coherent gual topic models. In EMNLP ’09, pages 880–889, 2009. latent topic model for concurrent object segmentation and clas- [Minka, 2000] Thomas Minka. Estimating a Dirichlet distribution. sification. In Proceedings of IEEE Intern. Conf. in Computer Technical report, M.I.T., 2000. Vision (ICCV), 2007. [Nagao et al., 2008] S. Nagao, S. Takahashi, and J. Tanaka. Mirror [Griffiths and Steyvers, 2004] Thomas L. Griffiths and Mark appliance: Recommendation of clothes coordination in daily life. Steyvers. Finding scientific topics. Proceedings of the National In HFT ’08, pages 367–374, 2008. Academy of Sciences, 101 Suppl 1:5228–5235, 2004. [Ramanan, 2007] Deva Ramanan. Learning to parse images of ar- [Hofmann, 1999] Thomas Hofmann. Probabilistic latent semantic ticulated bodies. In Advances in Neural Information Processing analysis. In UAI ’99: Proceedings of 15th Conference on Uncer- Systems, volume 19, 2007. tainty in Artificial Intelligence, pages 289–296, 1999. [Rowley et al., 1998] Henry A. Rowley, Shumeet Baluja, and [Hofmann, 2003] Thomas Hofmann. Collaborative filtering via Takeo Kanade. Neural network-based face detection. IEEE Gaussian probabilistic latent semantic analysis. In Proceedings Transactions on Pattern Analysis and Machine Intelligence, of the 26th Annual International ACM SIGIR Conference on Re- 20(1):23–38, 1998. search and Development in Information Retrieval, pages 259– [Shen et al., 2007] Edward Shen, Henry Lieberman, and Francis 266, 2003. Lam. What am I gonna wear?: scenario-oriented recommenda- [Iwata et al., 2009a] Tomoharu Iwata, Shinji Watanabe, Takeshi tion. In IUI ’07, pages 365–368, 2007. Yamada, and Naonori Ueda. Topic tracking model for analyzing [Swain and Ballard, 1991] Michael J. Swain and Dana H. Ballard. consumer purchase behavior. In IJCAI ’09: Proceedings of 21st International Joint Conference on Artificial Intelligence, pages Color indexing. Int. J. Comput. Vision, 7(1):11–32, 1991. 1427–1432, 2009. [Viola and Jones, 2004] Paul Viola and Michael J. Jones. Robust [Iwata et al., 2009b] Tomoharu Iwata, Takeshi Yamada, and real-time face detection. Int. J. Comput. Vision, 57:137–154, May Naonori Ueda. Modeling social annotation data with content rel- 2004. evance using a topic model. In NIPS ’09, pages 835–843, 2009. [Yonezawa and Nakatani, 2009] Yuri Yonezawa and Yoshio [Lowe, 2004] David G. Lowe. Distinctive image features from Nakatani. Fashion support from clothes with characteristics. In scale-invariant keypoints. International Journal of Computer Vi- HCI International ’09, pages 323–330, 2009. sion, 60(2):91–110, 2004. 2267
You can also read