EXTRACTION OF TOURIST ATTENTION POINTS FROM LOW-RATED REVIEWS AND CLASSIFICATION BY VIEWPOINT - IHCI 2021
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
ISBN: 978-989-8704-32-0 © 2021 EXTRACTION OF TOURIST ATTENTION POINTS FROM LOW-RATED REVIEWS AND CLASSIFICATION BY VIEWPOINT Junichi Fukumoto and Kazuki Ito College of Information Science and Engineering, Ritsumeikan University 1-1-1 Noji-higashi, Kusatsu, Shiga 525-8577 Japan ABSTRACT There are various Internet sites for tourists and a lot of positive and negative word-of-mouses are posted from tourists. Negative information can be used as attention points for sightseeing to prevent the same mistakes for their first visit. The purpose of this research is to extract such negative information from word-of-mouth as tourist attention points and classify them for easy-to-understand. In the experiments, we successfully extracted attention points from actual tourist reviews and classified them based on target of the points. KEYWORDS Tourism Information, Attention Points, Low-Rated Reviews, Classification, Dependency Relation 1. INTRODUCTION There are various Internet sites for tourists and a lot of word-of-mouses are posted. When people plan to visit some sightseeing spot at the first time, they refer posted word-of-mouses of these sites to get useful information and select tourist destinations. Internet posted information were used to improve sightseeing satisfaction and recommend appropriate place to visit and stay (Dincer, 2017) (Han, 2020) (Ogawa 2014). There are various positive and negative opinions such as recommendation of a location, dissatisfied experience, stories of failed experiences, and so on. Important point is that negative information can be used as “attention points for sightseeing” to prevent the same mistakes. There are some approaches that negative information is used to for product improvement (Kurihara, 2014) (Ohmori, 2012). In tourism information, for example, if you read the review "There is no toilet in the castle and it is a little unsuitable for children because of the stairs", you can use a toilet before you go into the castle. Now a huge amount of reviews has been posted, there are many reviews that include tourist attention points. However, it is difficult and time-consuming to read all such reviews to find tourist attention points from negative reviews. The purpose of this research is to extract such negative information from the huge amount of word-of-mouth as tourist attention points and present it to user in an easy-to-understand manner. We will focus on 1 to 3-star low-rated reviews that contain a lot of contributors’ dissatisfaction and complain based on our preliminary survey of reviews. We will extract attention points from sentences that include negative expressions and classify the extracted tourist caution points based on the negative evaluation viewpoint. In the following chapters, we will show our proposed method: extraction method of attention point and classification method of the extracted attention points. In the experiment, sample extraction of attention points and classification using tourist reviews of Wakamatsu castle and discuss the extraction results and some problems in our current method. 154
International Conferences Computer Graphics, Visualization, Computer Vision and Image Processing 2021; Connected Smart Cities 2021; and Big Data Analytics, Data Mining and Computational Intelligence 2021 2. PROPOSED METHOD 2.1 Extraction of Attention Points from Negative Reviews From the tourist site “Jalan net”, we obtained 3814 reviews of 1 to 3 stars related to the castle by web scraping. To select negative reviews effectively, we used sentiment analysis module of Watson NLU tool and classified the reviews into positive, neutral, and negative ones. As a result, we obtained 1260 negative reviews. To extract attention points for tourists, we used negative expressions and dependency structure of negative review sentences. Firstly, all the documents of negative reviews are divided into sentences using Japanese sentence separator such as a punctuation mark. Next, sentences will be morphologically analyzed using Japanese morphological analyzer MeCab with dictionally NEologd and syntactically analyzed with Japanese syntax analyzer CaboCha. We will extract attention points using dependency relation of syntax structure in the following patterns. Pattern 1: extract phrases depending to a negative expression and phrases to the dependent phrases Pattern 2: extract dependent phrases of a negative expression and phrases to the dependent phrases Pattern 3: if there is a negative expression sentence, the next sentence will be extracted. In the pattern 1 and 2, related elements with negative expression are extracted using dependency structure, however, adverb phrases will not be extracted. Pattern 3 is a case that some unpleasant situation will be described, and only negative impression will be added after the situation. This type of description is often used in blogs. We prepared Japanese negative expressions shown in Table 1. English translations are shown in brackets. Table 1. A list of Japanese negative expressions 残念(unfortunate), 小さい(small), 大変(hard), 狭い(narrow), がっかり(disappointed), 暑い(hot), 断念(give up), キツイ(hard), 諦め(give up), しんどい(hard), 苦労(hardship), 足りない(not enough), 後悔(regret), 悪い(bad), 邪魔(disturbing), 厳しい(severe), 退屈(boring), 重い(heavy), 嫌悪(disgust), 微妙(negative mood), 汚い(dirty), 不便(inconvenient), 怖い(scare), 興ざめ(disappointed), 辛い(hard), うるさい(noisy), 疲労(fatigue), 危ない(dangerous), 不満(dissatisfied), 難しい(difficult), 注意(attention), 遠い(far), こじんまり(small), 退屈(boring), 苦労(hard), 混雑(crowded) 2.2 Classification of Attention Points There are many kinds of attention points extracted from negative reviews. We will classify attention points based on viewpoints in the description. This classification will help these points easy to check. It will be effective to classify by very related words to a negative expression of attention points. To choose this related word, we use dependency analysis of description of attention points. In case of extraction pattern 1, shown in the above, noun or proper noun of dependent phrase to a negative expression will be a classification clue. If there is no noun in this dependent phrase, a dependent phrase to the first dependent phrase will be checked to extract noun or proper noun, repetitively, and the extracted noun or proper noun will be the classification clue. All the classification clue words will be extracted in pattern 1 type sentences because there are no such related words in pattern 2 and 3 type attention points. To classify pattern 2 and 3 type attention points, we will use classification clues. The pattern 2 and 3 attention points will be classified by the words they contain in the list. In the extracted classification clues, we set a stop word list of Japanese one letter words such as “中 (middle)”, “上 (upper)”, “事 (thing)” to exclude meaningless clues. 155
ISBN: 978-989-8704-32-0 © 2021 Figure 1 shows an example of dependency analysis of extracted attention point although the word order of Japanese is different from its English translation. Word level translations are shown in the below of words with brackets, and full translation is at the bottom of this figure. In this example, the phrase “大変です (hard)” is a negative expression. The phrase “行くのが (go to)” modifies the negative expression but this is not noun. The phrase “天守閣が (the castle tower)” modifies the phrase “行くのが (go to)” and this is noun, then this phrase will be a classification clue. Figure 1. Example of dependency analysis of attention point 3. EXPERIMENTS We used 259 reviews of Wakamatsu Castle for the experiment. A part of results of extraction of attention points is shown in Table 2. English translations are shown in brackets. Table 2. Example of extraction of attention point Negative Attention points expressions 悪い (bad) ライトアップしてましたが天気が悪かった (It was light up, but the weather was bad.) 難しい 登りは階段なので足が不自由な方は難しい (Climbing is with staircase, so it is difficult (difficult) for people with disabilities.) 大変 (hard) 子供も段差が大変 (The steps are also hard for children.) 悪い (bad) 全体的にみれば悪い場所ではないので商売っ気が少々強すぎる気がして、私には合 わない場所でした (Overall, it's not a bad place, so I felt that the business was a little too strong, so it was a place that didn't suit me.) 汚い (dirty) トイレが汚く (The toilet is dirty.) 残念 こちらで残念だったのは入場券を購入する際に何も言わないと茶室の入場券付のチ (disappoint) ケットがくる事、その茶室に行くとまた別料金で抹茶和菓子が如何か聞かれる事、 城の出口に向かうと売店を通るようになる事が興ざめ (What was disappointing here was that if you didn't say anything when purchasing the admission ticket, you would get a ticket with an admission ticket for the tea room, and when you went to the tea room, you would be asked what kind of matcha Japanese sweets would be for an extra charge. When you go to the exit, you will be able to go through the shop.) For extraction of classification clues, we applied our method for the extracted attention points. Among the extracted classification clues, 10 single letter clues are excluded using stop word rule. We will show sample results of classification of attention points with clue word “階段 (stairs)”, “トイレ (toilet)” and “天守閣 (castle tower)” in the following. 156
International Conferences Computer Graphics, Visualization, Computer Vision and Image Processing 2021; Connected Smart Cities 2021; and Big Data Analytics, Data Mining and Computational Intelligence 2021 clue word: “階段 (stairs)” 登りは階段なので足が不自由な方は難しい (Climbing is a staircase, so it is difficult for people with disabilities) 階段は小さい子は抱っこしなきゃいけないし、狭いしきつい (The stairs are narrow and tight, small children must hug.) 階段しかないので、抱っこして上まであがるのが大変 (Since there are only stairs, it is difficult to hold children and climb up.) ご年配の方には階段を昇るは大変 (It is difficult for elderly people to climb the stairs.) 階段が多くて当日は筋肉痛も有ったので辛かった (It was painful because there were many stairs and I had muscle pain on the day.) 2 回目の鶴ヶ城!修学旅行生で混んで居るけど学べましたまた桜の時期に行きたいです。年配の人 には階段キツイ (The second time of Tsuruga Castle! It's crowded with school students, but I learned it. I want to go again at the cherry blossom season. Stairs are hard for older people.) clue word: “トイレ (toilet)” トイレが汚くて (The toilet is dirty.) お城の中を見ていて、子供がトイレに行きたくなり、中になく外まで行かないとないので大変で す (It's hard because I'm looking inside the castle and my child wants to go to the bathroom, so I must go outside.) お城の中にトイレが無いので幼年には少し不向きそれだけのために階段の上り下り歩いて入口ま で戻るのは辛い (Since there is no toilet in the castle, it is a little unsuitable for childhood because it is difficult to walk up and down the stairs and return to the entrance.) clue word “天守閣: (castle tower)” ただ連休中だったので天守閣の回りは人で溢れて一周回るのが一苦労 (However, since it was a consecutive holiday, it was difficult to go around the castle tower because it was full of people.) 時間があれば、若松城周辺の公園等はじめ見所がたくさんあり楽しみがあります、強いて言えば、天守閣の 窓が、普通の窓であるのが残念 (If you have time, there are lots of things to see, such as the parks around Aizuwakamatsu Castle, and you can have fun.) 最上階天守閣は・・・思ったより狭い (The castle tower on the top floor is ... narrower than I expected.) だけど、お城の中は年表やら何藩がどうだとかのボードばかりで、当時の物の展示が少なく、天 守閣からの眺めは良かったですが、それ以外は残念 (However, in the castle, there are only boards such as the chronological table and what kind of clan it is, there are few exhibits of things at that time, and the view from the castle tower was good, but other than that, it is disappointing.) 天守閣への入場が大変 (Admission to the castle tower is difficult.) 天守閣も工事中で外の景色が見れず残念 (It's a pity that the castle tower is also under construction and I couldn't see the outside scenery.) 期待をして行ったが、天守閣の修理中で(中は入れる)外観および天守閣から風景を見ることが できず残念 (I was expecting it, but I am sorry that the castle tower is being repaired (the inside is inside) and I couldn’t see the scenery from the castle tower.) 残念ながら…屋根の瓦のふきかえ工事にあたってしまいお城が見れませんでしたが、逆に工事の 方がめずらしいので天守閣などに使う瓦に名前が刻める寄付的なコトもあり…記念にどうでしょ う? (Unfortunately, ... I couldn't see the castle because the roof tiles were repairing, but on the contrary, the construction is rare opportunity, so there is a donation to engrave the name on the tiles used for the castle tower, etc. ... how about a memorial?) 雨天で天守閣からの眺めは厚い雲で覆われて山々がと見えず、残念 (Unfortunately, the view from the castle tower was covered with thick clouds and I couldn’t see the mountains due to rain.) 157
ISBN: 978-989-8704-32-0 © 2021 4. DISCUSSIONS As for extraction of attention points, there are many cases that sentences with negative expressions co-occurred with attention points in 1 to 3 stars negative reviews. If a sentence is only negative expression, extracting adjacent sentence also works well to take an attention point. In case that here are some sentences which use negation with a negative expression, attention points will be positive ones. When an attention point appears with a negative expression, it will be extracted, but other attention points will not. Moreover, a long sentence including negative expression has several attention points and some positive points. It is impossible to extract all attention points and a positive point might be extracted by mistake. It is also necessary to make the extracted negative point more compact. As for choosing classification clues, we choose noun or proper noun related to a negative expression using syntax analysis result. This strategy works well because such nouns are very related to negative expression, then it was appropriate for classification clue. In addition, classification using clue words helped to make it easier to understand the points that user dissatisfied at. To exclude meaningless clue words, we used to stop word list of Japanese single character words, but there is some case that some important words were deleted. It is necessary to improve the rule, for example, use of word frequency of review documents and so on. 5. CONCLUSION In this paper, we focused on low-rated reviews and extracted tourist attention points from sentences that include negative expressions using dependency relation. The extracted tourist attention points are classified using clue words related to negative expression. Extraction of tourist attention points and classification of these points were enough level to help tourist understand. However, it is necessary to improve stop word information and handling of negation sentences. It is also required to apply more reviews of other domain such as other sightseeing spots and product evaluation. ACKNOWLEDGEMENT The authors would like to express their sincere thanks to the anonymous referees for the useful comments. REFERENCES Dinçer, M. D. and Alrawadieh, Z., 2017, Negative Word of Mouse in the Hotel Industry: A Content Analysis of Online Reviews on Luxury Hotels in Jordan, Journal of Hospitality Marketing & Management, Vol. 26, No. 8, pp. 785-804. Han, K. and Kitayama, D., 2020, An Association Method of Tourist Spots Using User Reviews for Advancing Explainability, IPSJ Transactions on database, Vol.13, No. 1, pp.1-7. Kurihara, K. and Shimada, K., 2014, Trouble Information Extraction from Twitter based on Bootstrap Method, Proc. of the 21th annual meeting of the ANLP, Japan, pp. 341-344. (in Japanese) Ohmori, N. and Mori, T., 2012, Automatic Extraction of Words Representing Industrial Products and Their Parts: Classification Methods According to “Word Tangibility”, The IEICE transactions on information and systems 95(3), pp. 697-706. (in Japanese) Ogawa, K, Sugimoto, Y., et.al., 2014, Basic design of a sightseeing recommendation system using Characteristic Words, IPSJ SIG on DPS, 2014-DPS-159 (14), pp.1-6. (in Japanese) 158
You can also read