Datasets for Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation - MDPI
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
data Data Descriptor Datasets for Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation Md. Atikur Rahman * and Emon Kumar Dey * Institute of Information Technology, University of Dhaka, Dhaka 1000, Bangladesh * Correspondence: bsse0521@iit.du.ac.bd (M.A.R.); emonkd@iit.du.ac.bd (E.K.D.) Received: 20 March 2018; Accepted: 2 May 2018; Published: 4 May 2018 Abstract: With the extensive growth of user interactions through prominent advances of the Web, sentiment analysis has obtained more focus from an academic and a commercial point of view. Recently, sentiment analysis in the Bangla language is progressively being considered as an important task, for which previous approaches have attempted to detect the overall polarity of a Bangla document. To the best of our knowledge, there is no research on the aspect-based sentiment analysis (ABSA) of Bangla text. This can be described as being due to the lack of available datasets for ABSA. In this paper, we provide two publicly available datasets to perform the ABSA task in Bangla. One of the datasets consists of human-annotated user comments on cricket, and the other dataset consists of customer reviews of restaurants. We also describe a baseline approach for the subtask of aspect category extraction to evaluate our datasets. Dataset: https://github.com/AtikRahman/Bangla_ABSA_Datasets Dataset License: CC0 Keywords: ABSA dataset; Bangla ABSA; aspect extraction from Bangla 1. Summary People trust human opinion more so than traditional advertising. For example, consumers are used to seeking advice and recommendation from others before making decisions regarding important purchases. Word of mouth (WOM) has always been salient for consumers when making a decision. Such referrals have a strong impact on both customer decision-making and new customer acquisition for the purchasing of a company’s product or service [1]. On the other hand, organizations are eager to mine all the activities and interactions of people to understand what their weaknesses and strengths are. This understanding would help them to develop their organizational strategy in this competitive world. Sentiment analysis (or opinion mining) is a process to determine the viewpoint of a person on a certain topic. It classifies the polarity of a document (i.e., review, tweet, blog, or news), that is, whether the communicated opinion is positive, negative, or neutral. There are three levels at which sentiment is analyzed [2]: the document level, sentence level, and aspect level. The document level considers that a document has an opinion on an entity, and the task is to classify whether an entire document expresses a positive or negative sentiment. The task at the sentence level regards sentences and determining whether each sentence expresses a positive, negative, or neutral opinion. Neither the document level nor the sentence level analysis discover exactly what people liked and did not like. The aspect level (or aspect-based sentiment analysis—ABSA) performs a finer-grained analysis that identifies the aspects of a given document or sentence and the sentiment expressed towards Data 2018, 3, 15; doi:10.3390/data3020015 www.mdpi.com/journal/data
Data 2018, 3, 15 2 of 10 each aspect. This level of analysis is the most detailed version that is capable of discovering complex opinions from reviews. There are two major tasks when performing ABSA. The first is to extract the specific areas or aspects mentioned in the opinioned review. The second is to identify the polarity (either positive, negative, or neutral) for every aspect. For example, the following review of a restaurant reveals two aspects: service and food. Both aspects have a positive polarity. “The service was excellent and the food was delicious.” As one can see, the name of the aspect categories are explicitly mentioned in this review. A review might also contain implicit categories; for example, “The staff makes you feel at home and the chicken is great.” Here, the same aspects, “service” and “food”, are contained without being directly mentioned. Semantic Evaluation (SemEval), a reputed workshop in the NLP domain, introduced a complete dataset [3] in English for the ABSA task. Later this was expanded to the ABSA task by adding multi-lingual datasets in which eight languages over seven domains were incorporated. To perform ABSA, datasets of several languages, such as Arabic [4], Czech [5], and French [6], were created. There is no dataset for Bangla in the field of ABSA. Consequently, no work is being done to extract aspects and to identify corresponding polarities for Bangla reviews. We are currently working on a project to extract the aspects from a Bangla review or comments for a particular product of a company, as online shopping is very popular nowadays in Bangladesh and is growing rapidly. People like to buy products online after reading the comments of others. In this paper, we have created two new datasets that serve as a benchmark for the ABSA domain in Bangla texts. We present two datasets named “Cricket” and “Restaurant”. The first dataset contains 2900 comments on cricket over 5 aspect categories, and the second dataset contains 2600 restaurant reviews. Because there is no work in Bangla for the ABSA task, we have introduced ABSA by extracting aspect categories from Bangla texts in order to evaluate our datasets. We performed the task with different training approaches and found a satisfactory outcome compared to evaluations of other languages. There are some related works from which we founded the idea of this topic. The restaurant review dataset, provided by Ganu et al. [7], was used to improve rating predictions. Their annotations included six aspect categories and overall sentence polarities. They had not prepared a complete ABSA dataset, as the aspect category was present but the corresponding polarity of that aspect was absent. The SemEval 2014 evaluation campaign [3] extended their dataset by adding three more fields with the aspect category. They published their dataset with four fields being contained for each review, that is, with the aspect term occurring in the sentences, the aspect term’s polarity, the aspect category, and the aspect category’s polarity. They also provided a laptop-review dataset and manually annotated with similar entities as for the restaurant dataset. These are the benchmark datasets that [8–11] researches have used for performing the ABSA task. The task was repeated in SemEval 2015 [12], for which aspect categories were the combination of the entity type and an attribute type. Multilingual datasets were released in the SemEval 2016 workshop [13] on the seven domains (restaurant, laptop, mobile phone, digital camera, hotel, and museum) and in eight languages (English, Arabic, French, Chinese, Turkish, Spanish, Dutch, and Russian). A book-review dataset in the Arabic language was provided by [4]. They annotated book reviews into 14 categories and 4 types of polarities, including “Conflict”. In [5], the author created an IT product-review dataset for the ABSA task, in which a total 2200 reviews were contained. The contribution of this paper is as follows: • We have collected and presented two Bangla datasets for ABSA and have made them publicly available. • We performed statistical linguistic analysis on the datasets.
Data 2018, 3, 15 3 of 10 • We implemented state-of-the-art machine learning approaches for the collected datasets and Data 2018, 3, x 3 of 11 found satisfactory accuracies. 2.2.Data DataDescription Description “DigitalBangladesh” “Digital Bangladesh”is is thethe integral integral partpart ofBangladesh of the the Bangladesh government’s government’s VisionVision 2021. 2021. The The Internet Internet is growing very fast over the country, and people are using different online is growing very fast over the country, and people are using different online platforms in every aspect platforms in ofevery theiraspect of their lives. This lives. This encouraged usencouraged to constructus to construct datasets datasets to analyze to analyze the Bengali the Bengali people’s people’s opinions and to opinions and to extract their sentiments in different aspects. In this paper, we have constructed extract their sentiments in different aspects. In this paper, we have constructed two different datasets, two different datasets, namely, the Cricket dataset and Restaurant dataset, namely, the Cricket dataset and Restaurant dataset, to evaluate the people’s opinions. to evaluate the people’s opinions. 2.1. Cricket Dataset 2.1. Cricket Dataset The Cricket dataset consists of 2900 different comments from different online sources with The Cricket five different dataset aspect consists of 2900 categories. Mostdifferent of the comments commentsfrom are different collectedonline from sources Facebookwithpages five different aspect categories. Most of the comments are collected (https://www.facebook.com/BBCBengaliService/; https://www.facebook.com/DailyProthomAlo/). from Facebook pages (https://www.facebook.com/BBCBengaliService/; Some comments are collected from two popular Bengali https://www.facebook.com/DailyProthomAlo/). Websites, BBC Bangla (http://www.bbc.com/ Some comments are collected from two bengali), and the Daily Prothom Alo (http://www.prothomalo.com). popular Bengali Websites, This dataset BBC Bangla was collected by the (http://www.bbc.com/bengali), and the Daily Prothom Alo (http://www.prothomalo.com). authors of this paper. The comments are of different lengths and each review contains approximately This dataset was collected by the authors of this paper. The comments are of different 3–100 Bangla words. The reasons behind choosing these Websites for collecting data are given below:lengths and each review contains approximately 3–100 Bangla words. The reasons behind choosing these Websites for •collecting data areand BBC Bangla given thebelow: Daily Prothom Alo are very popular online news sites for the Bengali community BBC Banglaall around and the world. the Daily They Prothom Aloare popular are for publishing very popular trustworthy online news sites forand theauthentic Bengali news. community all around the world. They are popular for publishing trustworthy and share Bengali people frequently read the news and sometimes make comments to their authentic opinion. news. Bengali people frequently read the news and sometimes make comments to share theirof Although people write their comments or opinions in both Bangla and English, most the time, they opinion. choose Although Bangla. people Wetheir write studied different comments articles and or opinions found in both thatand Bangla in almost English,90% mostof the of cases, people expressed their opinion in Bangla. the time, they choose Bangla. We studied different articles and found that in almost 90% of the • The Facebook cases, page of Prothom people expressed Alo has their opinion over 13 million followers, and BBC Bangla has over in Bangla. 11 million. The These Facebook twoofpages page provide Prothom Alo enormous has over 13text postsfollowers, million as well as and a large BBCnumber Banglaof comments. has over 11 • million.isThese Cricket one oftwo thepages mostprovide popularenormous text postsfor games nowadays as Bengali well as apeople. large number We foundof comments. that people Cricket are moreisinterested one of the in most popular making games nowadays comments for Bengalinews on cricket-related people. thanWeonfound any that people other topic. are more interested in making comments Thus, we chose this category for our experiment. on cricket-related news than on any other topic. Thus, we chose this category for our experiment. Table 1 shows an example of comments collected from the Facebook pages. Table 1 shows an example of comments collected from the Facebook pages. Table 1. Example of cricket-related comments on Prothom Alo and BBC Bangla Facebook pages. Table 1. Example of cricket-related comments on Prothom Alo and BBC Bangla Facebook pages. Comments Source মাশরািফ এক জাদুকারী নাম । য নামটা নেলই মন ভের যায়। আমােদর Prothom Alo Facebook page জিহর,জনসন, পালক, টিল নাই তেব এক জন মাশরািফ আেছ বালারেদরও দাষ নই, দাষটা 100% ম ােনজেমে র। পস বালারেদর জন উইেকট তির তা Prothom Alo Facebook page তারাই কের না। দাষটা ম ােনজেম না মানেল আিম নব। আশা কির তাসিকন অিত ত ু দেল িফরেব আর িনয়িমত খলেব এবং চর আউট করেব। Prothom Alo Facebook page এখন দশকেদর কােছ জনি য় হে 20--- ট /ওয়ানেড মানুষ এখন দখেতই চায়না.. Prothom Alo Facebook page হারেলও বাংলােদশ জতেলও বাংলােদশ।আগামীেত আবার আমরাই জতেবা। ইশ! আমােদর যিদ BBC Bangla Facebook page িবরাট কাহিলর মেতা একটা ব াট ান থাকেতা আমার পরামশ হেলা েকটারেদর কেপােরট জগত থেক দুের রাখেত হেব। BBC Bangla Facebook page Peopleusually People usually comment comment inin Bangla Bangla about about the the news. news. We We also also found foundthat that5–10% 5–10%of ofthe thetime, time,they they commented in English and also wrote Bangla sentences written in the English alphabet. We did not commented in English and also wrote Bangla sentences written in the English alphabet. We did not consider these opinions in our dataset. In addition, some comments had only emoticons and no other consider these opinions in our dataset. In addition, some comments had only emoticons and no text or words. We also omitted these for our dataset. All of the processes were done manually by the other text or words. We also omitted these for our dataset. All of the processes were done manually authors. The following section describes the annotation process of the collected corpus of cricket- by the authors. The following section describes the annotation process of the collected corpus of related comments. cricket-related comments. 2.1.1. Annotation of Cricket Dataset The Bangla text on cricket was annotated jointly by the authors, a group of second-year students of BSSE, and two employees from the Institute of Information Technology, University of Dhaka,
Data 2018, 3, 15 4 of 10 2.1.1. Annotation of Cricket Dataset The Bangla text on cricket was annotated jointly by the authors, a group of second-year students of BSSE, and two employees from the Institute of Information Technology, University of Dhaka, Bangladesh. All participants agreed to categorize the whole dataset into five different aspect categories. Data These 2018, were3,3,xbowling, Data 2018, x batting, team, team management, and other. Given a comment,4 the of 114task of 12 of the annotators was to recommend the aspect category and polarity labels for each. Three types of Bangladesh. Bangladesh. All participants participantsagreed agreedtotocategorize categorize the whole dataset into five five different aspect polarities were All considered, that is, positive, negative,theandwhole dataset neutral. Tableinto 2 shows different aspect the information about categories. These These were bowling, batting, team, team management, and other. Given a comment, the thecategories. participants. were bowling, batting, team, team management, and other. Given a comment, the task of the theannotators annotatorswas wastotorecommend recommend thethe aspect aspect category category andand polarity polarity labels labels for each. for each. ThreeThree typestypes of polarities wereconsidered, polarities were considered,that thatis,is, positive, positive, negative, negative, andand neutral. neutral. Table Table 2 shows 2 shows the information the information Table 2. Information about the participants in data collection. about the the participants. participants. Participant ID Gender Profession Task Table Table2.2.Information Informationabout thethe about participants in data participants collection. in data collection. P1 Male MS student/author Data collection (Cricket) and annotation Participant ParticipantID Gender IDMale Gender Profession Profession TaskTask P2 Faculty/author Data collection (Cricket) and annotation P1 P1 Male MS student/author Data collection (Cricket) and annotation P3 MaleMale MS student/author Graduate student Data collection Annotation (Cricket) (Cricket) andand annotation translation (Restaurant) P2 P2 Male Male Faculty/author Faculty/author Data collection Data (Cricket) collection and annotation (Cricket) and annotation P4 Female Graduate student Annotation (Cricket) and translation (Restaurant) P3 Male Graduate student Annotation (Cricket) and translation (Restaurant) P5 P3 Female Male Graduate Graduate student Annotation student Annotation (Cricket) (Cricket) and translation and translation (Restaurant) (Restaurant) P4 Female Graduate student Annotation (Cricket) and translation (Restaurant) P6 P4 Male Female Graduate Graduate student Annotation student Annotation (Cricket) (Cricket) and translation and translation (Restaurant) (Restaurant) P5 Female Graduate student Annotation (Cricket) and translation (Restaurant) P7 P5 Male Female Graduate Graduate student Annotation student Annotation (Cricket) (Cricket) and translation and translation (Restaurant) (Restaurant) P6 Male Graduate student Annotation (Cricket) and translation (Restaurant) P8 P6 MaleMale Graduate Graduate student Annotation student Annotation (Cricket) (Cricket) and translation and translation (Restaurant) (Restaurant) P7 Male Graduate student Annotation (Cricket) and translation (Restaurant) P9 P7 Female Male Accountant Graduate student Annotation (Cricket) andAnnotation translation (Restaurant) P8 Male Graduate student Annotation (Cricket) and translation (Restaurant) P10 P8 MaleMale Officer Graduate student Annotation (Cricket) andAnnotation translation (Restaurant) P9 Female Accountant Annotation P9 P10 Female Male Accountant Officer Annotation Annotation P10 Male Officer Annotation Each participant categorized every comment of the dataset. We applied the majority voting Eachtoparticipant technique categorized make the final every decision comment about of thecategory the aspect dataset. and We applied the majority the polarity voting As an of a sentence. Each to technique participant categorized make the final decision every comment about the of the dataset. aspect category and theWe applied polarity of a the majority sentence. voting As an example, we have taken the following comment: technique to make the final decision about the aspect category and the polarity of a sentence. As an example, we have taken the following comment: এই following example, we have taken“the িপেচ রান করা টাফ, বািলং িনঃসে েহ ভােলা হেয়েছ ” comment: The voting result we “এই foundিপেচ for রান the করা টাফ,is is comment The voting result we found for the comment বািলং given িনঃসে in Table given েহ 3. ভােলা in Table 3. হেয়েছ ” The voting result we found for the comment is given in Table 3. Table 3. Voting example to define the category and polarity. Table 3. Voting example to define the category and polarity. Table 3. এই Comment: Voting িপেচ example to define রান করা টাফ, বািলংthe category িনঃসে andহেয়েছ েহ ভােলা polarity. Participant Comment: এই Voting িপেচ রানforকরা Category টাফ, বািলংVoting িনঃসে forেহPolarity ভােলা হেয়েছ Comment: P1 Bowling Positive Participant Voting for Category Voting for Polarity Participant P2 Voting for Category Bowling Voting for Polarity Positive P1 Bowling Positive P3 P1P2 Batting Bowling Negative Positive Bowling Positive P4 P2 Batting Bowling Negative Positive P3 Batting Negative P5 Other Neutral P3 Batting Negative P6P4 Batting Bowling Negative Positive P4P5 Batting Other Negative Neutral P7 Other Neutral P5 Other Neutral P8P6 Bowling Bowling Positive Positive P6 Bowling Positive P9P7 Other Bowling Neutral Positive P7 Other Neutral P8 P10 Bowling Batting Positive Negative P8 Bowling Positive P9 Bowling Positive P9 Bowling Positive From Table 3, we can P10 see that the commentsBatting Negative had three votes for Batting with a negative polarity, P10 Batting Negative two votes for Other with a neutral polarity, and four votes for Bowling with a positive polarity. Thus, From Table our method 3, we can determined thissee that theascomments comment had being in the three votes Bowling for Batting category with a negative with a positive polarity. polarity, We From two Table also votes 3, forfor had ties we Other can somewith see that a neutral comments. the comments Inpolarity, had and four this situation, three votesboth we took votes for for Bowling Batting with with with the categories their negative a a positive polarity.inpolarity, polarity Thus, two votes our forTable method dataset. Other withthis determined 4 shows aanneutral comment example polarity, asthis for being and four in the scenario. votes category Bowling for Bowlingwith with a positive a positive polarity.polarity. We Thus, alsoour hadmethod ties for determined this comment some comments. as beingwe In this situation, intook the Bowling category with both the categories withatheir positive polarity. polarity in Weour alsodataset. had tiesTable 4 shows for some an example comments. In for thisthis scenario. situation, we took both the categories with their polarity in our dataset. Table 4 shows an example for this scenario.
Data 2018, 3, 15 5 of 10 Data 2018, 3, x 5 of 11 Table 4. Voting category identification. Data 2018, 3, x Table 4. Voting category identification. 5 of 12 Comment: ওরা4.200 Table Comment: কেরেছ, Voting তামরাidentification. category 100 করেত পারেব না? Participant Participant Voting Comment: ওরা forfor Category 200 কেরেছ, Voting Voting for Polarity তামরা 100 করেতVoting Category পারেব না? for Polarity P1 P1 Participant Batting Voting Batting for Category VotingNegative for Negative Polarity P2P1 P2 Team Team Batting Negative Negative Negative P3P2 P3 Batting Batting Team Negative Negative Negative P4 P4P3 Batting Batting Batting Negative Negative Negative P5 Batting Negative P5P4 Batting Batting Negative Negative P6 Team Negative P6P5 P7 Batting Team Team Negative Negative Negative P7P6 P8 Team Team Batting Negative Negative Negative P8P7 P9 Team Team Batting Negative Negative Negative P10 P8 Batting Team Negative Negative P9 Team Negative P9 Team Negative P10P10 Team Team Negative Negative We can see from Table 4 that 50% of the evaluators voted for Batting, and 50% of the votes were We for the can We see Team can from category, see fromTable both 4 that with Table 50% 50%ofofthe a negative 4 that evaluators polarity. the voted As they evaluators had voted forfor Batting, tied, ourand Batting, and 50% algorithm 50% of votes took of the the votes both were of these were for the forTeam categories category, theinTeam the labeled category,both with dataset both a anegative with with anegative negativepolarity. polarity. polarity. Asthey As theyhadhadtied, tied, ourour algorithm algorithm tooktook bothboth of of theseWecategories these also in the categories faced labeled in the another dataset labeled kind dataset with withaato of problem negative negative constructpolarity. polarity. the dataset. After calculating the category, we found We We also also facedfaced another another dissimilarities kind forkind some ofof problemto problem comments toamong construct constructthethe thedataset. dataset. participants After calculating After calculating regarding thethepolarity category, the category, of the we comment. found we found For dissimilarities dissimilarities example, when for some for some comments comments we considered among theamong the participants the comment, following participants regarding weregarding the found the the polarity voting of the polarity result of the given comment. comment. in Table 5. For For example, example, when when weweconsidered considered thethe following following comment, comment, wewefound foundthe the voting resultresult voting given in Table 5. given in Table 5. Table 5. Problem related to polarity determination. Table 5. Problem related to polarity determination. Table 5. Problem related to polarity determination. Comment: রা াক বুেড়া হেয় গেছ, খলা পােরনা। তাহেল আজ িকভােব িক করেলা? Comment: Comment: রা াক বুেড়া হেয় Participant গেছ, Voting খলা পােরনা। তাহেলVoting for Category আজ িকভােব িক করেলা? for Polarity Participant Voting for Category Voting for Polarity Participant P1 Voting Bowling for Category Voting for Polarity Positive P1 Bowling Positive P1 P2 Bowling Bowling Positive Positive P2 Bowling Positive P2 P3 P3 Team Bowling Team Negative Positive Negative P3 P4 P4 Bowling Team Bowling Negative Negative Negative P5 Other Positive P4 P5 Other Bowling Positive Negative P6P6 Team Team Negative Negative P5 Other Positive P7P7 Other Other Negative Negative P6 P8 P8 Team Team Team Negative Negative Negative P7 P9 P9 Other Bowling Bowling Negative Positive Positive P8 P10 P10 TeamTeam Team Negative Negative Negative P9 Bowling Positive From Table P10 5, we can see that, for the comment, both the Bowling andNegative Team categories had four From Table 5, we can see that, for theTeamcomment, both the Bowling and Team categories had four votes. Thus, we keep both of these categories in our annotated dataset. In polarity, there were three votes. Thus, we keep both of these categories in our annotated dataset. In polarity, there were three positive From votes5,and Table weone cannegative see that,vote for bowling. Thus, we the considered it as having a positive polarity four positive andvotes addedand it toone our negative votefor dataset. Table the 6for comment, bowling. shows both Thus, a sample Bowling welabeled of the considered Cricket and it as Team havingcategories dataset. a positivehad polarity votes. and Thus,itwe added to ourkeep both ofTable dataset. these6 categories shows a sample in ourofannotated dataset. the labeled Cricket Indataset. polarity, there were three positive votes and one negative vote for bowling. Thus, In Table 7, the summary of the complete Cricket dataset is presented. We we considered it as having a positive can see polarity that there are a and added it to our dataset. Table 6 shows a sample of the labeled Cricket total of 3034 different comments with five different categories, that is, Batting, Bowling, Team, Team dataset. Management, and Other.Table Each 6. ofAthe categories part contains of the Cricket threein dataset different polarities: positive, negative, xlsx format. and neutral. For example, the Batting category contains a total of 583 comments, for which 138 are of positive, 389 are ofবয্াপার না। eটা and negative, ধু মাt56ঘর্টare না ছাড়া িকছু না। polarity. The Bowling category of neutral other contains 154neutral positive, 145 negative, and 33 neutral ভকামনা টাiগারেদরcomments. polarity জনয্। The Team, Team Management, team and Other categories positive contain totals of বাংলােদশ eখেনাand 774, 332, তািমেমর 1013 েযাগয্ oেপনার েপেলাrespectively. comments, না। batting negative বাংলােদশ হারেব আজ । team negative সাতজন িsনার িনেয় মােঠ েবাতল টানােনা যায় ময্াচ েজতা যায় না। bowling negative টািনর্ং িপচ বািনেয় ঔষধ খু েজ লাভ িক বাuিn িপচ বানােলi হয়। team management negative eটা পু রাদেম িফিkং eকটা েখলা হiেস। team negative জয় ধু সমেয়র aেপkা। other positive েযi জেয়র সমান!! other neutral
Data 2018, 3, 15 6 of 10 Data 2018, 3, x 6 of 12 6. A Table 6. Table A part part of of the the Cricket Cricket dataset dataset in in xlsx xlsx format. format. বয্াপার না। eটা ধু মাt ঘর্টনা ছাড়া িকছু না। other neutral ভকামনা টাiগারেদর জনয্। team positive বাংলােদশ eখেনা তািমেমর েযাগয্ oেপনার েপেলা না। batting negative বাংলােদশ হারেব আজ । team negative সাতজন িsনার িনেয় মােঠ েবাতল টানােনা যায় ময্াচ েজতা যায় না। bowling negative টািনর্ং িপচ বািনেয় ঔষধ খু েজ লাভ িক বাuিn িপচ বানােলi হয়। team management negative eটা পু রাদেম িফিkং eকটা েখলা হiেস। team negative জয় ধু সমেয়র aেপkা। other positive েযi জেয়র সমান!! other neutral েবালাররা েয পিরমােন শটর্ বল িদেc- তােত রান কতেবিশ হয় েসটাi েদখার িবষয়! bowling negative তােক েটs আর oিডআi দেল িনয়িমত চাi। team positive বাংলােদশ িkেকট আেরা eিগেয় যােব, oেপনারেদর eকটু ভােলা করেত হেব। team positive বাংলােদশ িkেকট আেরা eিগেয় যােব, oেপনারেদর eকটু ভােলা করেত হেব। batting negative িফেরi চমক েদখােলন রাjাক bowling positive নবীনেদর সু েযাগ েদয়া দরকার. other neutral েবািলং িপচ তেব আমােদর বয্াটসময্ানেদর আuট েলা আtহতয্া ছাড়া আর িকছু i নয়। batting negative েবািলং িপচ তেব আমােদর বয্াটসময্ানেদর আuট েলা আtহতয্া ছাড়া আর িকছু i নয়। bowling neutral দািয়tjান হীনতার aভাব? other negative In Table 7, the summaryTableof7.the Thecomplete completeCricket dataset statistics is presented. of Cricket dataset. We can see that there are a total of 3034 different comments with five different categories, that is, Batting, Bowling, Team, Team Management, and Other. Each of the categories contains Polaritythree different polarities: positive, negative, and neutral. ForCategory example, the Batting category contains a total of 583 comments, for Total which 138 are of Positive Negative Neutral positive, 389 are of negative, and 56 are of neutral polarity. The Bowling category contains 154 Batting positive, 145 negative, and 33 neutral138 389 polarity comments. The Team,56Team Management, 583 and Other Bowling 154 145 33 332 categories containTeam totals of 774, 332, and 166 1013 comments,502 respectively.66 774 Team Management 24 293 15 332 Other Table 7. The 89 complete statistics 828 of Cricket dataset. 96 1013 Total Comments Polarity 3034 Category Total Positive Negative Neutral 2.1.2. Analysis of Proposed Cricket Batting Dataset 138 389 56 583 Bowling 154 145 33 332 We used Zipf’s law [14] for our proposed Cricket dataset. Zipf’s law is a statement-based Team 166 502 66 774 observation that states that TeaminManagement a collection of data, 24 the frequency 293 of15a given332 word should be inversely proportional to its rank in the Other corpus. The word 89that is the 828most frequent 96 and 1013scores rank 1 in a dataset should occur approximately twice as the second Total most frequent word, three Comments 3034times as the third most frequent, and so on. Figure 1 shows the diagram in which we plotted the words of our Cricket dataset. 2.1.2.follows The plot Analysisthe of trend Proposed Cricketlaw. of Zipf’s Dataset We also calculated the reliability of the annotation process. The valueWe of the intraclass correlation (ICC) was 0.71. used Zipf’s law [14] for our proposed Cricket dataset. Zipf’s law is a statement-based observation that states that in a collection of data, the frequency of a given word should be inversely 2.2. Restaurant Dataset proportional to its rank in the corpus. The word that is the most frequent and scores rank 1 in a dataset should To occur create theapproximately twice asdataset, Bangla Restaurant the second we most tookfrequent word, three help directly from times as the third the English most benchmark’s frequent, and so on. Figure 1 shows the diagram in which we plotted the words of our Cricket dataset. Restaurant dataset [3]. All comments were abstractly translated into Bangla with their exact annotation. The plot follows the trend of Zipf’s law. We also calculated the reliability of the annotation process. The original English dataset contains a total of 2800 different comments. Participants from the same The value of the intraclass correlation (ICC) was 0.71. group involved in the Cricket dataset’s creation were involved in the translation process of the Restaurant dataset, except participants P9 and P10. We divided the original dataset equally into eight parts and distributed these to the participants. They translated their assigned parts of the original English dataset abstractly. Finally, participants P1 and P2 merged the separate sections and performed an extensive proofread.
Data 2018, 3, 15 7 of 10 Figure Figure 1. 1. Distributionof Distribution of word word fre que ncie s of frequencies ofCricke t datase Cricket t using dataset usingZipf’s law. Zipf’s law. Annotation Schema for Restaurant The Restaurant reviews [3] dataset used in this paper was abstractly translated into Bangla. There were five types of aspect categories, that is, Food, Price, Service, Ambiance, and Miscellaneous. As the objective was to identify the aspect category and corresponding polarity, the participants did not add aspect terms or their polarities. In terms of the polarity of an aspect category, we considered only three polarity labels, that is, positive, negative and neutral. The original dataset consisted of four different polarity labels: positive, negative, neutral, and conflict. In our translated Bangla dataset, we omitted the conflict category and assumed it to be the same as the neutral category. The annotators were asked to assign each translated Bangla restaurant review into their categories and their polarities for the original dataset. Table 8 shows a sample of the translated Restaurant dataset. Data 2018, 3, x 8 of 12 Table8.8.AApart Table partof ofthe theRestaurant datasetin Restaurant dataset inxlsx xlsxformat. format. খু ব সীিমত আসন আেছ eবং খাদয্ পাoয়ার জনয্ যেথ aেপkা করেত হেব। ambience negative খু ব সীিমত আসন আেছ eবং খাদয্ পাoয়ার জনয্ যেথ aেপkা করেত হেব। service negative Figure 2. Wordদাম freতুque লনামূncy of Bangla লকভােব কম। Re staurant datase t according to Zipf’s law. price positive াi িছল মজাদার food positive 3 যিদo খাবারিট চমৎকার িছল, eিট সsা িছল না। food positive যিদo খাবারিট চমৎকার িছল, eিট সsা িছল না। price negative খু ব ভাল! miscellaneous positive আচােরর সংেযাজন খু ব ভাল িছল । food positive ধু মাt রাnাi েয েসরা তা নয় , েসবা সবসময় মেনােযাগী eবং ভাল হেয়েছ। food positive ধু মাt রাnাi েয েসরা তা নয় , েসবা সবসময় মেনােযাগী eবং ভাল হেয়েছ। service positive সবর্দা eকিট সু nর িভড়, িকn েকান েকালাহল েনi। ambience positive সjা alsl eবং পির ার - িব াn বা pশংসা করা িকছু i েনi ambience neutral আিম িনি ত েয আমােক বারবর িফের েযেত হেব, !!! miscellaneous positive সmাবত eিট eকিট েছাট আরামদায়ক েরsু েরn,ভাল সjার সে েরামািnক aনু ভূিত। ambience positive যিদo খাদয্ ভাল িছল পিরেবষনা িছল িব । food positive যিদo খাদয্ ভাল িছল পিরেবষনা িছল িব । service negative Da ta 2018, 3, x; doi: কম রা মেনােযাগী eবং বnু tপূ ণ।র্ www.mdpi.c om/journal/data service positive খাবার ভাল িছল। food positive Table 9 shows the complete statistics of the Bangla Restaurant dataset. We can see from the table Table that9five shows the complete different categories, statistics of the that is, Food, Bangla Price, Restaurant Service, Ambiance,dataset. We can see and Miscellaneous, from the table contained that five713, different categories, 178, 336, thatreviews, 234, and 613 is, Food, Price, Service, respectively, Ambiance, with three and different Miscellaneous, polarities. contained For example, the 713, 178, 336,category 234, andFood 613contained reviews,500 positive, 126with respectively, negative, threeand 87 neutral different sentiment polarities. Forlabels. The Service example, the category category contained Food contained 186 positive, 500 positive, 118 negative, 126 negative, andand87 32 neutralsentiment neutral sentiments.labels. We alsoThe found that this Service category Restaurant dataset also followed Zipf’s law, which is shown in Figure 2. Table 9. Complete statistics of Bangla Restaurant dataset. Polarity Category Total Positive Negative Neutral Food 500 126 87 713
Data 2018, 3, 15 8 of 10 contained 186 positive, 118 negative, and 32 neutral sentiments. We also found that this Restaurant dataset also followed Zipf’s law, which is shown in Figure 2. Table 9. Complete statistics of Bangla Restaurant dataset. Polarity Category Total Positive Negative Neutral Food 500 126 87 713 Price 102 60 16 178 Service 186 118 32 336 Ambiance 138 53 43 234 Figure 1. Distribution of word fre que ncie s of Cricke t datase t using Zipf’s law. Miscellaneous 300 120 193 613 Figure2.2.Word Figure Wordfrequency fre que ncyof of Bangla Bangla Re staurant datase Restaurant datasett according accordingtotoZipf’s Zipf’slaw. law. 3 3. Baseline Evaluation Our objective is to provide benchmark datasets for Bangla ABSA. Our datasets are designed for two major tasks of ABSA. These are aspect category extraction and the identification of polarity for each aspect category. In this paper, we experimented with the first subtask, that is, the extraction of the aspect category. We applied three major steps to extract the aspect category. Firstly, preprocessing was performed on the dataset. After this, we extracted features from the data and finally performed classification using some popular classification models. 3.1. Preprocessing and Feature Extraction In the preprocessing phase, each Bangla document was represented as a “bag of words”. We applied traditional preprocessing steps for the evaluation. Firstly, punctuations Da ta 2018, 3, x; doi: and stop words www.mdpi.c om/journal/data were removed from each of the comments. After this, we removed the digits from our dataset, because we found that digits were not necessary for the aspect category. Finally, we tokenized each Bangla word from our dataset. Thus, a vocabulary of Bangla words was prepared after preprocessing. We created a feature matrix for which each review was represented by a vector of that vocabulary. Term frequency–inverse document frequency (TF–IDF) was used for calculating the features. 3.2. Results In the training phase, extracted feature sets were trained by the popular supervised machine learning algorithms. Because this was a multi-label classification problem, we trained our models
Data 2018, 3, 15 9 of 10 by setting up multi-label output. We used linear SVC in the support vector machine (SVM) implementation. The following machine learning algorithms were used: I. Support Vector Machine (SVM) II. Random forest (RF) III. K-nearest neighbor (KNN) After the training was completed, our proposed Bangla test dataset was executed on the trained model. The result is shown in the following table and figure. Table 10 shows the results for the task of aspect category extraction of the datasets we have presented in this paper. We can see that using the SVM, we obtained the highest precision rate for both of the datasets. Both datasets showed a low recall and F1-score. Figure 3 shows the overall accuracy of the models using our datasets. The inherent nature of the datasets is the reason behind the lower performance of the models for both datasets. People share their opinion with their individual judgment. Therefore, the variety of opinions in the datasets is much larger. On the other hand, aspect extraction is a multi-label classification problem. One’s opinion might have multiple aspect categories. Conventional classifiers miss some of these aspect categories. Table 10. Performance of proposed datasets. Dataset Model Precision Recall F1-Score SVM 0.71 0.22 0.34 Cricket RF 0.60 0.27 0.37 KNN 0.45 0.21 0.25 SVM 0.77 0.30 0.38 Restaurant RF 0.64 0.26 0.33 KNN 0.54 0.34 0.42 Data 2018, 3, x 10 of 11 Accuracy 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 SVM RF KNN SVM RF KNN Cricket Restaurant Precision Recall F1-score Figure 3. The result of three models of our datasets. Figure 3. The result of three models of our datasets. 4. Conclusions and Future Work Two datasets are provided for the ABSA of Bangla text. These datasets have been designed to These results can be improved if we process and train the datasets in a more sophisticated perform two tasks covering aspect category extraction and the identification of polarity for that aspect way. In this work, category. Wehave we taken also report all ofresults baseline the vocabulary as features to evaluate the task for theextraction. of aspect category evaluation after removing As future plans, we aim to enhance our work by including further punctuation, stop words, and digits. Some state-of-the-art techniques for information domains such as cars, mobiles, gain can be and laptops. We are working on more advanced methods for the ABSA of Bangla text using our applied to the dataset before classification and after the preprocessing steps to attain better results. datasets to achieve better performance. 4. Conclusions and Author Future Work Contributions: All authors contributed equally to this work, and have read and approved the final manuscript. Two datasets are provided for the ABSA of Bangla text. These datasets have been designed to Conflicts of Interest: The authors declare no conflict of interest. perform two tasks covering aspect category extraction and the identification of polarity for that aspect category. WeReferences also report baseline results to evaluate the task of aspect category extraction. 1. Trusov, M.; Bucklin, R.E.; Pauwels, K. Effects of word-of-mouth versus traditional marketing: Findings from an internet social networking site. J. Mark. 2009, 73, 90–102. 2. Jeyapriya, A.; Selvi, C.K. Extracting Aspects and Mining Opinions in Product Reviews Using Supervised Learning Algorithm. In Proceedings of the 2015 2nd International Conference on Electronics and Communication Systems (ICECS), Coimbatore, India, 26–27 February 2015. 3. Pontiki, M.; Galanis, D.; Pavlopoulos, J.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. Available Online:
Data 2018, 3, 15 10 of 10 As future plans, we aim to enhance our work by including further domains such as cars, mobiles, and laptops. We are working on more advanced methods for the ABSA of Bangla text using our datasets to achieve better performance. Author Contributions: All authors contributed equally to this work, and have read and approved the final manuscript. Conflicts of Interest: The authors declare no conflict of interest. References 1. Trusov, M.; Bucklin, R.E.; Pauwels, K. Effects of word-of-mouth versus traditional marketing: Findings from an internet social networking site. J. Mark. 2009, 73, 90–102. [CrossRef] 2. Jeyapriya, A.; Selvi, C.K. Extracting Aspects and Mining Opinions in Product Reviews Using Supervised Learning Algorithm. In Proceedings of the 2015 2nd International Conference on Electronics and Communication Systems (ICECS), Coimbatore, India, 26–27 February 2015. 3. Pontiki, M.; Galanis, D.; Pavlopoulos, J.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. Available online: http://www.aclweb.org/anthology/S14-2004 (accessed on 3 May 2018). 4. Al-Smadi, M.; Qawasmeh, O.; Talafha, B.; Quwaider, M. Human Annotated Arabic Dataset of Book Reviews for Aspect Based Sentiment Analysis. In Proceedings of the 2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), Rome, Italy, 24–26 August 2015. 5. Tamchyna, A.; Fiala, O.; Veselovská, K. Czech Aspect-Based Sentiment Analysis: A New Dataset and Preliminary Results. Available online: https://pdfs.semanticscholar.org/cbd8/7f4201c427db33783b1890bca65f5bf99d2c.pdf (accessed on 3 May 2018). 6. Apidianaki, M.; Tannier, X.; Richart, C. Datasets for Aspect-Based Sentiment Analysis in French. Available online: http://www.lrec-conf.org/proceedings/lrec2016/pdf/61_Paper.pdf (accessed on 3 May 2018). 7. Gayatree, G.; Elhadad, N.; Marian, A. Beyond the Stars: Improving Rating Predictions Using Review Text Content. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.150.140&rep=rep1& type=pdf (accessed on 3 May 2018). 8. Kiritchenko, S.; Zhu, X.; Cherry, C.; Mohammad, S. NRC-Canada-2014: Detecting Aspects and Sentiment in Customer Reviews. Available online: http://www.aclweb.org/anthology/S14-2076 (accessed on 3 May 2018). 9. Kiritchenko, S.; Zhu, X.; Cherry, C.; Mohammad, S. Supervised and Unsupervised Aspect Category Detection for Sentiment Analysis with Co-cccurrence Data. In IEEE Transactions on Cybernetics; IEEE: Piscataway, NJ, USA, 2017. 10. Soujanya, P.; Cambria, E.; Gelbukh, A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowl.-Based Syst. 2016, 108, 42–49. 11. Pengfei, L.; Joty, S.; Meng, H. Fine-Grained Opinion Mining with Recurrent Neural Networks and Word Embeddings. Available online: http://www.aclweb.org/anthology/D15-1168 (accessed on 3 May 2018). 12. Pontiki, M.; Galanis, D.; Papageorgiou, H.; Manandhar, S.; Androutsopoulos, I. Semeval-2015 Task 12: Aspect Based Sentiment Analysis. Available online: http://www.aclweb.org/anthology/S15-2082 (accessed on 3 May 2018). 13. Pontiki, M.; Galanis, D.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S.; AL-Smadi, M.; Al-Ayyoub, M.; Zhao, Y.; Qin, B.; De Clercq, O.; et al. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. Available online: http://www.aclweb.org/anthology/S16-1002 (accessed on 3 May 2018). 14. Pak, A.; Paroubek, P. Twitter as A Corpus for Sentiment Analysis and Opinion Mining. Available online: http://crowdsourcing-class.org/assignments/downloads/pak-paroubek.pdf (accessed on 3 May 2018). © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
You can also read