Recognition of Good, Bad, and Neutral News Headlines in Portuguese

Page created by Sam Powell

Current Events

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Advanced Science and Technology Letters
Vol.97 (UCMA 2015), pp.88-93
http://dx.doi.org/10.14257/astl.205.97.15

Recognition of Good, Bad, and Neutral News Headlines
in Portuguese

António Paulo Santos1, Carlos Ramos1, and Nuno C. Marques2
1
GECAD, Institute of Engineering - Polytechnic of Porto, Portugal
2
DI-FCT, Universidade Nova de Lisboa, Monte da Caparica, Portugal
pgsa@isep.ipp.pt, csr@isep.ipp.pt, nmm@di.fct.unl.pt

Abstract. This paper investigates the classification of news headlines as
positive, negative, and neutral. A news headline is positive if is associated with
good things, negative if it is associated with bad things, and neutral in the
remaining cases. The classification of a news headline is predicted using a
supervised approach. The experiments show an accuracy that ranges from
59.00% to 63.50% when argument1-verb-argument2 relations are combined
with other features. The accuracy ranges from 57.50% to 62.5% when these
relations are not used.

1 Introduction

In the future, some smart devices will be able to recognize the emotional state of
humans. When a negative emotional state is identified these smart devices will
communicate with other devices and they will try to create a positive environment.
So, the smart device could play the right music or movie creating a relaxed
environment. The smart device could choose and display good news among other
actions. This motivates us to consider the problem of classifying news articles by
overall sentiment, determining if a news headline is positive, negative or neutral.
Using the corpus compiled in the SemEval2007 workshop for “task 14: affective text”
[10], we apply different approaches and algorithms.

2 Classifying News Headlines - Applied Approach

For classifying a news headline as positive (or good), negative (or bad), or neutral, we
performed different experiments by applying a supervised machine learning approach
with two classification algorithms (SMO [8] and Random Forest [2]). For applying
this approach, first it was collected an existing dataset. We used the SemEval2007
[10] dataset, created for “task 14: affective text” in the International Workshop on the
Semantic Evaluations. The second step was to pre-process the dataset and represent
each headline as a vector of features. In one of the experiments the features were
unigrams and bigrams (sequence of two words). This representation is known as bag-
of-words (BOW) because word ordering is lost. To compensate this lost, we extracted

ISSN: 2287-1233 ASTL
Copyright © 2015 SERSC

Advanced Science and Technology Letters
Vol.97 (UCMA 2015)

argument1-verb-argument2 relations (as described in the next section) from each
news headline and used them as features. These syntactic features were combined
with unigrams and bigrams for performing another experiment. In another line of
features, we investigated the use of counts of certain types of words (e.g. number of
positive adjectives). The third step was to apply a learning algorithm over the training
set to get a classification model. We applied two different learning algorithms using
Weka [4]: the Sequential Minimal Optimization (SMO) [9] which is an SVM method
(Support Vector Machines) and the tree classification method Random Forest. The
learning algorithm aims to recognize the features that allow classifying a news
headline with a given classification. On the fourth step we evaluate the classification
model learned in previous step by the learning algorithm.

3 Classifying News Headlines - argument1-verb-argument2
relations

When dealing with subjective texts, such as texts containing opinions, it is common to
rely on adjectives and adverbs for identifying the polarity of those texts [11].
However, when dealing with factual text such as news articles, adjectives are much
less frequent. Also, as described in the previous section, a bag-of-words
representation of text does not take in account the word ordering. We believe that the
extraction of argument1-verb-argument2 relations from news headlines and use them
as features to machine learning algorithms can minimize both problems. An
argument1-verb-argument2 relation is mainly a relation between nouns. These
relations are able to capture part of the meaning of a news headline. Most of the
times, it can capture the essential of a news headline. For example, from the news
headline “João Pereira falha jogos com o Everton” (João Pereira misses games with
Everton) it is extracted the “João Pereira-falha-jogos” relation (João Pereira-misses -
games). As we can see, part of the information was lost, but the essential meaning is
captured. For extracting the argument1-verb-argument2 relations we applied the
following steps:

1. A part-of-speech tagger is applied to each news headline. In this step, each
word is labelled with its part of speech (e.g. noun, verb, adjective, adverb). This
step was performed using the OpenNLP 1 POS Tagger 1.5.2, using a maximum
entropy model, trained on a corpus named Bosque2 8.0.
2. A named entity recognizer (NER) is applied to each news headline. In this step,
persons, organizations, locations, facilities, and events are recognized. When a unit
of text can be recognized as a potential named entity but not classified in one of the
mentioned classes, it is assigned the type “unknown entity". This classification
remains useful for our main goal which is to identify entities composed by more

1
http://opennlp.sourceforge.net/
2
http://www.linguateca.pt/foresta/corpus.html

Advanced Science and Technology Letters
Vol.97 (UCMA 2015)

than one word. This step was done by adapting the NER from ANNIE 3 system to
Portuguese.
3. Multiword Expressions. In this step, consecutive words which represent a concept
that has the potential of leading to a positive or negative situation are labelled. For
example, the concept “red card” is labeled and associated with the negative a priori
polarity. It is true that this concept can be also used in a positive context, but in this
step the main objective is to capture multiple-word concepts. This step was
performed by using a pre-built dictionary of multi-words expressions.
4. A phrase chunker is applied to each headline. In this step, consecutive words are
grouped into non-overlapping phrases, namely NP (noun phrase), VP (verbal
phrase), PP (prepositional phrase), ADJP (adjective phrase) and ADVP (adverb
phrase) using the OpenNLP Chunker 1.5.2.
5. Extraction of PHRASE1-VP-PHRASE2 triples from each news headline. In
this step, a preliminary version of the argument1-verb-argument2 relations is
extracted. These preliminary versions are Phrase1-VP-Phrase2 triples, where
Phrase1 and Phrase2 are mainly NPs. An example of these triples is “Cesc
Fabregas”-“wants to win”-“Premier League” (pattern: NP-VP-NP). These
relations are extracted using syntactic patterns (e.g. NP-VP-NP). The syntactic
patterns were manually defined by examining 200 news headlines. The patterns
found in the news headlines were then aggregated, producing the patterns shown
on Table 1.

Table 1. Main patterns for extracting relations. Each relation is extracted by extracting each
Phrase separately (this was a matter of implementation choice).
Patterns to extract PHRASE1 Patterns to extract PHRASE2
NP1 (PP NPn)* negation_word? VP negation_word? VP (PP|ADVP)? NP
NP1 (, NPn)*,? (and|or) NPn+1 negation word? VP negation word? VP (ADJP|ADVP)
Symbols meaning: ( ) group one or more phrases, negation_word represent a negation word present in a
negation word list (e.g. no, not, never, etc.). ? can occur 0 or 1 time. * can occur 0 or more times. “and” and
“or" literally mean those words. | match either the expression preceding it or expression following it.
UNDERLINED are the phrases to be extracted if the entire pattern matches the text.

6. Extraction of argument1-verb-argument2 relations from PHRASE1-VP-
PHRASE2 relations. In this step, for each Phrase1-VP-Phrase2 relation is
extracted according to the following heuristics. From a VP we extract the verb.
Since a VP may have more than one verb, we extract the main verb. We assume
that the main verb is always the last in the VP. Both arguments are obtained by
extracting the core element of the respective Phrase. If the Phrase is a NP the core
element is a noun, if it is an ADJP the core element is an adjective, and if it is an
ADVP the core element is an adverb. Following the previous heuristics we extract
an argument1-verb-argument2 relation from each Phrase1-VP-Phrase2. In addition
to the three elements of a relation (argument1, argument2, and verb), it is extracted
also other information (which we call attributes) about these elements.
7. Conversion of the inflected words into their root. In this step, the words are
lemmatized. For example, the Sevilla-coloca-pé relation (in English: Sevilla-puts-
foot) is converted into Sevilla-colocar-pé) (in English: Sevilla-to put-foot). This

3 https://gate.ac.uk/ie/annie.html

Advanced Science and Technology Letters
                                                                    Vol.97 (UCMA 2015)

    procedure reduces the relations variants and allows querying the dictionary of
    sentiment words, where each word is lemmatized.

4     Experiments and Results

4.1   Dataset

In our experiments, we used the SemEval2007 Task 14 [10] dataset. This dataset was
created for Task 14: affective text in the Semantic Evaluation Workshop in 2007
(SemEval-2007). This is a corpus of 1250 English news headlines about multitopics
(e.g. sport, health, politics, world, etc.), extracted from news web sites (such as
Google news, CNN) and/or newspapers. Each news headline is annotated with a value
indicating its valence (its degree of positivity and negativity). The value ranges from -
100 (a highly negative headline) to 100 (a highly positive headline), where 0
represents a neutral headline. The dataset was independently labeled by six
annotators. The average of the inter-annotator agreement using the Pearson
correlation measure was 78.01. For our experiments, we performed two operations: 1)
we have translated the news headlines to Portuguese and 2) applying the same rule as
used by [10], the valence annotation was mapped to a negative/neutral/positive
classification (negative = [-100,-50], neutral = (-50,50), positive = [50,100]).

4.2   Sentiment Classification – Evaluation

The goal of this experiment was to evaluate the use of different features for sentiment
classification of news headlines. For that, we evaluated the classification performance
of a combination of features and machine learning algorithms. For this, we performed
6 experiments. First, we performed 3 experiments using different features but without
the syntactic features. Then, we performed the same 3 experiments adding also the
argument1-verb-argument2 relations as features. All experiments are compared by
using the accuracy measure, using 10-fold cross validation. Each experiment is
summarized below:
 Experiment 1 - word n-grams as features. The experimental setup in this
   experiment closely follows the setup of Pang et al. [6]. Representing each news
   headline as a bag-of-words (a bag of n-grams in fact), we used unchanged unigram
   and bigram features such in Pang et al. [6].
 Experiment 2 - Numeric features with generic dictionary. In this experiment, for
   each news headline, we quantified a set of word types and used them as attributes.
   For example, the number of positive, negative, and neutral verbs in news headlines.
   For determining the polarity of words it was generated a dictionary based on the
   [9] algorithm. The full list of used features were:
    wrdsNeg, wrdsNeu, wrdsPos - Total number of negative, neutral, and positive
      content words (nouns, verbs, adjectives, and adverbs) within a news headline.

Copyright © 2015 SERSC                                                                91

Advanced Science and Technology Letters
Vol.97 (UCMA 2015)

   sentNegativity, sentNeutrality, sentPositivity - Total number of negative (and
     alslo the neutral and positive) words divided by the total number of content
     words (wrdsNeg/contWords, wrdsNeu/contWords, wrdsPos/contWords).
   majorPolarity - Gets the value -1 if wrdsNeg > wrdsPos and wrdsNeg >
     wrdsNeu. Gets the value 1 if wrdsPos > wrdsNeg and wrdsPos > wrdsNeu. Gets
     the value 0 if wrdsNeu > wrdsPos and wrdsNeu > wrdsNeg. Gets the value 100
     in all other cases.
   NegAdj, neuAdj, posAdj, negAdv, neuAdv, posAdv, negNouns, neuNouns, pos-
     Nouns, negVerbs, neuVerbs, posVerbs - Total number of negative, neutral, and
     positive adjectives, adverbs, nouns, and verbs.
   AvgAdjPol, avgAdvPol, avgNounsPol, avgVerbsPol - The average polarity of
     all adjectives, adverbs, nouns, and verbs in the sentence.
 Experiment 3 - numeric features with custom dictionary. In this experiment, we
  used the same features as on experiment 2, but with a dictionary automatically
  generated from the news headlines. In this dictionary, each entry is a word
  followed by its grammatical category, and polarity (positive, negative, or neutral).

Table 2. Results for sentiment classification of news headlines. (*) without relations as
features. (**) with relations as features

     Experiment Id         Classifier       Mean Accuracy*       Mean Accuracy**
     Experiment 1            SMO               62.50%                62.70%
     Experiment 2        Random Forest         57.50%                59.00%
     Experiment 3        Random Forest         61.00%                63.50%
The main conclusion that can be taken from Table 2, and as is shown on the 4th
column, the classification accuracy increased on all experiments where the
argument1-verb-argument2 relations were used as features. These results shown also
on the 3rd and 4th row, which use of a custom dictionary (experiment 3, row 4)
instead of a pre-existent dictionary (experiment 2, row 3) provided a better result.
This improvement was probably because the custom dictionary has domain
knowledge, since it was generated from the news headlines (from the training set
only). Although not directly comparable to the results reported in SemEval2007 Task
14 [10], the results are very similar. The system with the best result achieved an
accuracy of 55.10%, and our best result was 63.50%. The results are not directly
comparable because we used a translation of the dataset used in SemEval2007. In
addition, and more important, the dataset made available for SemEval participants was
splitted into 250 annotated headlines for training, and 1,000 annotated headlines for
testing (our proportion was: 1125 news headlines for training and 125 for testing on
each iteration of the 10-fold cross validation method).

5    Conclusions

We conducted an empirical study for extracting argument1-verb-argument2 relations
(along with some attributes) from Portuguese news headlines, and used them as
features on machine learning algorithms for sentiment classification. We have shown

92                                                              Copyright © 2015 SERSC

Advanced Science and Technology Letters
                                                                       Vol.97 (UCMA 2015)

that the use of these relations as features improved the sentiment classification of the
news headlines. We found also, that the use of a sentiment lexicon generated from
labelled news headlines instead of the use of a general lexicon improved the sentiment
classification of the news headlines. There are several interesting directions that can
be explored in the future. For example, the results for extracting argument1-verb-
argument2 relations and the results for classifying news headlines suggest that there
is room for future improvements. Another direction could be to take into account the
user profile to present relevant news articles.

Acknowledgments. António Paulo Santos is supported by the FCT grant
SFRH/BD/47551/2008.

References

1. Andreevskaia, A., Bergler, S.: CLaC and CLaC-NB: Knowledge-based and corpus-based
    approaches to sentiment tagging. Proceedings of the 4th International Workshop on
    Semantic Evaluations. pp. 117–120 Association for Computational Linguistics (2007).
2. Breiman, L.: Random forests. Mach. Learn. 5–32 (2001).
3. Chaumartin, F.-R.: UPAR7: A knowledge-based system for headline sentiment tagging.
    Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-
    2007). pp. 422–425 Association for Computational Linguistics, Prague, Czech Republic
    (2007).
4. Hall, M. et al.: The WEKA data mining software: an update. SIGKDD Explor. 11, 1, 10–18
    (2009).
5. Koppel, M., Shtrimberg, I.: Good news or bad news? let the market decide. In AAAI
    Spring Symposium on Exploring Attitude and Affect in Text. Palo Alto: AAAI. pp. 86–88 ,
    Palo Alto, CA (2004).
6. Pang, B. et al.: Thumbs up?: sentiment classification using machine learning techniques.
    Proceedings of the ACL-02 conference on Empirical methods in natural language
    processing-Volume 10. pp. 79–86 Association for Computational Linguistics, Philadelphia,
    Pennsylvania (2002).
7. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2, 1-2,
    1–135 (2008).
8. Platt, J.C.: Fast Training of Support Vector Machines Using Sequential Minimal
    Optimization. In: Schölkopf, B. et al. (eds.) Advances in Kernel Methods. pp. 185–208
    MIT Press, Cambridge, MA, USA (1999).
9. Santos, A.P. et al.: Determining the Polarity of Words through a Common Online
    Dictionary. In: Antunes, L. and Pinto, H.S. (eds.) 15th Portuguese Conference on Artificial
    intelligence. pp. 649–663 Springer Berlin Heidelberg, Berlin, Heidelberg (2011).
10. Strapparava, C., Mihalcea, R.: SemEval-2007 Task 14: Affective Text. In Proceedings of
    the 4th International Workshop on the Semantic Evaluations (SemEval 2007). pp. 70–74 ,
    Prague, Czech Republic (2007).
11. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised
    classification of reviews. Proceedings of the 40th Annual Meeting on Association for
    Computational Linguistics. pp. 417–424 Association for Computational Linguistics,
    Morristown, NJ, USA (2002).
12. Valdez, P., Mehrabian, A.: Effects of color on emotions. J. Exp. Psychol. 123, 4, 394–409
    (1994).

Copyright © 2015 SERSC                                                                      93

You can also read