COLLOCATIONS WITH THE VERB HACER IN THE SPANISH OF MAJORCA

Page created by Jamie Brown
 
CONTINUE READING
COLLOCATIONS WITH THE VERB HACER IN THE SPANISH OF MAJORCA
MASTER’S THESIS

 COLLOCATIONS WITH THE VERB HACER
     IN THE SPANISH OF MAJORCA
A COMPARATIVE STUDY OF PALMA AND ALCALÁ
              DE HENARES

Elizaveta Kardovskaia

Master’s Degree in Modern Languages and Literatures

(Specialisation/Pathway Theoretical and Applied Linguistics)

Centre for Postgraduate Studies

Academic Year 2020-21
ii

 COLLOCATIONS WITH THE VERB HACER
     IN THE SPANISH OF MAJORCA
A COMPARATIVE STUDY OF PALMA AND ALCALÁ
              DE HENARES

Elizaveta Kardovskaia

Master’s Thesis

Centre for Postgraduate Studies

University of the Balearic Islands

Academic Year 2020-21

Key words:

language contact, interference, variation in Spanish, Spanish-Catalan bilinguals,

Spanish monolinguals, collocations, corpus linguistics, the verb hacer

Thesis Supervisor’s Name: Andrés Enrique-Arias
iii

Table of contents

Table of contents...........................................................................................................iii
Abstract ........................................................................................................................ iv
1.     Introduction ............................................................................................................ 1
2.     Research questions and hypotheses ......................................................................... 2
     2.1.     Theoretical background .................................................................................... 2
     2.2.     Research questions ........................................................................................... 3
     2.3.     Hypotheses ...................................................................................................... 3
3.     Literature review ..................................................................................................... 5
     3.1.     Definition of collocations ................................................................................. 5
     3.2.     Language contact ............................................................................................. 9
     3.3.     Catalanisms in Spanish .................................................................................. 11
     3.4.     Definition of a standard variety ...................................................................... 12
     3.5.     Social variables .............................................................................................. 13
     3.5.1.      Language use and gender............................................................................ 13
     3.5.2.      Language use and age ................................................................................. 14
4.     Method ................................................................................................................. 15
     4.1.     Social variables .............................................................................................. 15
     4.2.     Sociolinguistic overview of Palma and Alcalá de Henares ............................. 16
     4.3.     Corpora PRESEEA: Palma and PRESEEA: Alcalá de Henares ...................... 17
     4.4.     Data collection ............................................................................................... 17
     4.5.     Participants .................................................................................................... 18
5.     Data analysis ......................................................................................................... 19
     5.1.     Qualitative analysis ........................................................................................ 19
     5.2.     Quantitative analysis ...................................................................................... 26
6.     Conclusions .......................................................................................................... 31
Bibliography ................................................................................................................ 32
iv

Abstract

       Language contact phenomena take place in numerous speech communities
around the world. This paper aims to study the effects of language contact on lexical
change in the Spanish of Majorca through an analysis of the frequency and use of non-
standard collocations with the verb hacer. The study considers social variables such as
age, gender and level of education (or social class). The data comes from the PRESEEA
corpora considering two cities: the bilingual community of Palma and the monolingual
community of Alcalá de Henares. These corpora include interviews to 54 participants
from each community that are equally divided into three age groups (18 to 34, 35 to 55
and above 55 years old), two gender groups (men and women) and three educational
level groups (primary, secondary and higher education). In the case of Palma, speakers
are classified according to ethnolinguistic groups: Catalan-dominant, Spanish-dominant
and balanced bilinguals.

       After selecting all the relevant examples, a typology of non-standard
collocations with hacer in the corpora with number of occurrences was created. Also, a
comparative discussion with the linguistic bibliography was made. Although, some
Catalan expressions are recognized by Catalan speakers as stigmatized variants, we
consider them as innovative. Their Spanish equivalents we accept as influenced by
Catalan and we include them in our typology. The quantitative analysis reveals that two
variables that influence the use of the verb hacer are the geographical origin and the age
of speakers. These results confirm a strong influence of the Catalan language on the use
of collocations with hacer in Palma. Also, this shows that younger generations use this
verb in collocations more often than older speakers, meaning that the phenomenon is
expanding     and     can     be    recognized     as     a    change     in    progress.
1

1. Introduction

       Language contact phenomena take place in numerous bi- and multilingual
communities around the world. In recent studies, the contact between languages and/or
varieties is seen as a default scenario and zero-contact as the exception (Hickey 2020).
One such language contact situation is the Spanish of Majorca, a bilingual territory
where Spanish and Catalan have been in contact for several centuries. One of the
features that have been observed in the Spanish of Majorca is the use of noun (direct
object) collocations with hacer in cases in which monolingual speakers would use other
verbs such as dar, echar, poner and tomar (Enrique-Arias 2010: 111; Serrano Vázquez
1996: 390). The higher production of this structure in Spanish occurs as the result of
Catalan influence because the phenomenon is more frequent in that language than in
Spanish. As Sinner (2004: 522) similarly states:

       In Catalan speaking areas, there has been noticed a tendency to use the verb hacer instead of the
       verbs cometer, dar, llegar, seguir, tener, tomar imitating the Catalan model where the verb fer
       has a more extensive and flexible use. Therefore, it is relevant to talk about “multiusos
       distorsionadores” ‘distorting uses’ of the verb hacer.

       For example, while for a monolingual Spanish speaker it is typical to use the
verb celebrar in a collocation celebrar la Navidad, for a bilingual Spanish-Catalan
speaker in Majorca, it is the norm to use the verb hacer, as in hacer la Navidad.
       The topic of the Spanish language spoken in contact with Catalan has been
frequently addressed in contact linguistic contact studies. However, the study of the
Spanish language spoken in Majorca, and in particular, on the use of collocations with
the verb hacer in Palma has not been thoroughly developed. In the light of the need for
further research, this project examines factors that influence the use of the verb by
bilinguals in Majorca, i.e. the contact situation and the sociolinguistic factors including
age, gender and educational level of speakers. Moreover, this work is motivated by the
absence of a detailed corpus analysis and a comparative analysis with a monolingual
community. Therefore, in order to contribute to the research on this topic the paper aims
to study the effects of language contact on lexical change in a contact situation in the
Spanish of Majorca, i.e. on the frequency and use of non-standard collocations. This
main objective is achieved by developing the following sub-aims: (i) to create a
2

typology of the non-standard collocations with the verb hacer; (ii) to statistically
compare the frequency of use of non-standard collocations in a bilingual community of
Palma and a monolingual community of Alcalá using chi-square tests of significance;
(iii) to statistically compare the frequency of use of non-standard collocations among
age, gender and educational level groups in two speech communities of Palma and
Alcalá de Henares.
       In order to do so, this study will be broken down into five major sections. First,
after the introductory part, this paper continues by establishing research questions and
hypothesis preceded by a short theoretical background (Section2). Second, the literature
review part (Section 3) provides a thorough discussion on how linguists determine the
concepts of collocation, language contact, bilingualism and bilinguals, interference,
catalanisms, standard variety. The same section contains a sociolinguistic overview on
the relationship between language use and social variables such as age, gender and
educational level. Next, section 4 deals with a method of the study including the
description of participants, a sociolinguistic overview of Palma and Alcalá de Henares, a
presentation of corpora of the PRESEEA project and how I collected the data for the
analyses. The fifth part (Section 5) provides a qualitative analysis and comparative
quantitative results between Palma and Alcalá speech groups. The paper finishes with the
summary of findings and the conclusions (Section 6).

2. Research questions and hypotheses
2.1. Theoretical background

       In order to establish research questions and hypotheses, this subsection provides
a brief overview of linguistic theories related to language contact and sociolinguistic
analysis. As the primary focus of this paper is language contact and Catalan language
influence, the following theoretical statements can be implemented:

  i.   In the speech of bilinguals, as a result of language contact and speakers’
       familiarity with more than one language, deviation from the norms of either
       language occur (Weinreich 1968);
 ii.   As the result of Catalan influence, in Spanish, “certain collocations that employ
       dar, tomar, poner or other verbs in standard general Spanish are used with hacer
       in the Spanish of Majorca” (Enrique-Arias 2010: 111).
3

        In addition, in my study, there are three independent social variables which are
age, gender and educational level that may influence the way people use standard or
non-standard language. In sociolinguistics, the traditional studies confirm that younger
and older people tend to use more non-standard expressions than middle-aged (Holmes
2013; Labov 1994) and women tend to use more standard forms than men (Labov 1972:
243). As educational level is an objective measure of social class, the following theory
can be applied to this study: lower class individuals use more stigmatized forms than
upper class speakers (Silva-Corvalán and Enrique-Arias 2017).

2.2. Research questions

        The main research questions aimed to be fully developed in this study are the
following:

   i.   Do bilingual speakers of Spanish and Catalan from Majorca use the verb hacer
        in noun collocations differently from Spanish monolinguals in Alcalá de
        Henares?
  ii.   Do Catalan-dominant bilinguals use more frequently the verb hacer in direct
        object collocations than Spanish-dominant bilinguals in Majorca?
 iii.   Do age, gender and educational level of speakers in Palma and in Alcalá de
        Henares affect the frequency of use of the verb hacer in collocations?
 iv.    Is the phenomenon of collocations with the verb hacer a stable variable or a
        change in progress?

2.3. Hypotheses

        According to Salkind (2010: 365), a directional hypothesis is a researcher’s
assumption about difference, dependency or positive or negative change between two
variables in a community. This assumption is usually based on literature on the topic,
considerable experience, past research and/or accepted theory. Conversely, a non-
directional hypothesis does not specify “the change, relationship, or difference as being
positive or negative” (Salkind 2010: 365). Traditionally, a directional hypothesis is a
research hypothesis. The following are hypotheses developed from research questions
of this study:
4

   i.   The first directional hypothesis of the research postulates that bilingual speakers
        in Majorca will overuse the verb hacer in collocations in comparison to the
        monolingual Spanish speakers in Alcalá de Henares.
  ii.   The second directional hypothesis of the study is that Majorcan bilingual men
        use non-standard collocations with the verb hacer in their speech more than
        women from the same bilingual group.
 iii.   The third directional hypothesis postulates that in Majorca, younger speakers use
        non-standard collocations with the verb hacer more than older speakers.
 iv.    The fourth directional hypothesis states that speakers with primary education use
        non-standard collocations with the verb hacer more than speakers with higher
        education.

        Statistical analyses do not test directional (or non-directional) hypotheses.
Therefore, in order to run statistical tests, I have to develop null hypotheses (Salkind
2010: 366). Salkind (2010: 365) defines a null-hypothesis as an assumption that there
will be no change, difference or relationship between two variables. The null hypotheses
that can be rejected or accepted depending on statistical results are that there is no
difference in the frequency of use of collocations with the verb hacer as opposed to
other verbs between the communities and sample groups:

   i.   Palma and Alcala de Henarés, i.e. between bilingual and monolingual
        communities;
  ii.   bilingual groups, i.e. between Catalan-dominant and Spanish-dominant speakers
        in Palma community;
 iii.   gender groups, i.e. between men and women (separately for Palma and Alcalá);
 iv.    age groups, i.e. between younger, middle-aged and older speakers (separately for
        Palma and Alcalá);
  v.    educational level groups, i.e. between people with primary, secondary and
        higher education (separately for Palma and Alcalá);
 vi.    new age groups, i.e. between younger and older speakers aged above 35 in
        Palma.

        The next-to-last three hypotheses were tested separately for Palma and for
Alcalá de Henares communities meaning that the third, fourth and fifth hypotheses for
5

Palma are the same as sixth, seventh and eighth hypotheses for Alcalá respectively. The
sixth in the list is the ninth hypothesis regarding new age groups in Palma (see section
5.2 Table 6 for details). Therefore, there are nine null hypotheses in total to test. The
first hypothesis is the main hypothesis of the study regarding a comparison of two
different communities: monolingual in Alcalá and bilingual in Palma. The second
hypothesis compares Catalan-dominant and Spanish-dominant bilingual speakers in
Palma. The next three (six) hypotheses compare speakers from different age, gender and
educational level groups respectively in Palma and Alcalá de Henares separately. The
ninth hypothesis tests speakers from two age groups in Palma - the same younger group,
aged 18 to 34, and a new older group that consists of middle-aged and older speakers.

3. Literature review
3.1. Definition of collocations

       Before we embark on the study of collocations we need first to define the
meaning of this concept. The Collins English Dictionary (2021) contains the following
entrance: “In linguistics, a collocate of a particular word is another word which often
occurs with that word”. From this definition it follows that the most important concept
to take into consideration while defining a collocation is frequency with which
collocational units occur together (in a corpus). In the linguistic literature, there is a
wide range of definitions of a term   COLLOCATION,    and some of them confirm that it is
an onerous phenomenon to define (e.g., Nesselhauf 2003; Schmid 2003). In order to
elicit a notion of collocations relevant to this work I have adapted the state-of-the-art by
López Pérez and Benali Taouis (2019) as this article (similarly to my analysis) deals
with noun (direct object) collocations with the high-frequency verb do. The next lines
represent several definitions of a term ‘collocation’ in chronological order. Palmer
(1933: i) was the first author to introduce the concept of a collocation as “a succession
of two or more words that must be learned as an integral whole and not pieced together
from its component parts”. However, some sources (Koike 2001; Mel’čuk 1998) assign
a pioneer role to Firth saying that it was him who coined this term. According to
Mitchell (1971: 35), Firth (1957) was probably influenced by Palmer’s (1933) idea, as
quoted in Alonso Ramos (1994-1995: 9). Consequently, he defines collocations as
“actual words in habitual company” (1957: 182) and famously states that “you shall
6

know a word by the company it keeps” (1957: 11). These first two definitions hold a
general idea that “two or more words go together”. Similarly, Firth’s student, Sinclair
(1991: 170) continues to develop the concept of ‘collocation’ in the context of corpus
analysis and comes up with his own definition saying that “collocation is the occurrence
of two or more words within a short space of each other in a text”. However, when
analyzing Firth’s (1957) theory, Herbst (1996: 612), comes to the conclusion that “the
use of a term collocation is not restricted to combinations of two words”.
        A different perspective on collocations is given by Lewis (1997: 44), who states
that “collocations are those combinations of words which occur naturally with greater
than random frequency” adding that “collocations co-occur, but not all words which co-
occur, are collocations”. This essentially means that co-occurrence is not the only
characteristic of collocations (Awaj 2018: 3). A more recent and a more narrow
definition is that of Parrot (2010). Similar to the previous ideas, the writer, first, repeats
that a collocation “describes the habitual partnering of words” (2010: 125) and that “the
term is also used to refer to any words that frequently occur together” (2010: 125); and
adds that “in its narrow sense, however, ‘collocation’ is a term used to describe two-
word combinations where there is a restricted choice of which words can precede or
follow which” (2010: 125).
        Finally, López Pérez and Benali Taouis (2019: 101) come up with their own
definition in which collocations are considered “as the set of two or more words which
have an arbitrary restriction in their commutability and that must occur and combine in
order to produce accurate sentences from a grammatical point of view”. In this
definition, the idea of an ‘arbitrary restriction (on substitutability)’ is taken from
Nesselhauf (2003: 225), who develops a notion for verb-object-noun combinations
called ‘restricted sense’. According to this notion, “a sense of a verb (or noun) is
considered ‘restricted’ if at least one of the following criteria applies” (Nesselhauf 2003:
225):
        Criterion 1

        The sense of the verb (noun) is so specific that it only allows its combination with a small set of
        nouns (verbs).

        Criterion 2
7

        The verb (noun) cannot be used in this sense with all nouns (verbs) that are syntactically and
        semantically possible.

        In addition, the author establishes a relationship between these criteria of
restrictedness and word combinations, distinguishing them into three major classes: free
combinations, where both the noun and the verb can be freely combined (e.g. quiero el
agua); collocations, where “the verb in the sense in which it is used can only be
combined with certain nouns” (Nesselhauf 2003: 226) (e.g. hacer una pregunta, but
*hacer una respuesta) and idioms, where “both the verb and the noun are used in a
restricted sense, so substitution is either not possible at all or only possible to an
extremely limited degree” (Nesselhauf 2003: 226) (e.g. hacer añicos, hacer caso, hacer
polvo). Considering this classification, Nesselhauf (2003: 225) thus states the condition
for a word combination to be classified as collocation, in which “either criterion 1 or
criterion 2 or both apply to the verb of the combination”. Finally, the author postulates
that in all verb-noun collocations the verb is the dependent element and the noun the
independent one.
        One of the characteristics to take into consideration when defining a collocation
is juxtaposition of collocates. According to Sinclair (1966: 415), the structure of
collocations in a text is a relevant factor:

        We may use the term node to refer to an item whose collocations we are studying, and we may
        then define a span as the number of lexical items on each side of a node that we consider
        relevant to that node. Items in the environment set by the span we will call collocates. The extent
        of the span is at present arbitrary.

        The idea of dependency of the elements is also present in Sinclair’s (1966) study
on collocations and their structure. While the node (noun) is a semantically independent
element, and it determines a lexical unit (collocation), the collocate (verb) expresses
concrete meaning. Also, the author states that “the usual measure of proximity is a
maximum of four words intervening” (1991: 170). Men (2018) discusses Sinclair’s
(1991) definition explaining that “significant collocates usually fall in a span of 4:4, that
is, four words to the left and four words to the right of the node”. Schmid (2003: 241)
calls such collocations, which parts are not adjacent, discontinuous collocations and
states that this type of collocations is more frequent. As Magín Perroni (2020: 4) says,
8

different to idioms, collocations are syntactically flexible and have only one literal
meaning, but they do not acquire an idiomatic sense. Therefore, the following
expressions that are found in the PRESEEA corpus are some of the discontinuous
collocations considered for the analysis:

   (1) las Navidades normalmente siempre las hacemos en familia ‘we usually always
       celebrate Christmas as a family’;
   (2) algún espectáculo que hagan ‘some show they would put on’;
   (3) la Misa que la hacen a las 8 ‘the Mass that they celebrate at 8:00 am’;
   (4) el desastre que hicimos ‘a mess that we made’;
   (5) cambiaban de película y cuál ponían ‘they changed the movie and which one
       did they show’.

       For example, (1) is a collocation between the verb hacer and the noun (phrase)
(las) Navidades, where the verb and the noun (phrase) are not adjacent to each other. An
‘ordinary’ collocation for (1), where its parts accompany each other is hacer las
Navidades. For (2), (3), (4) and (5), the collocations are hacer algún espectáculo ‘put on
some show’, hacer la Misa ‘say/celebrate Mass’, hacer el desastre ‘make a mess’ and
poner película ‘show/screen a movie’ respectively.
       After considering the above definitions, the definition by Lewis (1997) is
selected as the most relevant. The reason is that it suggests that the most important
characteristic to define a collocation is frequency of the words coming together. In this
vein, a number of definitions emphasize the importance of frequency. Corpas (1996)
defines it as frequency with which two lexical units co-occur. Similarly, in the words of
Schmid (2003: 239), the notion of frequency is called ‘combined recurrence’; he
determines the words “that are adjacent in a given text […] eligible for the status of
collocations if they do not occur next to each other by mere chance but because they are
frequently used in this particular combination”. Schmid (2003: 239) shortly calls
collocations as “recurrent word combinations”. For example, in Catalan, the frequency
of co-occurrence for a Boolean search in Google “fer un passeig” is equal to 79,800,
whereas for “donar un passeig” is 49,800. Conversely, in Spanish, the frequency for
“dar un paseo” ‘take/have a walk’ is higher than for “hacer un paseo” ‘take/have a
walk’: 7,690,000 against 3,340,000 respectively. In corpora, this characteristic is
indicated in different ways. For instance, in Catalan corpus, Corpus textual
9

informatitzat de la llengua catalana (CTILC), there is an indication of Freqüència de
coaparició ‘frequency of co-occurrence’ when making a collocational search. In Mark
Davies’ (2012) corpus, a column    FREQ      on the page   FREQUENCY   provides information
about how many searched words are there in this corpus.
        However, co-occurrence is not an exclusive feature of collocations. Koike
(2001) distinguishes some of the formal and semantic traits of collocations including
discussed frequent co-occurrence and formal compositionality. The latter permits the
(formal) flexibility of the collocational components at the syntactic and morphological
level. For example, one of the parts of the collocation can be substituted by a lexical
unit with the similar meaning, as in hacer/ echar/ dormir la siesta. Similarly, a node or
a base of a collocation can be modified by an adjective, as in llevar una vida (muy)
holgada, poner distintas películas and hacer muchos kilómetros. Another process,
which is possible is pronominalization, as in lo hacemos en casa el día de Navidad,
where the phrase el día de Navidad is substituted by a pronoun lo.
        To sum up, these are the features that we consider relevant to define a
collocation:

   i.   It does not need to be continuous;
  ii.   It has a high frequency of co-occurrence;
 iii.   Its components are flexible in form.

        Substitution of the standard component of the collocation by the verb hacer is
typical of the Spanish speakers in Majorca. There are two reasons this phenomenon
occurs. It can be Catalan driven in origin, in other words, influenced by Catalan-Spanish
language contact; or it can be the case of colloquial Spanish. Though these two factors
(language contact and colloquial language) are not mutually exclusive, this paper
focuses on expressions in which the Catalan influence is more significant.

3.2. Language contact

        To continue with the project, we need to understand the phenomenon of
LANGUAGE CONTACT.      Uriel Weinreich’s (1953) influential work is a classic study of
language contact that laudably provides rigorous classification for the various types of
this phenomenon. In the preface to the sixth printing of Weinreich’s (1968) book, André
10

Martinet underlines the fact that “a linguistic community is never homogeneous and
hardly ever self-contained” (Weinreich 1968: vii); and, at the level of one individual,
each person is “a permanent source of linguistic interference” (Weinreich 1968: vii).
         As Uriel Weinreich had personally experienced a wide range of bilingual
situations, he was the first to point up the fact that language contact happens in bilingual
speakers saying that “the language-using individuals are thus the locus of the contact”
(Weinreich 1968:1). By two or more languages to be in contact he understands the
situations where they are used alternately by the same persons. Consequently,
Weinreich (1968) defines bilingualism as “the practice of alternately using two
languages” and bilinguals as the persons involved in this language usage. In connection
to these definitions originates the concept of             INTERFERENCE       phenomena as “those
instances of deviation from the norms of either language which occur in the speech of
bilinguals as a result of their familiarity with more than one language, i.e. as a result of
language contact” (Weinreich 1968: 1). As Ravindranath Abtahian and Kasstan (2020:
235) state, Weinreich “reserves “interference” for the effects of language contact on
linguistic structure, i.e. the influence of one linguistic system on another”. This
interference is often interpreted as a deviation from the norm by the unilingual speakers
of the same language. Uncontaminated by the contact system and unaware of some
linguistic peculiarities of another language, monolinguals are not likely to adopt them
(Weinreich 1968: 33).
         One of the general principles regarding language change by Weinreich (1968:
188), which most linguists agree with postulates:

         Linguistic and social factors are closely interrelated in the development of language change.
         Explanations which are confined to one or the other aspect, no matter how well constructed, will
         fail to account for the rich body of regularities that can be observed in empirical studies of
         language behavior.

         One of the indications of linguistic interference is a situation in which the
speaker is aware to which language the whole utterance belongs, “where the non-
belonging elements can be separated as transferred” (Weinreich 1968: 7). At the lexical
level,   TRANSFER    can be defined as the incorporation of linguistic items from one
language system into another which results in consequent reorganization of the
11

subsystems involved (Silva-Corvalán 1994: 4). Silva-Corvalán (1994: 2; 5) also says
that researchers paid much attention to “the linguistic phenomena which develop in
situations of societal bilingualism and multilingualism. […] There is a general
consensus, that intensive language contact is a powerful external promoter of language
change”.

3.3. Catalanisms in Spanish

       For several centuries, Spanish and Catalan have been in contact in Catalonia,
Valencia and the Balearic Islands. There has been Catalan influence in the Spanish of
these areas. As for the studies that investigate catalanisms in Spanish, I am particularly
interested in the following two theses: Catalanismos en el español actual by Szigetvári
(1994) and Catalanismos en la prensa digital: La influencia catalana en locuciones con
el verbo hacer by Bo (2017). In her work, Szigetvári demonstrates some lexical and
grammar peculiarities that characterize the Spanish spoken in Barcelona (1994: iii),
which is like Majorca, also the Spanish-Catalan bilingual community. The paper is
divided into two main parts, the first of which is arranged in the alphabetical order
according to the words that typical expressions with Catalan calques contain. In
particular, I am interested in the section Hacer where the author lists some examples of
the collocations with this verb in the line with the Catalan and standard Spanish
equivalent, saying that Spanish speakers, residents of Barcelona, do not realize the
origin of calques, such as hacer asco instead of dar asco, and make use of them
unconsciously (Szigetvári 1994: 28). The second part Grammar Appendix contains
some grammar considerations regarding catalanisms which are not relevant to the study.
       Similarly to Szigetvári (1994), Bo (2017) also provides the list with the
collocations with the verb hacer and their equivalents in standard Spanish, calling the
former as ‘correct expressions in Spanish’. The author examines the existence of these
calques in the newspapers of Barcelona, Valencia (as these are also bilingual
communities) and by curiosity in Mexico. Bo (2017) questions the existence of such
constructions with the verb hacer in Mexico, saying that their occurrence in the speech
of habitants of this place is due to Catalans’ immigration or other factors that need to be
further explored.
12

       Apart from these theses, the landmark book that investigates the Spanish
language in Catalonia is El castellano de Cataluña: Estudio empírico de aspectos
léxicos, morfosintácticos, pragmáticos y metalingüísticos by mentioned above Carsten
Sinner (2004). Sinner is the first in the linguistic world to publish a book about the
features of the Spanish in the Catalan community. The book is over 700 pages of a
thorough study; the subchapter Hacer + OD (‘hacer + direct object’) apart from
author’s quantitative analysis that compares a group of speakers from Madrid with
another group of Catalans contains references to many other studies including the
mentioned above Szigetvári’s (1994) conclusions. In addition, according to the linguist
Gutiérrez-Rexach (2016: para. 1):

       Spanish spoken in contact with Catalan in the Balearic Islands, the Valencian Community and
       Catalonia is a combination of typical traits that is a result of the interference with the Catalan in
       bilingual areas which may transfer to monolingual speakers of Spanish.

       Moreover, to see whether the verb hacer is used as a constituent of a collocation,
or it is used in its standard form, I would appeal to the Corpus del Español NOW (News
on the Web), which provides information about the frequency of searched words or
collocations in this corpus.

3.4. Definition of a standard variety

       One of the general definitions of a           STANDARD VARIETY           that excludes all non-
written languages of the world, which is more than half of all the existing languages, is
that of Janet Holmes (2013). Considering a definition by Joe Trotta (2011), I come up
with the following concept: a standard variety is a prestige one generally recognized in
written language and in formal speech contexts (e.g., on media) and which has been
codified in dictionaries, books and school grammars and persisted in schools as the
norm by which speakers should abide (Holmes 2013; Trotta 2011). Therefore, non-
standard varieties are those that are altered from the norm. The effects of sociolinguistic
variables such as age, gender and social class on the use of non-standard language have
already been discussed in numerous sociolinguistic studies.
13

3.5. Social variables

       In sociolinguistics, social variables that influence the way people communicate
include gender, education, age and origin. There is evidence that men and women speak
differently from each other. There is a series of sociolinguistic studies on how the
gender variable influences the way speakers perform. According to Labov (2001: 293):
“Women conform more closely than men to sociolinguistic norms that are overtly
prescribed”. Therefore, women are supposed to use more language features associated
with standard varieties. In his earlier study, Labov (1966) concludes that men,
differently from women, tend to speak using vernacular or non-standard varieties and
more variation in their speech, in other words, they speak more colloquially or
informally.

3.5.1. Language use and gender

       Amongst the sociolinguistic research on age, origin, social class and gender,
“the clearest and most consistent results […] are the findings concerning the linguistic
differentiation of men and women” (Labov 1990: 205). One of the principles that
summarize these results concerning men’s speech states that “In stable sociolinguistic
stratification, men use a higher frequency of nonstandard forms than women” (Labov
1990: 205). In her book, Tagliamonte (2012: 32) makes a chronological overview of
sociolinguistic observations on relationship “between women and standard language
use” (Tagliamonte 2012: 32) (e.g., Wolfram 1969; Labov 1972; Wolfram and Fasold
1974; Trudgill 1983; Cameron and Coates 1988) and comes up with the generalization
that “women tend to avoid stigmatized forms” (Tagliamonte 2012: 32).
       Speakers of both genders are aware that the language system requires them to
say, for instance, celebrar la Navidad ‘celebrate/spend Christmas’, but the way they
express themselves is different, which correlates with speakers’ gender. Classical
studies of the way men and women speak include Robin Lakoff’s (1973) study
“Language and Woman's Place”, in which the author introduces the theory of women’s
register “with regard to lexicon (color terms, particles, evaluative adjectives), and
syntax (tag-questions)” (Lakoff 1973: 45). Although this theoretical framework
embraces conclusions regarding aspects of the American English, some suggestions can
be applied for any language in general and for Spanish, in particular. In her book of the
14

same name, the author recognizes that “social change creates language change, not the
reverse” (Lakoff 1975: 47) adding “that a sentence that is 'acceptable' when uttered by a
woman is 'unacceptable' when uttered by a man” (Lakoff 1975: 47), and that depending
on the social status of the speaker language use changes. Regarding the linguistic norm,
from the author’s conclusions it can be said that the acceptable use of expressions is not
only determined by linguistic factors, such as rules of syntax, phonology or semantics,
but also and more importantly by the social context where the speech occurs. For
example, whereas in bilingual setting of Majorca, it is a norm to say hacer un paseo
‘take/have a walk’, in a monolingual setting of Alcalá city, the norm is different, it is
dar un paseo ‘take/have a walk’. There is no one general norm that can be applied to
different societies. Finally, Lakoff (1975) underlines the necessity of the linguist to be
strongly connected with sociology in order to understand the aberrant behavior that
happens amongst lexical items and to make relevant generalizations about them, but
these generalizations in the language grammar can only be made with reference to
social mores (Lakoff 1975: 50). Without taking into consideration social variables and
analyzing the society, the linguist will not be able to interpret a variety of ways the
language works.
       Another important study regarding the close relationship between gender
differentiation and linguistic change is Sociolinguistics: An Introduction to Language
and Society by the sociolinguist Peter Trudgill (2000) originally published in 1974. In
his bestselling book, the author examines factors that influence language change, such
as social class, ethnic group, gender, context, geography. Similarly to Lakoff (1975),
Trudgill (2000) thoroughly examines social roles of men and women. The author states
that “men and women are socially different in that society lays down different social
roles for them and expects different behavior patterns from them. Language simply
reflects this social fact” (Trudgill, 2000: 79) and therefore, it changes.

3.5.2. Language use and age

       Research shows that the social variable ‘age’ and the linguistic variable
‘language use’ correlate following the principle of age-grading, according to which
people of different generations use language appropriate to their age group (e.g., young,
middle-aged and old) (Downes 1984). As an individual speaker goes through life, their
15

relation with language use changes (Pym 2019). This change is explained by Holmes
(2013: 177): as people get older they gradually start to use more standard language; then
with time, their speech becomes less standard “and is once again characterized by
vernacular forms”. It is confirmed that “adolescents and young adults use stigmatized
variants more freely than middle-aged speakers” (Labov 1994: 73). The more
standardized speakers are then people in their middle age when societal pressures to
bow to are greatest (Holmes 2013). A classical study that illustrates dependency of
language use on age is by William Downes (1998). For different age groups, Downes
(1998) distinguishes different kinds of pressures that society and norms lay on people.
For example, younger generations undergo pressure of hypercorrection from above,
exerting a more standard than parents’ vernacular model. Moreover, in peer groups of
young people, there is a great normative pressure between each other with more
resistance “to society-wide norms conveyed to them by the institutions of the adult and
outside world, for example, in schools” (Downes 1998: 224). Intensively producing
idiolect forms, the language of adolescents becomes more creative (Pym 2019). As
people approaching their middle age, lives become more ‘public’. Middle-aged people
rise in society, and their “language becomes more standardized” (Pym 2019). This
statement is a generalization that is encountered in much sociolinguistic literature.
Similarly to Holmes (2013), Tagliamonte (2012: 47) says that during the working age,
at the age between 30 and 50, when people experience maximum social pressure
obeying to the norms of the standard language, the use of standard forms achieves its
peak. Over the age of fifty and older, as people finish their social ascendance and start a
more relaxed way of life, social pressures gradually diminish, the non-standard forms
may outcrop (Tagliamonte 2012: 47).

4. Method

4.1. Social variables

       In sociolinguistics, social variables include age, gender and level of education
(or social class). These variables can be found in the PRESEEA corpora that I describe
in the next two sections (see 4.2 and 4.3). While in PRESEEA: Palma the variables are
automatically coded in the columns of the corpus as 1-H-1, for example, in PRESEEA:
16

Alcalá de Henares I have manually created en Excel table with the coded social
variables.
          The independent social variables are the ‘city’, ‘dominant language’, ‘gender’,
‘age’ and ‘level of education’. The gender variable is a nominal or dichotomous variable
because it has only two categories: “men” and “women”. The dominant language
variable has “balanced bilingual”, “Spanish-dominant bilingual” and “Catalan-dominant
bilingual” categories. Also, there are three age categories: “18-34 year olds”, “35-55
year olds” and “over 55 year olds”. The educational level variable has three categories:
“primary education”, “secondary education” and “higher education”. Another
independent variable is a geographical region or a city where participants live. The city
variable has “Palma” and “Alcalá” categories, where “Palma” is a Spanish-Catalan
bilingual region and “Alcalá” is a Spanish monolingual region.
          The dependent variable is the nominal (or dichotomous) variable ‘use of the
collocations with the verb’ which has two categories “hacer” and “other verbs”. The
verb hacer is used in the collocations such as hacer las Navidades, hacer la siesta; other
verbs are used in the alternative collocations such as pasar las Navidades, dormir la
siesta.

4.2. Sociolinguistic overview of Palma and Alcalá de Henares

          Palma is the capital city of the Balearic Islands, the autonomous community of
Spain in the Balearic Sea. The city is a bilingual community with the population of
approximately 472,000 inhabitants in 2021 and 371,000 in 2007 (Population Stat 2021).
The territory of Palma city is in the situation of intense contact between Spanish and
Catalan with three types of bilingual speakers: Catalan-dominant, Spanish-dominant
and balanced bilinguals, where the latter constitute “a large number of speakers who
make use of both languages on a daily basis” (Enrique-Arias and Méndez Guerrero
2020: 321).
          Alcalá de Henares is a city located at approximately 30 km from Madrid, the
capital of Spain, with a population of nearly 200,000 inhabitants in 2020 (Foro-
Ciudad.com 2021) and 165,620 in 1990 (Moreno Fernández 2014). Alcalá de Henares is
a heterogeneous city in terms of linguistic behavior of its inhabitants who represent a
diversity of their places of origin including Andalusia, Balearic Islands, Catalonia,
17

Galicia, Basque Country and Valencia (for more details, see Moreno Fernández 2014:
13-14). Therefore, people from different places have brought linguistic features typical
of their places of origin. This can signify that in this city, there is a mixture of linguistic
features of different varieties. However, the interviews were taken with a limited
number of speakers who are Alcalá residents that live in the city since their childhood
(less than 10 years old) or school years (Moreno Fernández 2014: 14), i.e. monolingual
Spanish speakers.

4.3. Corpora PRESEEA: Palma and PRESEEA: Alcalá de Henares

        In the realization of this comparative study, data is used from two corpora:
PRESEEA Palma and PRESEEA Alcalá de Henares. PRESEEA stands for the Project
for the Sociolinguistic Study of Spanish from Spain and America. In Spanish, the
project is called Proyecto para el Estudio Sociolingüístico del Español de España y de
América (PRESEEA). Due to the PRESEEA project development, the Palma interviews
were collected at a more recent time period, between 2007 and 2010, while the
interviews in Alcalá de Henares were carried out some ten years earlier, in the 1990s –
two thirds (i.e., 36 interviews) of them in 1998 and one third (i.e., 18) in 1991. In both
corpora, the data consists of 54 interviews that were carried out in an informal
environment. The questionnaire touches the variety of topics including thematic
modules about weather, a place where a speaker lives, a family and friends, habits,
memorable anecdotes and stories from life (Moreno Fernández 2014), which facilitate
the appearance of collocations with the verb hacer and with alternative verbs. The
corpora contain the data regarding age, gender and educational level. Apart from these
social variables, the Palma corpus facilitates data on bilingual speakers, whether it is a
Spanish-dominant, a Catalan-dominant or a balanced bilingual speaker.

4.4. Data collection

        In the Palma corpus, I looked up for all the examples with the verb hacer, and
the final amount of 2,358 examples was extracted. The following tokens were excluded
from the analysis because they are universal in both languages:
- weather expressions such as hace calor, hace viento;
18

- temporal expressions such as hace tres años or expressions in the sense of cumplir
años;
- make-build expressions such as hacer muchísimas viviendas;
- impersonal expressions such as no lo tendría que hacer, hacen algo, hacen todo;
- universal and standard expressions in all varieties of Spanish such as hacer falta,
hacer amigos;
- verbal periphrases such as hace sentir;
- the causative use of the verb hacer such as in te hacen más mayor.
         After eliminating all these examples, the sample was reduced to 40 collocations
that are Catalan-influenced such as hacer una película ‘show/screen a movie’, hacer
quinielas ‘play the pools/lottery’ and hacer la siesta ‘take a nap’. In Catalan these
expressions come with the verb fer, as in fer una pel·lícula, fer quinieles and fer la
migdiada respectively.
         The main criteria for selection were collocations with hacer used by bilingual
speakers of Palma community where hacer is not the only possible option but rather it
alternates with other verbs including echar, celebrar, cometer, dar, pasar, poner, tomar.
Further, in the Alcalá corpus, I looked up nouns that are direct objects in the selected 40
collocations to find out a variety of verbs that go with these nouns in place of the verb
hacer.

4.5. Participants

         In my study, 54 participants from Palma community are classified in three
language dominance groups: Catalan-dominant, Spanish-dominant and balanced
bilinguals. Following the PRESEEA methodological requirements, 54 participants from
each community are equally divided into three age groups (18 to 34, 35 to 55 and above
55 years old), two gender groups (men and women) and three educational level groups
(primary, secondary and higher education).
         To analyze qualitative data, I used Excel, where I created lists with collocations
that were extracted from both corpora. For quantitative analysis, I generated tables
separately for both cities with numbers and percentages of collocations with hacer as
opposed to other verbs according to the three social variables (origin, age, gender and
educational level) and one table with number and percentages of collocations with hacer
19

as opposed to other verbs comparing two cities. To carry out statistical significance tests
(chi-square, χ2) and to calculate p-values, I used the online chi-square calculator on the
Social Science Statistics website. The following section contains detailed data analyses.

5. Data analysis

       This section provides qualitative and quantitative analyses by exploring the data
obtained from PRESEEA corpora. In relation to the first objective “to create a typology
of the non-standard collocations with the verb hacer” (see section 1), the two tables (Table
1 and 2) were created with the 40 examples of collocations. Regarding the latter analysis,
the descriptive design is used to thoroughly describe the use of the verb hacer in the
standard and non-standard collocations. Once I have stored all the examples in Excel, I
proceeded with descriptive statistics to obtain the percentage and the frequency of use
of the verb hacer among the monolingual and bilingual population. Then I will compare
the percentages between men and women, age groups and educational level groups in
Palma and Alcalá de Henares communities.
       As the variables are categorical, inferential statistics using a chi-square (χ2) test
will be applied to analyze the relationship between the dependent variable ‘use of the
verb hacer’ and the independent variable ‘city’ in Palma and Alcalá communities; and
the relationship between the dependent variable ‘use of the verb hacer’ and the
following independent variables: ‘dominant language’ (in Palma only), ‘age’, ‘gender’
and ‘educational level’ in both cities separately. If in the tests, the p-value is less than
the significance level of 0.05, I reject the null hypotheses and conclude that there is
evidence to suggest an association between the city and dominant language (in case of
Palma) and the use of the verb hacer, and between the social variables and the use of the
verb hacer in Palma and Alcalá de Henares. This statistics allows me to make
generalizations about how the participants of the Spanish speech communities in each
one of these two cities use the verb hacer.

5.1. Qualitative analysis

       In this section, I present two tables with the typology of 40 Spanish collocations
selected from the PRESEEA corpus. Table 1 shows a typology of Spanish verbs that are
used in standard collocations such as dar un paseo ‘take/have a walk’ and dormir la
20

siesta ‘take a nap’. It also contains a column with Catalan collocations with the verb fer
‘do’ such as fer un passeig ‘take/have a walk’ and fer la migdiada ‘take a nap’. The
third column features the Spanish collocations with the verb hacer ‘do’ that are
produced because of Catalan influence. Table 2 provides information on occurrence and
also contains a column with all the nouns in the corpus from the same semantic field
with which the verb hacer and other verbs collocate found in both corpora. In the
corpus, most of the examples have variation with other verbs such as celebrar, dar,
dormir, estar, echar, ganar, jugar, montar, tener, trabajar, vivir. While in Table 1
examples are presented in their dictionary form, Table 2 provides a list of expressions
with hacer as they were found in the corpus PRESEEA Palma.

Table 1
Typology of Spanish verbs, non-standard collocations with hacer and Catalan
equivalents

Spanish            Catalan               Spanish (contact)       English
(standard)
celebrar, pasar    fer el Nadal          hacer la Navidad        ‘celebrate Christmas’
celebrar, pasar    fer la festa          hacer la fiesta         ‘have a party’
celebrar           fer Reis              hacer reyes             ‘celebrate Wise Men’
provocar, causar   fer un desastre       hacer un desastre       ‘make a mess’
celebrar           fer els bous          hacer vaquillas         ‘celebrate bullfighting’
poner, echar       fer una pel·lícula    hacer una película      ‘show/screen a movie’
dar                fer el concert        hacer un concierto      ‘put on a concert’
poner              fer teatre            hacer teatro            ‘do a play’
poner              fer un espectacle     hacer un                ‘put on a show’
                                         espectáculo
echar, jugar a     fer quinieles         hacer quinielas         ‘play the pools/lottery’
jugar a, jugar     fer la loteria        hacer la lotería        ‘play the lottery’
pasar              fer hores             hacer horas             ‘make/spend hours’
dar                fer un regal          hacer un regalo         ‘give somebody a gift’
tomar              fer la decisió        hacer la decisión       ‘make a decision’
tener, dar         fer ganes             hacer ganas             ‘have a desire’
dormir, echar      fer la migdiada       hacer la siesta         ‘take a nap’
medir, tener       fer set metres        hacer siete metros      ‘measure seven
                                                                 metres’
pasar              fer tardes            hacer tardes            ‘spend afternoons’
dar                fer una conferència   hacer una               ‘hold a conference’
                                         conferencia
echar, jugar       fer un partit         hacer un partido        ‘play a game’
dar, tener         fer una classe        hacer una clase         ‘take/have/give a
                                                                 lesson’
21

dar                 fer un passeig          hacer un paseo          ‘take/have a walk’
llevar, tener,      fer una vida            hacer una vida          ‘live a life’
vivir
tomar               fer mesures             hacer las medidas       ‘take measures/action’
cometer             fer un delicte          hacer un delito         ‘commit (a) crime’
dar                 fer una opinió          hacer una opinión       ‘give an opinion’
ganar               fer diners              hacer dinero            ‘make money’
cantar, decir       fer una missa           hacer la misa           ‘say/celebrate Mass’
caer                fer les gotes           hacer 4 gotas           ‘drops fall’
tomar, beber        fer un aperitiu         hacer el aperitivo      ‘have an aperitif’
poner               fer mala cara           hacer mala cara         ‘make a face’
montar              fer conya               hacer cachondeo         ‘make fun’
estar hasta         fer-se les 7 del matí   hacer las 7 de la       ‘stay until’
                                            mañana
pegar, tener        fer una frenada         hacer un frenazo
estar en la lista   estar en llista         hacer lista de espera   ‘be on a waiting list’
de espera           d'espera
cumplir             fer el requisit         hacer el requisito      ‘meet the requirement’
poner               fer el (aquest)         hacer ese granito de    ‘do one's bit/give an
                    granet de sorra         arena                   easy hand’
echar               fer comptes             hacer cuentas           ‘do the math’
jugar               fer un paper            hacer papel             ‘play/have a role’
dar                 fer el gust (d'algú)    hacer el gusto (de      ‘accommodate
                                            alguien)                (someone's) wishes’

        Some of the examples were previously observed and described in the linguistic
bibliography. For example, several sources (Casanovas Catalá 2002; Enrique-Arias
2010; Freixas (2016) provide a classical example of a non-standard collocation hacer un
café ‘have a coffee’, which does not appear in the corpus PRESEEA Palma. Instead,
this expression has occurred four times in its standard form tomar un café ‘have a
coffee’, and the verb hacer was found to be collocated with other words from the same
semantic field such as in hacer un cubata ‘have a cocktail’ and hacer el aperitivo ‘have
an aperitif’. The last collocation hacer el aperitivo can be found in the list of noun
(direct object) constructions in Sinner (2004). Some other examples that were
previously discussed in the linguistic literature and found in the corpora include hacer
medidas ‘take measures/action’ (Beas Teruel 2009), hacer un paseo ‘take/have a walk’
(Blas Arroyo 2004; Sinner 2004), hacer una película ‘show/screen a movie’ (Sinner
2004), hacer la siesta ‘take a nap’ (Casanovas Catalá 2002; Sinner 2004), hacer mala
cara ‘make a face’, hacer una clase ‘take/have/give a lesson’, hacer una obra de teatro
‘do a play’ (Casanovas Catalá 2002). The last example was found in the corpus
You can also read