Developing and evaluating a style guide for chatbots deployed in a technical setting - AGNES PETÄJÄVAARA

Page created by Anne Collins

Society

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Developing and evaluating a style guide for chatbots deployed in a technical setting - AGNES PETÄJÄVAARA

Degree project in Interactive Media Technology
       Second cycle 30 credits

       Developing and evaluating a style
       guide for chatbots deployed in a
       technical setting
       AGNES PETÄJÄVAARA

Stockholm, Sweden 2022

Developing and evaluating a style guide for chatbots
                    deployed in technical settings

                                                            Agnes Petäjävaara
                                       M.Sc. Interactive Media Technology, KTH, Stockholm, Sweden
                                                             agnespet@kth.se

ABSTRACT                                                                med chatboten utvärderades och resulterade i ett NPS värde på
This study evaluates the perceived credibility of a technical           17 för chatboten som använde den ödmjuka kommunikation-
chatbot based on its communication style - the way it interacts         sstilen igämförelse med ett negativt värde på -16 för chatboten
with its users embodied through text and emojis. A chatbot’s            som använde den initiala kommunikationsstilen.
initial communication style was compared to a humble version.
                                                                        Denna studie visade att en mer ödmjuk kommunikationsstil
The humble communication style was developed from a design
                                                                        inte skadar den upplevda trovärdigheten hos en teknisk profes-
workshop held together with six participants and is presented
                                                                        sionell chatbot.
in this paper as a design style guide.
The perceived credibility was divided into six dimensions;              Author Keywords
Competence, Goodwill, Honesty, Predictability, Reputation,              Chatbot; Communication style; Design style guide;
and Trustworthiness. The results from the evaluation of the             Conversational Interface; Internal support systems; CUI
two chatbot versions showed that the credibility was, in gen-
eral, perceived higher for the chatbot using a humble communi-          CCS Concepts
cation style. Two exceptions were found; (1) the dimension of           •Human-centered computing → Human computer inter-
Trustworthiness stayed at the same level between the versions,          action (HCI); Haptic devices; User studies; Please use the
and (2) the dimension of Goodwill was perceived higher for              2012 Classifiers and see this link to embed them in the text:
the chatbot not using the humble communication style. The               https://dl.acm.org/ccs/ccs_flat.cfm
satisfaction with the chatbot was measured and resulted in an
NPS of 17 for the chatbot using the humble communication                INTRODUCTION
style compared to a negative score of -16 for the chatbot not           Perceived credibility is an important user experience (UX)
using it.                                                               feature for software systems [27, 20, 17]. A conversational
This study found that a more humble communication style                 user interface (CUI) is essentially a digital interface enabling
would not harm the perceived credibility of a technical profes-         users to interact with software following the same principles
sional chatbot.                                                         of human conversations [4]. Examples of CUIs are chatbots,
                                                                        voice, or virtual assistants.
SAMMANFATTNING                                                          This study regards chatbots and the perceived credibility of a
Den här studien utvärderar den upplevda trovärdigheten hos              humble communication style. Humble is a part of the brand
en teknisk chatbot baserat på dess kommunikationsstil - hur             tone of the company issued this thesis, Ericsson, where humble
den interagerar med sina användare genom text och emojis.               is used to describe a more personal and kind way of interacting
En chatbots initiala kommunikationsstil jämfördes med en                that does not feel too mechanical. From this study’s design
ödmjuk version. Den ödmjuka kommunikationsstilen utveck-                workshop, humble chatbots were defined to be unpretentious
lades från en designworkshop och presenteras i denna studie             and respectful.
som en designguide.                                                     Large-scale software development organizations need to have
Den upplevda trovärdigheten delades upp i 6 dimensioner;                an efficient development and support infrastructure. However,
Kompetens, Välvilja, Ärlighet, Förutsägbarhet, Rykte, och               large-scale organizations, such as Ericsson, come with com-
Pålitlighet. Resultatet av utvärderingen av de två chatbot ver-         plexity for support and maintenance which makes the need
sionerna visade att trovärdigheten upplevdes generellt högre            to simplify the process both necessary and challenging. Eric-
för den ödmjukare chatboten. Två undantag påvisades dock;               sson currently has a prototype of a chatbot to help alleviate
(1) Pålitlighet blev oförändrad mellan de två kommunikation-            these problems. The chatbot aims to be the primary point
sstilarna, och (2) Välvilja resulterade i ett högre värde hos den       of entry where Ericsson employees can search for guidance
initiala kommunikationsstilen. Användarnas tillfredsställelse           and support. The main users for the chatbot are engineers
                                                                        needed support with technical questions rather quickly. Erics-
                                                                        son believes when the chatbot is in production, it will make the
Document date: 2022-02-18                                               development process faster and more efficient. This would im-
© 2021 Copyright held by the author.                                    prove the quality of deliverables in the long term and the ease

                                                                    1

of use, speed, quality, and efficiency of the current handling of ment. For this reason, the credibility concept consists of the
support issues in the short term. It is of great importance for union of dimensions from the definitions of human-human,
the user experience and business value that users trust and be- computers in general, and AI for personal use. Competence,
lieve what the chatbot tells them since the perceived credibility Goodwill, Honesty, Predictability, Reputation, and Trustwor-
of a system influences the user’s interest in it [2]. thiness were used in this study to evaluate credibility.
It has been found important for systems that act as a source of Communication style
knowledge, aim to instruct or tutor users, or act as a decision Communication style is an expression of a person’s personality
aid, to be perceived as credible [14]. Therefore, it is of great and determines the way people interact with others. A person’s
importance that Ericsson’s chatbot also is perceived as credible. communication style determines how they speak, act, or react
Tools and studies evaluating the perceived credibility based in various situations [24]. HEXACO Personality Inventory-
on the communication styles of human-to-human interaction Revised (HEXACO-PI-R) is an instrument to measure the six
exist, but there is a gap in the qualitative research of human- major dimensions of personality and is found to have medium
computer interaction (HCI) [10, 25]. to strong associations with communication styles [8]. To be
A study by Liebrecht, Sander, and van Hooijdonk investigated able to develop CUI that feels human-like in conversation, it
communication styles (informal vs formal) of customer service is important to think of the CUIs personality [4]. Deciding
chatbots of familiar and unfamiliar brands. It was found that on a voice and tone is fundamental, and this is why this study
a chatbot’s informal communication style induced a higher focuses on the chatbot’s communication style.
perceived social presence which in turn positively influenced
Humility
the quality of the interaction and brand attitude. [19]
Humble was chosen as the communication style because it
For chatbots used in customer support, it is important that users has been found to be an appreciated communication style in
trust chatbots to provide the required support. An interview professional settings and is included in the brand tone of the
study by Nordheim, Følstad, and Bjørkli focuses on chatbots company that issued this thesis. Based on the research done on
used in customer support. They explain the importance of HEXACO-PI-R, the personality dimension “honesty-humility”
users trusting chatbots for them to provide the required sup- was studied in relation to workplace behavior. It was found
port. Trust was determined by the chatbot’s interpretation of that personalities with a high rating on the “honesty-humility”
requests, its self-presentation, and its professional appearance. dimension, showed productive and dutiful behaviors at work.
[15] The “Honesty-humility” dimension has also been found to be
crucial for providing the foundation for moral action within
A study by Beattie, Edwards and Edwards found that when
organizations where leaders with a high rating have a more ef-
emojis are used in chats the sender is considered more socially
fective team performance. Additionally, personalities showing
attractive, competent, and credible when compared to verbal-
a high honesty-humility are perceived to be more trustworthy
only message senders, no matter if the sender is a human or a
than others. [9, 26, 3, 22]
chatbot. [5]
Ericsson’s brand personality is said to be that of a challenger,
Based on the previous literature studies, there seems to be a
meaning that they sometimes want to be perceived as daring,
gap in research investigating the perceived credibility of chat-
and sometimes as heartfelt. Their tone of voice consists there-
bots deployed in technical settings based on their communica-
fore of five different pillars. One is “Humbly intelligent” and
tion style. This study aims to bridge this gap by investigating
is said to surprise, engage, and captivate their audience. [11]
whether a more humble conversation style of a chatbot, embod-
ied through text and emojis, affects its perceived credibility. METHODOLOGY
Additionally, the study presents a style guide for a humble and To answer the research question four phases beyond the litera-
professional chatbot and the results of its evaluation. ture research had to be covered. First, by defining humility in
The research question of this study is “What is the impact of a the specific context of a technical setting. This phase was cov-
humble communication style on the perceived credibility of a ered in a design workshop conducted with six participants that
chatbot used in a technical environment?” all came from outside of Ericsson. The workshop was used as
a method to gather data on the perception of humbleness from
tech office workers. To the author’s knowledge, there are lim-
Six dimensions of credibility ited data available on humble chatbots, and this method was
Credibility is a perceived quality that results from evaluating an approach to gain more data for the research work. There
multiple dimensions simultaneously [13]. A credible human are two main reasons why these participants were recruited;
is perceived to be competent, trustworthy, and have goodwill (1) there was difficult to find volunteers from the company that
[5]. For computers in general, credibility is said to consist all could participate in a design workshop, and (2) using non-
of trustworthiness and expertise [14]. The credibility of AI Ericsson participants gives a non-company-biased approach
for private use, such as voice agents and personal assistants, to humility in a technical setting.
is conceptualized along the dimensions of honesty, expertise,
predictability, and reputation [23, 10]. Second, by designing a style guide that can be easily imple-
mented by chatbots. The style guide was developed based on
This paper is focusing on evaluating the perceived credibility data from the literature research and the outcomes from the de-
of a support chatbot used in a professional technical environ- sign workshop using an affinity diagram. The affinity diagram

was evaluated and confirmed by a person with expertise in UX
to ensure it was representing the collected data accordingly.
Third, by implementing the style guide on a chatbot used in
a large-scale software organization. This was made by only
changing the sentences Ericsson’s chatbot prototype commu-
nicated to its users embodied through text and emojis, and not
changing any visual elements of the chatbot prototype.
And lastly, by evaluating the eventual impact by measuring
the perceived credibility of the chatbot using the humble style
guide in comparison to a version of the chatbot that is us-
ing the initial communication style. Ericsson’s chatbot is at
a prototype stage, meaning that it has not been released to
Figure 1. The results from the design workshop activity "What is hum-
the greater audience yet. Methods used to evaluate products ble?".
in production, such as A/B testing, are not applicable. The
methods used in this phase were think-alouds, interviews, and
surveys. To the author’s knowledge, there is no standardized
credibility evaluation available for technical support chatbots, the point, which makes the conversation easy for everyone
therefore, a union of established credibility evaluation surveys to understand”.
for relatable fields was used and got backed up by measuring
the Net Promoter Score (NPS) of the two versions of the chat- 3. Place the sentences on the grid: In this activity, the par-
bot prototype. 12 employees at Ericsson participated in the ticipants were given ten sentences and were asked to rate
evaluation; six on the initial version and six on the version them on a dimensional grid of humble vs. arrogant and
following the humble style guide. professional vs. unprofessional. The dimensional grid was
used as a tool to encourage discussion about different ways
of expressing yourself online and how sentences are per-
DESIGN WORKSHOP
ceived differently by the sender and receiver depending on
To be able to better understand what a humble communication
the relation between them. Therefore, the actual placement
style used by professional chatbots is - a design workshop was
of the sentences got on the grid was insignificant for this
held. It aimed for two main things; (1) to understand how
study.
the communication style of Ericsson’s initial prototype of the
chatbot was perceived, and (2) how its communication style The sentences used were constructed by the author of this
could be re-designed to instead be perceived as humble. Six paper with inspiration from real online conversations be-
participants; A, B, C, D, E, and F, volunteered to participate. tween employees at Ericsson:
The group of participants had a median age of 27 years old
1. "No worries! I am happy to help "
and came from 4 different nationalities. Due to the COVID-19
2. "I hate giving people more work so you are welcome"
pandemic and the several locations of the participants, the
3. " well done in deleting your history "
workshop was held online for an effective 90 minutes and was
4. "Your history was successfully deleted "
divided into five activities.
5. "Hellooooooo "
1. Perception of the initial version: As a first activity, the 6. "Hi! Can I help with anything?"
initial version of Ericsson’s chatbot prototype was demoed 7. "Do you need support??"
for the participants of the workshop. They were asked 8. "Bye! Come back soon "
to focus on the personality of the chatbot to later be able 9. "Oh, I’m sorry I didn’t get that."
to discuss their perception of it. The participants of the 10."How can you not already know that?"
workshop described their perception of the chatbot as “basic Following the participants’ placements on the grid, two
but professional” with the motivations of it only giving the main areas were discussed: (1)the use of emojis, and (2)
least amount of response needed while using “strict” and the need not to feel judged. Regarding the use of emo-
“formal” language. jis/emoticons, the participants agreed that only the most
common ones are OK to use to still be perceived as profes-
2. What is humble?: The next activity was to define humble in sional. Participant A referred to sentence nr. 3 and said:
the context of CUI. The participants were to rank synonyms
to humble based on how well they thought that the word “It seems very sarcastic with the clapping be-
corresponds to their definition of a humble online conversa- tween every word. If you are friends it is another
tion. The results can be seen in Figure 1. “Unpretentious” story, then you know that the other person is jok-
and “Respectful” were the synonyms that were the truest to ing. Otherwise, it feels very unprofessional to
the participants’ beliefs. From this activity, the need for a me” - A
conversation with a humble CUI not to be overly complex
was mentioned. Participant B explained it as “If you are Participant F agreed with participant A’s statement and
humble you are using simple language, kind of straight to added:

The most important feature was that the chatbot should have
a fast response time. The participants expressed that this
is essential for a humble chatbot because it needs to be
respectful of your time. The participants also agreed that it
is important for the chatbot to be able to redirect the user to
an actual human when they reach a dead-end. Participant C
stated that:
“If a chatbot should be humble, it has to realize
that it may not have all the answers and instead
pass [ the user ] on to someone who does.” - C
The priority of the friendliness of the humble chatbot di-
Figure 2. Results from the workshop activity “Which features are essential vided the participants. Half of the group said that it is an
priority?”. essential priority and the rest not so much. During the dis-
cussion that followed, it was stated that the friendliness of
the chatbot depends on how clever the chatbot is. Participant
F explained it as:
“You can use emojis and still be professional. Just
not a bunch of them. And use the simple one, like “If I get the answer that I need then it doesn’t
a smiley, wave, or thumbs up. Don’t use flowers matter if it is friendly or not. However, if it fails
unless you are talking about flowers [referring to to give me the correct answers I would be annoyed
sentence nr. 5].” - F if it also isn’t friendly. It [the chatbot] must be
friendly if it is a bit stupid” - F
The participants also stated the need not to feel judged or
stupid for the chatbot to be perceived as humble. Sentence The participants are considering deemphasized chat ele-
nr. 10 and nr. 9 are essentially expressing the same thing ments to be a priority because it seems more time-efficient
but got placed on the grid as opposites. Sentence nr. 10 which is considered an essential priority for humble CUIs.
got rated as extremely arrogant and unprofessional. Partici- Participant A explained their rating as:
pant B expressed that they felt judged by that sentence and
said "Who are they [the chatbot] to tell me what I should “I first rated this one quite low because I thought
know?”. that I want the chatbot to adapt to how I am talk-
Sentence nr. 9 got rated as humble and professional. Partic- ing, but through our discussions, I realized that it
ipant E expressed their reasoning behind it as: should actually be a priority. Especially if you are
either short on time or if you don’t know exactly
“This is the most humble one because it states like how to phrase stuff, preselected options could
I am sorry that *I* did not get it. It’s not blaming help you out. Having preset options is essential
anyone else, more like, ‘sorry it’s my fault, not for a humble but because it would value/respect
yours’” - E the user’s time“ - A

Participant F agreed with participant E and said that nr. 9 The level of human likeness was considered important but
does not imply that the receiver of the message is stupid. not essential. Participant D explained their ranking as:
The feeling of being stupid was also perceived by sentence
nr. 7, and this was only due to the double question marks “There needs to be some kind of human element
according to participant F. to a chatbot for me to be able to perceive it as
humble. Adding these human-unnecessary-words
4. Which features are essential priorities?: Since one of the here and there, for example [interjections such
goals from Ericsson was for the chatbot to motivate focused as] “oh”, makes it sound more human and hence
engagement and deliver simple and compelling UX, the perceived as more humble to me” - D
fourth activity of the workshop was to discuss design prin-
ciples for CUIs based on what the participants consider a 5. Scripting of a humble conversation: In the last activity of
priority for a humble chatbot. The design principles come the workshop, a few humble conversations were scripted
from Følstad and Brandtzaeg’s paper "Chatbots: Changing based on the support chatbot prototype demoed at the be-
user needs and motivation" and focused on response time, ginning of the workshop.
friendliness, human touch, and the presentation of chat ele- The first task for the participants in this activity was for them
ments [6]. The results from the participants’ ratings can be to decide on how a humble chatbot should greet. Almost
seen in Figure 2. What was considered the least important all participants replied with a similar answer to "Hey! How
of the participants of the workshop was the gender of the can I help you? ". The biggest difference was the
chatbot. However, previous research in this field indicate word chosen as a greeting word as well as the emojis used.
that chatbot gender does have an effect on users overall Participant E explained why they used “hey” as a greeting
satisfaction and gender-stereotypical perception [21]. word by saying:

be by asking a technical support chatbot "I hear music, do
                                                                                  you?".
                                                                                  There was a distinction in the participants’ answers where
                                                                                  some stated the importance of helping the user back on
                                                                                  track with the chatbot’s capabilities, whereas others wanted
                                                                                  the chatbot to be odd in the replies in a similar fashion to
                                                                                  the user’s question. Participant D submitted a more playful
                                                                                  answer and explained that “If the user says something odd
                                                                                  you can as well say something odd back”. Participant B
                                                                                  on the other hand expressed the importance of bringing the
                                                                                  users back on track focusing on deemphasized chat elements
Figure 3. Participants submission to activity five and the scripting task         and the preset answer options:
"How should a humble chatbot ask the user for more information?".
                                                                                       “The options are good to give [the user] just to
                                                                                       be sure that if the user, in fact, tries to mess with
                                                                                       the chatbot or if they just didn’t succeed in giving
        “I think the ‘hey’ makes it a little bit more per-                             [the chatbot] the correct key-words” - B
        sonal than “hi”. In the same way as [interjections
        such as] ‘oh’ make the chatbot seem a bit more                            The fifth scripting question was regarding how a humble
        human, and in that way also humble” - E                                   chatbot should present knowledge or results to a user query.
                                                                                  All participants stated the importance of the chatbot being
  Participant F agreed and further explained their choice of                      less confident in its answers. This, by explaining that what
  emojis with:                                                                    the chatbot found might not be what the user was looking
        “The emoji could emphasize that you are not both-                         for, and in that case, the chatbot can try to find something
        ering it. That means a real emoji, not a :) [refer-                       better. Participant A explained their submission with:
        ring to emoticons]“ - F                                                        “By saying - ‘let me know if this isn’t what you
  From the discussion, the participants also agreed that it is                         are looking for’, or ‘I can help you find something
  important that the chatbot already when greeting the user                            else’ - is a nice way of putting ‘the blame’ on the
  expresses what it is capable of so the user would not waste                          chatbot in the case that it didn’t give you the right
  their time.                                                                          results. It also shows that the chatbot is open to
                                                                                       help more like it is not bothered if you ask it for
  The second scripting task for the participants focused on the
                                                                                       more things. Rather than ‘ here are the results -
  scenario where the chatbot needs more information from
                                                                                       that’s it ‘, where the chatbot intends that if you
  the user to proceed. From the submitted results from the
                                                                                       don’t ask for the right thing it is on you that the
  participants, the importance of the chatbot apologizing was
                                                                                       chatbot couldn’t find the right thing.” - A
  clear, see Figure 3. Participant F explained their submission
  with:                                                                           The last scripting task was regarding how a chatbot should
                                                                                  end a session. Also here the participants were on the same
        “Adding a “sorry” at the beginning makes me
                                                                                  page regarding not explicitly including a formal "good-bye
        feel like I didn’t do the wrong thing by not giving
                                                                                  word" and instead said things like "I hope I helped you
        the chatbot the right information. The blame is
                                                                                  today, come back anytime! " or "Thank you for asking
        not on me. So adding that would make the chatbot
                                                                                  me, just let me know if there’s anything else I can do for you
        seem a bit more humble” - F
                                                                                     ".
  A discussion regarding when to use “sorry” and “please”                         Participant F explained it with:
  was happening where participant A explained that “sorry”
  is a better word because “it is a bit more ‘it’s my fault, not                       “I feel like the chatbot is always there in the back-
  yours’ than ‘please’”.                                                               ground, so it would be weird if it gave me a defi-
  Participant B was clear in stating the importance of humble                          nite bye” - F
  chatbots being able to connect the user to an actual human                    From some general discussion at the end of the workshop, the
  and explained it with:                                                        importance of using smiling emojis during the entire conversa-
        “If the chatbot is asking for more information,                         tion with a user to showcase that the chatbot has a “positive
        there is already a miscommunication happening.                          mindset” was determined.
        Therefore I think it is important to already here
                                                                                HUMBLE CHATBOT STYLE GUIDE
        throw in the possibility of talking to a human” - B
                                                                                The humble chatbot style guide is based on the outcomes from
  The next scripting task was regarding what the chatbot                        the design workshop and literature study and was examined
  should do if the user tries to break it. Trying to break                      using an affinity diagram that was evaluated by a person with
  the chatbot is a very common human behavior when first                        experience in UX. The style guide is applicable for support
  getting in contact with a chatbot [4]. One example could                      chatbots used in a technical setting, suggesting personal traits

                                                                            5

of the user to be highly skilled, well-educated, and eventually What to do
stressed. • “I am sorry, I didn’t get that. Would you mind rephrasing
the question or should I find you a human to talk to?”
Sentiment
• “Please let me know if this is not what you are looking for
The chatbot needs to be fast and straight to the point to be ”
perceived as humble. If you are humble, you don’t waste
someone’s time. The chatbot should use simple well-known What not to do
words and phrases. This is for being as inclusive as possible • “Your input is wrong”
for non-native English speakers. Interjections are important
for the chatbot to seem more human-like and in that case, also • “This is how it is”
be perceived as more humble.
Emojis
What to do Humble chatbots used in a professional setting can use emojis,
• “Hey ” but only the most common ones, such as smile[ ], thumbs
up[ ], or a wave[ ]. This is to avoid eventual miscom-
• “Oh, cool!” munications between age groups and cultures. For a humble
communication style, it is important to keep a positive mindset
• “That’s unusual" during the entire conversation, and by using smiling emojis
the chatbot emphasizes that the user is not bothering it. How-
What not to do ever, less is more. Emojis should not be overused to keep the
• “Hi” professional level to the conversation. The emojis used by a
humble chatbot need to be relevant to the content of the mes-
• “Ok.” sage. The meaning behind the emoji needs to be well-known
and perceived in a similar manner across different cultures.
• “That is arcane” More informal emojis can be used if the context allows it to,
eg. if a user tries to break the chatbot by asking questions
Clarity that have very little to do with the service the chatbot offers,
The chatbot should state early in the conversation what it can open-ended, hypothetical, or rhetorical questions.
do. This is so the user directly knows how the chatbot can help What to do
them. Showing preset options already when greeting the user • Hey, how can I help you? [smiling face-emoji]
would also oblige the user with problem-solving as soon as
possible. All of which to prevent the user from wasting their What not to do
time which is something a humble chatbot should avoid. • Hey, how can I help you? [smirking face-emoji and
hibiscus-emoji]
What to do
• “Select a topic or type your question below and I’ll do my Availability
best to help you [list of preset answer options]” The chatbot should always be there ready to help the user. For
this reason, it should never end a session with a formal closing
What not to do
phrase. If the user submits a good-bye word to the chatbot, the
• “What do you need help with?” chatbot should end the ongoing conversation but explain that
it will stay idle in the background.
Assertiveness
The chatbot should be less assertive in what it tells the user. It What to do
is important to ensure that the blame is always on the chatbot • “I hope I helped you today, come back anytime! [waving
if something goes wrong or if the results given are not correct. hand-emoji]”
It is also important to show that the chatbot is open to help
• “Thank you for asking me, just let me know if there’s any-
more and is not bothered if the user asks it for more things.
thing else I can do for you [smiling face-emoji]”
The user should never get the perception that it is their fault
that the chatbot could not help them. Adding a “sorry” at the • “Let me know if there is anything else I can help you with.
beginning of a sentence when the chatbot was unsuccessful Have a nice day [smiling face-emoji]”
makes it perceived as more humble. The “sorry” makes the
user feel that the error is on the chatbot and not on the input What not to do
they gave the chatbot. It is also important to include a human • “Goodbye [waving hand-emoji]”
hand-off as soon as the user is not fully understood to prevent
deeper miscommunication from happening. The chatbot can IMPLEMENTATION OF STYLE GUIDE
ask the user to clarify or rephrase their question, but if the The style guide was implemented to Ericsson’s initial chatbot
chatbot after that still does not understand what the user is prototype by only changing the sentences the chatbot commu-
asking for, it is important to already here give the possibility nicated to the users embodied through text and emojis, and
of talking to a human. not changing any visual elements of the CUI.

mandatory question asking the user to further explain why they
                                                                                    graded the statements the way they did. The survey data was
                                                                                    collected over a one-on-one interview, where the first part was
                                                                                    a think-aloud focusing on the general usability of the chatbot,
                                                                                    in the last part of the interview the participants were asked
                                                                                    to answer the survey. The credibility evaluation was made
                                                                                    on both the initial version and the “humble” version of the
                                                                                    chatbot. The statements and the median of the survey results
                                                                                    can be seen in Figure 5.

                                                                                    Initial version
                                                                                    Six participants participated in the evaluation of the initial
Figure 4. Chatbot’s initial reply when user tries to break it to the left and       version of the chatbot. Two of the participants worked as
after the implementation of a humble style guide to the right.                      support engineers, doing similar work to what the chatbot is
                                                                                    intended to do in the future. They expected the chatbot to
                                                                                    be able to help them with more technical advanced questions.
The initial chatbot listed its capabilities in a list of preset an-                 The others had general engineering roles at Ericsson, such as
swer options for the humble version, the message was changed                        test coordinator, product owner, and developer, and expected
from “What kind of support do you need?” to the more hum-                           the chatbot to help them with both technologically advanced
ble “Hey! How can I help you? Select a topic or write your                          and less advanced questions. All of which have a relation to
question below. ”.                                                                  the technical areas to which the chatbot is intended to give
                                                                                    support.
When the user selects a topic that they are interested in, the
initial chatbot was straightforward and demanding, directly                         Three participants stated that they strongly agreed to the Good-
asking the user to provide more information in order to pro-                        will statement and the rest that they agreed. One participant
ceed with the user’s request. The humble chatbot has a less                         motivated their answer with:
demanding approach.                                                                   “While it is not clear how the chatbot finds the content
When the chatbot presents results to a user request, the humble                       presented by it, I have no reasons to think the chatbot is
chatbot says "I hope these results were helpful for you! If what                      not well-intentioned.”
you were looking for is not included here, please let me know                       The initial version was perceived high on the Trustworthiness
and I will do my best to improve over time      ", in comparison                    and Reputation dimension of credibility. However, the ini-
to the initial chatbot that did not say anything.                                   tial chatbot prototype was perceived as less positive on the
If a user initially tried to break the chatbot by asking irrelevant                 Competence, Honesty, and Predictability dimensions. Three
questions such as “I hear music, do you?” the chatbot replied                       participants said that they neither agree nor disagree with the
with “I can’t handle that request”. In the humble version, the                      Honesty statement of the chatbot, saying that the chatbot did
chatbot uses interjections such as “hm”, and instead of leaving                     not reach that level. One participant strongly disagreed with it
the user in a dead-end, it lists the preset answer options of                       and said:
what the chatbot can do to further support the user. See Figure                       “I do not think that the chatbot is responsible for the
4 for a comparison between the two versions.                                          reliability of the information. If the information source is
If a user wants to end a session by typing a goodbye phrase the                       not accurate (erroneous or incomplete), the chatbot will
initial version of the chatbot replies with a simple “Goodbye!”.                      simply present erroneous or incomplete information. The
The humble version of the chatbot alternates between several                          chatbot can be trained to ignore unreliable information,
sentences as a reply, such as "I hope I helped you today, come                        but it is impossible to ensure that the information will
back anytime!        ", "Let me know if there is anything else I                      always be reliable.”
can help you with. Have a nice day         ", and "Thank you for
                                                                                    Humble version
asking me, just let me know if there is anything else I can do
for you      "                                                                      In the evaluation of the humble version of the chatbot, two
                                                                                    participants worked as system architects, one as a UX special-
                                                                                    ist, and the rest as data engineers. These had more negligible
CREDIBILITY EVALUATION
                                                                                    knowledge regarding the expertise of the chatbot compared to
To understand how much and in what way the communica-
                                                                                    the participants of the initial version evaluation. Meaning that
tion style of technical support chatbots affects its users - the
                                                                                    they expected the chatbot to help them with a wider range of
chatbot’s credibility was evaluated by 12 employees at Erics-
                                                                                    technical questions.
son. The participants were asked to grade sample statements
based on how well they agreed to them. The sample state-                            Analyzing the six independent data collections from the hum-
ments were compiled from credibility evaluation surveys from                        ble version evaluation showed positive results in comparison
relatable fields. Likert scale was used as a method to get                          to the initial chatbot. Two exceptions were raised; (1) the
the conversation going regarding each dimension of the per-                         dimension of Goodwill, and (2) the dimension of Trustworthi-
ceived credibility, therefor each statement was followed by a                       ness. Two participants said that they strongly agreed with the

                                                                                7

Figure 5. The median results from the two communication style evaluation based on the six dimensions of credibility; the median of the results from the
credibility evaluation of the initial communication style of the chatbot and the version of the chatbot with the implemented "humble" style guide.

Goodwill statement, the rest that they just agreed with it. The
dimension of Trustworthiness stayed at the same rating as for
the initial version of the chatbot. One participant explained
their score with:
“If it finds some relevant info, the info page itself is of
trustworthy sources, so I can freely accept it. However,
I would never accept it if it would say something is not
there, not found: that might just mean it cannot find it.” Figure 6. The NPS results to question “How likely are you to recommend
the chatbot to a colleague?”. To the left is the results from the chatbot
The dimensions of Competence, Honesty, Predictability, and using the initial communication style, and to the right is the results of the
Reputation all increased with half a level on the Likert scale chatbot using the humble communication style.
compared to the initial version of the chatbot. For the Reputa-
tion statement, several participants expressed that the humble
version was "simple and efficient to use".
And lastly, measure the eventual impact by measuring the per-
ceived credibility of the chatbot using the humble style guide
Net Promoter Score (NPS) in comparison to a version of the chatbot that is not. This
To get a more accurate overview of the credibility measure- study found that the perceived credibility is, in general, higher
ment, all 12 participants were asked “How likely are you to for chatbots using a more humble conversation style. Two
recommend the chatbot to a colleague?”. Since NPS mea- exceptions were found, (1) the credibility dimension of Trust-
sures customer satisfaction it is also related to the dimensions worthiness stayed at the same level for both chatbot versions
of credibility and was therefore included as a standardized in this study, and (2) the credibility dimension of Goodwill
complement to the survey. The initial version of the chatbot was lower for the humble chatbot compared to the chatbot of
got a negative result of -16. The humble version had a positive comparison.
score with an NPS value of 17. See Figure 6 for diagrams.
The neutral Trustworthiness result indicates that a humble
communication style has no direct impact on the perceived
DISCUSSION trust of the chatbot. This means that the personality of a
This thesis work aimed to investigate whether the perceived chatbot used for support in a technical environment can not
credibility of chatbots is affected by a more humble conversa- directly be compared to the personality of a human, since pre-
tion style embodied through text and emojis. To understand vious research on the HEXACO-PI-R instrument found that
the impact, this study had to cover several phases. First, by personalities with high honesty-humility are perceived to be
defining humility in the specific context of a technical envi- more trustworthy than others [9, 26, 3, 22]. Additionally, this
ronment. Second, designing a style guide that can be easily result can be extended to the study by Nordheim, Følstad, and
implemented by chatbots. Third, implementing the style guide Bjørkli where trust was determined by the chatbot’s interpre-
on a prototype used in a large-scale software organization. tation of requests, its self-presentation, and its professional

appearance, by explaining that a humble communication style              participants should have interacted with the chatbot on their
has no impact on the trustworthiness of chatbots that are used           own before concluding its communication style.
in a technical work environment [15].
                                                                         An online interactive presentation tool, Mentimeter, was used
 The negative Goodwill result suggests that it is something that         to collaboratively brainstorm ideas and answers to the dif-
 does not necessarily apply to the positive perception of cred-          ferent activities of the workshop [1]. On the positive side,
 ibility for humble chatbots. This contradicts the hypothesis            Mentimeter encourages all voices to be heard and generates
 by Beattie, Edwards, and Edwards, where they explain that               great visualization of the data collected, on the downside the
“chatbots using emojis in their conversation may be perceived            participants could influence other participants’ answers. This
 to be demonstrating human goodwill by taking steps to con-              was due to time pressure on completing each task in time, and
 vey relational information and keeping information open via             the visualization showing all participants’ answers for every-
 facing more conversational cues”. However, since the hum-               one directly when an answer was submitted. I.e. For the first
 ble chatbot is perceived to be more credible in general, but            scripting activity, all six participants submitted a very similar
 also more competent in comparison to the chatbot using the              sentences to the task “How should a humble chatbot greet?”.
 basic communication style, the outcome of this thesis work
                                                                         Credibility evaluation limitations
 also confirms Beattie, Edwards, and Edwards study where the
 sender is perceived to be both more competent and credible              The perceived credibility evaluation generated mostly positive
 when using emojis [5].                                                  results for the humble communication style, but the results
                                                                         might have turned out differently with a more homogeneous
The previous research on communication styles (informal vs               set of user groups. On the one hand, It has been found that
formal) of customer service chatbots of familiar and unfamiliar          users who are familiar with the content of a CUI will evaluate
brands, by Liebrecht, Sander, and van Hooijdonk, can to some             it more stringently and likely perceive it to be less credible
extent be extended by the findings of this thesis. The style             [14]. On the other hand, the main end-users of Ericsson’s
guide is based on the outcome from the design workshop                   internal chatbot are engineers who all should have some level
where all six independent participants were not trying to use            of knowledge of the technical systems they are asking the
human approaches to be humble, but rather rethinking what                chatbot for support with. It has been found difficult to measure
this means for a chatbot. Some formalities were explained                perceived credibility if participants’ judgments are influenced
to be needed for the chatbot to be perceived as professional,            by objective properties of the information or its source [12].
but at the same time interjections, emojis, and simpler words,           For these reasons, the outcome of this study might have turned
which can be found to be less formal, were mentioned to be               out differently if the credibility evaluations of the style guide
fundamental features of a humble communication style. This               were implemented and evaluated on a chatbot that was not
is also something that is explained in the previous research to          intended to be used in a technical professional environment.
be important to improve customers’ brand attitude and quality
of interactions [19].                                                    In hindsight, there are disadvantages to using the Likert scale
                                                                         as a method to collect data. For example, acquiescence bias
Based on the present findings, there are two main outcomes               - a phenomenon arising in surveys in general where respon-
of this thesis that can be directly applied in the industry; the         dents are more likely to agree than disagree with the statement
style guide and the survey results. The style guide can be of            shown [18]. To prevent accidentally misleading the partici-
interest to other designers and developers working with CUIs             pants and to be able to ask more accurate follow-up questions,
in general, and support chatbots in specific. This, since it gives       it would have been preferable to ask the participants to rate
practical guidelines of “dos” and “don’ts” in how to design a            the credibility statements only over interviews instead of the
chatbot that is perceived as humble. In addition, the results            survey. For instance, from analyzing the results of the survey,
from the evaluation, especially the positive NPS results, can            several participants from the initial communication style evalu-
be of interest for people working with developing chatbots in a          ation oversaw the chatbot’s negative behavior and tried to find
technical environment in general, and stakeholders, who need             solutions for it, instead of purely explaining why they agreed
the statistics to allow or reject a humble communication style,          or disagreed with the statement. This was shown in the Think
in specific.                                                             Alouds where they in greater chance made excuses for the
                                                                         chatbot’s bad behavior by saying things like “I should adapt
                                                                         my search query to fit this microservice” when the chatbot
Limitations                                                              was unable to return their intended results, instead of thinking
To gain a deeper understanding of the mechanisms behind                  that the chatbot did something wrong. A reason for this can
the perceptions of a credible technical support chatbot, more            be due to participants of the initial evaluation having a higher
research is needed that takes the following limitations into             knowledge of the expertise of the chatbot due to their daily
account.                                                                 work tasks.

Design workshop limitations                                              Ethics
The participants of the design workshop who evaluated the                Interaction happens both ways. For this reason, it is impor-
perceived communication style of the initial chatbot were not            tant to be aware of the chatbot’s personality since this may
able to interact with it themselves before the workshop but              influence user behavior in the long-term. The decision to take
only observed a demo of it during the workshop and drew their            a humble approach for the communication style of the CUI
conclusions on its communication style from that. Ideally, the           can be more ethically defensible since humble qualities are

                                                                     9

an important trait at work and an appreciated communication style was implemented and evaluated by employees from a
style in professional settings [9, 26, 3, 22]. Beyond work large-scale software development organization.
benefits, the humble style guide encourages gratitude. A study
by Grant and Gino has shown that showing gratitude, even the Participants of the evaluation study experienced that a more
smallest thank you, can motivate prosocial behaviors in others humble communication style positively influenced the per-
[16]. Not only those who give or receive prosocial behavior ceived knowledge of the chatbot, the reliability of its stated
benefit from them; it also affects the people observing the kind information, as well as the predictability of the chatbot’s next
acts or being part of the community where prosocial behavior action. The participants who used the chatbot with a humble
happens [7]. communication style also stated that they were more likely to
use the chatbot in the future as well as recommend the chat-
The style guide encourages the design of a chatbot that is bot to a colleague. If a support chatbot used in a technical
humanlike, but not overly so, to ensure that it still fulfills its environment is perceived as credible, it would greatly increase
primary purpose of giving the user the right answer as fast as the chances of employees trusting the chatbot as the primary
possible. Additionally, the style guide encourages chatbots source of information. If more employees use a chatbot for
to have a human handoff if miscommunication occurs. This assistance it will reduce the time it takes for the employees to
is not only a more efficient approach to problem solving, but get support and also free up time for the employees working
can also have a positive influence on the user’s mental health with supporting others today.
since early studies indicates that a pure robotic conversation
In conclusion, this study shows that using a more humble
might generate an increased feeling of isolation, loneliness,
communication style generally positively affects the perceived
and depression [6].
credibility of internal support chatbots used in a technical
environment.
Future work
Future work should focus on better understanding the mecha- ACKNOWLEDGMENTS
nisms why the dimension of Goodwill is the only dimension The author thanks all the volunteers who participated in the de-
that harms the perceived credibility. Would the Goodwill in- sign workshop, interviews, and credibility survey. A grateful
dication stay the same also if the Goodwill statement would thank you to Lori-Ann Robertson and Leif Jonsson at Erics-
have been phrased differently, if a more diverse set of partici- son for your help and support in providing feedback on ideas
pants would have answered the survey, or if more participants and help in collecting participants for the interviews and ques-
would have participated? It would also be of interest to in- tionnaire. A big thank you to Madeline Balaam for being an
vestigate if the perception of Trustworthiness would increase excellent academic supervisor and thank you to my examiner
if the chatbot had a defined gender. This, since previous re- Kristina Höök for keeping me updated on the latest in the field.
search indicates that gender transparency of a bot can create Finally, the author gratefully acknowledges friends and family
trust among users[4]. Another interesting approach to future for providing endless feedback and support throughout the
work would be to analyze if and how the credibility evalua- entire thesis work. Thank you!
tion would have differed if the style guide would have been
applied to another CUI or evaluated on another more diverse REFERENCES
user group. Would the style guide be perceived as humble [1] 2022. (2022). https://www.mentimeter.com/
also for neurodiversity? Additionally, the last phase of this
research work was quite short. It would be interesting to inves- [2] Farah Alsudani and Matthew Casey. 2009. The Effect of
tigate whether a longer evaluation phase might show effects Aesthetics on Web Credibility. In Proceedings of the
on human behavioral change since interactions happen both 23rd British HCI Group Annual Conference on People
ways. Would a humble chatbot influence its users to be more and Computers: Celebrating People and Technology
humble too? Finally, to make the style guide easier to adapt by (BCS-HCI ’09). BCS Learning amp; Development Ltd.,
other software development organizations it should be further Swindon, GBR, 512–519. DOI:
improved, evaluated, generalized, and optimized for easier im- http://dx.doi.org/10.5555/1671011.1671077
plementation and adaption in the future. This could be done by [3] Michael C. Ashton and Kibeom Lee. 2009. The
looping over more rounds of the Double Diamond approach, HEXACO–60: A Short Measure of the Major
which due to time limitations was not possible for this thesis Dimensions of Personality. Journal of Personality
work. Assessment 91, 4 (2009), 340–345. DOI:
http://dx.doi.org/10.1080/00223890902935878 PMID:
CONCLUSION 20017063.
This study contributes to the field of HCI by highlighting the [4] Rachel Batish. 2018. Voicebot and Chatbot Design:
importance of the communication style of support chatbots Flexible Conversational Interfaces with Amazon Alexa,
used in a technical environment. It aims to bridge the theo- Google Home, and Facebook Messenger. Packt
retical gap between credibility evaluation of support chatbots Publishing, Limited, Birmingham. 1789139627
and the use of a more humble communication style. The paper
presents a style guide for a humble communication style based [5] Austin Beattie, Autumn P. Edwards, and Chad Edwards.
on a design workshop with participants from a diverse set of 2020. A Bot and a Smile: Interpersonal Impressions of
cultures and work areas. The designed humble communication Chatbots and Humans Using Emoji in

Computer-mediated Communication. Communication                       York, NY, USA, 80–87. 0201485591 DOI:
     Studies 71, 3 (2020), 409–427. DOI:                                  http://dx.doi.org/10.1145/302979.303001
     http://dx.doi.org/10.1080/10510974.2020.1725082
                                                                     [15] Asbjørn Følstad, Cecilie Bertinussen Nordheim, and
 [6] Petter Bae Brandtzaeg and Asbjørn Følstad. 2018.                     Cato Alexander Bjørkli. 2018. What Makes Users Trust
     Chatbots: Changing User Needs and Motivations.                       a Chatbot for Customer Service? An Exploratory
     Interactions 25, 5 (aug 2018), 38–43. DOI:                           Interview Study. In Internet Science, Svetlana S.
     http://dx.doi.org/10.1145/3236669                                    Bodrunova (Ed.). Springer International Publishing,
 [7] Joseph Chancellor, Seth Margolis, and Sonja                          Cham, 194–208. 978-3-030-01437-7
     Lyubomirsky. 2018. The propagation of everyday                  [16] Adam Grant and Francesca Gino. 2010. A Little Thanks
     prosociality in the workplace. The Journal of Positive               Goes a Long Way: Explaining Why Gratitude
     Psychology 13, 3 (2018), 271–283. DOI:                               Expressions Motivate Prosocial Behavior. Journal of
     http://dx.doi.org/10.1080/17439760.2016.1257055                      personality and social psychology 98 (06 2010), 946–55.
 [8] Reinout E. de Vries, Angelique Bakker-Pieper, Femke E.               DOI:http://dx.doi.org/10.1037/a0017935
     Konings, and Barbara Schouten. 2013. The                        [17] Kieun Kim. 2016. The Relationship of UX and
     Communication Styles Inventory (CSI): A                              Perceptions of Credibility: The Case of the Mobile
     Six-Dimensional Behavioral Model of Communication                    Social Commerce Sites. International Journal of
     Styles and Its Relation With Personality.                            Affective Engineering 15, 2 (2016), 109–114.
     Communication Research 40, 4 (2013), 506–532. DOI:
     http://dx.doi.org/10.1177/0093650211413571                      [18] Ozan Kuru and Josh Pasek. 2016. Improving social
                                                                          media measurement in surveys: Avoiding acquiescence
 [9] Reinout E. de Vries and Jean-Louis van Gelder. 2015.                 bias in Facebook research. Computers in Human
     Explaining workplace delinquency: The role of                        Behavior 57 (2016), 82–92. DOI:
     Honesty–Humility, ethical culture, and employee                      http://dx.doi.org/https:
     surveillance. Personality and Individual Differences 86              //doi.org/10.1016/j.chb.2015.12.008
     (2015), 112–116. DOI:http://dx.doi.org/https:
     //doi.org/10.1016/j.paid.2015.06.008                            [19] Christine Liebrecht, Lena Sander, and Charlotte van
                                                                          Hooijdonk. 2021. Too Informal? How a Chatbot’s
[10] Cal W. Downs, Joan Archer, John McGrath, and Jeff                    Communication Style Affects Brand Attitude and
     Stafford. 1988. An Analysis of Communication Style                   Quality of Interaction. Følstad A. et al. (eds) Chatbot
     Instrumentation. Management Communication                            Research and Design. CONVERSATIONS 2020. Lecture
     Quarterly 1, 4 (1988), 543–571. DOI:                                 Notes in Computer Science 12604 (2021), 16–31. DOI:
     http://dx.doi.org/10.1177/0893318988001004006                        http://dx.doi.org/10.1007/978-3-030-68288-0_2
[11] Ericsson. 2020. Ericsson brand guidelines - extract from        [20] Jessica Lindblom and Rebecca Andreasson. 2016.
     ericsson brand house (v.1.0 ed.). 14 pages.                          Current Challenges for UX Evaluation of Human-Robot
     https://mediabank.ericsson.net/admin/mb/?h=                          Interaction. In Advances in Ergonomics of
     dbeb87a1bcb16fa379c0020bdf713872&p=                                  Manufacturing: Managing the Enterprise of the Future,
     dccda36951e6721097a93eae5c593859&display=list                        Christopher Schlick and Stefan Trzcieliński (Eds.).
[12] Andrew J. Flanagin and Miriam J. Metzger. 2007. The                  Springer International Publishing, Cham, 267–277.
     role of site features, user attributes, and information              978-3-319-41697-7
     verification behaviors on the perceived credibility of          [21] Marian McDonnell and David Baxter. 2019. Chatbots
     web-based information. New Media & Society 9, 2                      and Gender Stereotyping. Interacting with Computers
     (2007), 319–342. DOI:                                                31, 2 (04 2019), 116–121. DOI:
     http://dx.doi.org/10.1177/1461444807075015
                                                                          http://dx.doi.org/10.1093/iwc/iwz007
[13] B. J. Fogg, Jonathan Marshall, Othman Laraki, Alex
                                                                     [22] Lea Müller, Jens Mattke, Christian Maier, Tim Weitzel,
     Osipovich, Chris Varma, Nicholas Fang, Jyoti Paul,
                                                                          and Heinrich Graser. 2019. Chatbot Acceptance.
     Akshay Rangnekar, John Shon, Preeti Swani, and
                                                                          Proceedings of the 2019 on Computers and People
     Marissa Treinen. 2001. What Makes Web Sites
                                                                          Research Conference (2019). DOI:
     Credible? A Report on a Large Quantitative Study. In
                                                                          http://dx.doi.org/10.1145/3322385.3322392
     Proceedings of the SIGCHI Conference on Human
     Factors in Computing Systems (CHI ’01). Association             [23] Cecilie Bertinussen Nordheim, Asbjørn Følstad, and
     for Computing Machinery, New York, NY, USA, 61–68.                   Cato Alexander Bjørkli. 2019. An Initial Model of Trust
     1581133278 DOI:                                                      in Chatbots for Customer Service—Findings from a
     http://dx.doi.org/10.1145/365024.365037                              Questionnaire Study. Interacting with Computers 31, 3
                                                                          (08 2019), 317–335. DOI:
[14] B. J. Fogg and Hsiang Tseng. 1999. The Elements of
                                                                          http://dx.doi.org/10.1093/iwc/iwz022
     Computer Credibility. In Proceedings of the SIGCHI
     Conference on Human Factors in Computing Systems                [24] Robert Norton. 1983. Communicator style. Sage
     (CHI ’99). Association for Computing Machinery, New                  Publications.

                                                                11

You can also read