WEB SITE SECURITY MATURITY OF THE EUROPEAN UNION AND ITS MEMBER STATES - DIVA-PORTAL

Page created by Lance Frazier

Government & Politics

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

WEB SITE SECURITY MATURITY OF THE EUROPEAN UNION AND ITS MEMBER STATES - DIVA-PORTAL

Degree project

                 Web Site Security Maturity
                 of the European Union and
                 its Member States
                 A survey study on the compliance with best
                 practices of DNSSEC, HSTS, HTTPS, TLS-
                 version, and Certificate Validation Types

                 Bachelor Degree Project in Information
                 Technology with a Specialisation towards
                 Network and System Administration G2E
                 (IT610G)

                 Date of examination: 2021-06-13

                 First Cycle 22.5 credits

                 Spring term 2021

                 Student: Axel Rapp (a18axera@student.his.se)

                 Supervisor: Johan Zaxmy

                 Examiner: Jianguo Ding

Acknowledgement
I would like to begin this report by issuing a thank you to everyone who assisted me in the
process of completing this bachelor thesis. First and foremost to my supervisor Johan Zaxmy
who was constantly present and was able to guide me in the right direction every step of the
way. Further, I would also like to thank my examinator Jianguo Ding who provided feedback
and answers when questions arose. Finally, a thank you to my peers, friends, and family. Thank
you!

Abstract
With e-governance steadily growing, citizen-to-state communication via Web sites is as well,
placing enormous trust in the protocols designed to handle this communication in a secure
manner. Since breaching any of the protocols enabling Web site communication could yield
benefits to a malicious attacker and bring harm to end-users, the battle between hackers and
information security professionals is ongoing and never-ending. This phenomenon is the main
reason why it is of importance to adhere to the latest best practices established by specialized
independent organizations. Best practice compliance is important for any organization, but
maybe most of all for our governing authorities, which we should hold to the highest standard
possible due to the nature of their societal responsibility to protect the public. This report aims
to, by conducting a quantitative survey, study the Web sites of the governments and government
agencies of the member states of the European Union, as well as Web sites controlled by the
European Union to assess to what degree their domains comply with the current best practices
of DNSSEC, HSTS, HTTPS, SSL/TLS, and certificate validation types.
The findings presented in this paper show that there are significant differences in compliance
level between the different parameters measured, where HTTPS best practice deployment was
the highest (96%) and HSTS best practice deployment was the lowest (3%). Further, when
comparing the average best practice compliance by country, Denmark and the Netherlands
performed the best, while Cyprus had the lowest average.
Keywords: Web Site Security, Information Security, E-governance, Best Practice, DNSSEC,
HSTS, HTTPS, SSL, TLS, Certificate Validation

Table of Contents
1     Introduction ........................................................................................................................... 1
    1.1      Disposition .................................................................................................................... 1
    1.2      Terminology .................................................................................................................. 2
2     Background ........................................................................................................................... 3
    2.1      DNSSEC........................................................................................................................ 3
    2.2      HSTS ............................................................................................................................. 3
    2.3      HTTPS ........................................................................................................................... 3
    2.4      TLS & SSL .................................................................................................................... 4
    2.5      Certificate Validation .................................................................................................... 4
    2.6      Related Work................................................................................................................. 5
3     Motivation & Problem Definition ......................................................................................... 7
    3.1      Thesis Statement............................................................................................................ 8
    3.2      Objectives ...................................................................................................................... 8
4     Methodology ......................................................................................................................... 9
    4.1      Explanation of methodology ......................................................................................... 9
    4.2      Selection of Scope ....................................................................................................... 10
    4.3      Data Analysis Methodology ........................................................................................ 11
    4.4      Validity ........................................................................................................................ 11
5     Implementation of Methodology ......................................................................................... 13
    5.1      Finding Domains ......................................................................................................... 13
      5.1.1          Governments ....................................................................................................... 13
      5.1.2          Government Agencies ......................................................................................... 13
      5.1.3          European Union Web Sites ................................................................................. 14
    5.2      Finding Best Practices ................................................................................................. 15
      5.2.1          DNSSEC.............................................................................................................. 15
      5.2.2          HSTS ................................................................................................................... 15
      5.2.3          HTTPS ................................................................................................................. 15
      5.2.4          TLS ...................................................................................................................... 16
      5.2.5          Certificate Validation .......................................................................................... 16
    5.3      Finding Compliance with Best Practices of the Domains ........................................... 17
      5.3.1          DNSSEC.............................................................................................................. 17
      5.3.2          HSTS ................................................................................................................... 17
      5.3.3          HTTPS ................................................................................................................. 17
      5.3.4          TLS ...................................................................................................................... 18

                                                                                                                                                 i

5.3.5          Certificate Validation .......................................................................................... 18
    5.4        Automatic Data Gathering and Parsing ....................................................................... 18
6      Results ................................................................................................................................. 20
    6.1        Best Practice Compliance of the Whole Population per Parameter ............................ 20
       6.1.1          DNSSEC.............................................................................................................. 20
       6.1.2          HSTS ................................................................................................................... 21
       6.1.3          HTTPS ................................................................................................................. 22
       6.1.4          TLS ...................................................................................................................... 22
       6.1.5          Certificate Validation Type ................................................................................. 23
    6.2        Best Practice Compliance per Population Group per Parameter ................................. 23
       6.2.1          DNSSEC.............................................................................................................. 24
       6.2.2          HSTS ................................................................................................................... 25
       6.2.3          HTTPS ................................................................................................................. 26
       6.2.4          TLS ...................................................................................................................... 26
       6.2.5          Certificate Validation Type ................................................................................. 27
    6.3        Best Practice Compliance per Country per Parameter ................................................ 27
       6.3.1          DNSSEC.............................................................................................................. 28
       6.3.2          HSTS ................................................................................................................... 29
       6.3.3          HTTPS ................................................................................................................. 31
       6.3.4          TLS ...................................................................................................................... 32
       6.3.5          Certificate Validation Type ................................................................................. 33
    6.4        Best Practice Compliance Comparisons ...................................................................... 33
       6.4.1          Comparison between Population Groups ............................................................ 34
       6.4.2          Comparison between Agency Types ................................................................... 34
       6.4.3          Comparison between Countries ........................................................................... 35
    6.5        Conclusion ................................................................................................................... 36
       6.5.1          Comparison with Related Work .......................................................................... 37
       6.5.2          Contributions ....................................................................................................... 38
7      Discussion and Future Work ............................................................................................... 39
    7.1        Result Validity............................................................................................................. 39
    7.2        Ethical and Societal Aspects ....................................................................................... 40
    7.3        Future Work ................................................................................................................ 40
References ................................................................................................................................... 41
    Appendix A. List of Government Domains ............................................................................ 44
    Appendix B. List of Armed Forces Domains .......................................................................... 45
    Appendix C. List of National Civil Police Agency Domains.................................................. 46

                                                                                                                                                ii

Appendix D. List of Prison Agency Domains ......................................................................... 47
Appendix E. List of Public Employment Service Domains .................................................... 48
Appendix F. List of Taxation Agency Domains ..................................................................... 49
Appendix G. List of europa.eu Domains ................................................................................. 50
Appendix H. HTTPS Data Collection Script .......................................................................... 53
Appendix I. DNSSEC Data Collection Script ......................................................................... 55
Appendix J. TLS Data Collection Script ................................................................................. 56
Appendix K. HSTS Data Collection Script ............................................................................. 59
Appendix L. Certificate Type Data Collection Script ............................................................. 60
Appendix M. Data Parsing Script............................................................................................ 61
Appendix N. Data Gathering and Parsing Overview .............................................................. 66

                                                                                                                           iii

1        Introduction
It seems like every other day we hear news reports of breaches or hacking events that have taken
place, leaving trusting users in the line of fire for criminals. In a time where e-governance is steadily
increasing, and citizens expect to find information and services available online, it is paramount that
government-affiliated web services should lead by example and provide a high level of security for the
users. A pillar of democracy is that all power should be transparent and scrutinized which leads to the
question; are the EU and the member states’ governments leading by example in web security? If not,
how can we expect that the private sector will follow?
Even though there are sufficient solutions for many security threats, the implementation of such
solutions in the real world is not granted. For example, HTTPS support increased from 2016 to 2017,
however, the support varied by region and popularity of the Web site (Felt et al., 2017). Thus, leaving
more to wish for.
This is further confirmed when looking at web security features in the light of the Diginotar security
breach in 2011 where the certificate authority was hacked and as a result, was issuing fraudulent
certificates. Techniques developed after the incident, such as Certificate Transparency (CT) which
makes CA systems auditable; header additions to HTTPS; preventing protocol downgrade attacks with
SCSV; DNS-based extensions controlling certificate issuing with CAA, could all protect against a
multitude of attacks, even mitigating the effect of Diginotar-similar breaches. Their implementation,
however, was found to be disappointing in deployment (Amann et al., 2017).
So, there is cause for security concerns in the Internet ecosystem as a whole. What about government
security? When Thompson et al. (2020) compared a highly developed e-government state (Australia)
and a developing nation as a contrast (Thailand) it was found that not much separated the two nations'
security levels of their Web sites. In an example from the study, only half of the Australian sites were
configured to use enforcing HTTPS, or HSTS, as compared to Thailand’s one third. Government
agencies have the ability to set a mandate for the private sector to follow. But are they?
Certainly, there are many different ways of judging a Web site’s security level. The focus of this study
is Web site configuration regarding data exchange between the user and the web server within the EU.
This project aims to look at a selection of security parameters for Web sites as part of a security chain
and analyse the configuration of the selected parameters of websites of governments, government
agencies, and EU domains using tools to collect the necessary data.
The parameters considered in this paper are:
    -   DNSSEC (DNS Security Extension) – Ensuring authenticated DNS lookups
    -   HSTS (HTTP Strict Transfer Security) – Enforcing use of HTTPS and blocking insecure
        redirects
    -   HTTPS (HTTP over TLS) – Secure communication via encryption and server authentication
    -   SSL/TLS-version (Transport Layer Security & Secure Socket Layer) – The encryption layer
        of the HTTPS protocol
    -   Certification Validation type – To what degree certificate ownership is validated

1.1      Disposition
The contents of this report begin with a background section intended to present the necessary
information to provide the reader with an understanding of the protocols studied, as well as a

                                                                                                           1

summarization of related work. Next, a section that motivates why this subject is of interest for a study
is discussed, and a thesis statement is defined. The methodology selected is presented and argued for
section four, together with a discussion on validity. In section five it is detailly elaborated on how the
study was conducted, before presenting the finding of the study in a result section which also includes
the conclusions drawn from the findings. Finally, a discussion entertaining possible validity concerns,
ethical and societal aspects of the study, and ideas for future work is found in section seven.

1.2 Terminology
This section provides an easily accessible list of the terminology and abbreviations used throughout
the report. The list is in alphabetical order.
CA – Certificate Authorities are organizations that can issue digital certificates and store the
information of public keys and their owners.
CN – Canonical Name is a record used in DNS to create an alias from one domain to another.
CSV – A Comma Separated Value file is a text file using (mainly) commas to separate values. A CSV
file usually stores tabular data.
DNS – The Domain Name System is the system used to match domain names to IP addresses.
DNSSEC – Domain Name System Security Extensions. A detailed explanation of DNSSEC is
available in section 2.1.
E-governance – Electronic governance is the act of using Information and Communication
Technology to provide government services or information sharing.
HSTS – HTTP Strict Transport Security. A detailed explanation of HSTS is available in section 2.2.
HTTP – Hypertext Transfer Protocol is the protocol used to transfer data over the Web.
HTTPS – Hypertext Transfer Protocol Secure. A detailed explanation of HTTPS is available in section
2.3.
IANA – The Internet Assigned Numbers Authority is a department of ICANN responsible for unique
identifiers registries, such as domain names, protocol parameters, and IP addresses.
ICANN – Internet Corporation for Assigned Names and Numbers is a non-profit organization
responsible for the IP protocols and address space, among others.
IETF – The Internet Engineering Task Force is a non-profit standards organization creating standards
to maintain and improve the usability and interoperability of the Internet.
MITM-attack – A Man-in-the-Middle-Attack is an attack where the attacker places themselves in
between an end-user and a service in order to control the information flowing between them. This
attack can be used in different ways, for example, eavesdropping or manipulate information.
RFC – A Request for Comments is a document published by the IETF used to develop standards.
TCP – Transmission Control Protocol is the most common transmission protocol in IP networks.
TLS/SSL – Transport Layer Security/Secure Socket Layer. A detailed explanation of TLS/SSL is
available in section 2.4.
URL – Uniform Resource Locator is the network identification for any resource connected to the web
and is used to specify addresses on the World Wide Web network.

2 Background
This section of the report aims to provide the reader with information and an understanding of central
concepts of web security. All parameters discussed are part of a linked security chain, meaning that
each link directly affects the overall security of a Web site.

2.1 DNSSEC
The Internet addressing system, DNS, is arguably the most critical part of Internet infrastructure and
has been in use since the 1980s. As most protocols developed in the early days of the Internet they
were not designed with security in mind, leaving them vulnerable to attacks where users can be
redirected to fraudulent sites and having valuable information stolen from them.
Developed to combat these sorts of MITM attacks, DNSSEC makes use of public-key cryptography to
authenticate and validate DNS data(Arends et al., 2005a). The idea is built upon a chain of trust,
meaning that once a DNS response is given to a local DNS server, a public key from the responding
DNS server is sent along with the signed response. The local DNS use this public key to validate its
authenticity by querying its parent zone, usually, a Top-Level Domain (TLD), which can vouch for the
child’s signature as well as signing its own response. The local DNS then proceeds to query the root
zone which can vouch for the TLD’s zones signature. If this process does not output any errors, the
local DNS is ensured of the DNS data’s authenticity and integrity.
As an example, when resolving the IP address of www.example.com the root zone (.) will verify the
.com zone’s signature and the .com zone will verify the example.com zone’s signature. If any link in
this chain is broken, it is enough cause for concern for the local DNS not to accept the resolved IP
address.

2.2 HSTS
Based upon Jackson & Barth’s (2008) prototype approach of ForceHTTPS, which intended to enforce
HTTPS in browser client communication through a browser extension, HSTS instead have the
websites declare themselves as HTTPS only, mainly in the HTTP response header field (Hodges et al.,
2012). Hodges et al. (2012) continue by explaining that with HSTS implemented on the server, the
client will dynamically change insecure links to secure ones before accessing the web host (e.g.,
http://his.se to https://his.se). Further, if security errors with regard to TLS or the certificate not being
trusted occur, the connection will be terminated by the server and the client will not be able to access
the web application. If applied to the top domain name, HSTS could also be configured to apply to all
sub-domains, thus blocking any HTTP redirects within the domain.

2.3 HTTPS
A fundamental part of Internet security relates to the use of HTTPS which was first introduced in 2000
in an RFC by IETF (Rescorla, 2000). The communication protocol builds upon the application layer
protocol HTTP which makes up the fundament of world wide web communication. The incentives for
implementing HTTPS are that the HTTP protocol makes no significant efforts towards security. Thus,
exposing vulnerabilities to confidentiality and integrity of network traffic. As explained by Rescorla
(2000) the concept of HTTPS is simple; simply use HTTP over TLS as you would over TCP. This
prevents packets from being sent in plain text to instead being encrypted with TLS, achieving secure

communication. The HTTPS protocol fulfils the following features:
• Confidentiality – the message is encrypted.
• Integrity – the message has not been altered in transit.
• Server Authentication – the message is received from the correct sender.
An HTTPS session usually occurs between a web browser (client) and a web server where the former
initiates the process of agreeing upon what cryptographic parameters to use for the session. This
process is called “The TLS handshake”. There are multiple ways to configure a server and a client to
what this handshake should entail, however, according to Rescorla (2000), the most important and
relevant steps are the following:
1. The client sends a ClientHello message with information regarding what TLS version and
encryption algorithms (also called cypher suites) it supports along with a preference list for
what to use if it can be matched by the server. A session ID is also present.
2. The server responds with a ServerHello message where the TLS version and cipher suite
selected is established. It also passes along its server certificate.
3. The client responds by sending a premaster secret which is used by the server to set up an
asymmetric encryption channel.
4. The two sides confirm that the encryption is valid and working, and if it is, application data
can be securely sent both ways.
Dierks and Rescorla (2008) give insight into the following issue: an HTTPS session is initiated by the
client, but it is the server that dictates the term of the session. A web server can be configured not to
support certain TLS versions as well as cipher suites in order to avoid establishing insecure
communication channels with clients. There is a balance here that is of interest: a web server which
only supports the newest and most secure cypher suites might find itself denying most clients.
Therefore, some support for older (and perhaps insecure) versions and suites needs to be present. A
conflict of interest can occur for the web server owner between wanting to provide access to all clients
and wanting to provide only secure access.

2.4 TLS & SSL
SSL is the predecessor and now obsolete protocol of TLS, the encryption protocol used in HTTPS
communication. Originally developed by Netscape the SSL version 1.0 was never published due to
security flaws. Its successor, SSL 2.0, released in 1995 was not free of security flaws either and forced
a rather quick redesign of the protocol released the following year as SSL version 3.0. SSL 3.0 is the
protocol built upon by the newer TLS versions. The control of the protocol was given to IETF which,
in 1999, released TLS version 1.0 as an update to SSL 3.0. Since then, three more versions of TLS
have been published adding more security features and protections against newer attacks. In 2011 and
2015, SSL versions 2.0 and 3.0 respectively were deprecated, leaving only the TLS suite available for
use (Barnes et al., 2015; Turner & Polk, 2011).

2.5 Certificate Validation
Another parameter important to web security are certificates, or more specifically, certificate
validation levels. The main idea of a certificate is its ability to certify a public key’s owner in some
manner. If the public key is trusted, then clients can also trust that public key’s corresponding private
key. This trust can be accomplished only by a trusted third party – a Certificate Authority.

As explained by (CABF, n.d.), when a CA issues certificates there are a few different ways of
validating the entity to which the certificate is applied: (1) Domain Validated Certificate, (2)
Organizational Validated Certificates, and (3) Extended Validated Certificates. Each differing in cost
and how thorough the validation process is. (1) Domain Validation (DV) is the lowest level of
validation where the CA essentially only validates that the purchaser has authority over the domain in
question. A DV certificate can usually be obtained within a few minutes or hours from purchase and is
the cheapest alternative since no human interaction is needed. You are, however, provided with the
ability to establish an HTTPS connection to your Web site. (2) Organizational Validation (OV) is a
step in the right direction with a more expensive solution in which the CA needs to be able to validate
the organization’s identity before issuing the certificate. This process can be completed within a few
days. (3) Extended Validation (EV) is the strictest alternative where the CA requires the organization
to provide documentation of ownership, physical location, the legal existence of the organization etc.
before issuing the certificate. Naturally, with this type of thoroughness, the certificate takes longer
(usually a few weeks) and costs more to obtain.

2.6 Related Work
This section aims to explain how this study fits in the environment of previously published work in the
area and why it is a necessary contribution to the total body of literature on the subject of Web site
security for governmental Web sites.
E-governance and security are two interchangeable areas. Alharbi et al. (2014) identify security as part
of e-governance to be a main factor in regard to adoption by end-users and attribute the perceived risk
similar importance. Further, their study finds that half of the respondents housed a privacy concern
when utilizing e-government services. The researcher believes this study, among others, highlights the
importance of transparency in the development of Web security related to governments.
Another approach seen in the literature is the deep-dive analysis of a specific country’s conditions
regarding e-governance or audits of specific services, often performed on developing nations. For
instance, the vulnerability assessment of Burkina Faso’s government-controlled Web sites which
showed known vulnerabilities on half of the inspected sites (Bissyandé et al., 2016). While this type of
work is particularly important for the ability of these nations to better their infrastructure and overall
security, it does not provide a general status of larger regions, such as Europe.
Studies similar to this study, where the adoption rates of certain security features or protocols are
researched, are available. However, these also focus on specific countries. As mentioned in section 1,
Thompson et al. (2020) investigated, as part of their study, the adoption rate of the HTTPS enforcing
protocol HSTS. Thus, providing a glimpse of the best practice implementation status of that protocol
for the two countries included in the study at that specific point in time. This type of information has
not been identified collectively for the government agencies of the European countries.
There are a lot of data available, however, only for the usage of certain protocols on the Internet as a
whole, or specific for individual countries. This type of data is often provided by non-scientific
sources such as browser providers or certificate authorities but still gives a good estimate of the
ecosystem’s adaptation as a whole. In contrast, this study aims to provide similar data but specific to
governmental Web sites within the EU in a scientific manner.
Another non-scientific resource that qualifies to be mentioned in this section is Balter's (2021) analysis
of federal .gov domains in the US. His work is closely related to the aims of this research study. The
prevalence of a set of parameters has been analysed for all US federal controlled domains to serve as a
status update of the current adoptions of those parameters. It was shown that HTTPS was supported on

95% of domains, and 75% returned DNSSEC records. Further, 69% of domains supported HSTS in
some capacity whereof 44% also present on the HSTS preload list (Balter, 2021).

                                                                                             6

3        Motivation & Problem Definition
As the use of the Internet increases, so does the adaptation of e-governance by government agencies.
Government agencies use the Internet to communicate with their citizens mainly through their Web
sites and Web services, and citizens are to a greater extent expected to use the Internet in their
communication with the government agencies, even more so in developed parts of the world, such as
Europe (Thompson et al., 2020). As identified by Alharbi et al. (2014); security and perceived security
are two important factors for this state-to-citizen relationship to work. A prerequisite for this type of
communication to work as intended is for the communication to be secure.
The security of the communication between Web sites and end-users are not always clear to the end-
user and not complying with best practices regarding Web site security can lead to a multitude of
issues affecting the confidentiality and the integrity of the end-user. For example, by not supporting
HTTPS and providing Web site communication over plain HTTP the end-user is at risk for
information theft and manipulation due to the loss of confidentiality, integrity and server
authentication as compared to HTTPS communication (Rescorla, 2000).
An intention of this report is to, by establishing the best practices and highlighting possible weak
areas, assist network and system administrators in maintaining secure Web sites.
The goal of this study is to perform a collective data gathering from the Web sites of the governments,
and the government agencies of the member states of the European Union (EU), as well as EU-
controlled Web sites. By analysing the collected data and presenting the results, it will act as a
snapshot of the current implementation adoption of relevant security protocols and security features,
something which is not readily available at the moment. The benefit of such information is to provide
a fair insight into the e-government security level of a large geographic area affecting a large
demographic. Further, a snapshot of the current state can also serve as a benchmark when measuring
improvement over time.
The selection of parameters to gather data upon is derived from the reasons explained in section 2.
Essentially, to be able to assess the security maturity of multiple Web sites there is a trade-off between
how thorough of a security analysis of a Web site to perform and how many Web site you can assess.
This project aims to assess many different domains in multiple countries and because of this, an in-
depth analysis of each Web site is not applicable given the resource availability of this final year
project. Rather, the parameters to be included must be able to be validated in an automated fashion
without human interaction. A few key metrics have been chosen as part of a Web security chain that
also fulfils the previously mentioned requirement of automated validation. Each security feature’s
relevance and their importance as part of this chain are discussed in section 2 where their interplay is
also clarified. The chosen parameters are:
    -   Prevalence of DNSSEC
    -   Prevalence of HSTS
    -   Prevalence of HTTPS
    -   SSL/TLS version support
    -   Type of Certificate Validation
The selection of subjects where this study exclusively explores government agencies, governments of
EU member states and EU itself is explained by the interest of scrutinizing those whose power and
trust is the greatest to ensure that they oblige to best practices. Since e-governance is an ongoing

                                                                                                         7

driving force it is of importance that it is handled correctly. Further, the researcher has chosen not to
look at specific countries or regions within the EU, instead, all countries within the EU will be
included in the study.

3.1       Thesis Statement
Considering the reasons explained in the previous section, the thesis statement of this study is:
This study aims to establish to what extent the European Union controlled Web sites, and the Web
sites of the governments and a selection of the government agencies of the member states of the
European Union comply with the best practices of Web site security in regard to DNSSEC, HSTS,
HTTPS, TLS, and Certificate Validation.

3.2       Objectives
Three subordinate objectives have been formulated and need to be answered in order to reach the aim
of the thesis statement:
Objective 1: What Web sites should be included in the study?
To reach the aim, this first objective is essential to have a starting point. Since there are several
different countries with different polities to be included in the study, as well as an international union,
a selection of relevant and comparable Web sites to be included must be found.
Objective 2: What are the best practices for the relevant security features?
To be able to establish to what extent the relevant Web sites comply with best practice, there needs to
be a standard to which to compare the sites. This standard for best practices needs to be defined before
any relevant conclusions or comparisons can be made from the material.
Objective 3: Do the Web sites in objective 1 comply with the best practices established in objective 2?
Objective 3 could be argued to be the core question in this study since it builds on top of objective 1
and objective 2 and causally relates to the thesis statement. If each Web site’s compliance with the
best practices can be assessed, the entire population of assessed Web sites will make up the total body
of results needed to establish the extent of best practice use.
Answers to these objectives and establishing the extent of best practice use as explained by the thesis
statement could be valuable, not only to organizations included in the study but to any organization in
evaluating its Web site security. Further, the findings of this study will provide a transparent security
view of the organizations included in the study which can be used by individuals communicating with
them through their Web sites. Finally, the findings of this study would be beneficial for any research
intended to measure Web site security features implementation over time in providing a snapshot of
the current status.

                                                                                                            8

4 Methodology
This section of the report aims to provide the reader with an explanation of how the study is to be
conducted, and why the methodology was chosen to be the best applicable methodology in this
particular circumstance.

4.1 Explanation of methodology
The general process of this methodology is separated into three different steps, each corresponding to
a specific objective. The first step will be to decide on which entities to be included in the study.
Secondly, to map the current best practices of the parameters chosen to be included in the study.
Thirdly, to gather empirical data of the parameters for the entities, and further, analyse the data to
evaluate the compliance with the best practices determined in step 2.
According to Wohlin et al. (2012), there are three different strategies to empirical studies such as this
one, namely, survey, case study, and experiment. A survey approach is useful for collecting
information to be able to describe, explain or compare variables. This is suitable for this study since
the objective is to describe Web site security and to compare implementation between different
groups. A case study would not suit this study to the same extent since case studies aim to explain
phenomena that are not that well understood, which is not the case when it comes to Web site security.
Although Web site security might be complex, the features looked at are inherently well understood
and best practices agreed upon. Further, the thesis statement can only be answered by looking at the
whole population to which it relates, not by deep diving into only a selected few organizations.
Finally, experiments manipulate a variable in a controlled setting to examine what effect it has on a
subject. This study is not performed in a controlled setting, rather, in a real-world setting, meaning that
the researcher cannot control certain elements in the environment. An experiment also makes use of a
hypothesis to try to verify or falsify it. This is not applicable in this study. (Berndtsson et al., 2008)
Given the above reasons, the chosen methodology is to perform a survey-based study focused on
gathering the same types of data from each subject in a larger population. In the context of this study,
the data refer to Web site security implementations of governments and government agencies of
member states of the EU, as well as EU-controlled Web sites.
Unlike a classical survey, this survey will be conducted on computers rather than people, which entails
certain advantages over classical surveys worth mentioning:
(1) The standardization of the questions used in the survey is important when dealing with people
to ensure all subjects interprets the questions in the same way. This becomes a non-issue when
you query a Web site server. The use of standardized protocols ensures that the question is
understood and that an expected response is given.
(2) As highlighted by Berndtsson et al. (2008) a common issue for surveys is that the motivation
for participation often is low and that high response rates are difficult to achieve. By
performing the survey on Web site servers, an answer is guaranteed as long as the subject is
online.
(3) Responder bias where respondents tend to want to show a positive image of themselves, or
that respondents tend to favour a neutral position in a question is also a non-issue concerning
this survey. All possible bias of the respondent’s answer being influenced to be pleasing or
unpleasing is non-existent when the responder is a computer.

4.2 Selection of Scope
According to Robson & McCartan (2016), the population refers to all cases within the scope of the
research question. The sample (the entities included in the survey) is selected from that population
using a sampling technique. The population of this study consists of all Web sites of governments of
member states of the EU, and all agencies under those governments, and all the EU-controlled Web
sites. As stated by Robson & McCartan (2016), the larger the sample is in relation to the population,
the lower the risk is of error in generalization. This is the reason for striving for an inclusive survey,
whereas much of the population as possible is included in the study. The first group in the population
– governments – allows for a total inclusion since they are not very numerous. The second group –
government agencies of member states – is a larger group that is not as easily available. If all member
states of the EU were to have registries available displaying all national agencies, a total inclusion
could be achieved here as well. However, not all relevant states have public registries of this kind, and
the resources available for this final year project for a bachelor’s degree are not sufficient to localize
all national agencies for all the EU member states.
The non-probability sampling technique of purposive sampling was utilized to achieve a sample
within this population. It is described by Robson & McCartan (2016) as a technique where the
researcher chose the sample based on satisfying the need of the study. The population is divided into
groups – each member state is a group – and from that group representative agencies which is
comparable to one another was chosen. The goal is to find five agency types expected to exist in all
member states of the EU to include in the study. To achieve this, the researchers starting point was to
find five well-funded agency types. It is not necessary for the study that the selected agencies are the
most well-funded in each state. Instead, a generalization and an assumption have been made that even
though the countries in question might distribute their spending differently between their agencies, that
it is still a fair ground for comparison if the same types of agencies are selected. By referencing the
Swedish Financial Management Authority (Utgifter i Statens Budget, 2021), the allocation of state
resources in Sweden was used to find five agency types that were well-funded and have counterparts
in all member states of the European Union. The implications of this selection are discussed in section
7.1. The agency types selected were:
1. Armed Forces
2. National Civil Police Agency
3. Prison Agencies
4. Public Employment Services
5. Taxation Agencies
Finally, the third group, consisting of EU-controlled Web sites, will not be handled in precisely the
same way as the government agencies of each member states since the EU as an international union
simply is not comparable to individual nations. However, an approach similar to the first group of
governments can be taken and include all EU institutions, agencies, and bodies. This is an appropriate
measure since all EU institutions, agencies, and bodies reside on the same 2nd level domain – namely
europa.eu – along with a few other types of Web sites relevant to the thesis statement:
1. Inter-institutional cooperation entities or services.
2. Sites providing access to the information and services of an official programme.
3. Sites providing access to a service or database with a well-established brand name.
4. Sites requiring high visibility for promotional purposes.
(The Europa Domain, 2021)

A total inclusion of all 3rd level domains under the europa.eu domain will be selected, thus, including
the entire population.

4.3 Data Analysis Methodology
The data gathering process aims to extract data from each Web site that can be compared to a best
practice standard clarified in section 5.2 and also give the ability to compare results between different
subject groups (e.g., comparing countries with one another). The different metrics included in the
study all generate nominal data since the result of all queries to the Web site will be constructed
similar to yes and no questions. For example:
Are you configured with HTTPS? Yes.
or
Do you support TLS version 1.3? No.
Of course, this is not an interview with a server. Details of which tools and queries will be used to
extract this information is presented in section 5.
The analysis of this study is focused on measuring the gathered data for frequency. For example, how
many domains that have HSTS implemented on their Web sites.

4.4 Validity
For the trustworthiness of the result, this section will explore the specific validity concerns in regard to
this specific study by using the classification presented by Wohlin et al. (2012) to discuss to what
extent the results are true and not biased by the researcher’s subjective perspective.
The first validity threat concerns the reliability of measures which Wohlin et al. (2012) explain to be
of great importance for any study which includes measurement of some sort. This reliability can take a
toll if, for example, bad questioning is used. As further explained by Wohlin et al. (2012), as a general
principle the less human interaction included in measurements, usually the more reliable the measure.
If something is measured twice, the outcome should be the same. For this study, human error is
avoided by using scripts and tools that ensure that the queries are constructed in a standardized manner
ensuring that all subjects are being treated the same way. This also allows for the automation of certain
tasks within the process. Testing of the scripts and tools used for data gathering will be conducted to
compare the output with recognized tools and methods to ensures functionality ahead of data
gathering. This will be performed by choosing a number of domains included in the study and running
them through external testing tools to validate the results. The standardization and automation of data
gathering also mitigate any threat to the reliability of treatment implementation, which refers to the
test subjects being treated differently.
Furthermore, it is important to acknowledge that even though the scripts and tools produce identical
responses when run directly after one another, the results will not be able to be repeated due to the
constant change of the studied environment. Not only are adoption rates of protocols likely to change,
but new consensus on best practices is almost certain to change over time as the threats and security
protocols used to combat them change the dynamic of the Internet. However, this is not a threat to
validity since that is to be expected when analysing public Web sites available on the Internet. It is an
ever-changing environment. Therefore, due to the nature of the posted thesis statement, the result is
only presented as a current state as of 2021-05-11.

No in-depth statistical analysis will be performed due to the data only being nominal. However, the
sample size used should be a true representation of the total population since all entities are being
tested for the population groups of governments and europa.eu domains, and generalization errors
were mitigated with the sampling technique explained in section 4.2.
Other validity threats include maturation, history, repeated testing, responsiveness, and statistical
regression. These types of human changes in responses have been briefly touched on in section 4.
With the use of standardized protocols and the respondents being Web site servers, these types of
validity threats are not affecting this study.
Construct validity relates to what extent the operational measures that are studied really represent what
the researcher has in mind and what is investigated according to the research questions. If, for
example, the constructs discussed in the interview questions are not interpreted in the same way by the
researcher and the interviewed persons, there is a threat to construct validity. As mentioned before,
this is mitigated by the standardization of protocols used when querying the Web site servers. This
type of validity threat could also include the studied parameters not providing sufficient information to
be able to answer the thesis statement. Web site security is a broad subject, and the best practices
extend beyond the scope of this study. Hence, it has been decided to present the results as compliance
with the best practices of the parameters included in the study, not as holistic compliance with Web
site security best practices.
As previously mentioned in section 4.1, surveying computers guarantees an answer as long as the
subject is online. However, there can be a multitude of reasons as to why a server is unresponsive
(downtime, etc) and it is possible that unresponsiveness could impact the result of the study. This is
important to keep in mind when consuming the results of this study since an unresponsive server
yields a failed tests as the scripts are designed. Due to the constraints of resource availability of this
bachelor thesis, no mitigation to this validity threat was achieved.

5 Implementation of Methodology
This section aims to provide the reader with a more detailed explanation of the methodology described
in section 4 as well as the outcome of the questions presented as objectives to the thesis statement
where objective 1 correlates to section 5.1, objective 2 correlates to section 5.2, and objective 3
correlates to section 5.3.

5.1 Finding Domains
This section aims to explain how the researcher conducted the task of finding the domains and
providing answers to objective 1 for all different population groups.

5.1.1 Governments
As discussed in section 4.2, a total inclusion of the first population of governments is used in the
study. To find the Web sites of all governments of the member states of the EU, a premade list was
utilized from an official EU Web site. Each Web site was manually extracted and visited by the
researcher to ensure the validity of topicality of the link. The full list of government URLs is available
in Appendix A (List of National Governments, 2021; The 27 Member Countries of the EU, 2021).

5.1.2 Government Agencies
The sample of government agencies to be included in the study was established to consist of (1)
Armed forces, (2) National Civil Police Agencies, (3) Prison Agencies, (4) Public Employment
Services, and (5) Taxation Agencies. The process of finding the URLs of these agencies was to search
for premade lists already containing links to the agency in question, preferably lists residing on official
EU Web sites. If such a list cannot be found, other sources have been utilized to locate the accurate
Web sites. All resources used will be referenced accordingly.
(1) Armed Forces
No official EU Web site was found containing a premade list holding this information. Instead, a
Wikipedia page listing all military and paramilitary personnel was utilized to locate the Web sites of
the national armed forces of the EU member states. Each Web site was manually extracted and visited
by the researcher to ensure the validity of topicality of the link.
Where no general agency was found for a nation, the ministry of defence was selected instead.
The full list of Armed Forces URLs is available in Appendix B (List of Countries by Number of
Military and Paramilitary Personnel, 2021).
(2) National Civil Police Agencies
No official EU Web site was found containing a premade list holding this information. Instead, two
Wikipedia pages listing law enforcement agencies of different countries were utilized. Differences in
policing structure are obvious between different countries. The objective was to, as accurately as
possible, locate an agency equivalent to a national civil police. The lists were cross-checked against
one another to find the accurate URL. Each Web site was manually extracted and visited by the
researcher to ensure the validity of topicality of the link.

The full list of National Civil Police Agencies URLs is available in Appendix C (Law Enforcement by
Country, 2021; List of Law Enforcement Agencies, 2021).
(3) Prison Agencies
No official EU Web site was found containing a premade list holding this information. However,
Europris, an organization dedicated to promoting professional prison practice and officially supported
by the Justice Programme of the European Union provides just that through its web tool EPIS. Using
this tool, each Web site was manually extracted and visited by the researcher to ensure the validity of
topicality of the link.
In cases when no general prison agency was listed, the Web site of the ministry of justice was used
instead.
The full list of Prison Agencies URLs is available in Appendix D (European Prison Information
System, 2021).
(4) Public Employment Services
A list residing on the European Commission’s Web site was located holding information on all the
member states’ public employment services. Each Web site was manually extracted from this list and
visited by the researcher to ensure the validity of topicality of the link.
An exception was made regarding Belgium, where four different agencies were listed. Instead of
selecting only one, not possessing the ability to make a fair choice, all four Web sites were included in
the list.
The full list of Public Employment Service URLs is available in Appendix E (Public Employment
Services, n.d.).
(5) Taxation Agency
A list residing on EUs official Web site was utilized to find the Web sites of all national taxation
agencies. Each Web site was manually extracted and visited by the researcher to ensure the validity of
topicality of the link.
In cases when no general taxation agency was listed, the Web site of the ministry of finance was used
instead.
The full list of taxation agency URLs is available in Appendix F (Tax Authorities Contact List, 2021).

5.1.3 European Union Web Sites
Similar to 5.1.1 – Governments, European Union controlled Web sites will have a total inclusion of
the population to be used in the study. To find all subdomains of europa.eu a web tool by Wolfram
Alpha was utilized which presents all subdomains of a given domain along with visiting statistics of
each domain. The tools provided the researcher with 78 individual subdomains of europa.eu which all
were included in the study. Each Web site was manually visited by the researcher to ensure the
validity of topicality. This resulted in the removal of a few entries from the list due to duplicates and
domains no longer being active. In total, 78 domains, are included in this population in the study. The
full list of europa.eu URLs is available in Appendix G (WolframAlpha, n.d.)

5.2 Finding Best Practices
This section will present the concluded best practices for each parameter included in the study.
To establish the best practices of the parameters included in the study a multitude of sources were
consulted to find a consensus on the matter of best practices. The open standards organizations of
Internet Engineering Task Force and Internet Assigned Numbers Authority provided a significant
basis in this regard. Further, industry bodies, eminent organizations in the industry, and research
articles regarding the subjects were also consulted in establishing the current best practices.

5.2.1 DNSSEC
The first published standard of DNSSEC was made public by IETF in 1997 (Eastlake & Kaufman,
1997). This was followed by a revision of the initial RFC in 1999 (Eastlake, 1999). The protocol suite
was rewritten in 2005 spanning over RFC 4033-4035 (Arends et al., 2005a, 2005b, 2005c).
At this point, any holistic implementation of DNSSEC was still limited due to the fact that the root
zone had not yet been signed. This was achieved in July of 2010 when ICANN signed the root zone,
greatly simplifying the deployment of DNSSEC resolvers. As of February 2021, the Root Zone
Database of IANA (Internet Assigned Numbers Authority) contains 1589 TLDs adopting DNSSEC,
proving its worth as a best practice (IANA & IANA, 2021).

5.2.2 HSTS
The essentialness of using HTTPS is covered in section 2.1 and 5.2.3, and the HSTS protocol helps
ensure the use of HTTPS and to mitigate mainly SSL-stripping man-in-the-middle attacks, an attack
where a secure HTTPS connection is converted into a plain text HTTP connection. However, HSTS
does not come without limitations. The same idea of SSL-stripping can be used by an attacker if the
user is accessing the Web site for the very first time and strip the HSTS header, essentially precluding
the protocol to be activated. This limitation is addressed by the browsers by distributing HSTS
preloaded lists within their commercial browsers. An HSTS preloaded list contains known HSTS
supported Web sites for which HTTPS is used for the initial request. A simple workaround indeed,
however, it is unwieldy due to its scalability issues. This list cannot cover the entire Internet. There are
ongoing discussions of a more scalable solution with HSTS headers announced via DNS, securely
using DNSSEC, but no consensus has been established. The best practice is to have HSTS configured
on the main domain covering redirects to subdomains and to have the domain present on HSTS
preloaded lists (Hodges et al., 2012).

5.2.3 HTTPS
The importance of offering HTTPS support and moving away from HTTP without TLS can be
manifested by the many promotion efforts invested in this development. In 2014, Google started using
HTTPS support as a ranking parameter in search results - an effort to encourage websites to implement
HTTPS support (Ait Bahajii & Illyes, 2014). It is now common for web browsers to warn their users
when visiting a website over HTTP (Schechter, 2016; Vyas & Dolanjski, 2017). To further press on
the efforts made by the industry to make the Web a more secure place are the works of Let’s Encrypt.
Let’s Encrypt is a free Certificate Authority currently serving close to a quarter-billion Web sites
starting in 2013 (Let’s Encrypt Stats, 2021). It is made clear by the industry that HTTPS support is
best practice.

You can also read