PRESERVING SECURITY AND PRIVACY: A WIFI ANALYZER APPLICATION BASED ON AUTHENTICATION AND TOR - DIVA
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
DEGREE PROJECT IN ELECTRONICS AND COMPUTER ENGINEERING, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2020 Preserving Security and Privacy: a WiFi Analyzer Application based on Authentication and Tor ALEXANDRA KOLONIA REBECKA FORSBERG KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
Abstract Numerous mobile applications have the potential to collect and share user- specific information on top of the essential data handling. This is made possible through poor application design and its improper implementation. The lack of security and privacy in an application is of main concern since the spread of sensitive and personal information can cause both physical and emotional harm, if it is being shared with unauthorized people. This thesis investigates how to confidentially transfer user information in such a way that the user remains anonymous and untraceable in a mobile application. In order to achieve that, the user will first authenticate itself to a third party, which provides the user with certificates or random generated tokens. The user can then use this as its communication credentials towards the server, which will be made through the Tor network. Further, when the connection is established, the WiFi details are sent periodically to the server without the user initiating the action. The results show that it is possible to establish connection, both with random tokens and certificates. The random tokens took less time to generate compared to the certificate, however the certificate took less time to verify, which balances off the whole performance of the system. Moreover, the results show that the implementation of Tor is working since it is possible for the system to hide the real IP address, and provide a random IP address instead. However, the com- munication is slower when Tor is used which is the cost for achieving anonymity and improving the privacy of the user. Conclusively, this thesis proves that combining proper implementation and good application design improves the security in the application thereby protecting the users’ privacy. Keywords Network security, Privacy, Anonymity, Onion routing, Tor Organization, Wi-Fi data collection i
Sammanfattning Många mobilapplikationer har möjlighet att samla in och dela användarspecifik information, utöver den väsentliga datahanteringen. Det här problemet möjligg- örs genom dålig applikationsdesign och felaktig implementering. Bristen på säkerhet och integritet i en applikation är därför kritisk, eftersom spridning av känslig och personlig information kan orsaka både fysisk och emotionell skada, om den delas med obehöriga personer. Denna avhandling undersöker hur man konfidentiellt kan överföra användarinformation på ett sätt som tillåter använ- daren av mobilapplikationen att förbli både anonym och icke spårbar. För att uppnå detta kommer användaren först att behöva autentisera sig till en tredje part, vilket förser användaren med slumpmässigt genererade tecken eller med ett certifikat. Användaren kan sedan använda dessa till att kommunicera med servern, vilket kommer att göras över ett Tor-nätverk. Slutligen när anslutnin- gen upprättats, kommer WiFi-detaljerna att skickas över periodvis till servern, detta sker automatiskt utan att användaren initierar överföringen. Resultatet visar att det är möjligt att skapa en anslutning både med ett certifikat eller med slumpmässiga tecken. Att generera de slumpmässiga tecknen tog min- dre tid jämfört med certifikaten, däremot tog certifikaten mindre tid att verifiera än tecknen. Detta resulterade i att de båda metoderna hade en jämn prestanda om man ser över hela systemet. Resultatet visar vidare att det implementerin- gen av Tor fungerar då det är möjligt för systemet att dölja den verkliga IP- adressen och att istället tillhandahålla en slumpmässig IP-adress. Kommunika- tionen genom Tor gör dock systemet långsammare, vilket är kostnaden för att förbättra användarens integritet och uppnå anonymitet. Sammanfattningsvis visar denna avhandling att genom att kombinera korrekt implementering och bra applikationsdesign kan man förbättra säkerheten i applikationen och därmed skydda användarnas integritet. Nyckelord Nätverkssäkerhet, Sekretess, Anonymitet, Lök-routing, Tor Organisation, Wifi- datainsamling ii
Acknowledgments We would like to thank our supervisor Cihan Eryonucu at the Royal Institute of Technology, for his support and encouragement throughout this project. For his time, effort and patience to always be there and answer questions and to provide feedback and guidance whenever needed. Further, we would like to thank our examiner Panagiotis Papadimitratos at the Royal Institute of Technology, for his guidance and advice to choose the right topic and for supporting and encouraging us throughout the whole project. We would also like to thank our family and friends for their support and atten- tion throughout these last 3 years. Stockholm, June 2020 Alexandra Kolonia and Rebecka Forsberg iii
Authors Alexandra Kolonia and Rebecka Forsberg Information and Communication Technology KTH Royal Institute of Technology Place for Project KTH Royal Institute of Technology Stockholm, Sweden Examiner Panagiotis Papadimitratos KTH Royal Institute of Technology Supervisor Cihan Eryonucu KTH Royal Institute of Technology iv
Contents List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Acronyms and Abbreviations . . . . . . . . . . . . . . . . . . . x 1 Introduction 11 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.5 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . 14 1.6 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.7 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Background 16 2.1 Network Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.1.1 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.2 Public Key Encryption . . . . . . . . . . . . . . . . . . . . 17 2.1.3 Authentication . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.4 Passwords . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.5 Security Tokens . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.6 Transport Layer Security . . . . . . . . . . . . . . . . . . 20 2.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.1 Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.2 Pseudonymity . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.2.3 Unlinkability . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.4 Onion routing . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 WiFi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 SSID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 BSSID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.3 Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.4 The localization of the user . . . . . . . . . . . . . . . . . 23 2.3.5 The localization of the access points . . . . . . . . . . . . 23 2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4.1 Participatory sensing . . . . . . . . . . . . . . . . . . . . . 24 2.4.2 Privacy-respecting applications . . . . . . . . . . . . . . . 24 v
2.4.3 Privacy policies in mobile application . . . . . . . . . . . 25 2.4.4 RSA Cryptosystem . . . . . . . . . . . . . . . . . . . . . . 25 2.4.5 X.509 Certificates . . . . . . . . . . . . . . . . . . . . . . 25 2.4.6 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.7 Apps that provide WiFi details . . . . . . . . . . . . . . . 26 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3 Method 28 3.1 Research Process . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.1.1 Defining the app which this project is based on . . . . . . 28 3.1.2 Design of the system . . . . . . . . . . . . . . . . . . . . . 28 3.1.3 Authentication . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1.4 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.1.5 WiFi Analyzer . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.6 Evaluation of the implementation . . . . . . . . . . . . . . 32 3.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 Experimental Design/Planned Measurements . . . . . . . . . . . 33 3.4 Assessing reliability and validity of the method and data collection 33 3.4.1 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4.2 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.5 Planned Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . 34 3.5.1 Data Analysis Techniques . . . . . . . . . . . . . . . . . . 34 3.5.2 Software Tools . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6 Evaluation Framework . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6.1 Collection of data . . . . . . . . . . . . . . . . . . . . . . 34 3.6.2 Evaluation of data . . . . . . . . . . . . . . . . . . . . . . 35 4 Implementation 37 4.1 Selection of the base app . . . . . . . . . . . . . . . . . . . . . . . 37 4.2 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3.1 Study Review . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 38 4.4 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 Study Review . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.3 Tor integration . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5 Wi-Fi Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 53 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 vi
5 Results and Discussion 55 5.1 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.1.1 Established Connection . . . . . . . . . . . . . . . . . . . 55 5.1.2 Authentication of users . . . . . . . . . . . . . . . . . . . 57 5.1.3 Session Tokens . . . . . . . . . . . . . . . . . . . . . . . . 58 5.1.4 Certificates . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.2 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2.2 Anonymity . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.3 WiFi Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.3.2 Periodic Request . . . . . . . . . . . . . . . . . . . . . . . 64 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 6 Conclusions and Future Work 68 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.4 Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 References 71 A Raw Data Collected 80 A.1 Raw data from attempt to establish a connection . . . . . . . . . 80 A.2 Raw data for users to be authenticated . . . . . . . . . . . . . . . 81 A.3 Raw data for session token to be verified . . . . . . . . . . . . . . 84 A.4 Raw data for obtaining a new session token . . . . . . . . . . . . 86 A.5 Raw data for obtaining a new certificate . . . . . . . . . . . . . . 86 A.6 Raw data to establish connection with Tor . . . . . . . . . . . . . 87 A.7 Raw data for IP addresses used when running with Tor . . . . . 88 A.8 Raw data for periodic requests times . . . . . . . . . . . . . . . . 88 vii
List of Figures 2.1 An image of the onion routing. . . . . . . . . . . . . . . . . . . . 22 2.2 An image of the trilateration illustration. . . . . . . . . . . . . . 23 3.1 An image of the system and the different communication steps between the parties . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.1 Message flow between client and third party to register . . . . . . 40 4.2 Message flow between client and third party to login . . . . . . . 41 4.3 Message flow for client and server communication using session tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4 Message flow for client and server communication using certificates 47 5.1 Execution time for attempt to establish a connection over multi- ple times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2 Execution time for users to be authenticated . . . . . . . . . . . 58 5.3 Execution time for session token to be verified . . . . . . . . . . . 59 5.4 Execution time for obtaining a new session token or a new certificate 60 5.5 Execution time to establish connection with Tor and with SSL sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.6 IP addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.7 Screenshots of authentication pages . . . . . . . . . . . . . . . . . 65 Note: The figures are created by the authors if nothing else is stated. viii
List of Tables 3.1 Evaluation framework for the project. . . . . . . . . . . . . . . . 36 5.1 Delays on periodic requests . . . . . . . . . . . . . . . . . . . . . 66 5.2 Summary of the average execution times for each operation . . . 67 Note: The tables are created by the authors if nothing else is stated. ix
List of Acronyms and Abbreviations BSSID Basic Service Set-Identifier CA Certification Authority CRL Certificate Revocation List DNS Domain Name System GUI Graphical User Interface ITU-T The International Telecommunication Union’s Telecommunication Standardization Sector KTH Kungliga Tekniska Högskolan MCS Mobile Crowd Sensing OP Onion Proxy OR Onion Routers PKI Public Key Infrastructure PS Participatory Sensing RSA Rivest–Shamir–Adleman RSSI Received Signal Strength Indicator SPPEAR Security and Privacy-Preserving Architecture for Participatory-Sensing Applications SSID Service Set-Identifier SSL Secure Sockets Layer TCP Transmission Control Protocol TLS Transport Layer Security Tor The Onion Router UDP User Datagram Protocol x
Chapter 1 Introduction More than 70 percent of mobile application suffer from various security vulner- abilities which can lead to sensitive information being leaked [1]. Such data leak can cause physical or emotional harm if it is shared with unauthorized people [2]. Further, these applications can not only store and process sensitive informa- tion such as Social Security number, Internet banking credentials or passwords [2], but also access other parts on the phone such as the user’s location, photo gallery and the contact’s list [3]. Moreover, the cause of the leaked information is often poor application design or inappropriate implementations [3]. There- fore, the importance of security and privacy is essential in order to protect the users and their integrity [1]. 1.1 Background The right for privacy is central in life and protected by laws. Accordingly impor- tant is the power to control sensitive information, collected and used by others. As the usage and the dependence on technology are increased, it becomes in- creasingly difficult to keep sensitive information secret [4]. Different apps are simplifying the lives providing useful functionalities such as provide service updates on public transportation, real-time GPS-navigation etc. Most mobile apps are dependent on the internet which allows the developers to implement them in such a way that massive amount of data can be stored in databases, be analysed and be shared with third parties [5]. Privacy can be protected using different techniques such as anonymity, pseudo- nymity, unlinkability and unobservability. Anonymity is achieved by the user being a part of a larger set of users. The larger the group is, the harder to identify the individual becomes. Pseudonymity, on the other hand is achieved through the use of fake names such as identifiers. However, a user is not as untraceable as when there is an anonymity, since it is the case that sometimes it 11
is possible to link the fake identifier to the real person. Also, the users actions can be observed and linked to the same pseudonym, revealing patterns and thus identity. Unlinkability is achieved when several objects cannot be linked to- gether. For example, if two messages are sent in a system, it should not possible to determine if they are related or not. Finally, unobservability is achieved when an observer cannot identify if an item of interest actually exists. This means, if a message is sent, it should not be possible for uninvolved parties to determine if the message exists or not [6]. The Onion Router (Tor) organization is working towards enhanced privacy and better information security over the internet. They are using the onion routing protocol in order to encapsulate messages in many layers not accessible to all the nodes [7]. Studies have been performed highlighting privacy issue in mobile application, finding major issues in most mobile applications [8]. More specifically, a re- search on the privacy policies of apps with similar functionality to the proposed application, which is described on the problem section, was made and concluded that user’s information is collected and shared to third parties [9]. "Wifi Analyzer" is an open source application which provides information about the user’s surrounding WiFi networks, such as strength. The people who have developed this application are aware of the security and privacy concerns. Af- ter evaluating the code of the Wifi Analyzer app, it appears that, it does not communicate with a server [10]. "Orbot" is an open source application which is based on the concept of the Tor Organisation. The application protects the user’s security and privacy, by encrypting and hiding the user’s Internet traffic [11]. 1.2 Problem In the recent years, the lack of privacy in communications have been revealed. Mobile applications can collect and share user information on top of the data the app requires. Thus, offending the user’s privacy [9]. Numerous scandals [12] have been widely discussed publicly highlighting the need for better security to solve the related privacy problems. A short research into the privacy policy of mobile apps, which analyze WiFi networks, communicate with a server and require the collection of location data in order to function, concluded that most store and share user data to third party companies [9]. Progress is made for web applications in the area of hiding sensitive user data [7], in contrary to native applications. Individuals are using internet applications in multiple occasions daily, not only 12
at home which they have a stable internet access to their own WiFi but when they are moving to different locations. It is important then that they are able to get information about the WiFi spots and the quality of the internet pro- vided. An application which can provide such information based on the input of multiple users is necessary. How can such an application be developed and at the same time preserve the security and privacy of the user? 1.3 Purpose The aim of the project is to investigate how the user data can be transferred confidentially and to be untraceable in a native app, using techniques to pro- tect user’s privacy. Specifically, the application collects information about WiFi networks and their properties such as speed and provides useful information, derived from these data, to the user. In order to preserve the user’s privacy, the integration of the Tor project to the application is implemented. Moreover, the data is sent to the server on specified times, without the user triggering the action. The data is digitally signed, and the server should be able to authenti- cate the client and vice versa to create secure communication. The application is built on top of the open source project "WiFi Analyzer" [10]. Furthermore, this project aims on developing the ability to conduct a scien- tific research in the Security and Privacy field and prove relevant knowledge and understanding. The problem covered in this project demands the ability to search, collect and evaluate information. In this way the authors strengthen their confidence when reading academic documents and analyzing the found material. It is important for the project to be written as a report, in order to learn how to present a problem, its solution and conclusions and allow for fur- ther discussion. Additionally, it is necessary to realize the importance of having a well structured plan and framework as to be able to complete a given task on time. Hopefully, the industry can be inspired by the implementation to use simi- lar techniques in order to protect user’s privacy and reevaluate their current implementations. 1.4 Goals The goal of this thesis is to implement the above described mobile application, which transfers WiFi and user data to a server in a secure and privacy preserv- ing manner. To achieve this, we add a third party and a server to the system. The goal have been divided into the following three sub-goals: 13
1. Improve Graphical User Interface (GUI) for open source application that provides information about the reachable WiFi. 2. Implement mutual authentication between client and server for security reasons. 3. Send data from the developed native application to a server through the Tor network. 1.5 Research Methodology The project is conducted as an experiment in order to investigate ways to trans- fer WiFi and user data to a server anonymously and achieve mutual authentica- tion. There are specific steps that are to be followed, to ensure that this project follows a natural flow and eventually achieves the intended outcomes. Shortly, • A continuous literature research of the field is performed. • Proper tools are selected, based on the research, for the implementation of the app. • The application work starts with the implementation of the app, the cod- ing part, which eliminates the user profile. • Testing of the functionality of the app follows, focusing on the data trans- missions. The working assumption is that security vulnerabilities are lim- ited in order to narrow down the problem to the data transmissions process and simple authentication. • Collection of data follows, by using the app in different locations in order to verify that no sensitive data is being stored. • The obtained results are analysed to ensure the app works as indented, e.g. if the user can be traced. • The conclusions from the experimental data are drawn and the findings are reported. The work is being organized using the Agile methodology [13]. The experimental parts are broken down into shorter time periods, which provides a better un- derstanding of it. In each period, it is decided the amount of work that should be done and the work is divided into tasks. This way the definition of complete is clearer and possible mistakes or bugs are avoided. Both of the students have equal responsibility for this project, so a traditional method would not work, since the tasks should be assigned after mutual agreement. Also, both of the 14
students are familiar with this method. 1.6 Limitations Due to limited time, this thesis uses an already existing application for further development instead of building one from the beginning. Furthermore, the the- sis does not focus on design but the functionality instead. 1.7 Structure of the thesis The thesis is organized as follows. In Chapter 2, a comprehensive theoretical background is presented. This is for example information about network secu- rity, anonymity and pseudonymity together with related work, for the purpose of giving the reader a fundamental knowledge in order to proceed to the the- sis. Further, in Chapter 3, the methodology used in this project is presented. This includes how the research process was done and how the data collection was implemented. Additionally, the validity and reliability about the method is mentioned. In Chapter 4, an precise description of the the implementation is made and some parts of the code are shown. Furthermore, in Chapter 5, the re- sults are presented together with an analysis and discussion about the project. Lastly, in Chapter 6, a conclusions derived from this work are conferred and reflections and suggestion for future work are made. 15
Chapter 2 Background The purpose of this chapter is to present a detailed description of essential background in order to give the reader the required knowledge to proceed in this thesis. Section 2.1 and 2.2 provides the reader with information about network security and privacy and the fundamentals to achieve them, describing topics such as cryptography, authentication and anonymity. In section 2.3, WiFi terms together with ways to calculate the coordinates of a user are described, followed by section 2.4 where related work is presented. Lastly, the chapter ends with a short summary. 2.1 Network Security The information distributed over networks can be sensitive and its leakage can result in significant costs, such as financial costs or privacy disturbance [14]. Net- work security is the practise of attaining objectives of preserving the integrity, availability and confidentiality of information distributed over a network [15]. More specifically, integrity means that the message send over a network should remain unmodified and in case it has altered the receiver should be able to detect that it has changed. Confidentiality ensures that the transmitted messages can be read and understood only by the sender and the receiver. Last, availability refers to the fact that data and services should be accessible by authorized users [16]. These goals could be threaten by multiple attacks. The network security at- tacks are divided into two categories: active and passive. In short, in active attacks the intruder aims to disturb the network’s operations. This could for example be accomplished by a denial of services attack, where the intruder floods the receiver node with requests, in order to occupy the network and pre- vent legitimate requests to go through. In passive attacks the intruder aims to gain and possible use information from the system without disturbing the 16
system resources [17]. For example, monitoring or eavesdropping, where the intruder can monitor the network and find information which is not intended for unauthorized people, without altering the data. Another important principle in network security which is essential to refer to is authentication. Authentication identifies the communicating entity and ensures its claims of identity [18]. One of the most common practise to verify an identity is by using user names and passwords, due to the ease of implementation and low cost [16]. However, it has some weaknesses too, for example it is vulnerable to dictionary and brute force attacks. 2.1.1 Cryptography Cryptography is the practise of preserving security in a system, by encoding and decoding messages [19]. The term cryptography is derived from the Greek words "κρύπτος" and "γραφείν", which mean secret writing [20]. Encryption refers to the ability of converting data in a way that it is only available to authorized people [19]. There are two essential components which encryption requires, an algorithm and a key. In order for two entities to be able to encrypt and decrypt their message they need to use the same algorithms [20]. 2.1.2 Public Key Encryption Public Key Encryption arises for two main reasons: 1. Symmetric Trust, since the two entities share the same key in order to encrypt and decrypt messages they need to be extremely sure if they trust the other entity [21]. 2. Key Establishment, in a symmetric encryption scenario the two entities need to agree on a secret key in advance [21]. Public key encryption allows two entities to employ cryptography to secure data they send, using a public key and a private key. The public key of the receiver is used to encrypt the outgoing message whilst the receiver’s private key is used to decrypt the incoming message [21]. The public key is known to any entity, whereas the private key is only known to the owner. Moreover, a node can verify the source of the messages given the public key of the sender [19]. Since keys are publicly known, the possession of a key cannot identify the owner of it, consequently entities needs to be able to verify their authenticity. As a result, there exist key distribution mechanisms to ensure a list of keys belongs to a specific entities. [19] Public Key Infrastructure A Public Key Infrastructure (PKI) is a set of programs, procedures, and se- curity policies which employs public key cryptography and digital certificates 17
for secure communications. Furthermore the PKI can identify users, create and distribute certificates, maintain and revoke certificates, distribute and maintain encryption keys, and allow encrypted communications. PKI can allow public key cryptography by distributing public keys and at the same keeping the pri- vate keys private. For the PKI to operate successfully and ensure security, trust within this framework is required. [22] Nominally the components of a PKI are: Certificate authority, Registration authority, Certificate server, Certificate repository, Certificate validation, Key recovery Service, Time server, Signing server [22]. Digital Certificates As mentioned above, there is a need to ensure authenticity of public keys, in other words, verify that the key belongs to the entity which claims to own it [19]. As a result "Digital Certificates" have been conceived in order to provide authenticity of public keys [23]. Since digital certificates are widely used on the internet and handled by different authorities the format needs to be the same, thus they usually follow the “X.509” standard [23]. The field included in a digital certificate are: version of the certificate, unique serial number, algorithm ID, issuer, validity dates, owner, public key, ID of is- suing authority, id of the owner [22]. Certificate Authority (CA) Digital certificates require a trusted third party, the Certification Authority (CA) [23], which will be responsible for the maintenance, issue and sometimes even the distribution of them [22]. The CA can add its digital signature to the certificate in order to validate it [23]. Then, the CA will deliver the certificate to the owner and control it until it expires [22]. If the CA stops trusting an entity which its certificate is not yet expired then it can revoke the certificate, by adding it to the Certificate Revocation List (CRL) [22]. 2.1.3 Authentication Authentication is the process of proving the identity of a user or an entity, pre- venting that way unauthorized entities to access the system [24]. Authentication is not the same as identification, which is the process of confirming the identity of an entity, by requesting credentials from the user [24]. There are four major types of authentication: 1. Something you know, cognitive information [25] 2. Something you have, possession of items [25] 3. Something you are, physiological and behavioral attributes [25] 4. Where you are, location information [24] 18
This methods of authentication can be combined together in order to enhance the systems security. 2.1.4 Passwords A password is a secret sequence of characters [24], which allows the users to identify themselves aiming to gain access to the resources of a system [25]. If the password is obtain by an adversary, it can access the system and obtain the rights the legitimate user have [25]. Passwords need to be securely stored by the system, otherwise passwords can be retrieved by intruders [24]. For that reason the National Institute of Standards and Technologies has published a guide on how to store passwords securely [26]. An example of how passwords can be stored is by using a cryptographic hash function. Hashing Passwords When storing a password, instead of saving the original plain text, an one way function is applied to it and the output is what is stored [27]. That prevents attackers from obtaining the actual passwords if they manage to get access to where they are stored [24]. When using an one way hash function, then the hash code, i.e. the output,is of fixed length irrespective of the actual size of the password, consequently it provides no information about the length of the actual password [28]. An competent hashing algorithm can compute the hash code efficiently, but finding the input that produced the hash code should be hard [28]. One of the most common hashing algorithm is SHA-256 [24]. SHA-256 is an algorithm, whose hash function is commonly used in signing certificates and to ensure data integrity [28]. In short, SHA-256 converts a message with a variable length to fixed size 256-bit message [28]. Hashing prevents attacker from obtaining the actual passwords, however if two users in the database have the same password the hash code will be the same. In order to avoid that a random number, a salt, can be added in end of the password, so after the hash function is applied, the hash codes will differ. Fur- thermore, it prevents from knowing if a user uses the same password across different platforms and increases the difficulty of dictionary attacks [24]. 2.1.5 Security Tokens Security tokens are devices which can identify a user [29]. The type of authen- tication tokens provide is "something you have" [30]. Tokens can be used to generate one time passwords which they can only be used once by an entity, i.e. an attacker will not be able to reuse it to obtain unauthorized access [29]. For tokens to be useful, the necessary infrastructure needs to exist, for example if 19
the token was a smart-card a card reader would be essential [30]. There exist stand alone tokens, which do not require external devices to evaluate them, such as the one time password example [30]. In these cases it improves security if the user can identify itself to the token before obtaining the one time password [30]. 2.1.6 Transport Layer Security The Transport Layer Security (TLS) protocol, forms a secure connection for two entities over a network, such as a client and a server [31]. It is widely chosen in order to provide confidentiality, authenticity, integrity and privacy over the network [32]. There have been different version of the protocol over the years, each time aiming to make improvements and eliminate vulnerabilities of previ- ous releases [32]. Initially the protocol was called Secure Socket Layer and first version was SSLv2, the newest version today is TLSv1.3 [32]. The protocol is based on two sub-protocols: the handshake protocol and the record protocol [33]. The handshake protocol, establishes or resumes secure sessions, by authenticating the server and the client and negotiating the cipher suite [31]. The authentication is based on public key cryptography and requires a PKI, digital certificates, private and public keys [34]. The record protocol is re- sponsible for securing the application data and ensuring the messages’ integrity, by using the session keys that were created during the handshake protocol [34]. 2.2 Privacy The fundamentals in privacy and applications is to not share information the user does not want to share. In order to prevent the sharing, anonymity can be used as well as unlinkability not be able to trace back actions to the user and to reveal the user’s identity [35]. In the following sections anonymity, pseudonymity, unlinkability and onion routing will be explained. 2.2.1 Anonymity Anonymity means that a subject cannot be identified along with other subjects. Specifically, by subject it is meant a person in a crowd or an item in a group of similar items [36]. Anonymity is frequently used when an individual wishes to achieve personal privacy [37]. 2.2.2 Pseudonymity Pseudonymity is when a user uses a substitute for the identity instead of the real identity as this identity has to remain unknown [36]. However, users can still be recognised through usernames, user patterns or other identifiers that could be linked back to them [38]. 20
2.2.3 Unlinkability Unlinkability, in terms of anonymity, can be divide into three types. The first is the sender’s anonymity which means that the items sent cannot be traced back to the sender. The second one is the receiver’s anonymity, in which the items sent cannot reveal the receiver [39]. Finally there is so-called Relationship anonymity which means that it is not possible to trace neither the sender nor receiver of the messages, in short it is not possible to identify the two commu- nicating parties [40]. In terms of pseudonymity, the level of unlinkability depends on the way pseudo- nyms are being used. Pseudonyms sometimes can reveal the identity of the user, in result all his/her actions to be possible to be linked to the user [41]. One-time-use pseudonyms on the other hand are only used once and can be a random number for example [39]. However, the use of multiple one-time-use pseudonyms are not totally secure, as it can be semantic or syntactic linkable. Semantic linkability is when it is possible to predict what the new pseudonym of a user will be whilst syntactic linkability is when it is possible to link two one-time pads to the same user, and therefore threaten the anonymity of the user [42]. 2.2.4 Onion routing The onion routing is a technique to redirect data traffic over public networks in order to achieve anonymous connections[43]. Therefore, this technique is resis- tant against cyber attacks such as data analysis and eavesdropping, which can violate the users privacy, since the user remains anonymous and the data are encrypted. The onion routing is used through secure socket connections in both directions [44]. A message in an onion routing network is redirected through three different nodes called Onion Routers (OR) until it arrives at the final destination [45], as shown in Figure 2.1. The communication is encrypted in several layers where the ORs decrypt their corresponding layer. Firstly, the sender encrypt the pack- age and send it to the first OR as this is the only information the sender knows about the ORs. Then, the OR decrypts the package and receives information about which OR the package needs to be forwarded to. This OR decrypts the next level in order to obtain information about the next OR the package needs to be forwarded to. Finally, the third and last OR receives the package and decrypts the corresponding layer in order to obtain information about the final destination of the package and sends it there. In conclusion, each OR in the scheme is only aware of their predecessor and successor [46]. 21
Figure 2.1: An image of the onion routing. 2.3 WiFi WiFi is a wireless network that allows different devices to exchange information with each other over the internet. The internet connectivity through WiFi is established via the connection to a wireless router that allows the device to interact with the internet. An access point can extend a wireless network, so basically a router can be an access point. Further, an access point can for example provide security together with several other functions [47]. The following information are retrieved from the scanner of the application and are sent to the server. 2.3.1 SSID SSID stands for "Service Set-Identifier" and it is the technical term for a wireless network name. The owner name of the network, needs to be a meaningful name, to be possible for the an individual to separate it from other networks and connect to the preferable network[48]. 2.3.2 BSSID BSSID stands for "Basic Service Set-Identifier". Within a WLAN there can be many access points and in order to separate them from each other it uses an identifier called BSSID. The identifier used as ID is the MAC address of the specific access point [49]. 22
2.3.3 Signal The WiFi signal strength shows how reliable the internet connection is which in turn shows the amount of internet speed that can be utilized. Further, the strength can be affected by both the distance to the router and by the surround- ing materials, such as walls which for instance block the signal. [50] 2.3.4 The localization of the user To calculate the coordinates of the user, the best result outdoors, is to use GPS for localization. 2.3.5 The localization of the access points To calculate the coordinates of the routers, is achieved by the multilateration algorithm. The multilateration algorithm can locate the router by applying the algorithm on the distance to users [51]. Sometimes, the multilateration algo- rithm is called trilateration algorithm, which indicates that the coordinates is calculated only from three access points [52]. In order to use the multilateration algorithm the distance between the access point and at least three users need to be measured. Theses distances can be calculated with Received Signal Strength Indicator (RSSI). When distance d1 , d2 and d3 are calculated, the model can be viewed as shown in Figure 2.2, the users are the center of circles and each circle have a radius of their respective distance to the access point. Further, the intersections of the three circle show where the access point is. [53] Figure 2.2: An image of the trilateration illustration. 23
In order to calculate where the user is following equations need to be solved: (x1 − x)2 + (y1 − y)2 = d21 (x2 − x)2 + (y2 − y)2 = d22 (x3 − x)2 + (y3 − y)2 = d23 These equations can easily be solved through the use of mathematical matrices which give an unique solution, i.e. the coordinates of the access point [51]. 2.4 Related Work 2.4.1 Participatory sensing In Gisdakis, Giannetsos and Papadimitratos paper "SPPEAR: Security & Priva- cy-Preserving Architecture for Participatory-Sensing Applications" they talk about Participatory Sensing (PS) systems which need to be secure in order to protect the users’ personal data and integrity. They show that their SPPEAR architecture provides a set of security and privacy properties. Further, they use Tor for the purpose of maintaining anonymity [54]. Another paper from Gisdakis, Giannetsos and Papadimitratos "Security, Pri- vacy & Incentive Provision for Mobile Crowd Sensing Systems", which is a follow up work on the above SPPEAR paper, they talk about the new paradigm of Mobile Crowd Sensing (MCS). The MCS systems has an openness which makes the concerns about the privacy and integrity entitled, as the users expects to share a huge amount of data. Previous works have handled these aspects sep- arately instead as one unit. Their work builds on finding a holistic solution that value the security and integrity for the users as they might share personal data. Additionally, the MCS system at the same time needs to be resistant against unauthorized users or severs in order to retain the security [55] and ensure data trustworthiness [56]; a related extension of a cellular authenication protocol was investigated in [57, 58, 59]; protective mechanisms for registration were considered in [60]. 2.4.2 Privacy-respecting applications In the paper "Privacy-respecting reward generation and accumulation for partic- ipatory sensing applications" by Dimitriou, where the importance of maintain- ing users’ privacy in a mobile application and the challenge to make the users participate is discussed. Further, he presents a reward system which makes the users to want to be active and analyses the challenge that follows on how to reward anonymous users. Moreover, the importance of the use is discussed in order to be able to both reward the user and at the same time be able to retain their anonymity [61] requirements were surveyed in [62] and remuneration and incentives were investigated in [55, 63]. 24
2.4.3 Privacy policies in mobile application Privacy policies in mobile application can be both long and uninformative and therefore threaten the decision making process for the user to decide what they want to share and not. In the paper "Benchmarking Privacy Policies in the Mobile Application Ecosystem" by Kandil, Akker, Baarsen, Jansen and Vulpen, they present a Privacy Policy Benchmark Model. The model evaluates the transparency together with the amount of data, during a data collection in order to help the users not to share information they do not want to. [64] 2.4.4 RSA Cryptosystem The RSA (Rivest–Shamir–Adleman) cryptosystem is distributed in many com- mercial systems and are commonly used for secure data transmission by provid- ing privacy and assuring authenticity [65]. In RSA cryptosystem there is a pair of keys, an RSA public key and an RSA private key, which can be used to encrypt and decrypt data [66]. The public key is viewed as (N, e) and the private key is viewed as (N, d). Where N is the product of multiplying two large prime numbers p and q and whilst e and d are two integers that is satisfying ed = 1 mod (p-1)(q-1) [65]. Additionally, in order to encrypt a message M, the sender need to compute M e mod N = C with the receiver’s public key whilst the receiver then can decrypt the message by computing C d mod N = M, with its own private key [65]. 2.4.5 X.509 Certificates X.509 Certificate is a digital certificate that is extensively used in today’s com- puter security [67]. The International Telecommunication Union’s Telecom- munication Standardization Sector (ITU-T) states that “Virtually all security services are dependent upon the identities of the communicating parties being reliably known, i.e. authentication”. This is critical when it comes to web trans- actions as it is very important to know the identity of both the sender and the receiver [19]. TLS can not verify the identities of both sides, only encrypt the Web page sent between them. Therefore, the X.509 certificate can be used together with TLS in order to solve this identity problem [19]. The certificate is issued by a CA that have identified the person as a trustful owner of an encryption key [68]. An encryption key, also called the public key, is part of a key pair where the other part consists of the decryption key, also called the private key. The public key is used by the sender to encrypt a message whilst only the corresponding private key, used by the receiver, can decrypt the message [69]. Furthermore, the key pairs can be used for authentication. If the sender, in addition uses its own private key for encryption, the receiver can verify the sender’s identity by decrypting with the corresponding public key of the sender. Integrity can be 25
achieved through the use of digital fingerprints. The fingerprint is a encrypted message digest that make sure that the message has not been altered [70]. Con- clusively, confidentiality, integrity and authentication can be improved by the use of digital certificates [71]. Modern PKIs can also provide privacy protection, through the use of pseudonyms, e.g., earlier [72, 73] or more recently [74, 75]. 2.4.6 Tor Tor stands for The Onion Router [7] and is a network protocol that implements the original onion routing method mentioned above, but with a few modifica- tions in order to achieve better distribution and security. Tor is an open source network and therefore free to use for anyone who has the software called Onion Proxy (OP) [76]. The network consists of a huge amount of volunteer routers from which three routers are chosen to build a scheme for the user. The three nodes are named entry, middle and exit node [7]. 2.4.7 Apps that provide WiFi details There are several applications related to WiFi on the market today. The WiFi apps can provide information about the network such as the signal strength and measure the upload and download speed [77]. They can even shows free networks on a map and give directions to them, such functionality is provide by the app "WiFi Map" [78]. Whilst other can provide a comparison between available networks, show radio frequencies and show the Ethernet address of the router [79]. There are also apps that are related to WiFi but do not provide the above information. For example, "Fing" [80] is an app that can show to the users which devices are using their network. If the network speed is slow the user can detect if anyone is using the network without permission and in that case shut them out [79]. "Wifi Analyzer" is an open source app [10] that is used in this project. The app can show signals strength from the nearby networks and show if the sig- nals overlaps somewhere, so the network administrator can avoid interference by using different channels [79]. 2.5 Summary This chapter has presented concepts related to security, privacy and WiFi. In the security part, a description of cryptography was given, and how it is used to achieve security. Further, the role of certificates was presented together with authentication and how passwords are stored in a secure way. In the privacy section, two types of anonymity were mentioned along with a description on how this could be achieved with onion routing. Moreover, in the WiFi part, WiFi 26
terms were defined together with how it is possible to calculate the coordinates of an access point. Finally, related work was presented within security and privacy. 27
Chapter 3 Method This chapter describes the method and every step in this thesis which allow to replicate the study. In section 3.1 the research process is presented in detail, followed by a description of the data collection in section 3.2. Further, the test environment is described in section 3.3 and the reliability and validity methods comes next in section 3.4. The chapter ends with the planned data analysis and the evaluation framework in section 3.5 and 3.6 respectively. 3.1 Research Process This section lists the steps conducted in order to carry out this research. 3.1.1 Defining the app which this project is based on Upon the decision of the project, an existing app would be enhanced with the authentication and the communication to a server using Tor features. The requirements of the app were the following: • is open source • is providing WiFi details • is implemented for android devices • is lacking the intended features, i.e. the authentication and the communi- cation with a server • and is respecting the user’s privacy 3.1.2 Design of the system Then, a literature study and a revision of knowledge gained from courses rel- evant to networked system security, privacy and network communication were 28
conducted. The main principals of security and privacy were cleared and the main idea on how to design and implement a system with mutual authentication and that communicates using Tor was clarified. The system, i.e. how the project should be implemented was then designed. In Figure 3.1, the system is shown, more details about it are provided in section 4.2. Figure 3.1: An image of the system and the different communication steps between the parties 3.1.3 Authentication Literature review A literature review was conducted in order to base the implementation of the authentication feature on scholarly literature. The sources were found using a combination of Google Scholar search engine [81] and KTH Library [82]. Key words such as, "authentication", "password authentication", "network secu- rity", "SSL", "TLS" were searched. Implementation The first feature that was implemented was the authentication. Starting from a basic implementation in which the client was communicating to the third party using sockets, was then built up using secure sockets to ensure security and privacy. Secure sockets, which meant that the communication was based on the Transport Layer Security, required digital certificates to be issued, distributed and handled. Ensuring that the communication was secure, code supporting registration and authentication was written. First, the user need to register to the third party 29
by sending a request with a username and password. If the username does not exists, the third party responds that the registration was successful, otherwise it responds that the username already exists. After that, the user should be able to authenticate and log in, using a username and a password. Since passwords were used, the next step was to store them safely. The client was then authenticated to the third party and in order to be able to be authenticated to the server certificates and tokens were used. The third party provided to the client a token or a certificate which could be used by the client to verify that it is an authorized user to the server. The server would then verify the validity of the token or the certificate with the third party. Tools and Libraries The language used to implement the different components in order to achieve authentication was java [83]. These language was preferred, for the developers being familiar with it and due to the fact that the app which this project is based on was written in java. The environment used to write the code was An- droid Studio [84] for the client and IntelliJ IDEA Community Edition 2019.3.2 [?] for the third party and the server. Java has its own built library to achieve security [85], which was used in mul- tiple parts of the implementation. Complimentary to that, the Bouncy Castle Crypto APIs [86] were used. Testing In order to test if the authentication was working as indented, test cases needed to be defined. For example, what happens if the client does not own the nec- essary certificates, or if the third party or the server does not own the required certificates? Does it work properly when all the requirements are fulfilled? What happens if a user tries to register with a username that already exists, can a user register if the username has not used before? Is a user logged in if the user provided valid username and password, and what happens when password is wrong or user does not exist? Are tokens correctly generated and how long does it take for the server to authenticate them? Are certificates correctly generated and how long does it take for the server to authenticate them? What happens if the user keeps reusing the same token? What happens if the user gets multiple tokens upon authentication? What happens if the user gets multiple certificates upon authentication? What happens if the user gets a new token every time it uses one? What happens if the user gets new certificate every time it uses one? The tests cases were assessed multiple times, manually by the developers of the project. 30
3.1.4 Tor Study Review Before, starting implementing the Tor service within the app, it was necessary to obtain the right knowledge. Thus, multiple scholarly literature, were re- viewed. The search engines used to find information was again Google Scholar and KTH Library. An example of the keywords that were searched were "tor", "onion routing" and "privacy". In addition, projects in GitHub were evaluated, which were relevant to Tor and previous work on Tor was attempted to be found on "DiVA portal" [87]. Libraries There are different libraries which allow Tor to be integrated within an appli- cation. The ideal characteristics for a library were: • to be written in Java • to have good documentation • to be maintained The libraries that were taken into consideration were: • Orchid [88] • Orbot [11] • Tor Onion Proxy Library [89] • Stem [90] Among these, the most fitting library to the above requirements was the Tor Onion Proxy Library. It is written in Java, which would allow to simple import it to the Java application, there were few examples on the repository explaining how it should be used and it is not completely abandoned, in other words fixes seem to be taken over a period of time. Implementation The Tor feature was implemented by correctly importing the library, ensuring that configuration has been properly set and using the functions provided by the library. The strategy was to start from a basic functionality to make sure that the library works as indented and it has been correctly configured, then it was enhanced with code which supports the services provided by Tor on the application. Testing The implementation of Tor was tested to ensure that privacy is preserved. Mul- tiple tests of transmitting data from the client to the server needed to be taken, to check if Tor works most of the times and data was received by the server. 31
Then, the IP address of the source was retrieved by the server and verified that it is not the real IP address of the client. Moreover the performance of Tor was evaluated. 3.1.5 WiFi Analyzer Review Upon selection of the app, that this project is based on, a more detailed review of the app needed to be taken. Study was conducted, in order to understand the exact way of the implementation, that is the structure of the code and the functions they used to retrieve the WiFi details from the reachable access points. Implementation The functionality of the WiFi Analyzer was enhanced by allowing the applica- tion to send the WiFi details from the reachable access points together with their location to the server. That needed to be done periodically and without necessarily the app to be in use. These data could then be used by the server and be visualized. Testing It is assumed that the implementation for the WiFi analyzer is proper and it can identify the access points and their details. Therefore, this thesis does not extensively tests if the routers and their details coincides with the reality. It was tested if the data was sent periodically to the server. 3.1.6 Evaluation of the implementation In the end, when all the features were implemented and tested, they were merged together. Follow up testing needed to be conducted to make sure that nothing stopped functioning after the merge. The same tests, that were defined for each individual feature were evaluated. 3.2 Data Collection There was limited data collection. The data collected during the project was needed to test the functionality of the app. The app was not distributed to the public so they could test it due to limitations on the time and mostly because there was no need at this stage since the functionality of the app could be tested by the developers. Furthermore, no sensitive information was stored. The data collected was stored locally in the machine where the server was running and on the android phone that were used for testing. For authenti- cation data such as usernames and passwords were stored in the authentication database, where passwords were encrypted. Certificates and session tokens were issued and was stored locally. The WiFi details that were sent from the client 32
You can also read