Residue-Free Computing - Logan Arkema* and Micah Sherr - Privacy Enhancing Technologies ...
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Proceedings on Privacy Enhancing Technologies ; 2021 (4):389–405 Logan Arkema* and Micah Sherr Residue-Free Computing Abstract: Computer applications often leave traces or rather to provide additional features and functionali- residues that enable forensic examiners to gain a de- ties. For example, modern operating systems maintain tailed understanding of the actions a user performed access times to quickly locate the most recently modi- on a computer. Such digital breadcrumbs are left by fied files; these access times help an examiner determine a large variety of applications, potentially (and indeed how the computer was used. Log-based filesystems are likely) unbeknownst to their users. This paper presents designed to facilitate fast data recovery, but also leave the concept of residue-free computing in which a user remnants of erased files and more generally allow exam- can operate any existing application installed on their iners to reconstruct the filesystem at various points in computer in a mode that prevents trace data from be- time. ing recorded to disk, thus frustrating the forensic pro- More generally, the traces left by operating systems cess and enabling more privacy-preserving computing. and applications leads to a status quo in which it is In essence, residue-free computing provides an “incog- exceedingly difficult to use a computer without leav- nito mode” for any application. We introduce our im- ing a readily available record of one’s actions. Coun- plementation of residue-free computing, ResidueFree, termeasures that enable greater privacy protections are and motivate ResidueFree by inventorying the poten- piecemeal, and are usually constrained to a particular tially sensitive and privacy-invasive residue left by popu- application—a notable example is the “incognito” or lar applications. We demonstrate that ResidueFree al- “privacy” mode of modern web browsers that are de- lows users to operate these applications without leaving signed to remove local traces of web activity left by the trace data, while incurring modest performance over- browser. More general solutions do exist [10, 12], but heads. they require modifications to the Linux kernel and fail to capture some types of application residue. (We ex- Keywords: privacy; forensics; anti-forensics plore the related literature in more detail in the next DOI 10.2478/popets-2021-0076 section.) Received 2021-02-28; revised 2021-06-15; accepted 2021-06-16. This paper introduces residue-free computing, a privacy-preserving (or anti-forensics) mode of operation for modern operating systems without requiring kernel 1 Introduction modifications. In residue-free computing, the user uses their normal operating system and has access to their Many of the actions that users perform on their com- existing files. The user may optionally choose to start puters leave digital traces. In a computer forensics in- any installed program in residue-free mode, which en- vestigation, these traces form digital breadcrumbs that ables the program to run on top of a union filesystem. allow a forensic examiner to gain a fairly detailed un- The application has read access to all existing data on derstanding of what the user did with their computer— any installed filesystem, but file modifications (includ- which applications and files were accessed, the times ing deletions) are made to a volatile filesystem stored in and potentially the duration of their accesses, and more memory (i.e., RAM). Upon exiting the application, the generally, the actions performed by the user. filesystem modifications are permanently erased. Con- Computer forensics is aided by modern operating ceptually, residue-free computing provides an incognito systems that tend to leave an enormous volume of these mode for any installed application. breadcrumbs. Operating systems leave traces not nec- Residue-free computing assumes an atypical threat essarily to ease the job of a forensic examiner, but model. It assumes a cooperative user who wants to pre- serve their privacy, and an application and operating system that do not actively attempt to prevent the ap- plication from executing in residue-free mode. That is, *Corresponding Author: Logan Arkema: Georgetown residue-free computing is designed to work on appli- University, Email: lja39@georgetown.edu Micah Sherr: Georgetown University, Email: cations and operating systems as they currently exist. micah.sherr@georgetown.edu We assume as our adversary a forensic examiner who is
Residue-Free Computing 390 able to take filesystem snapshots of the filesystem before Motivating examples. To motivate the use of and after the execution of an application in residue-free residue-free computing, we consider two example sce- mode. We describe our adversary in more detail below, narios. but in brief, we do not protect against a network ad- Privacy from invasive cohabitants: A computer user versary who learns which applications are being used shares a residence with an individual (for example, an by examining network traces, nor do we protect against abusive partner) who attempts to monitor the user’s malware running on the user’s device that attempts to computer usage. This adversary may be technically so- record actions. Our goal is to prevent our snapshot ad- phisticated, but is not necessarily a trained forensic ex- versary from learning which applications were used and aminer. The user is also not especially technically so- which files were accessed. phisticated, but is aware that ResidueFree is installed To be most useful, residue-free computing should on their computer. not require any program modifications and should be The user uses Skype. Although the user may not compatible with all software already installed on the be aware that Skype stores comprehensive log files that computer, including its operating system. Increased contain chat transcript and Skype call metadata (in- privacy should not come at a significant performance cluding participants and call durations), the user for- penalty; running an application in residue-free mode tunately opted to use ResidueFree to operate Skype. should incur limited overheads. Finally, residue-free Since Skype was used in residue-free mode, the cohab- computing should be user-friendly, allowing a user to itating adversary can neither identify that Skype was easily opt to use residue-free computing mode (or not) used nor learn with whom the user communicated. for any application. Privacy from a knowledgeable forensic investigator: There already exist techniques that achieve some of An investigative journalist enters a border crossing the goals of residue-free computing. For example, the when their computer is seized and imaged. The bor- user can simply run an application in a virtual ma- der crossing agent examines the journalist’s computer chine and use checkpointing to roll back filesystem and to learn what they are investigating. Fortunately, the memory changes after the application exits. Or, the user journalist used ResidueFree to run the VLC media could use a live CD operating system such as The Am- player, so the log of the videos that they viewed (in- nesic Incognito Live System (Tails) [2] that is tailored cluding some interviews streamed from the web) are not to frustrate forensic investigations by preventing traces available to the agent, as they otherwise would have had from being recorded to non-volatile storage. However, ResidueFree not been activated. these solutions incur high usability costs as they require Contributions. In summary, this paper makes the the user to significantly modify their behavior (to the following contributions: point of using an entirely different operating system) in – The design of residue-free computing: a mode of op- order to gain some privacy protections. eration that provides an “incognito mode” for any We present the design and implementation of application; ResidueFree, an instantiation of residue-free comput- – ResidueFree, an open-source implementation of ing that allows users to operate their existing applica- residue-free computing. ResidueFree is available tions (on their existing operating systems) in residue- as free open-source software and is available at free mode. We perform an in-depth forensic investi- https://larkema.github.io/residuefree/; gation in which we examine every persistent file cre- – A study of the residue (i.e., forensic traces) left ated or modified during a run in residue-free mode, by popular applications, and an examination of and find that ResidueFree leaks only minimal infor- how ResidueFree prevents the collection of these mation: namely, that ResidueFree was used. The ap- traces. plication being used and the affected files (both read and modified) are invisible to the forensic examiner. We show through extensive benchmark-based evalu- ation that ResidueFree incurs moderate-to-limited 2 Related Work overhead. Finally, we provide a simple and intuitive in- terface for ResidueFree for the Gnome desktop: users The literature on privacy-enhancing methods of com- run an application in residue-free mode by right-clicking puting is rich and diverse. In this section, we describe the application icon and selecting “Run in Residue- how residue-free computing fits in the context of this Free.” existing literature.
Residue-Free Computing 391 Most related to residue-free computing are from traditional files stored on the main filesystem to PrivExec [22] and TpriVexeC [10] which have the containerized filesystems, which themselves are stored shared goal of providing an incognito-like privacy mode on the main filesystem. In short, containers provide at for any application. Although residue-free computing, best a level of indirection, but do not significantly ob- PrivExec, and TpriVexeC all use a union filesys- fuscate filesystem changes. A knowledgeable forensic ex- tem, they differ in several important respects: unlike aminer could easily examine containerized filesystems. PrivExec and TpriVexeC, ResidueFree does not (We address the additional difficulty of securely deleting require kernel modifications and works out-of-the-box these containerized filesystems below.) with popular Linux distributions. We achieve this by Live CDs achieve many of the same privacy goals as leveraging existing efforts in Linux containerization. Ad- residue-free computing, which we describe in more de- ditionally, residue-free computing offers additional pri- tail in §3. However, both live CDs and VMs incur large vacy guarantees over both PrivExec and TpriVexeC: usability costs, as they require the user to maintain at residue-free computing obfuscates which applications least two computing environments: one for ordinary use, were run and it conceals which files were read by the and one for more private computing. Live CDs and VMs application running in privacy-mode; both are exposed prevent the user from making a quick impromptu deci- by PrivExec and TpriVexeC. Like PrivExec and sion to run an existing application (along with its exist- TpriVexeC, we also provide a performance evaluation ing configuration) in a more privacy-preserving mode, of our approach and explore its overheads; however, since configuring the live CD or the VM requires a non- unlike the prior work, we also conduct a forensics in- trivial amount of work. In contrast, residue-free comput- vestigation using filesystem snapshots to empirically ing enables the user to run any application in a privacy- demonstrate the efficacy of our technique. Finally, preserving mode on demand, at the user’s whim. This ResidueFree offers a forensics mode that allows a user allows the user to run any installed application as al- to explore the residues left by different applications. ready configured, without leaving a digital trace for a potential future forensics investigation. Techniques that use a separate privacy- preserving environment. As detailed in Private web browsing modes. Residue-free com- Garfinkel’s survey of anti-forensic techniques, there puting is partially inspired by the private web browsing are many techniques a user could employ to disrupt a modes of modern web browsers. These modes prevent potential future forensic investigation [14]. Residue-free websites from accessing cookies stored in non-private computing belongs to a class of techniques that attempt mode, and it erases browsing history, search history, to “minimize the footprint[s] [14]” available to computer and cookies upon exit. However, private browsing modes forensic tools. In Garfinkel’s survey, he considered two have historically not removed all forensic traces [27, 28]. footprint minimization techniques: the use of live CDs— Prior work has also shown that many users are un- or operating systems that boot from a removable device aware of private browsing modes, and many of those (e.g., a CD)—and virtual machines (VMs). In partic- who are aware do not understand the actual privacy ular, Tails [2] is a live CD operating system designed protections offered by these modes [13]. In particular, to provide strong privacy protections by avoiding any Habib et al. show that many users of private browsing writes to non-volatile storage and routing all network modes overestimate their protections from targeted ad- communication anonymously via Tor [9]. Similarly, us- vertisements and user tracking [16]. ing a virtual machine allows a user to potentially destroy Residue-free computing can be viewed as a form traces by reverting to an earlier snapshot. This latter of private or incognito mode for any application. Ar- approach however entails significant risk, since it relies chitecturally, however, it is very dissimilar to browser- on the secure erasure of the unsaved machine state; in based privacy modes. As discussed in more detail in §5, practice, it is unclear that such information is actually residue-free computing uses redirect-on-write filesys- unrecoverable. tems and volatile filesystems stored on RAM-disks to Similarly, so-called lightweight VMs such as con- ensure that all file modifications are transient and re- tainers (with Docker being a noteworthy example) can moved after the application exits. Private web brows- provide strong application isolation and ensure that ing modes require specific functionality in the applica- filesystem modifications are not directly applied to the tion (i.e., the web browser) to track file modifications system’s underlying filesystem. However, these contain- and ensure that they are removed after the browser ers do not eliminate traces; they merely shift the trace exits its privacy mode. We argue that residue-free
Residue-Free Computing 392 computing offers stronger guarantees than application- niques for securely removing data [26]. Much of the liter- specific solutions since the former operates at the oper- ature on secure deletion focuses on mobile devices (cf. [3] ating system/filesystem level, ensuring that any and all and [11]) and their NAND-based flash file systems. filesystem modifications made by a program running in Our implementation (see §6) sidesteps the difficul- residue-free mode are transient. In contrast, application- ties of erasing data on flash-based storage media and specific solutions require careful attention by the pro- versioning filesystems by storing all file modifications grammer to identify and revert all modifications. on volatile memory. Steganographic file systems. The goals of residue- Redirect-on-write filesystems. Residue-free com- free computing are also similar to those of stegano- puting makes extensive use of redirect-on-write filesys- graphic filesystems (also called plausibly deniable stor- tems. A redirect-on-write filesystem makes a virtual age systems) [4]. Steganographic filesystems typically copy of an underlying base filesystem. Read operations have multiple layers of files, with a given layer be- are usually directed to the underlying base filesystems, ing exposed by providing its corresponding key while while writes are made to a modified copy stored in a all other layers remain hidden. Conceptually, stegano- separate storage area. Subsequent reads to modified files graphic filesystems provide a mechanism where a co- are then directed to this separated storage area. Popular erced individual—for example, a detainee at a border implementations of redirect-on-write filesystems include crossing—could provide a key that reveals benign con- UnionFS [25], aufs [18], and overlay2 [6]. tent while obscuring the existence of the more sensitive Designed primarily to support efficient snapshotting covert data. There are several deployed steganographic (also called checkpointing) rather than anti-forensics, filesystems, with notable examples including StegFS [20] redirect-on-write filesystems are by themselves insuffi- and (the now defunct) TrueCrypt [1]. However, these cient for achieving the goals of residue-free computing. systems do not hide the fact that the steganographic The storage area that records filesystem modifications filesystem is installed. Czeskis et al. show that an op- constitutes a (fairly comprehensive) digital trace of ap- erating system’s regular functionality may betray the plication and data use. existence of the hidden volume, and demonstrate such attacks against TrueCrypt [8]. Chen et al. extend the no- tion of plausible deniability to include invisibility, which also conceals evidence of the steganographic file sys- 3 Threat Model and Goals tem [7]. Our system model includes the user, who is the opera- In theory, steganographic filesystems enable users tor of the machine; the operating system; the application to conceal the digital breadcrumbs left by their applica- to be run in residue-free mode; the filesystem that con- tions by obscuring the existence of the entire filesystem tains the operating system and the application1 ; and on which the application resides. In practice, however, the adversary, who conducts a forensic investigation. steganographic filesystems require maintaining multiple computing environments—one for an overt “cover” OS, Goals. The primary goals of residue-free computing and a separate environment for covert application usage. are to prevent the adversary from (1) identifying which This entails a significant and persistent effort by the application was used in residue-free mode and (2) identi- user. Residue-free computing attempts to lighten this fying any files that were modified (including file creation workload by enabling on-demand privacy-preserving op- and deletion) or accessed by an application operating in eration of existing applications. We do not require the residue-free mode. For ease of exposition, we consider a user to maintain multiple copies of data or applications, single use of residue-free mode in this paper, but our and only require that the user proactively starts an ap- goals naturally extend to sequential or concurrent use plication in residue-free mode when they want enhanced of residue-free computing: for example, the adversary privacy protections. should not be able to tell which applications were run in residue-free mode during their investigation. Secure deletion. Residue-free computing requires that any filesystem modifications are undone after a program running in residue-free mode terminates. On modern log-based and versioning filesystems, this can be 1 In practice, the user’s computer can have multiple filesystems. (perhaps) surprisingly difficult [23]. Reardon et al. pro- For ease of exposition, we consider the filesystem to be one log- vide a comprehensive survey of the challenges and tech- ical volume in this paper.
Residue-Free Computing 393 An important secondary goal of residue-free com- in residue-free mode exits) and optionally the first puting is ease-of-use. Although this is far more difficult filesystem snapshot (taken before the program spawned to quantify, residue-free computing should not require in residue-free mode begins execution) reveals nei- significant work or expertise on the part of the user. ther which application was run in residue-free mode Residue-free computing should be compatible with the nor which files were accessed (read, modified, created, user’s existing operating system and applications, and deleted, etc.) by that application. all of the user’s data (i.e., all data stored on their filesys- We emphasize that residue-free’s security proper- tem) should be available to an application operating in ties are met after the program running in residue-free residue-free mode. The behavior of an application run- mode terminates; during that program’s operation, we ning in residue-free mode should be identical to that ap- do not attempt to hide that it is running or what files plication running outside of residue-free mode, except- it accesses. ing for the existence of trace data not being available The adversary is permitted to learn that the com- after the program exits. Starting a program in residue- puter supports residue-free mode (e.g., that Residue- free mode should be intuitive and not require a high Free is installed) and the time that the user launched level of technical expertise. residue-free mode; which specific application(s) were Finally, operating in residue-free mode should not run remains hidden. incur significant overhead. The performance of a pro- gram operating in residue-free mode should be on par with its operation outside of residue-free mode. 4 Motivation: Case Studies of Threat model and adversary capabilities. We consider an adversary who has access to snapshots Popular Application Residues of the filesystem before and after an application runs To motivate the need for residue-free computing, we first in residue-free mode. The adversary has knowledge of investigate the frequency and degree to which popu- residue-free computing, and can apply any forensic tools lar applications leave residues. We also examine these or analyses to the snapshots. We assume the adversary residues to attempt to gauge the level of privacy loss has, at minimum, knowledge of publicly documented or information leaked to forensic examiners. Conceptu- forensic artifacts in the operating system, such as those ally, this provides a baseline of how much an examiner presented by Nishida [21]. can learn without the protections offered by residue-free The adversary is not permitted to eavesdrop on the computing. computer’s network communication or monitor the com- We selected applications across a wide range of cat- puter while an application is running in residue-free egories, based on their ranking in Ubuntu’s Software mode. In general, “live” or real-time forensics is consid- application. We focused our selections on popular ap- ered out-of-scope for residue-free computing. The goal plications, where popularity was based primarily on our of residue-free computing is to remove the residue left own experiences with these applications and the ease by applications on the filesystem in order to subvert a at which we could find independent information that forensic investigation. It does not protect against ad- hinted at the application’s popularity (e.g., blogs and versaries who use other means (e.g., network eavesdrop- YouTube videos made by users). While we attempted ping) to ascertain a user’s computer usage. to identify quantitative application metrics for Ubuntu We assume that the operating system (including the desktop applications, we could not identify a definitive implementation of the filesystem) and any applications metric similar to an Apple App Store or Google Play do not attempt to detect and thwart residue-free com- download metric. The applications, which are listed in puting. While applications should be agnostic to (and Table 1 in the Appendix, were installed either using even ignorant of) residue-free computing, we do not at- the default packages from Ubuntu Software on Ubuntu tempt to conceal its existence from an application that 18.04 (if available) or the version available on the appli- actively tries to detect it. Similarly, residue-free com- cation’s web site (if not). puting does not protect against spyware or other forms To perform our investigation, we used the forensic of malware that may be running on the user’s computer. mode of ResidueFree, which we describe more fully in Security properties. Residue-free computing meets §6. In brief, we used a redirect-on-write filesystem to the above two goals: a forensic examination of the isolate all files that the application modified while in second snapshot (taken after the program operating use. We also provided unique inputs, which we call ca-
Residue-Free Computing 394 nary strings (e.g., “gorgonzola penguins”) to the appli- Most of this data is stored without the user’s knowl- cations to determine whether the applications’ residues edge or explicit consent, which is not to say that captured these inputs. Their presence in an application’s these applications are categorically flawed. Much of the residue suggests that a forensic examiner would have residue applications leave is used to provide additional detailed information about the user’s inputs to that ap- features or performance benefits to users, and these ap- plication. plications do not consider an attacker with future ac- We manually inspected the application residues to cess to the user’s machine in their threat models. How- search for both our canary strings as well as other ob- ever, a user seeking to manually remove the details from vious information about the user’s session (e.g., times- their session would be hard-pressed to do so for every tamps, chat logs, names of transferred files, log entries, application. Most of the residue we found is stored in etc.). hidden directories (i.e., directories whose names begin We emphasize that our manual analyses do not al- with ‘.’), in files multiple directory levels deep, and/or low us to exhaustively determine the information con- in files without descriptive names. Rather than task a tained within an application’s residue. The presence of user with finding every file that may contain information canary strings tells us definitively that the user’s inputs from their session, we provide a solution that automati- are captured, but the absence could indicate that the cally discards all changes to the filesystem made during information is somehow otherwise recorded (e.g., in an their session. encoded form). As a result, the residues we identify are a subset of all the potential residues that could be un- covered by a trained forensic examiner (especially one well-versed in a particular application), and should be 5 Architecture taken as conservative estimates of the trace data left by Residue-free computing is composed of three main com- popular applications. ponents: a completely volatile RAM disk, a union Our findings are summarized in Table 1 in the Ap- filesystem [25], and an isolated execution environment. pendix. We found that these common user applications These components must not write to the user’s on- left a significant amount of residue on the filesystem disk filesystem nor modify any filesystem metadata and that, if left on disk for an attacker to read, could re- should sufficiently interact with the user’s system to veal significant data on how the user used the applica- provide a nearly indistinguishable user experience from tion. For example, Spotify leaves all songs the applica- running the application outside of residue-free mode. tion displayed to the user and the user’s social media This latter interaction requires special care, since the information on the filesystem; Skype records detailed application’s functionality may require it to communi- information from the user’s audio, video and instant cate with other processes, but such interprocess commu- message conversations, including message contents, re- nication (IPC) should not result in leaked information cipient usernames, and timestamps; Discord stores the about the application running in residue-free mode. As channels a user participated in; and VLC tracks and a result, system processes may need to be modified dur- updates timestamps on recently played media files. ing residue-free computing to eliminate sensitive logging Privacy-focused applications, such as Signal, Tele- and protect against information the operating system gram, Brave Web Browser, Tor Web Browser, and may otherwise leave on the user’s disk. other web browsers operating in “private” or “incog- The execution environment and RAM disk will both nito” mode do not record detailed data on the user’s contain information from the user’s residue-free session activity, but they sufficiently modify the filesystem to that must be kept private, while the union filesystem clearly denote that the user ran the applications and is a logical component that facilitates file access with- when the user ran them. Unfortunately, designing these out moving data to a new physical device. At the same private browsing modes is difficult since developers need time, residue-free must have access to the physical com- to track (and subsequently, eliminate) any state nat- puter hardware necessary to run user programs without urally kept by the browser. Subtle errors can lead to having the system log sensitive information. Figure 1 tracking. For example, Solomos et al. recently showed visualizes this architecture. that malicious websites could bypass incognito mode on We define a completely volatile RAM disk as a all major browsers using favicons [27]. ResidueFree filesystem physically hosted on RAM that can reliably avoids such tracking by eliminating all persistent data. have all of its contents made irrecoverable. To securely
Residue-Free Computing 395 Isolated Execution Environment processes, to provide a functional user experience with All File Calls as few performance costs as possible. In addition, sys- System Resources Union Filesystem User System tem processes may be modified to provide additional Read-Write Read-Only I/O Network privacy guarantees without limiting their functionality. CPU RAM In summary, a user should not notice that they are User Volatile RAM Disk Filesystem Misc System using residue-free computing while the underlying oper- Residue-Free Procs ations do not modify the user’s filesystem. Logical Operations 6 Implementation Private Residue-Free Session System processes with logging disabled Computer Hardware We implement the residue-free computing architecture using an encrypted RAM disk, the unionfs copy-on- Fig. 1. The architecture of residue-free computing. write filesystem, and a Docker container. We carefully configure these components to enable access to the erase the contents of RAM disk, an implementation user’s system resources and additionally modify select could either overwrite every bit of the disk or encrypt system processes. Our proof-of-concept implementation, the contents of disk in real time with an irrecoverable which we refer to as ResidueFree, implements the ba- key (we discuss the associated overhead costs of this im- sic requirements of residue-free computing while pro- plementation choice in §7.3). Any data written to the viding additional features to the user that go beyond RAM disk should be made unrecoverable quickly after architectural requirements. While this implementation a residue-free computing session ends and must be com- incurs moderate overhead and performance costs, par- pletely erased after the user’s system powers off or enters ticularly on file operations, it overwhelmingly meets the sleep mode. architecture’s privacy goals. The union filesystem takes the volatile RAM disk ResidueFree is released as free open-source soft- and merges it with the user’s filesystem to create a logi- ware and is available to download at https://larkema. cally unified filesystem interface while accessing two dif- github.io/residuefree/. ferent physical devices. The union filesystem must have ResidueFree operates on Linux, and has been read-write access to the RAM disk and read-only ac- tested with Ubuntu 18.04 LTS with kernel 5.4.0.48. As cess to the user’s filesystem. Conceptually, this allows noted in §3, ResidueFree is designed for standard sys- the application running in residue-free mode to appear tem configurations and cannot provide privacy guaran- to have full (i.e., read-write) access to the user’s regular tees when system administrators install enhanced log- filesystem. Every write a process using the union filesys- ging capabilities that are specifically designed to cap- tem makes must be written to the RAM disk and these ture and record all computer activity (e.g., kernel-level writes must be readable to processes that use the union provenance tracking [5, 24]). We emphasize that the filesystem (so long as the RAM disk is still intact). The “stock” versions of popular distributions (e.g., Ubuntu, union filesystem can read from the user’s filesystem for RedHat/Fedora, Debian, Mint, etc.) should be largely data that has not been created or changed in the iso- compatible with ResidueFree, but may require small lated execution environment, and these reads must not changes to how ResidueFree modifies system pro- modify any file data or filesystem metadata on the user’s cesses. 2 Finally, ResidueFree requires administrative regular filesystem. The union filesystem does not store privileges (e.g., sudo) to switch into residue-free mode; any data; it merely mediates access between the two the application itself runs with the user’s regular (non- storage devices. administrative) privileges. Finally, the isolated execution environment must We implement the volatile RAM disk by encrypting run on top of the union filesystem and should pro- files as they are written to RAM. We mount a temporary vide a nearly indistinguishable user experience from the filesystem (tmpfs), then mount an encrypted filesystem user’s system. The environment must ensure that all (ecryptfs [17]) on top of it. ResidueFree initializes calls to the filesystem are made to the union filesystem and should be sufficiently isolated to minimize interac- tion with user processes running outside of residue-free 2 For example, our implementation targets logs written by the mode. However, this isolation must be balanced with Gnome desktop environment, so implementations targeted at sufficient access to system resources, including system KDE distributions would have to account for KDE-specific logs.
Residue-Free Computing 396 and mounts the encrypted file system with a random tions to access abstract unix sockets set up by system key, and adds the key and its cryptographic hash to processes.3 These steps allow processes running inside the keyring. ResidueFree enables access to the files in ResidueFree to have transparent access to daemons, the tmpfs filesystem only while the ecryptfs filesystem GUI components, and other critical system resources. remains mounted. The tmpfs filesystem is initially set While these steps prevent applications running in- to one gigabyte, but the size can be configured by the side ResidueFree from writing data to disk, operating user. In edge-cases where the tmpfs filesystem is larger system features outside of ResidueFree may still write than available RAM, data in the tmpfs filesystem may sensitive data to disk, especially given that Residue- be written to swap space. Though this is not ideal, the Free intentionally provides access to system processes. encryption provided by the ecryptfs filesystem prevents To address this challenge, ResidueFree modifies the any meaningful residue from being retrieved by an ad- main filesystem’s mount options, mlocate, the system versary. journal daemon, the syslog daemon, the system ap- We construct the union filesystem using the unionfs port service, the user’s pulse audio daemon, the user’s package [25]. Unfortunately, Docker currently does not keyring, and the user’s Gnome shell. Before starting allow root (i.e., /) filesystem mounts, which is necessary the Docker container, ResidueFree remounts the main to ensure that the application running in residue-free filesystem with the noatime (no access time) option mode functions as it would if it were not in residue-free to ensure that execute and access operations are not mode. We sidestep this constraint by separately mount- tracked inside ResidueFree. ResidueFree updates ing the computer’s top-level directories. The union mlocate’s configuration file to not update file names un- mounts have the copy-on-write option set (among other der a ResidueFree mount point, restarts the system options) so that files modified in the Docker session are journal daemon with all journal logs written to none copied in their entirety to the RAM disk. Once a mod- (instead of the default disk), and stops the syslog and ified or newly created file is on the RAM disk, future apport services from running. ResidueFree also stops read operations read from RAM rather than the old ver- the user’s pulse audio and keyring daemons and then sion on the user’s filesystem. After mounting, Residue- starts them again inside the ResidueFree container.4 Free sets the permissions of the directories in the union Finally, ResidueFree sets the current Gnome shell to filesystems to match the permissions in the user’s filesys- stop recording application use and sets the file where tem. Gnome stores application use to read-only. A particularly challenging aspect of ResidueFree After ResidueFree exits, the ecryptfs is imme- is to provide an isolated execution environment while diately unmounted. At this point, none of the data simultaneously allowing the application running in in ResidueFree’s write-cache is logically recoverable, residue-free mode to interact with the necessary sys- as the key for its encrypted contents is not stored tem processes/daemons to ensure the program oper- in any process and ecryptfs clears the keyring when ates correctly. Rather than construct our own isolated the corresponding filesystem is unmounted. The tem- execution environment, we use Docker, which provides porary filesystem unmount follows, and all directories strong isolation and has a mature codebase. We mount that ResidueFree creates as well as the RAM disk’s the /dev filesystem inside the Docker container and (encrypted) contents are removed. ResidueFree then run the container as privileged to allow the applica- tion to interface with the system’s devices. However, simply mounting the union filesystems and /dev into a Docker container still does not provide sufficient access to the user’s system resources for a functional user expe- rience. To ensure sufficient access, we bind-mount criti- cal unix sockets from the host filesystem into the union 3 The most important of these is the user’s session dbus. Certain filesystems and keep ResidueFree in the same network applications will only interact with the abstract socket connec- namespace as the host. All sockets in /tmp/.X11-unix/, tion to the user’s dbus and refuse connections to the filesystem /tmp/.ICE-unix/, and the /run/user// directo- socket connection. 4 Pulse audio, by default, tracks the names of applications that ries, as well as the /run/lock and /run/dbus sockets, produce sound in a file. Applications can store user information, are bind-mounted into their respective union filesystem such as their login state, in the keyring. When these applications directories. In addition, running ResidueFree in the restart inside ResidueFree, these file changes are written to the same network namespace as the host allows applica- RAM disk.
Residue-Free Computing 397 restores the seven system modifications to their original Forensic mode. To increase ResidueFree’s use- state.5 fulness, our implementation also includes a “forensic” To address unexpected exits and ensure the proper mode in which, instead of writing all modified files to removal of any residue, ResidueFree handles inter- RAM, the user can opt to write all modified files to their rupt and termination signals by executing this cleanup filesystem. Conceptually, forensic mode allows the user process. ResidueFree also launches a detached back- to determine exactly what residue is left after running ground process that will execute this cleanup process an application. This mode operates identical to normal if ResidueFree exits and cannot do so itself (i.e., af- ResidueFree, except that it does not mount a tmpfs ter receiving a kill signal). In a scenario where the en- nor ecryptfs filesystem. Instead, ResidueFree’s foren- tire computer unexpectedly shuts down, the encrypted sic mode mounts the read-write portion of the unionfs filesystem and RAM disk unmount - leaving the residue filesystems in a folder of their choosing. When forensic unrecoverable. mode exits, the write-cache is unmounted and all its As we describe in more detail in §7.1, our evaluation files and directories have their owner set to the user. of the user’s filesystem contents reveal that no residue To reduce the risk of any potentially malicious files exists from the application run in residue-free mode, nor added during the ResidueFree session, ResidueFree is there any evidence that the particular application was removes the execute permissions from executable files. run in residue-free mode (or at all). The residue described in Table 1 was collected using forensic mode. Docker-based isolated execution environment. We use a Docker container to construct our iso- Persistent Files. Conceivably, a user may want to lated execution environment because of Docker’s well- store select files from their ResidueFree session, such documented, well-maintained features that allow us to as a word processing document or a PDF file down- incorporate our filesystem modifications without re- loaded via web browser. To enable these use cases, implementing Linux’s isolation capabilities. Docker’s ResidueFree creates a specific folder in the user’s home popularity makes user-customization and additional directory for files the user does not want deleted. When open-source contributions to ResidueFree much sim- ResidueFree exits, these files are transferred to the pler while allowing us to make modular backend filesys- user’s desktop. To prevent the inadvertent deletion of tem changes. While we could likely cut performance desired files, ResidueFree displays a brief message costs by only using the containerization features nec- when launched reminding the user that all files not essary for ResidueFree, these extensive modifications stored in the persistent folder will be deleted at the end are outside the scope of our proof-of-concept implemen- of their ResidueFree session. tation. Optimizations and additional features. Notably, our ResidueFree container is far outside ResidueFree includes a number of options during of Docker’s typical use case. While Docker is typically installation and at run-time for the user to tailor their used to run lightweight, single applications securely sep- use of ResidueFree. When running ResidueFree arated from the host operating system, our “container” for the first time, users have the option to enable the provides access to every file on the host file system, and Docker daemon to startup with system boot. If enabled, does not provide traditional Docker security assurances. it will reduce the costs of launching ResidueFree the A remote attacker who gains access via a process run- first time after system boot, but will stand out on many ning inside ResidueFree will not only have full read- client machines and may not be ideal for users wishing access to the entire filesystem, but the numerous system to raise as little attention as possible to the fact that resources and privileged configurations granted to the ResidueFree is on the system (we discuss associated Docker container make container-escaping trivial. performance costs in §7.3). A core design goal of ResidueFree is to make the interface as intuitive, unobtrusive, and user-friendly as possible. The user can opt to add a right-click option for every desktop application to run in ResidueFree 5 The Gnome shell updates application use every five min- mode, as shown in Figure 2. utes, so ResidueFree launches a background process that only Once installed, users can run ResidueFree in ei- restores Gnome application use recording five minutes after ResidueFree exits. All other system configurations are restored ther privacy (the default) or forensic mode. In privacy on ResidueFree’s exit. mode, users can specify the size of the RAM disk. In
Residue-Free Computing 398 7 Evaluation We conducted both forensic and performance analyses to evaluate our proof-of-concept implementation. For our forensic evaluation, we compared the filesystems of two different virtual machines across three different states: before running ResidueFree, immediately after running a series of applications within ResidueFree, and ten minutes after ResidueFree exits. For our per- formance analyses, we used the Iozone7 and Phoronix8 performance benchmark suites and timed tasks in com- Fig. 2. Application right-click menu with an option to run the mon user applications to analyze the filesystem and gen- application in residue-free mode. eral system performance of ResidueFree. Finally, we timed ResidueFree’s startup and shutdown overhead forensic mode, the user can specify the output direc- times. tory’s location and whether to compress the output file. We ran our performance analyses on an Ubuntu 18.04 machine running directly on a Toshiba Satellite Operating system portability. Our initial imple- E45W-C4200X with a two-core 2.10 GHz Intel i3-5015U mentation of ResidueFree operates on Linux. Because processor, 6 GB of RAM, and a 500 GB HDD. Which it uses Docker to help enforce sandboxing and Docker it- is to say, we ran our performance tests on a (metaphor- self uses a Linux kernel to drive its containers, Residue- ical) potato to demonstrate its functionality for users Free cannot straightforwardly be modified for other op- with access to limited hardware resources. erating systems. (Although Docker can run on OSX, the kernel used inside containers is still Linux, which would not permit local OSX applications to be run.) How- ever, much of the containerization enabled by Docker 7.1 Forensic Evaluation can be duplicated on OSX, and we have made sig- The primary goal of ResidueFree is to eliminate any nificant progress in doing so towards a future OSX- filesystem changes a user or process makes while us- compatible release of ResidueFree. ecryptfs is also ing ResidueFree. Our implementation should leave no Linux-dependent, but ResidueFree is agnostic to the residue on the filesystem, and ideally nowhere on disk, cryptographic filesystem that protects the ramdisk, and that could be useful in a forensic examination. Our could easily be adapted to support a FUSE-based so- forensic analysis shows ResidueFree leaves no informa- lution (e.g., EncFS6 ). We are planning to fully support tion about what processes the user ran inside Residue- OSX in a future releases of ResidueFree, which we Free and what file(s) or filesystem data those processes hope will lead to more widespread adoption. may have accessed. While an examiner can determine Like OSX, Windows has support for FUSE filesys- that ResidueFree is installed on the computer and the tems. FUSE-based copy-on-write filesystems exist, and times a user ran ResidueFree, no further information coupled with FUSE-based encrypted filesystems, many is left in the filesystem. of the logical components of ResidueFree are currently For our analysis, we compared the filesystems of two available for Windows. However, Windows presents virtual machines (VMs) across three states. First, we unique challenges such as the common use of shared key- created an Ubuntu 18.04 virtual machine (VM), down- stores (e.g., the Windows registry) as well as the use of loaded a series of applications and installed Residue- isolation features that significantly differ from those of Free, and then cloned the VM to ensure the two VMs Linux and OSX. Consequently, enabling Windows sup- would be as close to identical as possible. While one port remains a more distant future goal. VM, which we refer to as the “baseline VM,” remained active in the background, we used the other VM, or “ResidueFree VM”, to launch ResidueFree and in- 7 http://www.iozone.org/ 6 https://github.com/vgough/encfs 8 https://www.phoronix-test-suite.com/
Residue-Free Computing 399 teract with several applications inside ResidueFree. ating system and the Gnome Desktop Environment.10 After exiting ResidueFree, we immediately froze both For application downloads and use, he points to sys- VMs and made copies of their virtual disks. We then tem log files, application installation files, journal files, restored the ResidueFree VM, let it run for an addi- Gnome’s file for tracking application use, Gnome’s file tional ten minutes, and then froze the VM and copied for tracking recently opened files, and a handful of addi- its virtual hard disk again.9 We then mounted read-only tional miscellaneous files in the user’s cache or that are copies of these filesystems in a separate machine, gen- application-specific. ResidueFree ensures that none of erated hashes of each file in all three states (including the described forensic artifacts are written to disk. the virtual disks’ swap files), and manually inspected In addition, ResidueFree did not preserve any files the contents of any files whose hashes differed between from Firefox or Chromium, including favicons where the baseline VM and either of the ResidueFree VM’s the supercookie attempts to persist between sessions. filesystem snapshots. As Solomos et al. [27] note, these favicon-based super- Within the ResidueFree VM, we first connected cookies persisted past all major browsers’ “private” or to ProtonVPN using an OpenVPN configuration file. “incognito” modes. While these incognito modes are We then opened LibreOffice Writer to create and save generally effective, they are vulnerable to novel persis- a short document, ran Firefox to generate a favicon- tent storage techniques like this one. By ensuring that based “supercookie” from https://supercookie.me/, and all file modifications never touch the user’s disk, no played an entire Youtube video. While Firefox was run- matter an application’s settings, ResidueFree protects ning, we intentionally caused Chromium to crash. We against such user-tracking attacks. then opened Chromium successfully and downloaded a PDF, ended ProtonVPN, and opened Nautilus to select the PDF and read it using Evince. Next, we opened 7.2 Filesystem Costs Telegram and sent a message, then downloaded a sim- ple .txt file from another computer on the local network Since ResidueFree effectively intercepts file operations using Filezilla, and finally used apt to install a new ap- to prevent application residues, we performed a detailed plication (gnome-gmail). We then exited ResidueFree evaluation of ResidueFree’s file I/O overheads. That and immediately froze the VM. is, we sought to answer the question, what is the per- Thirty-nine files differed between the baseline VM formance penalty of using ResidueFree? We used the and the ResidueFree VM both immediately after Iozone Filesystem Benchmark suite to evaluate 13 differ- ResidueFree ran and after ten minutes idling. We ent filesystem operations on a baseline standard mode manually analyzed each file to determine that no sen- without ResidueFree, on ResidueFree without the sitive information from the ResidueFree session was ecryptfs mount on top of the RAM filesystem, and on part of these differences. We determined the differ- the fully-fledged (with ecryptfs) ResidueFree. We ran ences were caused by either routine filesystem activity the suite 30 times for each category, and took an average (e.g., timestamps differing by a few seconds or different across block sizes and file sizes for each file operation in unique identifiers for system resources) or indications each run. Figure 3 shows the average of those 30 av- that ResidueFree ran (e.g., Docker and Containerd erages for each file operation, with error bars denoting logs noting ResidueFree’s container invocation or log the standard distribution. For the ResidueFree tests, files that tracked the ResidueFree launch command). we ran the Iozone Benchmark 30 times in a row in the In the latter case, none of the files indicated which ap- same ResidueFree session. plication was run in residue-free mode. On average, ResidueFree file operations run at Notably, ResidueFree removes common residue 30.1% the speed of standard mode file operations: 636.9 identified by trained forensic examiners. Nishida [21] vs. 2113.8 MB/s. Without ecryptfs, ResidueFree runs describes locations in the Ubuntu operating system for 40.6% as fast as standard mode, averaging 858.9 MB/s. reliably identifying forensic artifacts left by the oper- ResidueFree runs 74.1% as fast as ResidueFree with encryption disabled; the difference between them al- most entirely comes from write operations (when the en- 9 We took the ResidueFree VM’s state twice to ensure that system processes with delayed writes, namely Gnome’s application-use tracking described in §6, did not generate any 10 While Nishida [21] evaluates Ubuntu 20.04 and we evaluate residue. Ubuntu 18.04, almost all forensic artifacts remain the same.
Residue-Free Computing 400 3000 Baseline Baseline Megabytes per Second Residue Free w/o Encryption 0.56 ResidueFree 2500 Residue Free 0.54 Gigabytes per Second 2000 1500 0.52 1000 0.50 500 0.48 0 0.46 e ad d e ad ad ad ad e d e e e rit ea rit rit rit ea rit rit re re re re re w w ew fw w w er fr re re re m d e om fr fr rid w o rd 0.44 bk nd nd st co ra ra re File Operation 0.42 0.40 Fig. 3. Filesystem operation speeds from Iozone Benchmark tests. Network Loopback Benchmark Fig. 4. Transfer speed for 10 GB of data over the loopback inter- cryption layer actually operates on the files). Residue- face from the Phoronix network-loopback benchmark. Free runs at 22.3% the speed of ResidueFree with- out encryption on write operations (120 vs. 537 MB/s), 295.0 Baseline but 95.1% as fast on read operations (1000 vs. 1134.8 292.5 ResidueFree RSA Signatures per Second MB/s). Finally, we note that the performance costs 290.0 shown in Figure 3 are not weighted by the frequency of the individual file operations. 287.5 285.0 282.5 7.3 Additional Performance 280.0 Considerations 277.5 275.0 OpenSSL Benchmark Impacts on network, CPU, and RAM speed. In addition to the filesystem costs, we evaluated Residue- Fig. 5. RSA signing speeds from the Phoronix OpenSSL bench- Free’s impacts on other key system performance mea- mark. sures. We used the Phoronix Test Suite to evaluate net- work device, processing, and RAM speeds, with and without ResidueFree. host, these results are expected. We show these results We ran all Phoronix Benchmarks 10 times consec- in Figure 4. utively in both standard mode and in ResidueFree. To measure processing performance, we used We ran all 10 ResidueFree runs in the same Residue- Phoronix’s OpenSSL12 and API-Test13 benchmarks. Free session. Unlike our other tests, the Phoronix runs The OpenSSL benchmark measures how fast the system did not fall into a uniform distribution. In particular, calculates RSA signatures, while the API benchmark tests running in ResidueFree were more prone to vari- measures how many frames per second the system can ance. However, the results show ResidueFree performs render using various graphics API operations. approximately the same as standard mode in these non- On average, ResidueFree performs 4096-bit RSA filesystem benchmarks. signatures at 99.8% the rate of standard mode, calcu- To measure performance on network devices, we lating 289.9 vs. 290.4 signatures per second. We show used Phoronix’s network-loopback benchmark11 to mea- these results in Figure 5. sure data transfer speeds over the loopback interface. We ran all 22 of the API benchmark’s per-API tests, Transfer speeds for both ResidueFree and standard though only 20 of them successfully ran on our system mode are practically equivalent, with ResidueFree (both inside and outside of ResidueFree). We ran the averaging 11 MB/s faster than standard mode. Since tests at a 1024x768 resolution. ResidueFree performs ResidueFree shares a network namespace with the graphics API operations at 98.5% the rate of standard 12 https://openbenchmarking.org/test/pts/openssl 11 https://openbenchmarking.org/test/pts/network-loopback 13 https://openbenchmarking.org/test/pts/apitest
You can also read