Software Evolution Big Data History Artificial Intelligence - IEEE Computer Society
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
COMPSAC 2023 CALL FOR PAPERS UNIVERITY OF TURIN - ITALY JUNE 26-30, 2023 RESILIENT COMPUTING & COMPUTING FOR RESILIENCE IN A SUSTAINABLE CYBER-PHYSICAL WORLD During the pandemic-induced chaos of the last few years, the world Standing Committee Chair has mainly managed to resist chaos and maintain a sense of order Sorel Reisman and stability. This resistance has brought a new focus and meaning California State University Fullerton, USA to our technology research direction, i.e., resiliency, availability, and sustainability. As we go forward, we have learned that we must Standing Committee Vice Chairs carefully prepare a variety of alternatives, keep them in operation, Sheikh Iqbal Ahamed, Marquette University, USA and continue to adapt them and us to changing circumstances. Mohammad Zulkernine, Queen’s University, Canada For instance, the notion of security can be interpreted as the General Chairs maintenance of high availability. It can also be said that resilience is the ability to keep the lights burning, even if the quality of stability is Marco Aldinucci, University of Turin, Italy significantly reduced. Ali Hurson, Missouri University of Science & Technology, USA Inspired by these thoughts, COMPSAC 2023, which is organized as Marina Marchisio, University of Turin, Italy a tight union of symposia, will focus on all the innovative research Forrest Shull, Carnegie Mellon University, aspects of computing in a sustainable cyber-physical world, including Software Engineering Institute, USA resilience computing, security, healthcare analytics, e-government and e-society, metaverse worlds, intelligent autonomous systems, Program Chairs in Chief Internet of Things networks, explainable artificial intelligence, Yuuichi Teranishi, National Institute of Information augmented reality, and virtual augmented reality. The technical & Communications Technology, Japan program will include keynote addresses, research papers, industrial Alfredo Cuzzocrea, University of Calabria, Italy case studies, fast abstracts, a doctoral symposium, poster sessions, Moushumi Sharmin, Western Washington University, workshops, and tutorials on emerging and important topics related USA to the conference theme. Highlights of the conference will include Dave Towey, University of Nottingham plenary and specialized panels that will address the technical Ningbo China (UNNC), China challenges facing researchers and practitioners who are driving fundamental changes in Computers, Software, and Applications in Resilience Computing Systems and Tools. Panels will also Important Dates address cultural and societal challenges raised by rapidly changing Main conference papers due January 15, 2023 communication norms concerning computing and collaboration. Notification April 1, 2023 Camera-ready & registration May 1, 2023 Authors are invited to submit original, unpublished research work Workshop papers due April 7 2023 and industrial practice reports. Simultaneous submission to other Notification May 1, 2023 publication venues is not permitted except as highlighted in the Camera-ready & registration due May 7, 2023 COMPSAC 2023 JC & CJ program. All submissions must adhere to IEEE Conference Publishing Policies and will be vetted through the IEEE CrossCheck portal. Full details are available on the conference website: https://ieeecompsac.computer.org/2023/
IEEE COMPUTER SOCIETY computer.org STAFF Editor Publications Portfolio Managers Senior Advertising Coordinator Cathy Martin Carrie Clark, Kimberly Sperka Debbie Sims Production & Design Artist Publisher Carmen Flores-Garvey Robin Baldwin Circulation: ComputingEdge (ISSN 2469-7087) is published monthly by the IEEE Computer Society. IEEE Headquarters, Three Park Avenue, 17th Floor, New York, NY 10016-5997; IEEE Computer Society Publications Office, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; voice +1 714 821 8380; fax +1 714 821 4010; IEEE Computer Society Headquarters, 2001 L Street NW, Suite 700, Washington, DC 20036. Postmaster: Send address changes to ComputingEdge-IEEE Membership Processing Dept., 445 Hoes Lane, Piscataway, NJ 08855. Periodicals Postage Paid at New York, New York, and at additional mailing offices. Printed in USA. Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in ComputingEdge does not necessarily constitute endorsement by the IEEE or the Computer Society. All submissions are subject to editing for style, clarity, and space. Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2) includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third- party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted material on their own Web servers without permission, provided that the IEEE copyright notice and a full citation to the original work appear on the first screen of the posted copy. An accepted manuscript is a version which has been revised by the author to incorporate review suggestions, but not the published version with copy- editing, proofreading, and formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications /rights/paperversionpolicy.html. Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribution must be obtained from IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854-4141 or pubs-permissions@ieee.org. Copyright © 2022 IEEE. All rights reserved. Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Unsubscribe: If you no longer wish to receive this ComputingEdge mailing, please email IEEE Computer Society Customer Service at help@ computer.org and type “unsubscribe ComputingEdge” in your subject line. IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html. IEEE Computer Society Magazine Editors in Chief Computer IEEE Intelligent Systems IEEE Pervasive Computing Jeff Voas, NIST Longbing Cao, University Marc Langheinrich, Università of Technology Sydney della Svizzera italiana Computing in Science & Engineering IEEE Internet Computing IEEE Security & Privacy Lorena A. Barba, George George Pallis, University Sean Peisert, Lawrence Washington University of Cyprus Berkeley National Laboratory and University IEEE Annals of the History IEEE Micro of California, Davis of Computing Lizy Kurian John, University David Hemmendinger of Texas at Austin IEEE Software Union College Ipek Ozkaya, Software IEEE MultiMedia Engineering Institute IEEE Computer Graphics Shu-Ching Chen, University and Applications of Missouri, Kansas City IT Professional Torsten Möller, Irena Bojanova, NIST Universität Wien 2469-7087/22 © 2022 IEEE Published by the IEEE Computer Society November 2022 1
NOVEMBER 2022 • VOLUME 8 • NUMBER 11 8 A Brief History 26 Visual Analytics 36 The Apollo of Free, Review: An Early and Guidance Open Source Continuing Success Computer Software and Its of Convergent Communities Research With Impact
Software Evolution 8 A Brief History of Free, Open Source Software and Its Communities JESUS M. GONZALEZ-BARAHONA 14 A Watershed Moment for Search-Based Software Engineering IPEK OZKAYA Big Data 18 Major Computing Technologies of the Past 75 Years NIR KSHETRI AND JEFFREY VOAS 26 Visual Analytics Review: An Early and Continuing Success of Convergent Research With Impact DAVID S. EBERT, AUDREY REINERT, AND BRIAN FISHER History 36 The Apollo Guidance Computer MICHAEL MATTIOLI 40 The History of Franz and Lisp FRITZ KUNZE AND LAUREN KUNZE Artificial Intelligence 46 Knowledge-Intensive Language Understanding for Explainable AI AMIT SHETH, MANAS GAUR, KAUSHIK ROY, AND KEYUR FALDU 52 Attacks on Artificial Intelligence ELISA BERTINO Departments 4 Magazine Roundup 7 Editor’s Note: Software Evolution in Action 55 Conference Calendar Subscribe to ComputingEdge for free at www.computer.org/computingedge.
Magazine Roundup T he IEEE Computer Society’s lineup of 12 peer-reviewed technical magazines covers cutting-edge topics rang- ing from software design and computer graphics to Internet computing and security, from scientific appli- cations and machine intelligence to visualization and microchip design. Here are highlights from recent issues. observations and probe physics software, and network operations. with the aid of numerical simula- This article draws heavily on the per- Applying Machine Learning tions. They briefly review dynami- sonal experiences of the authors, and Data Fusion to the cal-spacetime general-relativistic many of whom have not been pre- “Missing Person” Problem magneto-hydrodynamic (GRMHD) viously reported in the literature. calculations as fundamental tools The focus is on the 1969–1975 time The authors of this article from the to study the local properties of period when ARPAnet was the sole June 2022 issue of Computer pres- black holes and matter around responsibility of the Advanced ent a system for integrating multi- them. Then, the authors discuss Research Projects Agency (ARPA). ple sources of data for finding miss- the need for general-relativistic ing persons. It can help authorities radiation transport to propagate find children, individuals who have the local information about light wandered off, and persons of inter- obtained with GRMHD simulations Bulsarapp: Interactive est in investigations. to their telescopes. Visual Analysis for Surname Trend Exploration The study of surnames for a given Black Hole Physics and Seeking High IMP Reliability population, together with their dis- Computer Graphics in Maintenance of the tribution and spatial patterns iden- 1970s ARPAnet tification, has been a long-stand- In this article from the March/April ing problem in the fields of human 2022 issue of Computing in Sci- This article from the April–June biology, public health, and social ence & Engineering, the authors 2022 issue of IEEE Annals of the sciences. The ancestry inferred note that black holes are among History of Computing describes from surname information can be the most extreme objects known the first years of ARPAnet opera- a useful means to understand the to exist in nature. As such, they are tions, a time when computers were dynamics of human populations. excellent laboratories for testing not highly reliable, but the net- This knowledge allows us to char- fundamental theories and study- work was built from standard com- acterize geographically the ethnic- ing matter in conditions that can- puters and was expected to func- ity of populations, and to under- not be found anywhere else in tion as a utility with high reliability. stand the complex relationships the universe. The authors high- The article describes how we man- between identity, migration, and light the relevance of black holes aged to achieve the desired reliabil- health issues in a demographic in modern physical and astro- ity, as perceived by ARPAnet users, view. However, in most cases, a nomical research and present one by making innovations in hard- detailed geolocalization of this of the possible paths to explain ware, maintenance procedures, data can be a daunting task. In this 4 November 2022 Published by the IEEE Computer Society 2469-7087/22 © 2022 IEEE
article from the July/August 2022 the authors use mode connectiv- issue of IEEE Computer Graph- ity combined with multi-objective ics and Applications, the authors optimization to select the best Maya: Using Formal propose a visual analytic tool that model out of an identified feasible Control to Obfuscate Power summarizes the heterogeneous set of model weight configurations Side Channels surname and geographic informa- with similar overall performance tion collected from Argentinean but different distributions of per- The security of computers is at electoral rolls. This tool allows a formance over individuals. risk because of information leak- massive data analysis and facil- ing through their power consump- itates interdisciplinary studies tion. Attackers can use advanced about population dynamics. signal measurement and analy- Trustworthy Digital Twins sis to recover sensitive data from in the Industrial Internet of this side channel. To address this Things With Blockchain problem, the authors of this July/ Maximizing Fairness in August 2022 IEEE Micro article Deep Neural Networks via Industrial processes rely on sen- present Maya, a simple and effec- Mode Connectivity sory data for critical decision-mak- tive defense against power side ing processes. Extracting action- channels. The idea is to use formal With frequent reports of biased able insights from the collected control to re-shape the power dis- outcomes of AI systems, fairness data calls for an infrastructure sipated by a computer in an appli- rightfully becomes an active area that can ensure the trustworthi- cation-transparent manner—pre- of current ML research. However, ness of data. In this article from venting attackers from learning while progress has been made on IEEE Internet Computing’s May/ any information about the applica- theoretical analysis and formu- June 2022 issue, the authors envi- tions that are running. lation of fairness as constraints sion a blockchain-based frame- on error probabilities, our ability work for the Industrial Internet to design and train modern deep of Things to address the issues learning models that reach the tar- of data management and secu- Why VR Games Sickness? An geted fairness goals in practice rity. Once the data collected from Empirical Study of Capturing is still limited. The authors of this trustworthy sources are recorded and Analyzing VR Games IEEE Intelligent Systems May/June in the blockchain, product lifecy- Head Movement Dataset 2022 article focus on an interest- cle events can be fed into data- ing yet common fairness setting, driven systems for process moni- Virtual reality (VR) technology is where multiple samples are col- toring, diagnostics, and optimized gaining popularity in a variety of lected from each individual, and control. The authors leverage dig- fields, including education, games, the goal is to maximally reduce ital twins that can draw intelligent movies, medicine, and engineer- performance disparity among indi- conclusions from the data by iden- ing. 360° VR video could provide an viduals while maintaining over- tifying the faults and recommend- immersive experience and attract all model performance. To obtain ing precautionary measures ahead more researchers’ and developers’ such fair deep learning models, of critical events. attention. Some literature focused www.computer.org/computingedge 5
MAGAZINE ROUNDUP on the head movement when users 91.59% in euthymic/manic mood- unmaintained embedded OSS watched 360° videos and released state recognition. components are vulnerable to head tracking datasets. With the severe risks. In this July/August popularity of VR games, how the 2022 IEEE Software article, the game contexts influence players’ authors introduce the OSS Aban- head movement and the effect of Measures to Ensure donment Risk Assessment model head movement on VR sickness is Cybersecurity of to help companies avoid poten- a topic worth studying. In this arti- Industrial Enterprises: tially dire consequences. cle from IEEE MultiMedia’s April- A Legal Perspective June 2022 issue, authors collected a head movement dataset of 30 Cyberattacks on the industrial participants while playing five VR sector demonstrate that informa- 5G/SDR-Assisted Cognitive games (Aircar, Beat Saber, Moss, tion security is the most important Communication in UAV Arizona Sunshine, and SUPER- strategic task at the international Swarms: Architecture HOT), and the participants filled level. In this IEEE Security & Pri- and Applications out the Simulator Sickness Ques- vacy article from the July/August tionnaire (SSQ) after playing VR 2022 issue, the authors consider In this May/June 2022 IT Profes- games. They then analyzed the the legal regulation of the cyber- sional article, the authors address SSQ scores and the impact of VR security of industrial enterprises the challenge of UAV swarm com- games on VR sickness. in certain foreign countries and munications by presenting a frame- Russia. The aim of the study is to work that offers an open-interface analyze the sufficiency and effec- communication and networking tiveness of legal instruments and solution for surveillance operations mechanisms, to identify threats in urban/outreach areas. It is based Long–Short Ensemble and attacks, and to eliminate their on a hybrid connectivity module Network for Bipolar Manic- consequences. that can enable the coexistence of Euthymic State Recognition 5G infrastructures, adaptive multi- Based on Wrist-Worn Sensors band SDR waveforms empowered with cooperative communication In this article from IEEE Perva- OSSARA: Abandonment Risk capacities, and satellite commu- sive Computing’s April-June 2022 Assessment for Embedded nications for continuous swarm issue, the authors propose to per- Open Source Components operation in any demographic area. form user-independent, auto- In addition, they discuss some of matic mood-state detection Software needs to be continu- the current and futuristic applica- based on actigraphy and electro- ously updated and maintained tions and scenarios that can bene- dermal activity acquired from a to continue being useful. This is fit from the provided solution. wrist-worn device during mania particularly true for open-source and after recovery (euthymia). software (OSS) components and This article proposes a new deep libraries, which are increasingly learning-based ensemble method integrated into large and complex Join the IEEE leveraging long (20 h) and short (5 systems. For companies develop- Computer Society min) time intervals to discriminate ing long-term projects, all embed- computer.org/join between the mood states. When ded OSS components should guar- tested on 47 bipolar patients, the antee lengthy life expectancies proposed classification scheme and be maintained as long as sys- achieves an average accuracy of tems are in service. Systems with 6 ComputingEdge November 2022
Editor’s Note Software Evolution in Action S oftware is ever-changing. The techniques we use to develop software, the qualities we IEEE Software’s “A Watershed Moment for Search-Based Soft- ware Engineering” describes the Computer” celebrates the Apollo 11 spacecraft computer that was instrumental to the 1969 mission’s infuse in its design, and the ways emergence of metaheuristic algo- success. IEEE Annals of the History in which we distribute it have all rithms for software engineering of Computing’s “The History of evolved over time. It’s important two decades ago. Franz and Lisp” presents a father- to look back at software’s his- Big data has an interesting his- daughter interview about a once- tory to understand the current tory of its own. The authors of Com- popular programming language for landscape and to prepare for the puter’s “Major Computing Technol- artificial intelligence applications. future. This ComputingEdge issue ogies of the Past 75 Years” identify Finally, this ComputingEdge features two articles that high- big data as one of the most pow- issue covers two aspects of arti- light different aspects of soft- erful drivers of the digital age. In ficial intelligence (AI): explainabil- ware’s evolution. “Visual Analytics Review: An Early ity and security. In IEEE Internet Computer’s “A Brief History of and Continuing Success of Con- Computing’s “Knowledge-Inten- Free, Open Source Software and vergent Research With Impact,” sive Language Understanding for Its Communities” explains that— from Computing in Science & Engi- Explainable AI,” the authors pon- until the late 1960s—software neering, the authors discuss the der methods for providing deci- was shared easily and was consid- rise of visual analytics as a highly sion explanations in AI systems. In ered simply a companion to hard- effective means of representing IEEE Security & Privacy’s “Attacks ware. The author recounts how, big data. on Artificial Intelligence,” the as more software became propri- Some software and hardware author examines input and poi- etary in the 1970s, certain devel- products have had an outsized soning attacks and encourages opers and companies placed their impact on computing history. the development of AI assurance source code in the public domain. IEEE Micro’s “The Apollo Guidance processes. 2469-7087/22 © 2022 IEEE Published by the IEEE Computer Society November 2022 7
EDITOR: Dirk Riehle, Friedrich Alexander-University of Erlangen Nürnberg, dirk.riehle@fau.de This article originally appeared in DEPARTMENT: OPEN SOURCE EXPANDED vol. 54, no. 2, 2021 A Brief History of Free, Open Source Software and Its Communities Jesus M. Gonzalez-Barahona, Universidad Rey Juan Carlos FROM THE EDITOR Welcome back, and welcome to a new thematic arc in our “Open Source Expanded” column! Until now, we have looked only at using open source software, mostly from a company perspective. Now, we will examine how community open source projects work, collaborating across volunteers and companies. I’m very happy to have convinced Jesus M. Gonzalez-Barahona, a long-time open source researcher and enthusiast, to write this opening article about the history of open source and its communities. He will take us through what are, by now, several decades of open source history. Enjoy! And, as always, happy open hacking, everyone, and be safe! — Dirk Riehle Free, open source software (FOSS) has a long history, beginning with the origins of software itself, when the terms free software and open source software were not yet defined. Learning about the milestones of this history may help to understand FOSS today. T he concept of “free software” (with free as To some extent, prior to 1970, software was just an in freedom) dates from the early 1980s. The add-on to hardware, not something considered valu- term open source is much younger, from the able in itself. late 1990s. But before free and open source software The situation changed in 1969, when IBM (FOSS) existed as such, some programs were paving announced the unbundling of software: part of its the way. In fact, until the late 1960s, most software catalog was to be sold separately. From that moment worked as FOSS: it was shared with relative ease on, users had to purchase some of the software they between people who took care of computers. Only a needed. Various companies began to flourish with few companies manufactured computers, with IBM a business model based on producing software to being, by a large margin, the market leader. For all of be run on hardware sold by others. This kicked off them, software was just a companion to hardware: the software market and, with it, the change of soft- as long as you paid for maintenance, you had access ware’s status. Vendors implemented technical and to the software catalog of the manufacturer. User legal means to limit sharing, modifying, and even groups, such as SHARE (IBM) and the DECUS [Digital studying programs. During the mid-1970s, proprietary Equipment Corp. (DEC)] favored software sharing. (non-FOSS) software was already the norm. However, by the early 1980s, some programs were distributed in ways similar to what we now consider FOSS, among Digital Object Identifier 10.1109/MC.2020.3041887 them, SPICE (Simulation Program with Integrated Cir- Date of current version: 11 February 2021 cuit Emphasis), TeX, and Unix. 8 November 2022 Published by the IEEE Computer Society 2469-7087/22 © 2022 IEEE
In 1973, SPICE and its source code were placed in by then a programmer at the Massachusetts Insti- the public domain by their author, Donald O. Peder- tute of Technology (MIT) Artificial Intelligence Lab, son. The program was a tool for learning integrated quit his job to ensure that he had full ownership of circuit (IC) design, and it was quickly adopted by sev- the software he wrote. The project began with an eral universities. With time, SPICE and its derivatives editor (Emacs) and some other tools and quickly pro- evolved into the industry’s preferred tools to design duced various key components. By 1987, it delivered ICs, becoming the de facto standard. It was the first a compiler (GNU Compiler Collection), a debugger example of how a FOSS-based strategy could lead to (GNU Debugger), and several utilities. In 1985, Stall- market dominance. man founded the Free Software Foundation to sup- TeX was developed by Donald Knuth in 1978, dur- port and foster the GNU Project and free software ing a sabbatical, as a typesetting system to produce in general. quality output. Knuth intended to use it for typeset- ting his own books but distributed it as source code as well, through an authorization that today would be UNTIL THE LATE 1960S, MOST considered quite similar to a FOSS license. Since then, SOFTWARE WORKED AS FOSS: IT TeX has become the standard in scientific typesetting, WAS SHARED WITH RELATIVE EASE and it is still popular. BETWEEN PEOPLE WHO TOOK CARE Unix was created by Thompson, Ritchie, and others OF COMPUTERS. at AT&T Bell Labs, starting in 1972. Since 1973, Unix has been distributed to many universities, with a license permitting academic use. The software could not be disseminated beyond the signatories of the license, He also established the philosophical principles of but during the late 1970s, those parties formed a com- free software, including the definition of the concept. munity composed mainly of academic institutions This characterization was based on “four freedoms” and research centers that worked in a similar way to for any user of a free software program: use, study later FOSS groups. Its members shared and improved and modify, redistribute copies, and distribute modi- the code, and the Computer Systems Research Group fications. The GNU Project produced licenses for the (CSRG) at the University of California, Berkeley, began software it was releasing. Those licenses were the producing its own Unix distributions. This was a key- legal projection of the four freedoms. In 1989, they stone of the emergence of FOSS during the late 1980s. were unified in the GNU General Public License (GPL), Earlier in that decade, these cases provided some the first of one of the most successful families of FOSS experience with how basic FOSS-enabled mecha- licenses. The GPL was a clever hack: it protected soft- nisms worked. ware users’ freedoms by using copyright law. The GNU Project’s work was structured in small THE 1980S: GNU, BERKELEY teams of volunteers who produced different pieces SOFTWARE DISTRIBUTION, AND of software, according to a carefully designed plan. THE INTERNET Around 1990, the project had almost completed an In 1983, Richard Stallman announced the GNU Proj- operating system. However, its tools were always run- ect, with the aim of producing a Unix-like system ning on top of proprietary or non-free kernels because composed only of free software. Stallman, who was it still lacked their own versions. www.computer.org/computingedge 9
OPEN SOURCE EXPANDED Meanwhile, during the 1980s, the CSRG lead a large Since the early 1970s, it had collaboratively been pro- community that was busy working on improving Unix, ducing requests for comments (RFCs) (specifications producing Berkeley Software Distribution (BSD) Unix. of standards) as open documents that were acces- The community included people from the University of sible to anyone. The protocols were complemented California at Berkeley and the University of California by reference applications, which were designed to at Los Angeles to MIT; Stanford; Carnegie Mellon; and be easily portable to manufacturers’ systems. During others. There were industry members, too, notably the 1980s, the community developing Internet Proto- AT&T and Bolt, Beranek, and Newman, the company cols and applications was closely related to the Unix producing the first implementations of Internet pro- BSD group since BSD Unix was the usual target for tocols. The efforts were funded mainly by R&D grants developments. Later, as the Internet became popular from the U.S. government via DARPA. at universities, its tools and protocols became funda- mental for the development of communities support- ing free software projects. In an epoch when remote coordination was still usually done via phone and AS THE INTERNET BECAME POPULAR postal mail, free software communities were already AT UNIVERSITIES, ITS TOOLS AND communicating via email lists and sharing software PROTOCOLS BECAME FUNDAMENTAL electronically via FTP or its poor-man version, the FOR THE DEVELOPMENT OF Unix-to-Unix copy network (UUCPnet). COMMUNITIES SUPPORTING FREE Those were also years of testing sustainability SOFTWARE. models for FOSS. Projects quickly became a mixture of people working on their own time as volunteers collaborating with people hired to assist them. In the With time, BSD Unix had less and less code from beginning, hired developers mainly worked at univer- the original AT&T Unix and more and more code pro- sities, such as the teams at the CSRG and other BSD duced by its contributors. The original AT&T code was Unix contributors. In many cases, their funding came covered by the Unix license, but not all the new code from R&D institutions, especially DARPA. But compa- was. In 1989, the code not covered by the Unix license nies were involved in two major ways: by directly fund- was offered as Networking Release 1 (Net/1) under the ing FOSS projects and by making their employees work BSD license, which was free. Net/1 still lacked some on FOSS projects. modules to be a complete, working operating system. The most prominent case of a project funded by At that time, several companies were using BSD Unix companies was X Window, developed at MIT, which (including Unix licensed code) as the basis of their jointly funded the work with DEC and IBM. This was operating systems, and some of them were contrib- one of the first projects to evolve from proprietary uting to BSD with ports to specific hardware, new software (several licenses of X Window were sold) applications, and bug fixes. The effort of incorporat- to being later released as FOSS in 1986. The project ing all this into the BSD code base was coordinated was so successful that several companies used it by the CSRG. as the basis of their GUIs, at a moment when GUIs Another remarkable project of the late 1980s were a key characteristic of workstations. Several was the X Window System, which produced a of these companies assigned large teams to port X platform-independent graphics system incorporating Windows to their systems and to build new appli- a protocol that enabled applications to use a graphics cations for it. Some of the resulting software was terminal, even remotely. X Window was released in contributed back to the FOSS project, showing the 1986 under the MIT license, which was also free and, benefits of sharing upstream. In 1988, X Window in many aspects, similar to the BSD license. was so important to numerous vendors that they During the 1970s and early 1980s, another develop- decided they needed to formalize a neutral point to ment community was creating software under similar drive its evolution, forming the MIT X Consortium. models: the Internet (at first, the Arpanet) community. This was the first case of competing companies 10 ComputingEdge November 2022
OPEN SOURCE EXPANDED establishing a nonprofit to provide stewardship of software of interest to individuals and companies, a FOSS project. both for ethical and practical reasons. The best-known case of a company assigning Companies were also learning how to benefit from employees to work on FOSS projects at that time was FOSS development. Some small companies were try- Cygnus Support, which was founded in 1989 to com- ing pure FOSS business models. Others used FOSS as mercially sustain some of the GNU tools. Its employ- a viable model. And FOSS emerged as a strategic tool ees had led the development of some of those tools, that could be harnessed to build neutral consortiums, such as the GNU debugger, assembler, and linker, all of where competitors could collaborate to produce which were fundamental to the GNU Project and the software that they all found interesting. Some com- FOSS community at large. This not only helped to bring panies noticed how existing FOSS components could stability and resources to the project but it showed be employed to build large parts of complex systems, how companies could profit and grow from maintain- enabling them to leverage their own developments ing and building FOSS components by becoming a at a fraction of the cost of creating the modules focal point of expertise. The fact that Cygnus was themselves. directly involved in the production and maintenance of FOSS projects signals how interesting such efforts were to many companies. Those organizations were FOSS EMERGED AS A STRATEGIC using GNU tools in production environments, and TOOL THAT COULD BE HARNESSED they were ready to pay Cygnus for support and new TO BUILD NEUTRAL CONSORTIUMS, functionality. WHERE COMPETITORS COULD Another remarkable case of a company with a busi- COLLABORATE. ness model centered on FOSS was Aladdin Software, although for a different reason. Since 1986, Aladdin had developed Ghostscript, a PostScript interpreter, and released it under the GNU Project as GNU Ghost- By this time, FOSS development communities script. But the company used a dual licensing model, used digital means for communication (mailing lists, maintaining its own version, Aladdin Ghostscript, Usenet groups, FTP servers, and UUCPnet), enabling under a non-FOSS license. With this model, it was them to work in large, geographically distributed exploring how dual licensing could prevent its com- networks. They explored organizational mechanisms petition from using the latest features in the software that included, in some cases, appointed figures (such while maintaining a FOSS version that sustained the as GNU “maintainers,” who acted as leaders of their popularity of the program. development communities) and de facto coordinators At the end of the 1980s, FOSS communities were (as CSRG personnel were for BSD to some extent). complex in many ways, with people and companies Formal organizations for stewardship projects were collaborating by sharing software and, indirectly or already common: GNU was conceived from the begin- directly, resources. They were exploring several sus- ning as such, BSD had steering committees, X Window tainability models: public funding (via R&D grants), organized the MIT X Consortium, and so on. donations collected via nonprofits, direct funding from companies, the direct involvement of companies via THE EARLY 1990S: LINUX, *BSD, neutral consortiums, pure volunteer work, and combi- AND COMPANY nations thereof. They set up a legal infrastructure cen- During the early 1990s, developments started dur- tered on the two families of FOSS licenses that are still ing the previous decade converged in the first com- in use today: those based on the principles of the GPL plete systems composed only of FOSS components: and those established on the principles of the BSD *BSD and Linux. In the BSD Unix camp, the CSRG and MIT licenses. They had a solid philosophical basis, had reimplemented most of the missing components formalized in several documents that were widely to produce a complete Unix-like system under the known in their communities. And they were producing BSD license. This was distributed as Net/2. In 1992, www.computer.org/computingedge 11
OPEN SOURCE EXPANDED 386BSD was released with an implementation of the THE LATE 1990S AND THE 2000S: small pieces that Net/2 needed, thus resulting in the THE AGE OF FOUNDATIONS first full FOSS system. NetBSD, FreeBSD, and Open- AND CORPORATIONS BSD were later evolutions. In 1998, Netscape announced that its flagship appli- Meanwhile, in 1991, Linus Torvalds announced cation, Netscape Communicator, was to be released his project for writing an operating system kernel, as FOSS. Netscape Communicator was one of the which would soon be named Linux. It quickly gained two web browsers that dominated the market (the traction and contributions from other developers. In other was Microsoft Internet Explorer), and Netscape 1994, Torvalds released Linux 1.0, the first “stable” was one of the most prominent companies of the version, although the software was already usable new Internet era. Because of this, the announcement in 1993, and, in some respects, in 1992. Many tools, received plenty of attention from the media. To some including from GNU and BSD, were ported to it, and extent, this event signaled that FOSS was becoming different groups started to produce Linux-based dis- something real for companies, something that they tributions (such as Slackware, Debian, or Red Hat). could use as a part of their strategy. In preparation for Around 1993, *BSD and Linux-based distributions the announcement, the term open source software were perfectly usable, complete operating systems was coined as an alternative for free software, and the that could be installed on PCs. With time, Linux Open Source Initiative was formed. became the most popular, and during most of the At about the same time, large FOSS communities 1990s, many cohorts of young developers, including emerged. The GNU Project included a growing num- students at numerous universities, were exposed to ber of tools and members. People were also joining it. New FOSS projects, small and large, launched in the Free Software Foundation. New projects were many places and domains, and the number of people bootstrapped. Debian was one of them. In 1993, it was involved in FOSS development and maintenance established to maintain the Debian Linux-based distri- kept growing. bution, and it was soon joined by tens, and eventually It was during the mid-1990s that the Internet hundreds, of developers. Debian was a community of evolved from an academic curiosity to a mass market individuals, where companies didn’t have a role. In this service, with the web becoming the primary mecha- respect, it followed the GNU tradition, although from nism for accessing information and, later, digital the beginning, its governing rules, which were explicit, services. The importance of FOSS components for led to a much more horizontal organization. Internet infrastructure was evident, being one of the Another community of developers was Apache, enablers of the expansion of this technology. Most first built around the Apache HTTP server and then of the implementations of Internet Protocols were expanded with other FOSS components. In 1993, the either FOSS or derived from FOSS projects. Many project was born as the Apache Group, which expanded of the most popular services were implemented as and formed the Apache Software Foundation in 1999. FOSS, such as Sendmail and NCSA HTTPd (and later This, too, was a community of individuals, although Apache), which were dominant among email and many of its members were hired by companies. How- HTTP servers, respectively. ever, Apache tried to remain neutral with respect to A new kind of FOSS-related company appeared companies, following a spirit that resembled Debian’s. that was linked to Linux-based distributions. In fact, The group of developers producing the Linux kernel many of the major Linux-based distributions were pro- was one of the major software development communi- moted by companies: Red Hat, SuSe, Mandrake, and ties formed during the 1990s. From its beginning, the others. They all began by marketing a Linux-based dis- project was very clearly directed by Torvalds, with only tribution and expanded to offer a mixture of services, a few formal governing rules. Although companies had from training to support, that were, in general, loosely no direct role, they hired many Linux developers, who based on their distribution and, to some extent, their often had clear interests in the system’s development. brand. Other companies, such as VA Linux, joined this In 2000, the Linux Foundation was formed to organize growing market of FOSS-based solutions. contributions from these companies and support 12 ComputingEdge November 2022
OPEN SOURCE EXPANDED the project although, in general, technical decisions project. It was formed by companies providing finan- remained relatively separate. Later, the Linux Founda- cial resources and by Eclipse developers. They all par- tion extended the model to many other projects that ticipated in strategic decisions. came under its umbrella. In 1996, the Kool Desktop Environment (KDE) was born to develop a FOSS desktop application. Partly as a reaction to KDE using some non-FOSS compo- T hese software development communities and their corresponding nonprofits have explored dif- ferent relationship models between developers and nents, the GNU Network Object Model Environment the companies with interests in their projects. From (GNOME) was announced in 1997, with similar objec- the very developer-centric Debian and Apache to tives. Soon, hundreds of developers joined both of those with significant direct company participation them. Various companies began hiring developers to (KDE, GNOME, and Eclipse), from those originated by work on the projects because they wanted to drive companies (Mozilla and Eclipse) to those with origins the evolution of certain applications. This was the in individual developers (almost all the others men- case, for example, with SuSe and Red Hat: the desktop tioned previously), from those with clear and detailed environment of their Linux-based distributions was governance and participation rules (such as Apache, to be improved. Some others, such as Helix Code and Debian, KDE, GNOME, and Eclipse) to those based Eazel, were small start-ups funded to develop specific more on practices and the personal charisma of some applications. individuals (GNU and Linux), they all have produced GNOME and KDE established nonprofits to sup- FOSS components of interest. They have proved to port the projects, and both found ways to let compa- be sustainable, remained attractive to developers nies participate directly. KDE’s nonprofit was formed (either hired or volunteer), and devised their own in 1997, and the GNOME Foundation was incorporated approaches to structuring productive FOSS commu- in 2000. Using different mechanisms, companies that nities. The current landscape of FOSS development contributed significant resources participated in the is the result of this history. There has been progress, projects’ decision making, and combined with the and there have been contradictions. There has been influence they obtained by hiring developers, they had collaboration but also fierce competition between a real impact on the initiatives. GNOME and KDE were models, aims, and mechanisms. Today’s FOSS is the the most prominent organizations exploring the path product of it all. toward communities of companies, which had begun with the MIT X Consortium. Netscape launched Mozilla to produce the FOSS JESUS M. GONZALEZ-BARAHONA is with the Universidad version of Netscape Communicator. But Netscape’s Rey Juan Carlos, Fuenlabrada, 28943, Spain. Contact him at new owner, AOL Time Warner, lost interest in the ini- jesus.gonzalez.barahona@urjc.es. tiative. In 2003, the Mozilla Foundation was formed to legally steward the project, independently from AOL. From then on, the Mozilla Foundation searched for lines of revenue, which it found in agreements with companies, notably Google, that were interested in its flagship program, Firefox. Thanks to this revenue, Mozilla hired a large team of developers, and it also built a large community of volunteer supporters. In 2001, the Eclipse project was created by IBM and supported by a group of software companies to F O LLOW US produce a FOSS integrated development environ- @s e curit y p riva c y ment and related tools, which then extended to many other domains. In 2004, the Eclipse Foundation was established as a neutral nonprofit to steward the www.computer.org/computingedge 13
EDITOR IN CHIEF: Ipek Ozkaya, Carnegie Mellon Software Engineering Institute, ipek.ozkaya@computer.org This article originally appeared in FROM THE EDITOR vol. 38, no. 4, 2021 A Watershed Moment for Search-Based Software Engineering Ipek Ozkaya S oftware engineering is about understanding of studies which explore specific applications and and making the right tradeoffs. Many soft- challenges in SBSE, such as search-based software ware engineering tasks—such as test case testing,3 search-based refactoring,4automatic bug generation, design, sprint planning, and refactor- repair,5 and incorporating interaction into SBSE.6 ing—boil down to understanding tradeoffs including The road from research to practical use always has concretely expressing the attributes to optimize its twists and turns. Research applications can be on and elements and decisions that will maximize their small scale, limited problem sets. Retrospective stud- intended outcomes. Consequently, many software ies assessing challenges in transitioning SBSE to prac- engineering problems can be formulated as search tice demonstrate one trivial, but key lesson learned: problems. Driven by these observations, Harman and understanding the role and the tasks of the user when Jones in 2001 emphasized the importance of concen- applying SBSE techniques cannot be overlooked. trated research on the application of search-based Human intervention is crucial for solution evaluation techniques in software engineering and coined the and indication of preferences to the algorithms about subfield of search-based software engineering (SBSE) how a given situation should be solved. Despite such in software engineering research.1 During the two common, but hard to resolve research challenges, decades that followed, SBSE has seen a significant SBSE applications have seen some recent use in prac- amount of increased research where metaheuristic tice at scale in automated testing.7,8 algorithms are used to create recommendations for A key barrier often voiced for adopting new auto- software engineering tasks. Metaheuristic algorithms mated approaches, such as accepting recommenda- are designed to select a good enough solution with tions found through search, is developer acceptance. incomplete or imperfect information, making them Developer tolerance for false positives is often very suitable for the complex tasks of balancing tradeoffs low and the expectation can be end-to-end full auto- as common in many activities involving development mation. We need to understand and design workflows and deployment of software. where developers are supported by such automated SBSE has been attractive to researchers as it has tools and are able to provide seamless feedback to the the potential of increasing the availability of auto- correctness and relevance of the recommendations. mated approaches to software engineers for tasks Incorporating developer feedback to workflows will that are otherwise hard to provide automated support result in improving tool robustness and reliability more for, such as design tradeoff analysis. Retrospective effectively. studies that reflect on the research trends in SBSE The ability to reap the benefits of the past two demonstrate that most of the concentrated research decades of research no doubt will require continu- activity in SBSE has been in software verification ing to fine-tune the algorithms, their performance, and validation, and design.2 There are now a number and their validation. However, we will likely see accelerated change in practice especially when researchers step back and start tackling smaller yet Digital Object Identifier 10.1109/MS.2021.3075108 high-return-on-investment problems with SBSE. SBSE Date of current version: 18 June 2021 is in fact going through a watershed moment. 14 November 2022 Published by the IEEE Computer Society 2469-7087/22 © 2022 IEEE
FROM THE EDITOR INDUSTRIAL APPLICATIONS problems often involve high technical as well as OF SBSE political stakes. Solving small yet frequent issues The most recent and notable application of SBSE in reliably at scale both eases their acceptance as well practice has been in search-based testing with the as adoption, opening the door to investigating harder deployment of the Sapienz tool at Facebook,7 where problems to follow. the application of search-based testing methods Search-based refactoring, similar to search-based enabled replacing manual testing with automated test testing and search-based bug repair, has made some case generation. Facebook engineers report that the progress in industrial validation. In particular, the Sapienz search-based approach both speeds up the ability to integrate the tools by identifying refactor- testing process and also delivers acceptable accurate ing opportunities at commit time and recommending results (up to 75% accuracy). Identifying objectives sequences of refactorings to fix the quality issues that capture the tradeoff space to drive Pareto-optimal is an attractive industry relevant scenario.10 The solutions appropriately with acceptable algorithm application of refactoring ranges from making local performance improved the applicability of Sapienz at changes through refactoring to refactoring software scale. One of the key insights in Sapienz was that pro- to support evolution scenarios at scale.11 Identifying gressively replacing long test sequences with shorter easy, small-scale applications will pave the way for the ones identified equally good test cases and achieved development of more complex tools that can automate higher effectiveness, resulting in detecting faults pre- design changes. Creating recommendations for such viously undetected in Android applications.8 There changes needs to take into account hard-to-identify are several competing objectives that any testing and reconcile competing objectives that range from approach strives to accomplish, including test cov- ensuring that the changes do not introduce code qual- erage, sequence length, execution time, readability, ity, security, maintainability, and reliability issues to and replicability. No one tool can optimize all, while minimizing changes. some achieve success on multiple fronts. The ability to balance such competing objectives simultaneously THE WATERSHED MOMENT with improved outcomes provides an opportunity to The application of search-based testing research out- improve software testing efficiency and accuracy. The comes in industry at scale along with the increased multiobjective search-based approach successfully number of research projects in SBSE in general is demonstrated by Sapienz and its application at scale creating SBSE’s watershed moment. The ability of at Facebook are the kinds of research successes that Sapienz to identify faults that other rigorous testing create watershed moments which showcase examplar techniques and tools, such as Google’s Android Mon- research to practice to research cycles. key, were not able to identify, and to do so faster, is Other industrial applications of SBSE will fol- one of the many reasons that triggered a buy-in to low the success of SBSE applications in testing at transition the technique to practical scale. In addi- Facebook. For example, Bloomberg is experimenting tion, understanding the critical value position for with application of automatic bug repair. 9 Automatic the developers allowed the researchers to focus on bug repair is also fundamentally considered a search a high-return-on-investment technical approach, problem. 5 One of the observations of Bloomberg focusing on shorter testing sequences that are prac- engineers and researchers includes that the more tical and simple to implement. Lessons learned from complex the academic research strives to apply SBSE two decades of SBSE research and current feed- techniques, the harder their initial buy-in and applica- back from developers using early industrial applica- tion. At Bloomberg, engineers were able to get buy-in tions will inform and drive accelerated progress in the for prototyping an approach by focusing on demon- decades to come. strating the repair of small, frequent, and trivial bugs, Understanding developer interactions and prefer- not their hardest bugs. Research often misses the ences is critical both for improving the performance of realities of economies of scale in the applicability of SBSE algorithms as well as their acceptance in prac- research and transitioning of it to practice. Complex tice. SBSE techniques provide recommendations on a www.computer.org/computingedge 15
FROM THE EDITOR Pareto-front set of all solutions that represent the sat- Testing,” in Proc. IEEE 8th Int. Conf. Softw. Testing, isfaction of objectives where more improvements are Verification Validation (ICST), 2015, pp. 1–12. doi: 10.1109 not possible without making one or more criteria worse. /ICST.2015.7102580. More often than not, several equally applicable solu- 4. T. Mariani and S. R. Vergilio, “A systematic review on tions exist rather than one best solution. This implies search-based refactoring,” Inf. Softw. Technol, vol. 83, that developers need to both be able to express differ- pp. 14–34, Mar. 2017. doi: 10.1016/j.infsof.2016.11.009. ent objectives and be comfortable in providing feed- 5. C. L. Goues, M. Pradel, and A. Roychoudhury, “Auto- back to the automated tools in making the selections: mated program repair,” Commun. ACM, vol. 62, no. 12, how far off the automated recommendations are how 2019. doi: 10.1145/3318162. often the recommendations are applied, and what other 6. A. Ramírez, J. R. Romero, and C. L. Simons, “A system- objectives need to be incorporated into the search and atic review of interaction in search-based software selection process. These questions drive the need to engineering,” IEEE Trans. Softw. Eng., vol. 45, no. 8, pp. investigate interaction mechanisms between develop- 760–781, Aug. 2019. doi: 10.1109/TSE.2018.2803055. ers and SBSE tools. Understanding how developers can 7. N. Alshahwan et al., “Deploying search based software express their preferences and how to incorporate these engineering with Sapienz at Facebook,” in Software back into the algorithm design is important in further Engineering. SSBSE 2018 (Lecture Notes in Computer improving adoption of SBSE techniques. Science, vol. 11036), T. Colanzi and P. McMinn, Eds. Birkhäuser Verlag AG: Springer, Cham. [Online]. Avail- T he value of identifying simple, trivial tasks to make progress on cannot be discounted. Automation is valuable when it buys developers time. While there able: https://doi.org/10.1007/978-3-319-99241-9_1 8. K. Mao. “Sapienz: Intelligent automated software testing at scale,” Facebook Engineering, May 2, 2018 are many tasks that are tradeoffs and search problems . https://engineering.fb.com/2018/05/02/developer in software engineering, researchers need to target -tools/sapienz-intelligent-automated\-software applications of SBSE on tasks where quick gains for -testing-at-scale/ (accessed May 10, 2021) developer time can be demonstrated along with their 9. S. Kirbas et al., “On the introduction of automatic reliable validation. Focusing applications of SBSE program repair in Bloomberg,” IEEE Softw., vol. 38, no. 4, on tasks that can be compartmentalized as small, pp. 43–51, July/Aug. 2021. doi: 10.1109/MS.2021.3071086. simple developer actions is essential for iterative and 10. V. Alizadeh, M. A. Ouali, M. Kessentini, and M. Chater, incremental progress. Automated testing, automated “RefBot: Intelligent software refactoring bot,” in Proc. bug repair, and automated refactoring each provides IEEE/ACM Int. Conf. Automated Software Engineering, opportunities to identify such small, mundane tasks 2019, pp. 823–834. to apply search operations to. During the next decade, 11. J. Ivers, I. Ozkaya, R. L. Nord, and C. Seifried, “Next gen- we will likely see accelerated development and adop- eration automated software evolution refactoring at tion of SBSE tools that are seamlessly integrated into scale,” in Proc. ESEC/SIGSOFT FSE, 2020, pp. 1521–1524. developer workflows. Time will tell. REFERENCES 1. M. Harman and B. F. Jones, “Search-based software engineering,” Inf. Softw. Technol., vol. 43, no. 14, pp. 833–839, 2001. doi: 10.1016/S0950-5849(01)00189-6. 2. T. E. Colanzi, W. K. G. Assunção, S. R. Vergilio, P. R. Farah, and G. Guizzo, “The symposium on search-based software engineering: Past, present and future,” Inf. Softw. Technol., vol. 127, p. 106,372, 2020. doi: 10.1016 /j.infsof.2020.106372. 3. M. Harman, Y. Jia and Y. Zhang, “Achievements, Open WWW.COMPUTER.ORG/COMPUTINGEDGE Problems and Challenges for Search Based Software 16 ComputingEdge November 2022
You can also read