YEARS OF - AMSTATNEWSJanuary 2020 Issue #511
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
January 2020 • Issue #511 AMSTATNEWS The Membership Magazine of the American Statistical Association • http://magazine.amstat.org YEARS OF After Two Busy Decades, SPAAC Wants to Be Busier ALSO: Twelve Years of ASA Science Policy: Highlighting the Scope and Breadth International Prize in Statistics Nominations to Open Soon
AMSTAT NEWS JANUARY 2020 • ISSUE #511 Executive Director Ron Wasserstein: ron@amstat.org Associate Executive Director and Director of Operations Stephen Porzio: steve@amstat.org features Senior Advisor for Statistics Communication and Media Innovation 3 President’s Corner Regina Nuzzo: regina@amstat.org 5 Highlights of the November 2019 ASA Board Director of Science Policy Steve Pierson: pierson@amstat.org of Directors Meeting Director of Strategic Initiatives and Outreach 7 Formal Privacy: Making an Impact at Large Donna LaLonde: donnal@amstat.org Organizations Director of Education 10 Nominations Sought for ASA President-Elect Rebecca Nichols: rebecca@amstat.org 10 ASA Seeks New Science Managing Editor Policy Fellow Megan Murphy: megan@amstat.org 11 International Prize in Statistics Editor and Content Strategist Nominations to Open Soon Val Nirala: val@amstat.org 12 NERDS Workshop Fills Void Production Coordinators/Graphic Designers Olivia Brown: olivia@amstat.org 13 Triage Judges Needed Megan Ruyle: meg@amstat.org for COMAP Modeling Contest Advertising Manager Claudine Donovan: claudine@amstat.org 14 20 Years of Statistics Advocacy: After Two Busy Decades, SPAAC Page 11 Contributing Staff Members Wants to Be Busier Amstat News welcomes news items and letters from readers on matters of interest to the association and the profession. Address correspondence to Managing Editor, Amstat News, American Statistical Association, 732 North Washington Street, Alexandria VA 22314-1943 USA, or email amstat@ columns amstat.org. Items must be received by the first day of the preceding month to ensure appearance in the next issue (for example, June 1 for the July issue). Material can be sent as a Microsoft Word document, PDF, or within an email. Articles will be edited for space. Accompanying artwork will be accepted in graphics file formats only (.jpg, etc.), minimum 300 dpi. No material in WordPerfect will be accepted. 16 SCIENCE POLICY Amstat News (ISSN 0163-9617) is published monthly by the American Twelve Years of ASA Science Policy: Highlighting Statistical Association, 732 North Washington Street, Alexandria VA 22314- the Scope and Breadth 1943 USA. Periodicals postage paid at Alexandria, Virginia, and additional mailing offices. POSTMASTER: Send address changes to Amstat News, 732 This column is written to inform ASA members about what the ASA is doing North Washington Street, Alexandria VA 22314-1943 USA. Send Canadian to promote the inclusion of statistics in policymaking and the funding of sta- address changes to APC, PO Box 503, RPO West Beaver Creek, Rich Hill, tistics research. To suggest science policy topics for the ASA to address, con- ON L4B 4R6. Annual subscriptions are $50 per year for nonmembers. Amstat tact ASA Director of Science Policy Steve Pierson at pierson@amstat.org. News is the member publication of the ASA. For annual membership rates, see www.amstat.org/join or contact ASA Member Services at (888) 231-3473. American Statistical Association 22 STATS4GOOD 732 North Washington Street Data for Good: The Year in Review Alexandria, VA 22314–1943 USA (703) 684–1221 This column is written for those interested in learning about the world of Data ASA GENERAL: asainfo@amstat.org for Good, where statistical analysis is dedicated to good causes that benefit ADDRESS CHANGES: addresschange@amstat.org our lives, our communities, and our world. If you would like to know more or AMSTAT EDITORIAL: amstat@amstat.org have ideas for articles, contact David Corliss at davidjcorliss@peace-work.org. ADVERTISING: advertise@amstat.org WEBSITE: http://magazine.amstat.org Printed in USA © 2020 24 STATtr@k American Statistical Association What Supports the Big Tent for Statistics and Data Science? STATtr@k is a column in Amstat News and a website geared toward people who are in a statistics program, recently graduated from a statistics program, or recently entered the job world. To read more articles like this one, visit the website at http://stattrak.amstat.org. If you have suggestions for future articles, ® or would like to submit an article, please email Megan Murphy, Amstat News The American Statistical Association is the world’s largest managing editor, at megan@amstat.org. community of statisticians. The ASA supports excellence in the development, application, and dissemination of statistical science through meetings, publications, membership services, education, accreditation, and advocacy. Our members serve in industry, government, and academia in more than 90 countries, advancing research and promoting sound statistical practice to inform public policy and improve human welfare.
departments 29 meetings ASA Day Contest ASA to Cosponsor AI in Clinical Drug Development Symposium in May Winners History Quiz Contest Question 1: What library is used to archive ASA historical materials? Answer: Iowa State University Parks Library Winner: Karl Broman Question 2: What ASA president served as president (chancellor) of the University of Rochester and served as an adviser to US presidents Dwight Eisenhower, Richard Nixon, Gerald Ford, and Ronald Reagan? Answer: W. Allen Wallis Winner: Steve Wang Haiku Contest Winston Richards (left) awards the Winston A. Richards Student Entry First Place: Doha Akad Prize in Statistics to Ariel Stewart. Ariel had the best II and III performance in statistics. Page 33 As a bird far flies member news As stars are above the skies Statistics applies Non-Student Entry First Place: Larry Lesser* 30 Awards and Deadlines So much more than math 33 People News ASA illuminates 35 Section • Chapter • Committee News like Nightingale’s lamp 38 Calendar of Events Non-Student Entry Second Place (Tie): Anne Milley 43 Professional Opportunities Many shades of gray Statistics leads the way to understand much more Follow us on Twitter www.twitter.com/AmstatNews Non-Student Entry Second Place (Tie): Barry Nussbaum Join the ASA Community One eighty years old http://community.amstat.org Helping a profession grow Like us on Facebook www.facebook.com/AmstatNews ASA still bold Follow us on Instagram *Larry Lesser has previously published statistics poetry (e.g., www.instagram.com/AmstatNews April 2018 Amstat News) and has several more statistics poems (including a haiku) in this month’s issue of Journal of Humanistic Mathematics. 2 amstat news january 2020
president's corner Take Advantage of the ASA’s Data Challenge Opportunities O ne might expect the first column of an ASA president to describe the ASA initia- tives for the coming year. I am going to deviate from this practice and write about our ini- tiatives in the February issue of Amstat News. Instead, I will focus on ASA data challenge oppor- tunities in this first article. Many readers might be familiar with the Kaggle competitions (www.kaggle.com/competi- tions), the KDD Cup (www.kdd.org/kdd-cup), and—of course—the famous $1 million Netflix prize (www.netflixprize.com). However, did you know the ASA has a history of issuing data chal- lenges that pre-date all of these? The Statistical Computing and Statistical Graphics sections have held a Data Exposition competition with entries being presented and judged at the Joint Statistical ThisIsStatistics has an annual Fall Data Challenge for high-school and Meetings (https://bit.ly/2RKVcRP) since 1983. college students and a spring competition, Statsketball, which uses Some of these data sets (such as the airline on-time statistics to make predictions about the NCAA Basketball Tournament. performance data from Data Expo 2009) contin- ue to be used to demonstrate and teach statistical machine learning concepts, which illustrates the in the spring of 2020. Participants at the work- importance and impact of these challenges. shop will share ideas, which will result in trans- The ASA also has an annual Fall Data Challenge disciplinary collaborations impacting our world. for high-school and college students hosted at Another data challenge opportunity will be https://thisisstatistics.org. This challenge typically announced in mid-January. This challenge will focuses on a call to address real problems affect- be issued as part of the Women in Data Science Wendy Martinez ing our society. For example, the 2019 Fall Data (WiDS) Conference being held March 2. The Challenge used data from the US Department of WiDS conference is a global event during which Housing and Urban Development (HUD) relating data scientists from around the world come to Los Angeles, New York City, and Seattle. There together virtually and locally to inspire data sci- is also a spring competition—Statsketball—to entists, regardless of gender (www.widsconference. keep the excitement going throughout the school org). Regional events can be organized, and there year. This contest uses statistics to make predic- has been one in the DC-MD-VA area for the past tions about the NCAA Basketball Tournament. several years. We are issuing a data challenge as Then there is the Statistical Impact Competition part of the WiDS 2020 DC-MD-VA regional (https://bit.ly/2rCay0b), which was part of 2019 event, but we are still working on what data set ASA President Karen Kafadar’s impact initiative will be used. The plan is to issue the challenge in and JSM theme—Statistics: Making an Impact. January, and contestants will present results at the The goal of this challenge was to use data to illus- DC-MD-VA WiDS 2020. So, stay tuned to ASA trate areas that have been and could be impact- communication channels for details and think ed by the field of statistics. Submissions for the about organizing a WiDS event in your region! competition have been received and will form the Now, back to the longtime data challenges held foundation of an Innovation Workshop to be held at the Joint Statistical Meetings, because now is january 2020 amstat news 3
Three ASA sections work together to sponsor a now annual Data Challenge constrained by them. They are just to get the ideas flowing. Expo at JSM. The contest is open to • Is there a long-term trend with respect to temperature? Are there any outliers or anom- anyone and challenges participants alies in space or time? • Is there a spatial pattern with respect to tem- to analyze a data set using statistical perature changes? • Are there different geographic regions/clusters and visualization tools and methods. that behave differently (e.g., increases, no increases at all, or decreases)? • Can you construct a spatio-temporal model that predicts temperatures in 2030 (i.e., some slight extrapolation)? What else might affect the time to consider entering. Three ASA sections the temperatures 10 years from now? (Computing, Government, and Graphics) came together to sponsor a now-annual Data Challenge Contestants will present their results in a speed Expo. The contest is open to anyone who is inter- poster session at JSM and must submit their ested in participating, including college students abstracts to the JSM online system. Note that judg- and professionals from the private or public sec- ing takes place at JSM and is based on the results tor. This contest challenges participants to analyze presented there. Presenters are responsible for their a data set using statistical and visualization tools own JSM registration and travel costs, as well as any and methods. other costs associated with JSM attendance. Group The data set for the Data Challenge Expo 2020 submissions are acceptable. To enter, contestants is the Global Historical Climatology Network must do the following by February 4: (GHCN). Public use data files and documentation • Submit an abstract for a speed poster ses- are available at https://bit.ly/2P7rAME. Contestants sion via the JSM 2020 website (https:// must use some portion of the GHCN data, but are bit.ly/2E6ocew). Specify the Statistical strongly encouraged to combine other data sources Computing Section as the main sponsor. in their analysis such as IPUMS (https://ipums.org), You may include the Government Statistics NASA’s EarthData (https://earthdata.nasa.gov), the Section and Statistical Graphics Section as European Data Portal (https://data.europa.eu), or additional sponsors. the National Agricultural Statistics Service (https:// • Forward the JSM abstract submission email bit.ly/2LDPCww). to me at martinez.wendy@bls.gov. There are two GHCN data sets containing cli- I would like to end this first column by thank- mate data from land surface stations placed around ing the outgoing ASA Board members—Lisa the world and ranging in time from 175 years ago LaVange (2018 ASA President), David Williamson to the past hour. One data set (GHCN Monthly) (vice president), Amarjot Kaur (treasurer), James contains monthly mean temperatures that can be Lepkowski (Council of Sections representative), used for climate monitoring. However, the data set Cynthia Bocci (international representative), and that would perhaps be more useful for entries in Julia Sharp (Council of Chapters representative)— the competition is the GHCN Daily database. For for their service to our profession. Also, of course, I instance, these data could be used for understand- want to welcome our newest board members—Rob ing changes in various growing seasons, assessing Santos (2021 ASA President), Dionne Price (vice the frequency of heavy rainfall and other weath- president), Ruixiao Lu (treasurer), Rebecca Hubbard er patterns, and describing the frequency of heat (Council of Sections representative), Alexandra waves (see “An Overview of the Global Historical Schmidt (international representative), and Ji-Hyun Climatology Network-Daily Database” in the Lee (Council of Chapters representative). And to all Journal of Atmospheric and Oceanic Technology at of our members, thank you for letting us serve you. https://bit.ly/2E49SmL). Here are some questions to think about for an analysis; however, contestants should not feel 4 amstat news january 2020
Highlights of the November 2019 ASA Board of Directors Meeting A SA President Karen Kafadar convened the final ASA Board meeting of 2019 at the 2019 Board of Directors ASA headquarters in Alexandria, Virginia, Karen Kafadar, President November 22–23. The 2019 Board of Directors Wendy Martinez, President-elect were joined by the incoming 2020 board members. The highlights of the board meeting follow. Lisa LaVange, Past President David Williamson, 3rd-Year Vice President Discussion Items • As it does annually, the board discussed Katherine Monti, 2nd-Year Vice President the status of committees in the Education Richard De Veaux, 1st-Year Vice President Council and the Professional Issues and Visibility Council. These councils serve as the Julia Sharp, 3rd-Year Council of Chapters connection between their committees and the Representative board. The board expressed gratitude for the great work these committees do on behalf of Don Jang, 2nd-Year Council of Chapters the profession and the association. Representative • The board welcomed two former US chief Anamaria Kazanis, 1st-Year Council of statisticians, Katherine Wallman and Hermann Chapters Representative Habermann. They briefed the board on the Jim Lepkowski, 3rd-Year Council of citizenship data collection being carried out by Sections Representative the Census Bureau using administrative data per Executive Order 13880. Katherine Halvorsen, 2nd-Year Council of Sections Representative Action Items Mark Glickman, 1st-Year Council of • The board changed the names of three Sections Representative committees: Cynthia Bocci, International • The ASA/MAA Joint Committee on Representative Undergraduate Statistics will become the ASA/MAA Joint Committee on Scott Evans, Publications Representative Undergraduate Statistics Education (and, Amarjot Kaur, Treasurer perhaps, further be changed to ASA/ MAA Joint Committee on Undergraduate Ron Wasserstein, Executive Director and Statistics and Data Science Education, Board Secretary pending MAA approval). The size of this committee was reduced at the suggestion of the committee. scientists and journal editors. The task force will be appointed by Kafadar with advice • The ASA/NCTM Joint Committee on from the ASA Board and make its recommen- Curriculum in Statistics and Probability dations to the board by November 2020. will become the ASA/NCTM Joint Committee on K–12 Education in • As it does each year, the board reviewed the Statistics and Probability. A change to the ASA’s strategic plan and how it is being imple- charge of the committee was made as well. mented by the association. While no action to change the plan was taken, several suggestions • The ASA LGBT Concerns Committee for improved implementation will be followed will become the ASA LGBTQ+ Advocacy up on by staff and board leadership. Committee. • The board appointed initial members to the • The board created the ASA Task Force on ASA Review Board, which is the body respon- Statistical Significance and Reproducibility. Its sible for carrying out the policies for viola- charge is to develop thoughtful principles and tions of the ASA Activities Conduct Policy. practices the ASA can endorse and share with Board members Katherine Monti, Dionne january 2020 amstat news 5
Price, and Ron Wasserstein are the initial members of this review board. 2020 Board of Directors Wendy Martinez, President Reported Items • Associate Executive Director and Director Rob Santos, President-elect of Operations Steve Porzio summarized the Karen Kafadar, Past President ASA’s financial activity through September 30, 2019. He said the ASA’s financial health Katherine Monti, 3rd-Year Vice is very good, with net assets over $21 million. President He predicts a positive annual net revenue at Richard De Veaux, 2nd-Year Vice year’s end, but that depends on market activ- ity the rest of the year. President • ASA Treasurer Amarjot Kaur reported on Dionne Price, 1st-Year Vice President the ASA’s investments. She noted the ASA’s Don Jang, 3rd-Year Council of portfolio had gained over nearly $3 million in Chapters Representative value in the three quarters of 2019. Anamaria Kazanis, 2nd-Year Council • Amanda Malloy, ASA director of develop- of Chapters Representative ment, summarized the results of ASA Giving Day. We raised more than $80,000 from 300+ Ji-Hyun Lee, 1st-Year Council of donors. The number of donors on ASA Giving Chapters Representative Day was much higher than last year, which, Malloy noted, is exactly the goal of Giving Katherine Halvorsen, 3rd-Year Council Day. Malloy also noted there are more than 90 of Sections Representative members in the Helen Walker Society (HWS). Mark Glickman, 2nd-Year Council of HWS members are those who have given at Sections Representative least $1,000 to the ASA in the past year. Rebecca Hubbard, 1st-Year Council of • The board received progress reports on the Sections Representative strategic initiatives launched by Kafadar. In addition, ASA President-elect Wendy Alexandra Schmidt, International Martinez updated the board on planned Representative activities for 2020. Scott Evans, Publications • The Council of Chapters Governing Representative Board (COCGB) and Council of Sections Governing Board (COSGB) reported on their Ruixiao Lu, Treasurer recent activities. The COCGB was actively Ron Wasserstein, Executive Director supportive of Giving Day, has launched its and Board Secretary new reporting mechanism, continues to mon- itor chapter health, and continues to improve the traveling course program. The COSGB • The board heard an update from Andreas reported on the continued growth in the Georgiou, the former president of ELSTAT— number of interest groups and changes in pro- the Hellenic Statistical Authority—on the cedures to financially support new sections. status of his continued trials and tribulations It also made suggestions on ways the website in the Greek court system. After meeting with could be easier to navigate for section use. Georgiou, the board approved another state- ment of support for him. • Mark Glickman, co-chair of the ad hoc Advisory Committee on Data Science, updat- • Steve Snapinn, chair of the Publications ed the board on the progress of that commit- Committee, and Scott Evans, publications tee, noting recommendations will be coming representative to the Board of Directors, to the April 2020 board meeting. Tian Zheng, reported on discussions of the CHANCE chair of the Section of Statistical Learning and Magazine Task Force. The board is consider- Data Science (now the largest section of the ing a recommendation from the task force ASA), updated the board on activities of the to rebrand CHANCE as a data science– section and shared her perspective on what is focused publication. going on in various data science communities. The board will next meet April 3–4 at the ASA Glickman and Zheng then answered ques- tions from the board. headquarters. n 6 amstat news january 2020
COMMITTEE ON PRIVACY AND CONFIDENTIALITY Formal Privacy: Making an Impact at Large Organizations A JSM 2019 Session Summary W ith the growing amount of data collected More recently, the Census Bureau conducted every day, data confidentiality is increas- a reconstruction attack of the 2010 Census and ingly at risk. Many of the traditional re-identified data from 17% of the US popula- approaches to statistical disclosure control are no tion. The Census Bureau began to look for new longer deemed sufficient to protect the confidenti- approaches and has adopted differential privacy ality of the data. Formal privacy guarantees are for the 2020 Census and Economic Census. The provable privacy guarantees that typically hold, bureau is also working toward a similar solution for regardless of assumed knowledge and attack strategy the American Community Survey, though no final of a malicious user. The formal privacy guarantees decisions have been made. are especially important for large producers of sta- Garfinkel noted that, despite its size, the decen- tistics, such as national statistical agencies or large nial census is the easiest to make differentially pri- MORE ONLINE private companies. These organizations are increas- vate. There are only six variables per person: age, To access the ingly designing and engineering systems with sex, race, ethnicity, relationship to householder, and presentations from improved disclosure limitation systems, with strong location. There are no weights since it is a census. JSM, visit https://bit. consideration for formal privacy. The Disclosure Avoidance System (DAS) devel- ly/36UvJt8. To learn more about this, the Committee on oped by the bureau allows it to enforce global The ASA Privacy Privacy and Confidentiality (https://bit.ly/2RWbNlL) confidentiality protections that rely on injections and Confidentiality organized a Joint Statistical Meetings topic-contributed of formally private noise. The advantages of noise Committee is session, Formal Privacy: Making an Impact at Large injection with formal privacy are transparency, sponsoring a Organizations. The session brought together four tunable privacy guarantees (privacy guarantees do webinar on Privacy experts from large organizations who have devel- not depend on external data), protection against Day, January 28. The oped, proposed, and implemented formal privacy accurate database reconstruction, and protection of speaker is Michael Hawes, senior models or variants of differential privacy. The pre- individual data. The challenges are that the entire advisor for data sentations described challenges, discussed how the country must be processed at once for best accuracy access and privacy challenges were met, and provided an outlook for and every use of confidential data must be tallied at the US Census future implementation of formal privacy. in the privacy-loss budget. To do this, the Census Bureau. Details Lars Vilhuber of Cornell University, a member Bureau created new differential privacy algorithms will be provided at https://bit. of the Committee on Privacy and Confidentiality, and processing systems (the aforementioned DAS) ly/2RWbNlL. organized the session. The committee’s co-chair, that produce accurate statistics for large populations Aleksandra Slavkovic of The Pennsylvania State (e.g., states and counties), constructed protected University, moderated the panel. microdata that can be used for any tabulation with- Simson Garfinkel of the US Census Bureau gave out additional privacy loss, and fit the system into a talk titled “Deploying Differential Privacy for the the decennial census production system. 2020 Census of Population and Housing.” The 2020 The basic approach to creating a differentially decennial census requires an actual enumeration. The private decennial census is to treat the entire cen- data is collected under a pledge of confidentiality. sus as a set of queries on histograms. The selected The 2010 Census data released to the public queries measure six geolevels (nation, state, coun- used a disclosure avoidance technique called house- ty, tract, block group, block) and allow thousands hold swapping. Swapping was limited to households of queries per geounit, resulting in billions of within a state and of the same size. However, the queries overall. Each histogram therefore has bil- swapping rate is confidential. lions of cells. january 2020 amstat news 7
The Census Bureau first created a block-by- and the data (statistics/analytics or machine learn- block algorithm designed to independently protect ing). Mironov said, “Statistics is old school and each block by measuring queries for each block, machine learning is where industry is heading.” privatizing queries, and then converting results This raises an important question for our statistics back to microdata. It also developed a top-down community: Why such a perception of statistics? mechanism by first generating a national histogram The goal for distributed data analytics is to without geographic identifiers and then allocating learn about the data from distributed sources, counts to each geography from the “top down.” such as individual devices (or other distributed This approach is easy to parallelize, and each geo- data or databases setting). Mironov described unit can have its own strategy selection. Using high the use of the RAPPOR (randomized aggregable dimensional matrix mechanism, there is parallel privacy-preserving ordinal response) algorithm in composition at each geolevel and reduced variance the Google Chrome browser. It has inspired new for many aggregate regions. theory and applications. The main challenges are The Census Bureau then tested both algorithms that the absolute error increases with the square on the 1940 census data, available at IPUMS. It root of N and there is privacy loss over time. turns out the advantages of the “top down” mecha- He then went on to describe the development nism outweigh the disadvantages when compared of a new software stack called Cobalt as part of to the “block-by-block” mechanism on various the new Fuchsia operating system, still within the measures, and the Census Bureau has opted to context of statistical analysis of distributed data implement the “top-down” algorithm. Various runs (distributed analytics). It is also based on ran- of the 1940 data through the DAS, covering vari- domized response. The main challenge is who is ous values of the privacy parameter epsilon, were anonymizing the data. The anonymization meth- released to the public and are available to research- odology must be transparent. There are various ers (see https://bit.ly/2Ei1sbH). options enforced by organizational methods. Garfinkel also noted several organizational chal- Turning to data analytics on centrally stored lenges. For one, all uses of confidential data need data, which, according to Mironov, is the “standard to be tracked and accounted for. Ideally, all desired setting” in the differential privacy world. Examples queries (tables) should be known in advance, togeth- include privacy integrated queries (PINQ), an er with their desired accuracy. Furthermore, the early implementation of a data analysis platform verification of correct implementation is a check. designed to provide unconditional privacy guaran- Finally, traditional tabulations rely on data quality tees for the records of the underlying data sets. The checks, but under differential privacy, these must be main challenges and risks are mission creep and conducted without looking at the confidential raw expense of implementing the platform over time, data! The largest policy challenge, however, is the forcing the analysts to make choices. choice and allocation of the privacy budget. There are two main approaches to differential- Finally, the data user concerns are even more ly private machine learning (in the context of cen- challenging, as is the determination of the right trally stored data): a family of algorithms called value of epsilon. See Disclosure Avoidance and private aggregation of teacher ensembles (PATE) the 2020 Census (https://bit.ly/38Eb2TZ) for and the differentially private stochastic gradient more information about differential privacy and descent (DP-SGD) method. PATE uses a col- the 2020 Census. lection of hundreds of models to train a student Ilya Mironov, recently at Google and now at model. DP-SGD trains each gradient using differ- Facebook, gave a talk titled “Differential Privacy ential privacy. According to Mironov, DP-SGD is in the Industry: Challenges and Successes.” A dif- a better fit for standard machine learning pipeline. ferential privacy framework measures the privacy Mironov also said, “Right now, machine guarantees provided by an algorithm. In this con- learning is more of an art than a science, which text, he described modalities of privacy, as prac- requires adjustments to models to train the mod- ticed at Google. To frame the discussion, he pro- els for privacy.” Again, this is a sentiment familiar vided a cross-classification of various algorithms to the statistics community and often heard when by where the data are stored (distributed or cen- describing data analysis with real data versus pure trally) and by what use is made of the algorithm mathematical modeling. 8 amstat news january 2020
Juan M. Lavista Ferres of Microsoft gave a tool for opening these data sets while preserving the talked titled, “Differential Privacy in Windows privacy of the individuals.” 10, and Why Many DP Implementations Fail.” Shiva Kasiviswanathan of Amazon stated that Introduced in 2015, Windows 10 is a series of differential privacy provides provable protection personal computer operating systems produced and allows clear quantification of privacy losses; by Microsoft. Microsoft collects metrics in an however, there are challenges with implementing anonymous way as part of telemetry, a service that differential privacy at Amazon. Some are technol- contains technical data about how the Windows ogy-oriented, while others are based on human and 10 devices and its related software are working cultural factors: and sends this data periodically to Microsoft to fix issues that occur. Users have the option to opt • Different teams own different services, so dif- out from telemetry. There are 100s of millions of ferential privacy products have to be negoti- devices that don’t opt out. The problem, as Ferres ated across teams showed, is that the information from opt-out • The teams do not have proper differentially machines is not missing at random. private data cleaning and exploration tools In telemetry, data is systematically collected many times across the lifetime of a device, which • Software developers want code they can start results in a privacy leakage problem. The solution with, not technical papers is to discretize the numbers into buckets—that is • Explaining the legal implications of differen- to represent or approximate (a quantity or series) tial privacy is challenging using a discrete quantity or quantities. To address this challenge, Microsoft developed a solution that There is a large body of research that has been could provide them with the signal without affect- developed to design algorithms and tools to achieve ing the privacy of the individuals. Using a new differential privacy, understand the privacy-utility approach to the local differential privacy (LDP) tradeoffs in different data access setups, and inte- model, differential privacy is adapted for repeated grate differential privacy with machine learning and collection of counter data and happens before the statistical inference. Amazon is working to address data is transmitted. Windows 10 includes an API privacy challenges, especially by building differen- allowing developers to leverage a built-in differential tial privacy tools that are accessible to developers privacy solution. (both within and outside of Amazon). Turning to implementation challenges, Ferres Kasiviswanathan mentioned the autoDP pack- stated that many differential privacy projects fail age maintained by Yu Xiang on GitHub. It imple- because customers do not understand the solution. ments Rényi DP (which goes back to Mironov) and Ninety percent of developers surveyed had never is particularly useful when the data set is accessed heard of DP. Once introduced to it, they then think by a sequence of randomized mechanisms. This it is a magic box that can solve all their problems. approach weighs the tradeoffs through a privacy A common frustration is that they can query global calibrator that numerically calibrates noise to pri- models, but not the individual data. The data is not vacy requirements. They are working to integrate accessible in a raw format. this with the Apache MXNet, a fast and scalable Ferres also explained that he is passionate about training and inference framework with an easy-to- DP because it can provide data-driven input use, concise API for machine learning. to health issues such as Sudden Infant Death Kasiviswanathan briefly described other privacy Syndrome (SIDS). The current approach for access- projects at Amazon, such participant roles and anal- ing data for research at the US Centers for Disease ysis of false discovery rates. Control and Prevention requires writing scripts, Aleksandra Slavkovic of The Pennsylvania State submitting them to a trusted curator, seeking University moderated the discussion at the end of approval, and finally being able to run the script. the session. There was a focus on topics including This process takes three months and $900 for each achieving higher accuracy in large aggregations (e.g., script. Juan says, “Research doesn’t work if every large cities), defining federated learning (combining query takes three months to run.” He concluded traditional and differential privacy methods), and by noting, “Differential privacy can be an amazing how the privacy-loss budget will be set. n january 2020 amstat news 9
ASA Seeks New Science Policy Fellow Send the ASA T he ASA is accepting applications for its science ! your picks policy fellowship for fall 2020. A one- to two- year position, the fellow will be based at the ASA headquarters in Alexandria, Virginia; however, they will spend the bulk of their time in Washington, DC, advocating for statistics and experiencing first- hand how federal science policy is formed. Applications are due by March 31, but the ASA will consider particularly high-quality applications until the position is filled. Nominations The fellowship was created to elevate the profile of statistics in policymaking and advocate on behalf of the profession. Amy Nussbaum was the ASA’s inau- Sought for ASA gural science policy fellow, and Daniel Elchert is the second and current fellow. Recently, Elchert spearheaded the ASA’s involve- President-Elect ment with the social media company Reddit by orga- nizing Ask Me Anything (AMA) digital town hall N MORE ONLINE events to highlight statistical perspectives on issues of ominations are being sought for ASA Learn more about the day, starting with the decennial census (https://bit. the ASA Science president-elect and vice president ly/2LH8k6i). Policy Fellowship at candidates for the 2021 election year. Through Capitol Hill meetings, media outreach, https://bit.ly/38nlqj0 Yes, the 2020 elections have yet to be held, and view a video of grassroots organizing, and coalition building, he has but the Committee on Nominations needs former fellow Amy helped lead the ASA’s advocacy for evidence-based time to evaluate recommendations to propose Nussbaum discussing her policymaking and championed principal federal statis- experience at https://bit. the best possible slate of candidates for these tical agencies such as the Economic Research Service ly/2P9QgUQ. Questions critical positions. in the United States Department of Agriculture and about this opportunity As a member of the ASA, you recognize the may be directed to ASA the National Center for Education Statistics in the importance of leadership in our diverse, com- Director of Science Policy Department of Education. plex, and multidisciplinary field. You and all Steve Pierson at pierson@ Elchert is also leading the ASA’s State of the US fellow ASA members deserve visionary leaders amstat.org. Data Infrastructure article series by interfacing with who can ensure our discipline has a voice at former federal statistical agency leaders to create the table when appropriate, whether it be in thought pieces highlighting statistical agencies as academe; research firms; federal, state, or local key to our country’s capacity to make policy deci- government; or nonprofit organizations. This sions informed by data and evidence (https://bit. is why we need your input. ly/2PwSqgf). As part of this effort, Elchert led the cre- For this election cycle, the president-elect ation and building of the LinkedIn group Count on will be selected from government and the Stats (https://bit.ly/2qDU5If), a growing community vice president will be selected from academe. of professionals with an interest and stake in the work Think about your colleagues and associates of the federal statistical agencies. who are members of the ASA and would Finally, Elchert also served as co-author of the make good candidates for these positions. ASA’s first survey of master’s graduates, providing Think about members who have helped run insight about graduates’ degree satisfaction and related a conference or are active in your section or job market demands (https://bit.ly/358sa25). chapter. Then, nominate your choices for the Nussbaum represented the ASA at meetings from 2022 president-elect and vice president by the National Academies to Capitol Hill and even emailing elections@amstat.org. introduced her own member of Congress to climate Supply as much information about your nomi- scientists. Among the many projects she worked nee as possible to assist the committee in research- on were the documents “Guidance on Statistical ing each candidate thoroughly and discretely. Evidence in Legislation,” “Recommendations to The deadline for nominations is February Funding Agencies for Supporting Reproducible 1, 2020. Research,” and “Guidance for Service on Federal Advisory Boards and Committees.” n 10 amstat news january 2020
International Prize in Statistics Nominations to Open Soon T he International Prize in Statistics—one of the highest honors in statistics—is awarded every W two years to an individual or team “for major 18 C- achievements using statistics to advance science, tech- - 12 43 /a jm /s nology, and human welfare.” m b Nominations for the 2021 International Prize in Statistics will open in early 2020. Here are some points to consider when choosing a nominee for the prize: • The prize will be awarded for a single work or Help solve body of work, rather than for more diffuse rea- sons such as “lifetime achievement.” Not only should powerful and original ideas be recognized real-world by the prize, but also contributions that lead to breakthroughs in other disciplines or works with important practical effects on the world. problems • Generally, the prize will be awarded to individu- als, but in some cases, groups of individuals working on similar ideas—or even teams of indi- MORE ONLINE viduals or organizations—could be recognized. Learn more about the Learn to make data-driven • The recipient(s) must be living at the time of International Prize in Statistics at https:// decisions with a Master of selection for the award. bit.ly/2Prds0V. Visit Applied Statistics, offered • The 2021 prize will be announced in October https://bit.ly/359CNSz to download the online through Penn State 2020 and presented at the ISI World Statistics nomination form. Congress in July 2021 in The Hague. World Campus. A strong Email the form and A nomination packet consists of the following: related materials foundation in data analysis to nominations@statprize. • Name, address, phone number, and email address can help advance your of nominator (person making the nomination) org by August 15. • Name, address, and email address of the candi- date (person being nominated) Not ready to commit to a degree? Start with our • Nomination statement (maximum of 1,200 words) addressing why the candidate should 12-credit graduate receive this award (The statement should explain the contributions of the candidate in terms under- standable to a non-specialist. The nomination A world of possibilities. Online. statement should also indicate what the relation- ship is between the nominator and the candidate.) • Copy of the candidate’s CV, listing publications, honors, service contributions, etc. • Up to four letters of support (The committee reserves the right to contact the nominator and writers of the support letters to seek additional information and insight.) Unsuccessful nominations are carried over for one selection cycle (two years). For more information, visit https://bit.ly/2Prds0V. n worldcampus.psu.edu/amstat january 2020 amstat news 11
NERDS Workshop Fills Void Yongqiang Tang, Bingming Yi, Xinming Hao, Ming-Hui Chen, and Daoyuan John Loewy, Honghong Zhou, Jian Zhu, and L.J. Wei talk Shi stop for a smile October 11 during the workshop. animatedly during the NERDS Workshop. T he New England Rare Disease Statistics (NERDS) Workshop, the first of its kind in NERDS Organizing Committee the nation, attracted more than 160 attend- Yang Song, Vertex (Co-Chair) ees from all over the country on October 11, 2019. This sold-out event filled a void for statisticians Sammi Tang, Servier (Co-Chair) working to bring cures for rare diseases. Kun Chen, University of Connecticut The last 10–15 years have seen an emergence of drug development efforts in the rare disease space. Charlie Cao, Biogen Contributing factors include increased public aware- Roee Gutman, Brown University ness, encouraging drug regulation changes, scien- tific advancements in cellular/molecular biology and Mike Hale, Takeda genetics, development of innovative trial designs, an Daniel Meyer, Pfizer influx of capital investment, and availability of scien- tific talent through decades of cultivation. As a result, Jeffrey Palmer, Pfizer a number of regulators, academicians, and industry John Zhong, REGENXBIO statisticians now work to bring these orphan drugs to patients, facing unique technical issues and challenges. Recognizing the need for a conference like Li of Vertex, Chenkun Wang of Vertex, Rima Izem NERDS, Ouhong Wang, vice president and head of Children’s National, Ming-Hui Chen of the of biostatistics at Vertex, proposed the conference University of Connecticut, Qing Liu of Quantitative so statisticians across the rare disease drug develop- and Regulatory Medical Science, LLC, Balram ment spectrum would have a forum to exchange Gundapaneni of Pfizer, Feng Tai of Agios, and Peng ideas, share experiences, and network. Sun of Biogen. In addition to several actual case stud- The one-day workshop included detailed pre- ies, topics discussed included treatment effects in rare sentations and discussions. Vertex’s incoming CEO, diseases, trial designs, pediatric trials, historical control, Reshma Kewalramani, kicked off the event by deliv- comparative effectiveness, and matching methods. ering the keynote speech, “Tyranny of Numbers.” Attendees’ feedback was overwhelmingly posi- The scientific program featured speakers including tive, and there are plans to continue the workshop. L.J. Wei of Harvard University, Robert Beckman of For more information, visit the workshop website at Georgetown University, Jingjing Ye of FDA, Ziliang https://nerds.nestat.org. n 12 amstat news january 2020
Triage Judges Needed for COMAP Modeling Contest One Team Will Receive judges are needed to assist in the initial review of submissions. In mid-February, triage judges will ASA Data Insights Award receive the judging guidelines, initial allocation of papers to review, and examples. Also, there will be a web training session February 22 at 1 p.m. ET. I n 2016, the Consortium for Mathematics The MCM is open to both high-school stu- and Its Applications (COMAP) added a data dents and college undergraduates. In 2020, more insights problem, Problem C, to its annual than 5,000 teams are anticipated to participate in Mathematical Contest in Modeling (MCM, Problem C. https://bit.ly/36F2XNe). In this new modeling If you are interested in serving as a Problem C challenge, teams are presented with a modeling triage judge, contact Dave Olwell at dholwell@ problem and data set. me.com. Judging must be completed by March 22, The American Statistical Association will des- and judges are compensated $10 per paper scored. ignate one outstanding team as the winner of the For more information about Problem C, contact ASA Data Insights Award, and qualified triage Stacey Hancock at stacey.hancock@montana.edu. n Enhance Your Career Add Master of Science in Biostatistics from University of Florida to your resume Our MS online program offers you flexibility to study at times and locations convenient to you. Apply Now for Fall 2020 Learn more about how you can expand your job opportunities and increase your earnings potential. biostat.ufl.edu/MSonline january 2020 amstat news 13
YEARS OF After Two Busy Decades, SPAAC Wants to Be Busier Steve Pierson, ASA Director of Science Policy T he ASA’s work to increase the visi- bility of statisticians in policy and more broadly goes back to at least the Scientific and Public Affairs Advisory Committee’s (SPAAC) creation nearly 25 years ago. SPAAC serves as a sounding board for a variety of policy issues the ASA may consider acting upon. 14 amstat news january 2020
Activities of SPAAC have included discussing sta- tistical perspectives on current issues and whether What Has SPAAC Been Up To? the ASA should sign onto letters circulating in the scientific community or send its own letter. The The Scientific and Public Affairs Advisory Committee committee works closely with the ASA director of creation can be traced to a 1987 document recommending science policy and science policy fellow, who are and outlining the creation of an ASA Office of Scientific the ASA staff liaisons to the committee. and Public Affairs (OSPA), which led to the ASA hiring an The committee’s activities have covered an array OSPA director, Marilyn Humm. She served in that position of issues over the years. A regular activity has been from 1988 to 1998. In the minutes of the December 1991 to organize sessions for the Joint Statistical Meetings, ASA board meeting, the board approved an advisory which have covered topics such as statistics and the committee to OSPA and its charge was approved at the supreme court, statistical measurement on public pol- December 1995 board meeting. The early activity of the icy, election integrity, and accuracy of election polls. advisory committee seemed to focus on public affairs and The committee also hosts an annual JSM poster com- petition highlighting the contributions statisticians included the creation of the ASA media experts list. Today, make to society, from health care and the economy to the committee is much more active. Take a look at some of national security and the environment. what it has accomplished in the last 20 years: In the 2000s, the committee organized a work- JSM Poster Competition: https://bit.ly/2YGVGtj shop on climate change and was active in election integrity issues. Both efforts Statement on Climate Change: https://bit.ly/2YFLSjB resulted in an ASA board CHARGE OF THE SCIENTIFIC AND Statement on Electoral Integrity: https://bit.ly/2E6lHJe statement. SPAAC also PUBLIC AFFAIRS created several Statistical ADVISORY Statistical Significance: https://bit.ly/2E7zLlV Significance pieces, which COMMITTEE 2009 Congressional Visits: https://bit.ly/2YOW89j serve the same purpose as Consider public the pieces for the JSM poster policy issues (1) which STAT Act: https://bit.ly/38tLY2a competition, and oversaw affect the statistical Office of Financial Research: https://bit.ly/2t4bPNO community or (2) to the 2009 congressional visits. which statisticians can Moving into the next contribute Reaction to How to Lie with Statistics Ban: https://bit. decade, the committee was Recommend policies ly/2PFuDuo instrumental in the writing to the ASA Board of Commission on Evidence-Based Policymaking: https:// and introduction of Rep. Directors bit.ly/2PCahCs and https://bit.ly/2r7UiDQ David Loebsack’s bill— Serve as a liaison the Statistical Teaching, between ASA and Statement Regarding Drawing of Voting Districts and Aptitude, and Training Act other statistical Partisan Gerrymandering: https://bit.ly/38E95Hs of 2010 (STAT Act). SPAAC experts, professional and governmental Secret Science Act Letter: https://bit.ly/34diHFm also monitored and support- organizations, and the ed the establishment of the media on these issues Honest Act Letter: https://bit.ly/2Q9QWsz Office of Financial Research The Hill Op-Ed: https://bit.ly/2YGwLGg and led the ASA’s response to the US Department of Veterans Affairs Secretary’s decision to ban the book How to Lie with Statistics from its training sessions. Since the mid-2010s, the committee has been the op-ed in The Hill, “HONEST Act Needs Honest MORE ONLINE ASA’s lead on the work of the federal Commission Engagement of Scientific Community.” Get involved! on Evidence-Based policymaking and the subse- In the last two years, the committee has been Contact any of quent work it set in motion. In 2017 and 2018, especially involved in responding to federal calls for the committee under Jerry Reiter’s leadership, the committee led comments on issues covering the US Environmental members with your ideas. https://bit. the development of the joint ASA and American Protection Agency, a citizenship question on the ly/2YBAoNY Mathematical Society statement regarding draw- decennial census, and policy-comment embargo ing of voting districts and partisan gerrymandering. times for the release of federal economic statistics. Also in that period, SPAAC was active in respond- For more about the committee’s work, see the article ing to House bills requiring the EPA to only take about the scope and breadth of ASA science policy regulatory actions based on research for which the activities in this issue. underlying data is openly available. With the com- Looking ahead, the committee is seeking to expand mittee’s leadership, the ASA sent letters to Congress upon its current activities and invites you to contact about the original Secret Science Act and Congress’s Larry Hedges, committee chair, at l-hedges@north Honest Act. The committee’s work also led to an western.edu with your suggestions and comments. n january 2020 amstat news 15
columns SCIENCE POLICY Twelve Years of ASA Science Policy: T he ASA’s science policy activity has covered a variety of topics since the creation of the ASA’s science policy staff position 12 years ago. In her September 2019 column, 2019 ASA President Karen Kafadar mentioned several recent ASA science policy and advocacy initiatives, includ- ing Count on Stats, protecting a USDA statistical agency, and ensuring the integrity of the decennial census. She also touched upon the ASA’s forensic science reform work, in which she has played a leadership role over the last decade. The responsibilities of the position are to raise the profile of statistics and statisticians in policy- Steve Pierson earned making and to advocate for the interests of statisti- his PhD in physics cians. The execution of these responsibilities can be from the University of grouped into the following nonexclusive categories: Minnesota. He spent eight years in the physics • Statistics improving governance, justice, department of Worcester democracy, and other aspects of society Polytechnic Institute and later became head of • Scientific freedom and human rights government relations at • Scientific integrity the American Physical Society before joining • Science to inform policymaking the ASA as director of science policy. • Evidence-based policymaking • Improving science and its process • Nominations As you read this, it should be clear that the fol- lowing activities extend beyond the science policy staff, which now includes a science policy fellow. This work has been accomplished with the help of or by members, committees, sections, ad hoc groups, and task forces, sometimes without the input of the ASA science policy staff at all. Statistics Improving Governance, Justice, Democracy, and Other Aspects of Our Society Statistics has the potential to improve broad aspects many organizations working in the area. One of our society and every-day life. As part of the such partnership brought about the ASA play- responsibility to raise the profile of statistics and ing an integral role in the development of the statisticians in policymaking, we have supported the 2008 Principles and Best Practices for Post-Election following activities that do just that: Audits. That document, which the ASA endorsed, encouraged risk-limiting audits (RLAs) to take Election Integrity the place of auditing a fixed percentage of ballots, The controversies clouding the 2000 US presi- no matter the margin and with no scientific justi- dential election led the ASA Scientific and Public fication for the percentage. Affairs Advisory Committee (SPAAC) and vari- Philip Stark laid out the framework for rigorous ous ASA members to investigate ways statistics RLAs in 2008 and went on to pilot them with vari- can help bolster election integrity. Their work led ous California counties. In 2010, the ASA explicitly to various advances, as well as partnerships with endorsed RLAs, recommending they be routinely 16 amstat news january 2020
columns Highlighting the Scope and Breadth in statute in California, Colorado, Nevada, Rhode Island, Virginia, and Washington. Further, earlier this Congress, Sen. Ron Wyden introduced a bill requiring RLA’s, which the ASA endorsed. Forensic Science Kafadar reviewed in her September 2019 article the ASA’s forensic science work—under the guidance of the ad-hoc Advisory Committee on Statistics in Forensic Science—noting board statements on the importance of statistical research in strengthening forensic science and recommendations for the use MORE ONLINE of statistical statements in expressing the strength For additional of forensic evidence. She also mentioned the information, the National Institute for Standards and Technology online version (NIST) funding of the Center for Statistics and includes numerous Applications in Forensic Evidence (CSAFE). hyperlinked resources. See There are other indications of the central role of https://bit.ly/2tcT0Z0. statisticians in forensic science reform, including Kafadar’s September 2019 congressional testimony, Constantine Gatsonis’s December 2011 testimony, the strong engagement of the NIST Organization of Scientific Area Committees for Forensic Science with statisticians, and the CHANCE special issue on forensic science. Use of Value-Added Models for Evaluation of Teachers In 2014, the ASA board issued a position statement to better inform the use of value-added models (VAMs) for educational assessment. With techni- cal input and guidance from Sharon Lohr, Daniel McCaffrey, and Walter Stroup, the statement noted the strengths and limitations of VAMs and made recommendations for their use. While it’s hard to measure the impact of the statement, it has been cited extensively in discussions about the use of The ASA has worked to reform the field of forensic VAMs for the evaluation of teachers. science through efforts such as congressional testi- mony, the Advisory Committee on Statistics in Forensic Statistical Perspective for Federal Calls Science, and a special issue of CHANCE. for Comment The ASA, through its committees, has been active in conducted and reported in all federal and most responding to federal calls for comment. For exam- state-wide election contests. ple, SPAAC has provided statistical perspective on the Though activity by the ASA on election integrity policy comment embargo time for federal economic has slowed in recent years, Stark remains actively statistics, the federal poverty measure, a citizenship involved in advancing RLA theory and methods— question on the decennial census questionnaire, and the latest approaches relying heavily on sequential the proposed US Environmental Protection Agency tests derived from martingale inequalities—in addi- (EPA) transparency rule (as noted below) in the past tion to achieving their wider use. two years. The ASA Privacy and Confidentiality RLAs have been piloted in at least nine US states Committee (P&CC) has also been instrumental in and Denmark. RLAs are required by or mandated the ASA’s responses to calls for comment. january 2020 amstat news 17
You can also read