Response Rates and Nonresponse in Establishment Surveys - BLS and Census Bureau
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Response Rates and Nonresponse in Establishment Surveys – BLS and Census Bureau Rita Petroni, Richard Sigman, Diane Willimack (Census Bureau) Steve Cohen, Clyde Tucker (Bureau of Labor Statistics) This paper has been prepared for presentation to the Federal Economic Statistics Advisory Committee (FESAC) on December 14, 2004. It represents work in progress and does not represent any agency’s final positions on issues addressed. The FESAC is a Federal Advisory Committee sponsored jointly by the Bureau of Labor Statistics of the U.S. Department of Labor and by the Bureau of Economic Analysis and the Bureau of the Census of the U.S. Department of Commerce. The authors wish to thank subject matter specialists and program managers who provided information reported in this paper. We also express our gratitude for many helpful comments from agency reviews.
I. Introduction This paper supplements and extends information on response provided by the Interagency Group on Establishment Nonresponse (IGEN) for the Bureau of Labor Statistics (BLS) and the U.S. Census Bureau. IGEN was formed in 1998 to examine nonresponse issues in establishment surveys. The group presented papers at the 1998 Council of Professional Associations on Federal Statistics (COPAFS) Conference, the 2000 International Conference on Establishment Surveys (ICES-2), and the 2000 COPAFS conference. (Interagency Group on Establishment Nonresponse, 1998 (IGEN98); Shimizu, 2000; Ramirez, et al., 2000). The Bureau of Labor Statistics (BLS) is the main collector and provider of data for the Federal Government in the broad field of labor economics and statistics. BLS conducts a wide variety of establishment surveys to produce statistics on employment, unemployment, compensation, employee benefits, job safety, and prices for producers, consumers, and U.S. imports and exports. Data are collected from the full spectrum of establishments including manufacturers, retailers, services, state employment agencies, and U.S. importers and exporters of goods and services. The Census Bureau is the Federal Government's main collector and provider of data about the people and economy of the United States. It conducts over 100 different establishment surveys and censuses that measure most sectors of the nation's economy, with the exception of agriculture. These surveys collect a wealth of general statistics such as sales, shipments, inventories, value of construction, payroll, and employment. The Census Bureau also conducts about 30 establishment surveys sponsored by other government agencies that collect specialized establishment data from manufacturers, educational institutions, hospitals, prisons, and other establishments. After providing a literature review, we consider survey response measurement for BLS and Census Bureau surveys. We discuss each agency’s approach to defining and measuring response rates, and provide examples of response rates and trends. In Section IV, we examine methods that the BLS and Census Bureau have used to encourage response. We follow with a discussion of nonresponse reduction research that the two agencies have conducted before concluding with future plans. II. Literature Review IGEN98 provides an excellent literature review of nonresponse for both household and establishment surveys and censuses. The following summarizes the IGEN98 literature review of establishment nonresponse: • In an international survey of statistical agencies (Christianson and Tortora, 1995), about half reported declines in establishment survey response rates for the previous 10 years. Steady or increasing response rates were attributed to increased effort and resources devoted to nonresponse follow-up, automation, improved pre-notification, reductions in the amount of data collected, and other changes to data collection procedures. 2
• Hidiroglou, Drew, and Gray (1993) offer a “conceptual framework for the definition of response and nonresponse that is suitable for both business and social surveys.” They provide an extensive list of terms and definitions to support response and nonresponse measurement, along with descriptions of both business and social survey procedures to reduce nonresponse. The authors recognize several characteristics unique to businesses that impact nonresponse in establishment surveys. • Tomaskovic-Devey, et al. (1994), in a study of North Carolina businesses, found that several characteristics of businesses not relevant in household surveys affected survey participation. For example, some types of establishments were more likely than others to respond (e.g. manufacturing versus retail trade), larger establishments were less likely to respond, and establishments in industries with high profits were less likely to respond. • Paxson, Dillman and Tarnai (1995), Osmint, McMahon and Martin (1994), and Edwards and Cantor (1991) also discuss establishment nonresponse in the context of the differences between establishment and household surveys. • One body of literature on establishment survey nonresponse tests the application or adaptation of household survey methods to establishment surveys. Jobber (1986) looked at surveys that included industrial populations and found mixed results for pre-survey notifications, monetary and non-monetary incentives, degree of personalization of survey mailings and other factors. James and Bolstein (1992) successfully used monetary incentives to increase response rates in collecting employee health insurance information among small construction subcontractors. Walker, Kirchmann, and Conant (1987) used a number of elements from Dillman’s Total Design Method (Dillman, 1978) to increase response to an establishment mail survey. Paxson, Dillman and Tarnai (1995) found that mandatory reporting and telephone follow-ups in establishment surveys did produce high response rates. Each of these studies had shortcomings. • Werking and Clayton (1995) discuss the use of automated data collection methods in the Current Employment Statistics (CES) survey, including use of facsimile machines (FAX), electronic data interchange, touch tone data entry and others. Many of these techniques require that respondents have (or have access to) office machines (such as FAX or personal computer). Data are primarily quantitative in nature and respondents must be in the survey on repeated occasions with sufficient frequency to warrant the initial training required to provide data in these nontraditional ways. • Changes in the sample design have been explored to help increase business survey response. Permanent random numbers and rotational designs (Ohlsson, 1995; Srinath and Carpenter, 1995) have been used to minimize the amount of overlap in establishment survey samples and resulting respondent burden over time. Survey data imputation and the particular problems unique to business survey data are discussed by Kovar and Whitridge (1995). IGEN98 focuses on unit nonresponse in establishment surveys where unit nonresponse refers to a complete failure to obtain information for a sample unit. A respondent who fails to provide 3
sufficient or key information may also be treated as a unit nonresponse. The paper provides two basic formulas for computing unit response rates. The most basic is the unweighted response rate, which can be formulated as: Number of responding eligible reporting units Unweighted Response Rate = * 100 (1) Number of eligible units in survey The other is a weighted response rate that takes into consideration the relative importance assigned to reporting units. This rate can be formulated as: Total weighted quantity for responding reporting units Weighted Response Rate = * 100 (2) Total estimated quantity for all eligible reporting units The unweighted response rate is used to indicate the proportion of eligible units that cooperate in the survey, while the weighted rate is generally used to indicate the proportion of some estimated population total that is contributed by respondents. In establishment surveys, a small number of large establishments may account for a major proportion of the population total. In these cases, the weighted response rate is probably a better indicator of the quality of the estimates. While equations (1) and (2) are frequently the formulas used to calculate response rates, IGEN98 observed that definitions for the numerator and denominator components of the response rates varied by more than just the units sampled or targeted in the respective surveys. For example, some rates are calculated with only eligible units, while the rates for other surveys include all units at which data collection was attempted. In addition to the variations in the definitions of components for response rates, it was noted that other rates may also be calculated by using equations (1) and (2) and replacing numbers of respondents with numbers of units that were refusals, out-of-scopes, and so forth in the rate numerators. (Shimizu, 2000) IGEN98 reports that there is not a clear trend in response rates for establishment surveys and that survey type and design features seem to have more prominent roles in determining response rates than survey period. For example, use of mandatory reporting has significantly increased response rates in some Census Bureau surveys (Worden and Hamilton, 1989). For many of the 12 surveys IGEN considered, response rates improved from increased recontact efforts and the use of incentives targeted to response rate improvement. IGEN98 also provides a comprehensive summary of methods and activities to reduce nonresponse. Most federal establishment surveys engage in some type of pre-survey public relations emphasizing the purpose and importance of their surveys, attempt to reduce respondent burden, take actions to encourage reporting, and implement procedures to handle nonresponders. According to IGEN98, agencies attempt to reduce the biasing effects of remaining nonresponse in establishment surveys by employing post-survey weighting or non-weighting adjustment techniques. Weighting adjustment, post-stratification, and raking are weighting adjustment techniques. These techniques increase the weights of respondents so that they represent the nonrespondents. Imputation is a non-weighting adjustment technique. This non-weighting 4
technique derives or imputes a missing value for non-respondents from other respondent records. Some recurring surveys may use more than one technique to adjust for nonresponse. IGEN98 also provides a summary of past research on establishment survey nonresponse conducted by the BLS, Census Bureau, National Agricultural Statistics Service, and National Center for Education Statistics. It concludes by outlining eight broad areas that IGEN identified as needing further research or clarification: the correlates of nonresponse, management of establishments in recurring samples, initiation of new respondents and retention of 'old' respondents, effectiveness of nonresponse follow-up, effect of different collection modes on nonresponse, authority vs. capability to respond, uses of persuasion, and investigation of the potential biasing effect of nonresponse on establishment surveys. Shimizu (2000) is a follow-on to IGEN98. It includes an inventory of the response rates and definitions used for rate components by the eight Federal agencies represented by the IGEN members. Standardized terminology and notation are developed to categorize the survey case outcomes that agencies commonly used in rates across the 49 nonrandomly selected surveys included in the study. The paper also presents rates these surveys routinely calculate for other purposes such as monitoring survey progress and assessing collection operations. Ramirez et al. (2000) reviews the similarities and differences among the rates identified in Shimizu (2000), discusses the extent to which there is standardization in the calculation and publication of such rates and the reasons why this is so, and explores the possibilities, advantages, and disadvantages of fostering greater coordination and standardization across IGEN-member agencies. The paper concludes that standardization of response rates might not be as feasible in establishment surveys as in household surveys. Many roadblocks, such as variations in establishment survey populations and survey designs and disruption of the continuity of many time series of response rate data, would need to be overcome for effective standardization guidelines to be identified and implemented in governmental establishment surveys. III. Survey Response Measurement A. BLS In the 1980’s, BLS developed a framework for computing similar response rates across all BLS surveys. Over the last several years, response rate definitions and formulas for each survey were revised to conform to the BLS-wide framework. Using the response rates computed using these definitions and formulas, BLS has begun analyzing response rates across similar surveys. This section will present BLS’s framework and definitions, and describe the current status of the agency-wide analysis. 5
1. Overview of BLS Surveys This paper reviews four major BLS establishment-based surveys that were studied qualitatively as part of the BLS response rate initiative. Table 1 provides a summary of these four surveys in terms of Office, purpose, scope, sample, and collection methods (Ferguson, et.al, 2003). These surveys vary greatly across these variables. For example, the CES Survey has a sample of about 350,000 establishments and collects very limited data elements from respondents monthly to produce estimates of employment by industry and State; the Producer Price Index (PPI) survey contacts 38,000 establishments monthly to collect prices for 100,000 items; and the National Compensation Survey (NCS) collects data from 42,000 establishments annually (quarterly with some establishments) on an extensive number of variables. The International Price Program (IPP) survey has two components that are sampled and tracked separately. BLS’s surveys also use a varied set of collection methods, ranging from personal visit to mail/FAX, to CATI, to automated self-response methods such as touchtone data entry and web collection via the BLS Internet Data Collection Facility. Table 1 – Summary of BLS Establishment Surveys Periodic Updates Office Survey Survey Size Initial Data Primary Mandatory? E- Estab. Collection Mode Frequency Collection Q – Quotes Modes OPLC PPI 38,000 E PV Monthly Mail, FAX No 100,000 Q OPLC IPP Exports 3,000 E PV Monthly/ Mail, Phone, No 11,500 Q Quarterly FAX OPLC IPP Imports 3.400 E PV Monthly/ Mail, Phone, No 14,300 Q Quarterly FAX OEUS CES 350,000 E CATI Monthly TDE, CATI, Yes in 5 Electronic, states FAX OCWC NCS 42,000 E PV Quarterly/ PV, Mail, No Annually Phone Key: OPLC – Office of Prices and Living Conditions CES – Current Employment Statistics OEUS – Office of Employment and NCS – National Compensation Survey Unemployment Statistics E – Establishments OCWC – Office of Compensation and Working Q – Quotes Conditions PV – Personal Visit PPI – Producer Price Index CATI – Computer Assisted Telephone Interview IPP – International Price Program TDE – Touchtone Data Entry 6
2. Historical Response Rate Computations within Each Program Like most large survey organizations, for administrative and operational reasons BLS has organized its survey operations into several offices as shown in Table 1. Each office is thus responsible for the surveys within their domain. Given the separation of survey operation between the offices, the scope/size of the survey, and modes of collection, each office has historically worked independently; that is, developing their own internal procedures for survey operations without a lot of direct contact with their counterparts from the other program offices. This philosophy also extended to calculation of response rates. In addition to the nature of BLS survey operations, there are a number of pragmatic reasons for development of separate and perhaps distinct response measures for each program/survey. Many of BLS's surveys have a separate “field” component for initiation, relying on other methods for ongoing collection. As a result, for the update surveys, each program broke down operations by “stage” of processing (i.e. initiation vs. ongoing collection). This often involved establishment of separate databases for field tracking and update collection. Once these separate databases were in place, each survey could calculate response rates for each “stage”; however, it was more difficult to calculate response rates across the entire survey. Once periodic update survey operation was ongoing, it became more efficient to track the “update” or “repricing” rate for each survey. From an operational standpoint, tracking this rate provided managers with an accurate and consistent measure of how the monthly or quarterly collection was going. For many of the price programs there are other complexities that dictated a separate stage of processing response computation. These primarily relate to the difference between the number of establishments in the survey versus the number of “quotes” obtained from each. For example, the PPI samples 38,000 outlets but obtains price quotes for over 100,000 commodities. Thus, the initiation rate measures what percent of the 38,000 outlets agree to provide price quotes. However, we may not receive complete price information for all desired commodities. Should the response rate look only at what percent of the 38,000 outlets provide all quotes? What if we ask for ten quotes but only receive data for five of them? Once in the monthly repricing cycle, it is more important to track collection at the quote level rather than at the outlet level, since the quotes are the actual inputs into the index calculation. In the mid 1980’s, the BLS Commissioner became concerned about the effect increased telephone collection had on data quality. The Mathematical Statistics Research Center (MSRC), formerly the Division of Statistical Research and Evaluation, set out to develop tests that might be performed to investigate the effects of alternative modes of data collection. The Commissioner wanted suggestions for areas for program improvement. MSRC found that surveys did not fully document the modes used in their data collection methodology making evaluations of mode effect on quality almost impossible to pursue. MSRC recommended that BLS embark on an effort to document and standardize the data collection process at BLS. With this standardization in place, the Bureau could develop a framework for evaluating and testing its data collection modes to determine the trade-offs with respect to cost, time and quality. 7
In March 1985, the BLS formed a Data Collection Task Force (DCTF) to develop a system for compilation of standardized information on data collection across programs. The task force (Bureau of Labor Statistics, 1985) recommended a framework of accountability codes that met the following criteria: 1. At any level, the classes should be mutually exclusive and exhaustive subsets of the next higher level; 2. The framework should be flexible enough to be applicable to any BLS establishment or housing unit survey; 3. The framework should reflect the longitudinal nature of most BLS surveys; 4. The framework should allow for the differentiation of two basic survey operations: data collection and estimation; 5. The framework should be consistent with the standard definition of a response rate; 6. The framework should allow for the computation of field collection completion rates; and 7. The framework should provide for capability of mapping all current BLS classification schemes into it. The task force developed the following data collection and estimation phase classification schemes: Data Collection/Accountability Status Data Collection/Estimation Codes Accountability Codes Eligible Eligible for Estimation 10 Responding 10 Eligible for Estimation 20* Refusal Included in Estimation 21 Refusal – Date Absent - 11* Included in Estimation Unable to Cooperate Scheduled for Inclusion in a Previous Period 22 Refusal – Unwilling to 12* Scheduled for Inclusion in a Previous Period Cooperate 13 Included in a Previous Estimation Period Eligibility not Determined 14 Excluded from Previous Estimation Period 23 Eligibility Not Determined Not Scheduled for Inclusion in Ineligible Previous Estimation Period 30* Ineligible 15 Not Scheduled for Inclusion Previous 31 Existent – Out of Scope Estimation Period – 32 Nonexistent Exclusion for Estimation 19* Exclusion for Estimation 20 Not Responding at Data Collection * Use these codes only if data are not 23 Eligibility Not Determined at Data Collection available for subclasses. 25 Failed to Meet Prescribed Criteria Ineligible for Estimation 30 Ineligible for Estimation * Use these codes only if data are not available for subclasses. This proposed framework supports the following definition of an unweighted response rate: U Number of responding units . U Number of eligible units + Number of sample units with eligibility not determined 8
Each survey can also use this strategy to compute weighted response rates by summing the appropriate weight across all units in the category. For a weighted response rate, the numerator would be the sum of the weights for all responding units, while the denominator would include the sum of the weights for all eligible or eligibility not determined units. Depending on the survey, the weight may be the inverse of the probability of selection while it may be the current employment or volume of trade for other surveys. Over the nineties, BLS staff initiated efforts to ensure that all surveys were collecting response codes that could support this framework. There was no funding to move all programs over to this taxonomy. Managers supported the taxonomy by ensuring that revisions to the processing systems and collection protocols would be consistent as surveys modernized and updated their methodologies and computer systems. Over the late nineties, response rates in general were declining. Most survey managers reported that maintaining good response rates was becoming more difficult. Program managers routinely only monitored response of active sample members, generating a survey specific stage of processing response rates. Even though individual programs could aggregate their response codes into a compatible taxonomy, the response codes differed from program to program because of different internal monitoring requirements. Thus, the BLS could not use the individual response codes monitored by each survey to identify systematic problems across surveys. 3. Current Practices In early 2000, a team was formed to compile response rates based upon the DCTF methodology in order to develop a BLS-wide strategy for improvement initiatives. This team generated its first report in October 2000, including response rates from 14 household and business surveys. The team updates the report every three months to include as much data as possible from as many surveys as possible. Some surveys provide stage of processing rates as well as overall survey response rates while other surveys provide only an overall rate or only one or more stage of processing rates. Table 2 shows a recent summary of the unweighted response rates that appear in this quarterly report (Bureau of Labor Statistics, 2003) for the four surveys in this paper. All surveys studied in this paper except NCS have multiple closings. The CES, PPI, and IPP are monthly surveys that have short data collection periods to meet tight release dates. Due to unavailability of data in a reporting period, sampled establishments can report data in a later period for a scheduled revision. For surveys with multiple closings, we are presenting the response rates for first closing in Table 2. The table, as described below Table 2, contains five columns to include space for reporting the individual stage of processing rates and the overall survey response rate. 9
Table 2 – Sample BLS Unweighted Response Rates For most recent reporting period Fourth quarter 2003 Initial Data Collection Update Collection Update Estimation Total Survey Response Rate Response Rate Response Rate Survey Response Rate PPI 81% E 85% Q IPP Exports 87% E 70% Q 72% Q 63% Q 44% Q IPP Imports 84% E 66% Q 71% Q 64% Q 46% Q CES 73% E 92% E 66% E NCS 68% E NOTES: • E Æ Establishment unweighted response rate • Q Æ Quote unweighted response rate • Data is for the most recent survey panel completed on or before the March 2003 update cycle and for which response data was available. • Blank cells indicate that the response rate is not available for this survey at this time. • Survey: This is the name of the BLS establishment survey. See Table 1 for a description of the surveys. • Initial Data Collection Response Rate: This is the response rate based on the initial contact with the establishment for the individual survey. For most surveys, this rate is computed based on sampled establishments, where an establishment is considered a cooperative establishment if the company agreed to provide any of the requested data – whether or not the company provided all requested data. • Update Collection Response Rate: Where applicable, this is the response rate for the most recent update period for the survey. The update collection response rates show the ratio of establishments (or quotes) for which the survey collected any data during the update period, whether the data was usable for estimation purposes or not. Some surveys compute the update collection rate using establishment counts, while other surveys compute this rate for quotes, or the specific items for which the BLS asks the company to provide information. This rate only applies to surveys that perform an initiation process to gain initial cooperation and then gather updated data on a regular basis for several years. 10
• Update Estimation Response Rate: This rate includes only establishments/quotes for which the company provided enough data for the survey to include the establishment/quote in the actual survey estimates. Both the Update Collection and Update Estimation response rates are processing rates where the denominator includes only those items that were cooperative at the initiation contact with the establishment. This rate only applies to surveys that perform an initiation process to gain initial cooperation and then gather updated data on a regular basis for several years. • Total Survey Response Rate: The last column shows the overall survey response rate, when available. For this rate, the numerator includes all sample units with usable reports while the denominator includes all in-scope sample units. This rate applies to all BLS surveys. The PPI cannot estimate the rate at this time. 4. Trends in Response Rates at BLS The main value in calculating response rates quarterly is not only as a quality indicator for the published data, but as a tool to compare response over time to identify actions needed to maintain survey quality. Response rates have been relatively stable over the last 5 years at BLS. Table 3 compares the current response rate to the average of response rates over 3, 12, and 36 months. Three trends are discernable: • The response rates for the CES have been improving. This was a result of an intensive effort by the CATI centers to elicit response from companies and maintain continued response from sample attritors over time. • The response rate for the NCS has improved because of the introduction of a new sample panel. Due to a major re-design of the program sample replacement schemes had been curtailed for a time. • The response rates for the IPP-Export and IPP-Import have fallen slightly over time. Table 3 – BLS Unweighted Response Rates for Studied Surveys Survey: Current Three Most Most DCTF Month Recent 12 Recent 36 Response Average Month Month Rate Response Averages Averages Rates CES 66%* 64.3% 62.6% 53.6% NCS 68.6% N/A N/A 66.2% IPP Exports 44%* 45.3% 46.8% 48.6% IPP Imports 46%* 46.7% 47.6% 48.2% PPI N/A N/A N/A N/A We further analyzed PPI, IPP-Export and IPP-Import response rates by looking at first closing rates. First closing rates for these three surveys declined substantially, starting in winter of 2001 due to problems associated with mail delivery to Federal offices since the anthrax problems. 11
However, by the final closing enough additional response is obtained to improve the final response rates based on DCTF methodology. This is graphically displayed in Figures 1 and 2 below. It should be noted that the response rates in Figure 1 are for quotes that are scheduled for monthly repricing. These rates are higher than response based on the DCTF methodology in Figure 2. Figure 1: Index Programs -- Usable Repricing at 1st Release 0.9 0.85 0.8 0.75 Response Rate PPI - Repr, 1st close 0.7 IPP-EXP--Usable Response Rate at 1st Release 0.65 IPP-IMP--Usable Response Rate at 1st Release 0.6 0.55 0.5 0.45 1 2 3 0 1 2 3 0 01 1 02 2 03 l-0 -0 l-0 -0 l-0 -0 l-0 -0 -0 -0 n- n- n- pr pr pr ct ct ct Ju Ju Ju Ju Ja Ja Ja O O O A A A Reference Date 12
Figure 2: International Price Program Response Rates 1 0.95 0.9 0.85 0.8 Response Rate 0.75 IPP-Exp.(Overall) 0.7 IPP-Imp.(Overall) 0.65 0.6 0.55 0.5 0.45 0.4 Ap 1 Ap 2 Ap 3 Au 00 O 0 Fe 0 0 Au 01 O 1 Fe 0 1 Au 02 O 2 Fe 0 2 Au 03 O 3 3 Ju 0 D 0 Ju 1 D 1 Ju 2 D 2 Ju 3 D 3 0 0 0 0 0 0 0 -0 r-0 -0 r-0 -0 r-0 -0 r-0 -0 n- g- - b- n- g- - b- n- g- - b- n- g- ec ec ec ec ct ct ct ct Ap Reference Date B. Census Bureau The Census Bureau consists of several directorates that conduct censuses and surveys. The Economic Programs Directorate conducts an economic census every five years and conducts current economic surveys monthly, quarterly, annually, and a few surveys less frequently than annually. In 1993 the Economic Programs Directorate adopted standard definitions for two response rates for economic programs (Waite, 1993). These standard definitions are widely used by the Economic Directorate’s methodologists, survey practitioners, and system developers. In 1995 the Economic Programs Directorate decided to consolidate multiple processing systems for current surveys by developing a Standard Economic Processing System, referred to as StEPS (Ahmed and Tasky, 1999, 2000, 2001). Some response rate measures are calculated by this system. In 2004, the Economic Directorate established standard practices for coding actions by survey analysts when they use StEPS to correct data containing reporting or processing errors and when they enter data into StEPS obtained from follow-up activities. Also, in 2004 the Economic Programs Directorate collaborated with the other Census Bureau directorates to prepare standard response rate definitions. This section presents the Economic Programs Directorate’s current response rate definitions, selected response rate trends, the standard practices for data corrections and entries by survey analysts, and the proposed Census Bureau standard response-rate definitions for economic surveys and censuses. 1. Overview of Census Bureau Economic Surveys The Census Bureau conducts current economic surveys in areas of manufacturing, construction, retail trade, wholesale trade, transportation, commercial services, government services, and 13
foreign trade. Though a few of these surveys allow respondents to provide data over the internet, these surveys are primarily mail surveys with telephone follow-up. The Census Bureau’s current economic surveys are diverse in terms of size, frequency, and type of reporting units. Reporting units for current economic surveys include business establishments, divisions of companies, entire companies, and construction projects. This section will review the response rate trends for three monthly Census Bureau establishment surveys, summarized in Table 4. The sampling units for these surveys are aggregates of establishments. We selected these surveys because they have maintained records of their response rates over a twelve-year period. Table 4 – Summary of Selected Census Bureau Establishment Surveys Survey Size Survey Name (# of sample units) Data Collection Method Mandatory? Monthly Retail Sales 12,000 Monthly Retail Mail with telephone No Inventories 4,000 follow-up Monthly Wholesale 4,000 2. Response Rate Computations -- Current Practices The Census Bureau’s Economic Programs Directorate adopted two standard definitions for response rates in 1993. The first response rate measures the proportion of attempted cases that provide a response, where an attempted case is a case for which data collection has been attempted. It is defined as follows: Response rate #1 = R / M, where R = the number of units that provide a response, and M = the number of units for which one attempts to obtain a response. The definition of this first response rate includes the following comments: • All modes of response prior to publication are included, including last minute phone calls. • Partial responses (defined as one or more data items reported) are included in the U U numerator and denominator. 14
• Administrative records used in place of a planned attempt to collect data are excluded from the numerator and the denominator. • Whole form imputes are excluded from the numerator and included in the denominator. U U U U • Postmaster returns (undeliverable as addressed) are excluded from the numerator and U U denominator. • Refusals are excluded from the numerator and included in the denominator. U U U U • This measure is easy to understand and compute but may not be suitable as a measure across widely differing surveys. • This measure is useful for survey managers in controlling and monitoring operations and permits them to know what proportion of forms have been received at any point in time. • This measure can be misleading in an environment where the underlying population is very skewed. For example, one could measure 92 percent response but only have 50 percent of the values of sales covered. The second response rate measures the proportion of an estimated total (not necessarily the U U published total) that is contributed by respondents for an individual variable. It is defined as follows: R Response rate # 2 = ∑w t i =1 i i /T where wi = the design weight of the ith unit before adjustments for nonresponse, B B PU UP ti = the reported value for the ith unit of variable t for which the response rate is to be B B PU UP computed, and T = the estimated (weighted) total of the variable t over the entire population represented by the sampling frame. The definition of this second response rate includes the following comments: • Imputed values are excluded from the numerator but included in the denominator. U U • Administrative records used in place of a planned attempt to collect data are excluded U U from the numerator and the denominator. 15
• The value of this measure will change for different variables in the same survey. • This measure will be influenced by imputation methodology because the denominator includes imputed values. • This measure requires programs to carry indicators of which variables are reported or imputed. • This measure cannot apply directly to estimates of rates of change and proportions. • This measure cannot be used as a monitoring tool during data collection. • The choice of the survey variable used in calculating this measure can affect the value of the measure. It should be determined prior to data collection which primary variables U U will be used for this measure. Response Rate Number 1 is frequently labeled “return rate.” It is generally calculated only at disaggregated levels—for example, disaggregated by mode of data collection, questionnaire version, or size-based strata. Among Census Bureau surveys, there has been varied interpretations of whether returned forms that do not contain any respondent data or do not contain respondent data for specified key items should be included in the numerator. When Response Rate Number 1 is used to monitor the receipt of forms for initial processing, the general practice has been to include all received forms in the numerator. When Response Rate Number 1 is used for other purposes, however, the general practice has been to include only “usable” returned forms in the numerator. The definition of “usable” varies across surveys. For some surveys, a returned form containing at least one item of respondent data is usable. For other surveys, a returned form must contain respondent data for certain key items before it is considered usable. Response Rate Number 2 excludes imputed data from its numerator. Consequently, the quantity (1 – Response Rate # 2) is frequently calculated and is labeled “imputation rate.” The Economic Directorate includes imputation rates in the explanatory notes of press releases and in the “Reliability of the Estimates” section of its publications. For example, the following text appeared in an explanatory note of the August 22, 2003, press release titled “Retail E- Commerce Sales in Second Quarter 2003 were $12.5 Billion, up 27.8 Percent from Second Quarter 2002, Census Bureau Reports”: “Firms are asked each month to report e-commerce sales separately. For each month of the quarter, data for nonresponding sampling units are imputed from responding sample units falling within the same kind of business and sales size category. Approximately 14 percent of the e-commerce sales estimate for second quarter 2002 was imputed. Imputed total retail sales data accounted for approximately 19 percent of the U.S. retail sales for the second quarter 2003.” 16
The reason this example discusses imputation rates instead of response rates is that the Census Bureau’s publication guidelines encourage the discussion of sources and magnitudes of errors in published estimates. As can be seen from this example imputation rates are easily discussed in connection with the published estimates for different items. Response rates, on the other hand, are associated with the response process and not as easily discussed in connection with the published estimates. StEPS calculates response measures similar to Response Rate Number 1 in its management information module. This module is primarily used to monitor the progress of data collection operations and initial data editing. The management information module calculates the following measures: # checked −in cases Check-in Rate = # attempted cases # delinquent cases Delinquent Rate = = 1 - Check-in Rate # attempted cases # delinquent cases granted extensions to respond Extension Rate = # attempted cases # cases with respondent data Received Rate = # active cases # edited cases Edit Rate = # active cases A count of the number of “attempted cases” appears in the denominator of the Check-in Rate, the Delinquent Rate, and the Extension Rate. “Attempted” refers to an attempt to collect data. A given case is an attempted case in a particular survey cycle if there are one or more attempts to collect data from the given case during the survey cycle. A count of the number of “active” cases appears in the denominator of both the Received Rate and the Edit Rate. A case is active when it is initially selected for a survey. It becomes inactive when updates to the Business Register determine it is out of business or out of scope. A case can also become inactive by becoming a “ghost unit”—that is, by its being absorbed by another reporting survey unit or by its data being included in a combined report provided by another survey unit. 17
StEPS calculates imputation rates similar to (1- Response Rate Number 2) in its estimates and variances module. The definition of Response Rate Number 2 excludes from the denominator “administrative records used in place of a planned attempt to collect data.” The StEPS imputation rate, however, includes administrative data in the denominator when administrative data are included in the published estimate. For surveys that use weight adjustment to handle unit nonresponse, StEPS treats this as a type of imputation because the adjusted weights allow units that report to represent both reporting and non-reporting units. For surveys that use imputation to handle unit nonresponse or item nonresponse, StEPS calculates imputation rates based on the outcomes of processing performed in the StEPS general- imputation module. The general-imputation module imputes data using estimator type techniques (Giles and Patrick, 1986) and adjusts data items associated with additive relationships so that detail items sum to total items (Sigman and Wagner, 1997). With one exception, all reported item data changed by the imputation module are flagged as being imputed item data. This includes (1) items for which no data were reported, and the general-imputation module creates data; and (2) reported data that fail defined edits, and as a result the general-imputation module changes some of the data. The one exception is when reported or missing data are replaced by administrative data that are considered to be equivalent in quality to respondent- provided data. In this case, the changed data are treated neither as imputed data (used to calculate the imputation-rate numerator) nor as reported data (used to calculate the numerator of Response Rate Number 2) but are used to calculate the numerator of an associated administrative-data rate. StEPS uses the following formula to calculate imputation rates: ∑ w y + ∑ (w′ − w ) y Imp i i Adj i i i imputation rate = ∑ w y + ∑ w′ y not Adj i i Adj i i where wi = sampling weight for the ith unit not adjusted for nonresponse, B B PU UP U w’i = sampling weight for the ith unit adjusted for nonresponse (If nonresponse B B PU UP adjustment is not used, then w’i = wi .), U U B B B B yi = unweighted data (reported, imputed, or administrative) for the ith unit, B B PU UP Imp = imputed cases, Adj = respondents in those sub-domains that use weight adjustment to handle unit nonresponse, and 18
not Adj = active cases not in sub-domains using weight adjustment to handle unit nonresponse. The calculation of imputation rates requires that StEPS maintain tracking information indicating if a case is active, if the case has responded, and if the case has not responded how nonresponse is to be handled during processing. StEPS stores a status code for each case indicating if the case is active or inactive. StEPS also stores a “coverage code” for each case that specifies a reason why StEPS handles the case the way it does during data collection and subsequent data processing. The first and second columns of Table 5 list the StEPS coverage codes for active cases, and the third column lists the coverage codes for inactive cases. Table 5 – StEPS Coverage Codes Active cases Data-collection not Inactive cases Data-collection attempted attempted 10 Initial sample 41 Out of business, 40 Out-of-business, 11 Birth pending confirmed 12 Supplemental birth 43 Out-of-scope, 42 Out-of-scope, 13 Reactivation pending confirmed 14 Formerly out-of-scope 46 Chronic 44 Duplicate 15 Previously omitted in error delinquent, 45 Idle 16 Previously deleted in error refusal 47 Small plant, under size 30 Purchase 60 Other (active cutoff for survey 31 New ID resulting from but not population split attempted) 48 Erroneously included 32 Plant reorganized in sample, no weight 33 New industry recalculation needed 37 New ID resulting from 49 Erroneously included merger in sample, weight 38 Combined report recalculation needed 40 Other (active & attempted) 69 Other (inactive) To indicate response status, StEPS maintains a response code for each case indicating if the case has responded and if it has not, whether it is to be imputed or if it is contained in a subpopulation for which weight adjustment will be used to handle unit nonresponse. For nonrespondents, StEPS has the capability to track the classification of a case as a “hard refusal,” in which the respondent informs the Census Bureau that it will not participate, or as a “soft refusal,” in which the respondent does not report over a period of time but never actually informs the Census Bureau that it will not participate. StEPS provides survey analysts with interactive capabilities to enter, review, and change survey data. These capabilities are used to correct reporting and processing errors and to enter data obtained by following up nonrespondents. In version 4.0 of StEPS, analysts will be required to specify values for two one-character fields when entering or changing data in StEPS. The first one-character field, called the StEPS data flag, is a high-level description of the source of the 19
entry or correction. It is used to control various post-data-collection processing steps, including the calculation of response rates. The second one-character field, called the detailed source code is a detailed description of the entry or correction, For data entries or corrections associated with items on the survey instrument, analysts will set the StEPS data flag to one of four values--"A", "I", "R", or "O"--according to the following four principles: Principle 1. The “R” flag indicates questionnaire data provided by the respondent. Modes U U of collecting R-flagged data include conventional or electronic mail, telephone, fax, personal interviews, or electronic data transmission. Data that are improperly keyed and then corrected to values present on an image or the actual questionnaire are assigned a data flag of “R.” Similarly, imputed data that are reviewed and then replaced with originally reported data are also assigned a data flag of “R.” Data provided by respondents that are not questionnaire responses--such as brochures, company reports, or utility bills—are assigned a data flag of R only if approved by a supervisory analyst and it is clear how to link individual questionnaire items to individual non-questionnaire data values. Principle 2. The "A" flag indicates that an analyst judged data provided by a respondent to U U be incorrect and consequently corrected the data, using comments provided by the respondent on the questionnaire or the analyst's knowledge, instead of following up with the respondent to obtain the data. For an analyst correction of a reported value to be assigned an "A" flag, it must be obvious that the correction reverses one or more specific reporting errors present in the value provided by the respondent. Principle 3. The “O” flag indicates the analyst entered or replaced a data value with data U U from another data source that has been determined to be of equivalent quality to data provided by the respondent. The use of the “O” flag for another source requires that it has passed a validation study, in which the data from the other source is compared to data provided by the respondent. Principle 4. The “I” flag should be used in all situations in which Principles 1 through 3 U U do not apply. Analyst actions that are definitely assigned an I flag include, but are not limited to, the following: • Actions that are not specific to an individual respondent and time period – i.e., techniques such as the use of industry averages, the respondent’s history of period-to- period change, and model-based calculations • Direct substitution of data from another data source that has not been validated to be of the same quality as data provided by the respondent Data entries or corrections by analysts assigned a data flag of “R”, “A”, or “O” are excluded U U from the numerator of the StEPS imputation rate. Data assigned a data flag of “I” are included in U U the imputation rate numerator. Figure 3 shows a portion of the computer screen analysts will use to select the data flag and detailed-source code. Possible values of the detailed source code appear on the right side of the screen. For example, administrative data that has been validated 20
to be of equivalent quality to data provided by the respondent is assigned a detailed source code of “W”. Data that have been assigned a detailed source code of “W” are included in the numerator of the administrative data rate calculated by StEPS. (See Sigman (2004) for additional discussion of the StEPS data flag and the detailed source code.) Figure 3. Screen capture of StEPS data flags and detailed source codes 3. Response Rate Computations - - Proposed Practices As part of the attempt to standardize response rates across the Census Bureau, the Economic Programs Directorate is proposing the use of these three response rates: Response Rate = [s / (e+u)]*100 = [s / (n – a – i)]*100 21
⎡ ⎤ Quantity Response Rate = ⎢∑ w i t i / T ⎥ * 100 ⎣S ⎦ ⎡ ⎤ Total Quantity Response Rate = ⎢ ∑ w i t i / T ⎥ * 100 ⎣S + A ⎦ where S = the set of eligible cases with successful data-collections—that is, the set of units for which an attempt was made to collect data, it is known that the unit belongs to the target population, and the unit provided sufficient data to be considered a report. s = number of units in S. e = number of eligible units for which data collection was attempted—that is, the number of units for which an attempt has been made to collect data and it is known that the unit belongs to the target population. (Includes set S, plus eligible units that participated but did not provide sufficient data to be considered a report.) u= number of units of unknown eligibility⎯that is, the number of survey units for which there is an unsuccessful attempt to collect data and no information is available about whether or not the unit is a member of the target population. (Includes refusals.) n = number of units in the sample or census. A = the set of units belonging to the target population for which it was decided that data would be obtained from sources determined to be of equivalent quality as data provided by respondents. a= number of units belonging to the target population for which it was decided to not collect survey data, for reasons other than that the unit had been a refusal in the past, but to instead obtain data from sources determined to be of equivalent quality as data provided by respondents. i= number of ineligible units—that is, units for which data collection was attempted and it is known these units do not belong to the target population. wi = the sampling weight for the ith unit. B B ti = the quantity of a key variable for the ith unit. B B T= the estimated (weighted) total of the variable t over the entire population represented by the frame. T is based on actual data (and administrative data for some surveys) and on imputed data or nonresponse adjustment. 22
The proposed reponse rate differs from the STEPS check-in rate, which using the above notation is given by StEPS checked-in rate = (e + i c) / (e + u + i) = (e + i c) / (n - a) , B B B B where i c = checked-in ineligible cases—that is, units for which data collection was attempted, after B B which the cases were checked-in, but it is known that these units do not belong to the target population. (The check in process also checks in cases that do not possess sufficient data to be considered a response.) The proposed Total Quantity Response Rate is equal to 1.0 minus the StEPS Impuation Rate, and the proposed Quantity Response rate is equal to the Total Quantity Response Rate minus the StEPS Administrative Data Rate. 4. Trends in Response Rates at the Census Bureau Figures 4 through 7 display response rates (i.e., Response Rate # 2 = 1- Imputation Rate) for the three Census Bureau surveys summarized in Table 4. 23
Figure 4 displays the response rates for retail sales in the Census Bureau’s Monthly Retail Trade Survey from April 1991 through November 2003. Figure 4. Monthly Retail Sales 90.0 80.0 70.0 Response rate (%) = 100% - Imputation Rate (%) 60.0 50.0 BSR 87 BSR 92 BSR 97 40.0 BSR 2K 30.0 20.0 10.0 0.0 3 8 3 4 9 3 3 91 91 92 2 94 5 96 96 97 7 8 99 0 01 01 02 2 92 95 95 02 97 00 00 y9 y9 y0 a9 a9 c9 c0 e9 v9 e9 c9 v0 e0 Ap Se Fe Au Ap Se Fe Au Ap Se Fe Jl Jl Ja Jn Jl Ja Ju M M M O O O N N M M D D D 24
Figure 5 displays the corresponding information for the Census Bureau’s Monthly Retail Inventory Survey from April 1991 through November 2003. Figure 5. Monthly Retail Inventories 90.0 80.0 Response rate (%)= 100% - Imputation Rate (%) 70.0 60.0 50.0 BSR 87 BSR 92 BSR 97 40.0 BSR 2K 30.0 20.0 10.0 0.0 3 8 3 91 91 92 4 94 5 96 96 97 97 9 99 0 01 01 02 92 02 2 95 95 7 00 00 2 3 8 3 y9 y9 y0 a9 v9 a9 v0 e9 e9 e0 c9 c9 c0 Ap Se Fe Au Ap Se Fe Au Ap Se Fe Jl Jl Jl Ja Ju Ja Jn M M M N N O O O M M D D D 25
Figures 6 and 7 display response rates for sales and inventories respectively for the Census Bureau’s Monthly Wholesale Survey from April 1991 through November 2003. Figure 6. M onthly W holesale Sales 90.0 80.0 70.0 Response rate (%)= 100% - Imputation Rate (%) 60.0 50.0 BSR 87 BSR 92 BSR 97 40.0 BSR 2K 30.0 20.0 10.0 0.0 3 8 3 91 91 92 92 4 94 5 96 96 97 97 9 99 0 01 01 02 02 2 95 95 7 00 00 2 3 8 3 y9 y9 y0 a9 v9 a9 v0 e9 e9 e0 c9 c9 c0 Ap Se Au Ap Se Au Ap Se Jl Jl Jl Fe Fe Fe Ja Jn Ja Ju M M M N N O O O M M D D D 26
Response rate (%)= 100% - Imputation Rate (%) Ap 0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 91 Se 91 Fe 92 Jl 92 D e9 2 M y9 3 O c9 3 M a9 4 Au 94 Ja 95 Jn 95 N v9 5 Ap 96 Se 96 Fe 97 Jl 97 D e9 7 M y9 8 O 27 c9 8 M a9 9 Au 99 Ja 00 Ju 00 N Figure 7. Monthly W holesale Inventories v0 0 Ap 01 Se 01 Fe 02 Jl 02 D e0 2 M y0 3 O c0 3 BSR BSR BSR BSR 97 92 87 2K
These three surveys are redesigned approximately every five years. Between April 1991 and November 2003, new samples for these surveys were introduced at the beginning of the years 1987, 1992, 1997, and 2000. In the figures, the different samples are labeled “BSR” (for “Business Sample Redesign”) followed by the year that the new sample is introduced. The new sample and old samples overlap for three months. Response rates tend to increase for a short period of time after a new sample is introduced, but then they tend to decrease. The BSR 97 eliminated rotating panels and implemented a design with a single fixed panel. Under the rotating-panel design, respondents were asked to respond every third month with two months of data. Under the BSR 97 fixed-panel design, respondents were asked to respond every month. Cantwell and Caldwell (1998) describe the survey research supporting the BSR 97. The BSR 2K produced large increases in response rates. A possible reason for this is that in order to decrease respondent burden, small and medium size firms that were in the BSR 97 sample were not selected for the BSR 2K sample. This procedure had not been used in earlier sample revisions. Possible additional reasons for the increases in response rates with the introduction of the BSR 2K sample include that for the first time ever the mandatory annual survey was mailed prior to the voluntary monthly surveys and extra resources were devoted to training clerical staff to increase response and to monitor response progress. Cantwell and Black (1998) and Kinyon, et al. (2000) provide additional details about the BSR 2K. C. BLS vs Census Bureau Differences in Response Rates In addition to design and content differences in surveys conducted by the two agencies, differences in response rates may be due to differences in authority and data collection mode. Many Census Bureau surveys are mandatory and as a result obtain high response rates. Respondents may also think of the Census Bureau nonmandatory surveys as mandatory, resulting in higher response rates. On the other hand, except for the annual refiling survey for a few states and one national survey, the BLS surveys are voluntary. To compensate for the voluntary nature of the BLS surveys, BLS uses interviewers in the initiation process, if not also for routine data collection. BLS turns to self-administered modes only after sample initiation and/or indoctrination, while the Census Bureau relies on self-administration alone for nearly all of its survey or census programs, using personal intervention (usually by telephone) only for nonresponse reminders or follow- up. IV. Methods to Encourage Response Many methods used by the BLS and Census Bureau to reduce nonresponse on their establishment surveys run the gamut of traditional survey nonresponse reduction strategies, while some methods reflect characteristics more unique to establishment surveys. Both BLS and the Census Bureau conduct pre-survey notification activities, providing advance notification to respondents of upcoming survey contacts. BLS Regional Offices 28
You can also read