Methodology and new data sources including big data - Census Methodology in Estonia - unece
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Methodology and new data sources including big data Census Methodology in Estonia By Diana Beltadze and Ene-Margit Tiit from Statistics Estonia
ROAD-MAP FOR REGISTER-BASED CENSUS IN ESTONIA Preparatory work 2009–2020 2021– 2022 I STEP 2009–2015 II STEP 2016–2017 III STEP 2018–2020 census TEST IN 2014 I PILOT IN 2016 II PILOT IN 2019 activities 1. Redesigning administrative 1. Development of register- 1. Improvement of register- data for census purposes based census based census 2. Data acquisition test from methodology methodology registers (contracts, 2. Setting up structures and 2. Specification, design and description of the data set, procedures according to development of software checks on data quality and the needs of harmonized and standards to support the acquisition procedure) population statistics statistical production 3. Formation of census 3. Development of a 3. Improvement of data characteristics, software system for the quality by register holders programming of the implementation of big data 4. Data assessment after necessary rules and its estimation 2nd pilot in 2019 4. Registers` data quality methods assessment 4. Data quality assessment after the 1st pilot 20.09.2018
General prerequisites Availability of unique identifiers Unique address codes Including geo-coordinates Bussiness ID number of enterprise Personal ID numbers National legislation Access to administrative data Right to link microdata Free-of-charge delivery 20.09.2018
Action plan for register-based census rehersal in 2019 The main preparatory work for Pilot census II involves the following tasks: A. Data acquisition from registers (contracts, description of the data set, checks on data quality and the acquisition procedure); B. Formation of census characteristics, programming of the necessary rules; C. Testing the system as a whole. 20.09.2018
Data extraction from 24 registers New universal X-Road service and X-Gate service Automatic data transactions for 17 registers 20.09.2018
Results from 2016 pilot census The first pilot census of REGREL was successfully completed. The pilot census showed that a register-based population and housing census is feasible and the preparations for the census have been purposeful. In total, 38 census characteristics were formed, the following had the best quality ratings: sex, age, legal marital status, country of birth, country of citizenship, total population, ethnicity, native language, location of dwelling and living quarters by type of building. 20.09.2018
Problematic formation of family structure due to wrong addresses Increasing number of lone parents Decreasing number of families who are legally married couples or cohabiting couples with children When we compare the numbers of PHC2011 and the first pilot register-based census, the differences were the following: The number of lone parents increased by 67% (86,000=>143,000) The number of registered or cohabiting partners decreased by 26% (548,000 => 407,000) 20.09.2018
Index-based methodology for using registers How is it possible to correct registers using only register data? The answer comes from an old tradition of statistics – using repeated measurements allows getting more precise measurement results. In a similar way, by using a large number of registers, it is possible to improve the quality of administrative data. 20.09.2018
Index-based methodology for register-based census Residency index Partnership index • 2015 • 2017 Placement index • 2018 20.09.2018
Defining signs From each register containing information about people living in the country, it is possible to get signs which are useful for making decisions about persons. 20.09.2018
Why are these indexes necessary? People do not register their actual home address WHY? Child’s place in kindergarden and school; Free city transport; Taxation rules for land; Renter’s unwillingness to pay tax for rent; Some local bonuses for pensioners or groups of population; Laziness, negligence, etc. Population Register is over-covered People who have left Estonia do not register their leaving in PR (and coming back) 20.09.2018
Administrative signs of life 1) Estonian Population Register (marriages, divorces, changes of place of residence) 2) Estonian Education Information System (students, teachers) 3) Social Services and Benefits Registry 4) Health Insurance Information Database 5) National Defence Obligation Register 6) Estonian National Pension Insurance Register 7) Estonian Unemployment Information System 8) Register of Residence and Work Permits 9) E-file system (crime documents, court documents, etc.) 10) Estonian Traffic Register (changes of driver’s licenses, changes of vehicles) 11) Register of Employment 12) Register of Identity Documents 13) Estonian Medical Prescription Center 14) The State Human Resources Database In 2017, we had 33 signs of life in total 20.09.2018
Who can be a partner to a lone parent? The conditions are similar to conditions necessary for partners in family’s algorithm: Adult Of opposite sex Not close relative Age difference less than 18 years Not a partner in another existing household 20.09.2018
Signs of partnership Legally married or registered partnership Legally divorced Common child Common ownership Common loan Common car Jointly filled tax return Shared parent leave / parent compensation No demand for child support 20.09.2018
Placement index Registered address for father in county X Registered address for mother and child in county Y Family ownership in 20.09.2018 county C
Signs of placement The dwelling must be inhabited all year round (electricity consumption information) Enough living space (1 room per adult and 0.5 per child) Dwelling with amenities (water and kitchen) Occupied by somebody else 20.09.2018
Big data as an opportunity All Statistics Estonia can do is to adopt additional alternative data sources that would help to improve the quality of census results or validate the obtained results. 20.09.2018
Big data and census For census purposes, big data are not crucial, because the data are not accurate enough. By census, information is collected on each resident. In 2018, this is more of a research project on a potential additional data source. There is no usable methodology for census based on big data. As a possible data source we considered using big data in determining partnership, but this is not crucial and the use of such data would be possible only after serious work evaluating data quality, which has not been done so far. 20.09.2018
Why are big data needed in the census project? The use of big data has been accepted in census preparation by many countries, however, not for enumeration but for correcting data, finding errors and evaluating. The future direction is to use big data in the case of topics “not covered” by the census. 20.09.2018
Objectives of pilot survey I Test opportunities to specify the actual place of residence by using mobile positioning data Study problems related to linking mobile phone number to user 20.09.2018
Data analysis steps 1. Linking the participant’s and close relatives’ places of residence and real estate in the population register and land register 2. Comparison of addresses collected from databases to actual place of residence indicated during the survey 3. Create models using anchor points from mobile positioning that would: decide whether the participant’s residence in the population register is the actual place of residence; find the likeliest place of residence from among other addresses. 20.09.2018
Objectives of pilot survey II Using electricity consumption data to specify dwelling occupancy After the implementation of the partnership index, there will be a need to rearrange individuals’ places of residence in the statistical register; electricity consumption data should be useful here as well. 20.09.2018
Next steps in using electricity consumption data Elering data for 2016, 2017 and 2018. Improving algorithms and methods. Using information of electrical consumption data - to create total population of dwellings; - in partnership index. 20.09.2018
Current stage of development in the area of census In connection to big data, the Bayes method, data mining, etc. are used, however, there are essentially no special big data analysis methods. The main focus of big data analysis is currently on organising data and preparing the data for analysis with classical methods. 20.09.2018
Conclusion Alongside data develops also the methodology of census statistics, i.e. new possibilities will emerge for processing data. New data categories and data formats require improvements in methodology and new methodological approaches. The data analysis methodology is significantly affected by calculation possibilities as well as opportunities to apply more and more complex and resource-demanding calculations. 20.09.2018
References Levenko, V.; Tiit, E.-M.; Visk, H. Partnership index, Quarterly Bulletin of Statistics Estonia 1/18, pages 29-42, https://www.stat.ee/publication-2018_quarterly-bulletin-of-statistics- estonia-1-18 Tiit, E.-M., Maasing, E. (2016). Residency index and its applications in censuses and population statistics. Quarterly Bulletin of Statistics Estonia 3/16, pp 53–60 Tiit, E.-M., Vähi, M. Indexes in demographic statistics: a methodology using nonstandard information for solving critical problems. Papers on Anthropology XXVI/1, 2017, pp. 72–87
20.09.2018
You can also read