Towards High Quality Administrative Data - A Case Study: New Zealand Police
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Towards High Quality Administrative Data - A Case Study: New Zealand Police Gavin M. Knight New Zealand Police __________________________________________________________________ This report was commissioned by Official Statistics Research, through Statistics New Zealand. The opinions, findings, recommendations and conclusions expressed in this report are those of the author(s), do not necessarily represent Statistics New Zealand and should not be reported as those of Statistics New Zealand. The department takes no responsibility for any omissions or errors in the information contained here. Citation: Knight, G. (2008). Towards high quality administrative data – A case study: New Zealand Police, The Official Statistics System, Wellington, Official Statistics Research Series, Vol 3 ISSN 1177-5017 ISBN 978-0-478-31514-1 [Online], available: www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Abstract Much had been written about principles and standards for designing surveys to ensure good quality statistical information results. Less has been written about standards for administrative data. Whereas it is thought that many of the same principles may apply, the terminology is often different and contextual differences exist that may require changes in the form, if not the substance design- standards. For example, a survey questionnaire is usually designed to be completed by a sampled respondent just once, whereas a form used to capture data for an operational IT system may be filled out many times a day by the same person in order to record information required by that person to perform their job. Efficiency and relevance may therefore have different implications for the design of such forms. This paper documents a project undertaken as a case-study on New Zealand Police that sought to identify principles to assist with designing good quality administrative data. Recommendations are made, based on these principles. Keywords: Administrative data, quality, form, New Zealand Police Official Statistics Research Series, Vol 3, 2008 2 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Contents The main body of this paper describes the project, its methodology, and a summary of results and conclusions. Background..................................................................................................................... 4 Methodology ................................................................................................................... 4 Phase I: Business Processes.......................................................................................... 5 Description of Phase I ................................................................................................. 5 Results of Phase I ....................................................................................................... 6 Phase II: Design-Principles ............................................................................................. 7 Description of Phase II ................................................................................................ 7 Results of Phase II ...................................................................................................... 8 Recommendations .......................................................................................................... 9 Process and accountabilities ....................................................................................... 9 Design-rules for guardians ........................................................................................ 10 Ensuring compliance ................................................................................................. 10 Next Steps .................................................................................................................... 11 Conclusions .................................................................................................................. 12 Appendix 1: Summary of focus-group workshops .........................................................13 Appendix 2: Design-rule recommendations...................................................................22 Appendix 3: Integrationworks' report at conclusion of Phase I.......................................25 Official Statistics Research Series, Vol 3, 2008 3 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Background This project was undertaken between November 2006 and August 2007, as part of the Official Statistics Research programme administered by Statistics New Zealand. Principle partners in this project were New Zealand Police (NZ Police), which was responsible for leading the research and was the subject of the case-study, and Statistics New Zealand (Stats NZ), which not only contributed significant funding to the project, but also provided a 'project ownership' role to which Police were accountable for reporting and implementing the project as planned. NZ Police, Stats NZ and other organisations identified later in this paper, contributed to the research itself. The project aimed to produce a standard, consisting of principles and rules, as appropriate, that will be applied by NZ Police to ensure capture of quality statistical information in administrative systems, and which is sufficiently generic that it could potentially be applied by other agencies. Note: This paper uses the terms 'standard', 'principles', and 'rules' as interrelated concepts, often preceded by the word 'design'. When the word 'standard' is used as a noun in this paper, it should be interpreted as a combination of 'rules' and 'principles'. It may be thought of a concept; not necessarily a specific tangible document. Rules, on the other hand, must be explicitly documented. Methodology The project took the form of a case study on NZ Police, with two phases. The first phase aimed to understand the current business processes where decisions are made about design of forms and IT applications then, from this understanding, identify where in these processes design-principles should be applied. The second phase aimed to determine what these design-principles should be. An experienced project team was formed, whose makeup varied throughout the project, adapting to the requirements of the current stage of work. However, for continuity, two members of the team were involved from start to finish, being Gavin Knight from Police National Headquarters (PNHQ) and Simon Thomson from Stats NZ's Collection and Classification Standards unit. The lead for Phase I was contracted out to Integrationworks - a firm specialising in data, system and application integration. Integrationworks reported to the project team, which acted as a steering group. This phase involved interviewing Police staff involved in changing forms, IT applications and business processes. It also reviewed police documentation about the business processes relevant to these functions. Phase II was lead by Police, who facilitated a number of workshops, as a form of focus- group, involving practitioners from various government agencies who work with administrative data. This group, informed by existing literature and the results of Phase I, discussed practical implications and issues. It considered options for addressing these through an administrative data design-standard (consisting of principles and rules), and formed, by consensus, a view of what this standard should be. Official Statistics Research Series, Vol 3, 2008 4 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Phase I: Business Processes Description of Phase I Phase I commenced in November 2006 and concluded in March 2007. Initially in Phase I, the project team consisted of: • Gavin Knight, National Statistics Manager, PNHQ, NZ Police • Fiona Morris, Performance Officer, PNHQ, NZ Police • Simon Thomson, Statistical Analyst, Collection and Classification Standards, Stats NZ • Bridget Murphy, Justice Subject Matter Project Manager, Social Conditions, Stats NZ • Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand PNHQ project team members, assisted by members of Police's Information Technology Communications Service Centre (ICTSC), selected 'Integrationworks' from three prospective IT consulting firms, to take the lead in Phase I. Nick Borrell from Integrationworks interviewed eighteen Police Subject Matter Experts (SMEs), identified by PNHQ members of the project team. These SMEs included a mix of sworn and non-sworn police staff and represented a variety of roles within Police, including: • Systems analysts • Business analysts • Project managers • A file centre manager • Various IT managers • Area commanders • An Area tactical response manager • Intelligence section supervisors Most SMEs interviewed were of middle-management in seniority, ranging from Sergeant to Inspector in rank or rank-equivalent (non-sworn staff). In selecting SMEs to interview, Police took into account who had been involved in requesting or managing changes to forms, IT applications and business processes, either regularly, or in recent projects or initiatives. In addition to interviewing SMEs, Nick Borrell reviewed existing Police documentation relating to the business processes involving making changes to forms and IT applications. The project team acted as a steering group for Phase I, providing direction and feedback to Nick Borrell, as the project progressed. It also assisted with answering questions and removing roadblocks, such as facilitating access to SMEs. Official Statistics Research Series, Vol 3, 2008 5 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police By the end of Phase I, project team makeup had altered slightly, as Stats NZ sought to provide the appropriate expertise to provide feedback to Nick Borrell and review Nick's draft report. In particular Stats NZ replaced Bridget Murphy with Liping Jiang, Subject Matter Project Manager, Collection and Classification. Results of Phase I Nick Borrell submitted his final report on 22 March 2007. This report described the process undertaken, documented findings, and made a number of recommendations, primarily relating to the business processes by which forms and IT applications are designed. Nick's report is attached as Appendix 3. Key findings included: • No framework exists to standardise and manage data capture, • There are no resources (e.g. guideline manuals) available to staff for addressing data capture, • The existing process to manage form-changes does not consider data quality or standards, • The process to manage forms can be circumvented by staff - leading to unauthorised changes, • Staff do not appear to know how to initiate changes to forms or policing procedures, and • ICTSC is perceived as the de-facto owner of all data quality issues, yet resolving issues of statistical information quality is not a core function of ICTSC. The report also made a number of recommendations aimed at addressing the key findings in a way that minimises the barriers to implementing change, by avoiding significant process reengineering. Instead, the report takes into account existing business processes and functional groups. It recommends the minimum modification necessary to existing processes and work-group functions to effect required improvements. It is acknowledged that this tactical approach applied to other organisations may result in different business processes. However, there is a tension that needs to be balanced between achieving buy-in of an organisation to making change, and creating a business process that was common to all organisations. The latter was viewed as unrealistic and not necessarily desirable anyway. Whereas principles for data quality may be common, it may be appropriate for different types of organisations to have different business processes. Phase I therefore made recommendations about distinct features of an effective process, rather than simply recommending a specific process. Such features may have greater relevance to other organisations than the Police-specific processes that are recommended. The key recommendations in Integrationworks' report are: Official Statistics Research Series, Vol 3, 2008 6 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police • Create a data quality framework which supports guidelines and principles for data quality, • Establish form-change guardianship, • Appoint data quality guardianship to IT applications, • Develop design-standards for data to be captured, • Bring ICTSC processes in line with design-standards, and • Develop policy to support design standards. These are detailed more fully in Appendix 3. However, in short, they involve standardisation of processes, creation and enforcement of design-standards, and allocation of responsibility for application of standards when changes to forms and IT applications are proposed by the business. Phase II considered what these design-standards should be. Phase II: Design-Principles Description of Phase II Phase II commenced in April 2007 and concluded in August 2007. The project team for Phase II consisted of: • Gavin Knight, National Statistics Manager, PNHQ, New Zealand Police • Chris Worsley, Statistics Business Analyst, PNHQ, New Zealand Police • Simon Thomson, Statistical Analyst, Collection and Classification Standards, Statistics New Zealand • Matt Flanagan, Statistical Analyst, Collection and Classification Standards, Statistics New Zealand • Barb Lash, Statistical Analyst, Social Conditions, Statistics New Zealand • Robyn Smits, Manager, Data Management Unit, Ministry of Education • Dr. Karolyn Kerr, Manager Information and Analysis, Central Region Technical Advisory Services (Health) • Jason Gleason, Senior Data Analyst, Justice Sector Information Strategy, Ministry of Justice. Additionally, Ian Smith, Police's National Applications Manager, and Senior Sergeant Bernie Geraghty, Police's National Coordinator of Business Analysts, attended one meeting. Senior Sergeant Geraghty subsequently continued to provide feedback on notes from workshop meetings. Informed by the Phase I report, which identified how a design-standard would be used, project team members from Statistics New Zealand collated documents containing principles and standards which it was thought might usefully inform the development of principles and design-rules for administrative data. These documents included: Official Statistics Research Series, Vol 3, 2008 7 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police • "Quality Protocols" of the (New Zealand) Official Statistics System, produced by Statistics New Zealand. • "A Guide to Good Survey Design" ( July1995), produced by Statistics New Zealand, ISBN: 0-477-06492-2 • "Best Practice Guidelines for Classifications", used by Statistics New Zealand's 'Classifications and Standards' unit • "Official Statistics System Administrative Data Guidelines" • "Draft principles for designing forms, processes and IT applications to ensure desired statistics have acceptable quality" (August 2006), a desk-file used by the Statistics Unit at PNHQ It was apparent to the project team that the principles in most of the above documents had been developed from the context of surveys. In a couple of instances there had been an attempt to adapt principles developed for surveys to administrative data. However, gaps remained. Having collated and reviewed the above documents, principles from them (many of which appeared in more than one document) were explicitly identified and discussed by the project team, in terms of their relevance to administrative data. Project team members were asked to consider both these principles and gaps in the principles, based on their experience in working with administrative data. Discussion occurred in six workshops, occurring over a five-month period. Notes were taken at the workshops, particularly concerning conclusions and the associated rationale. These notes were reviewed by workshop participants between meetings. There were no a-priori assumptions of validity or intrinsic merit of any suggestions expressed by team members in the workshops. Team members were asked to consider all information presented, taking into account both the consistency with their own experiences and the soundness of the rationale behind ideas. The result, which is documented in Appendix 1, should therefore not be treated as empirical, but should be treated as expert opinions that have survived peer-review in a focus-group context. The project sponsor (Police) acknowledges the willingness of project team members to participate in such an exercise, where opinions were challenged in an effort to achieve a robust result. In general, project team members felt that the result was superior to what could have been produced by any individual, with ideas from one team member prompting ideas in others. Results of Phase II Notes from the discussions are attached as Appendix 1. A summary of key conclusions is as follows: Many of the principles that are applicable to design of survey data are applicable to administrative data as well. However, sometimes a slight change in terminology is needed, to reflect the different context. Official Statistics Research Series, Vol 3, 2008 8 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Ensuring good administrative data requires some additional components which are either not relevant to or manifest differently from survey data. These include preventing 'work-arounds' by staff wanting to report statistical information without complying with rules, and the need to relate level of detail in a classification to the level of detail required by the underlying operational process generating the data. The Phase I report (Appendix 3) proposed specific processes for developing forms and IT applications. In Phase II, it was identified that the process of developing IT applications typically produces increasing design-detail as the analysis and design phases of IT projects progress. Not all of the factors necessary to determine alignment with the principles and design-rules may be evident at the outset. The implication is that the checking of a proposed design may be more iterative than indicated in the process diagram suggested by the Phase I report. As identified in Phase I, to effect quality statistical information from administrative data, staff require resources (in the form of processes, practices and design guidelines) relevant to their roles. Generic quality principles are not as useful as principles relevant to the task at hand. The implication, recognised in Phase II, is that we must identify the tasks and develop resources for each task. Such resources include: • A template and guideline for developing a proposal for a new form or modification to a form or IT application. (proposer) • A manual for checking alignment of a proposal with the design-rules. (guardian) • Documentation of the business process(es) by which new and modified forms and IT applications are made. • A data dictionary for the organisation. • Policy. Recommendations Process and accountabilities 1. Create a documented standard process for processing proposed forms or modifications to forms and IT applications. 2. Create policy that assists and ensures compliance with this process. 3. Appoint a person or group in the organisation as 'guardian' in the process for creating or modifying forms and provide this guardian with resources such as training and/or guidelines that incorporate design-principles and rules. The guardian's role is to consider proposed new forms or modifications to forms against these design-principles and rules, provide feedback and suggested changes to the proposer, and to ensure proposals do not proceed until and unless they comply with the design-rules. 4. Appoint a person or group in the organisation as 'guardian' in the process for creating or modifying IT applications and provide this guardian with resources such as training and/or guidelines that incorporate design-principles and rules. The guardian's role is to consider proposed IT application designs against the Official Statistics Research Series, Vol 3, 2008 9 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police design-principles and rules, provide feedback and suggested changes to the proposer, and to ensure proposals do not proceed until and unless they comply with the design-rules. • There may be a separate IT application guardian for each IT application, or one group may take this responsibility for many or all IT applications. The guardian may or may not be the same as the guardian for forms. • For NZ Police, the guardian for forms should be the Applications Support group in ICTSC, as this would require the minimum change from existing accountabilities. The guardian or guardians for IT applications should be determined by the Manager ICT. However, to avoid overlapping accountabilities, as a minimum, there should be no more than one guardian for any given IT application. 5. Appoint a person or group in the organisation to have responsibility for developing, maintaining and communicating 'design-rules for administrative data' that ensure quality statistical information. • For NZ Police this should be the National Statistics Manager in PNHQ, as this role best aligns with existing competencies. 6. Have a standard 'change-request' form that accompanies proposals for new forms or modifications to forms or IT applications. The change-request form is distinct from any business case justifying and seeking approval for the change. Rather, its purposes are to (a) prompt the proposer to identify and consult with stakeholders, and (b) capture the information required by the guardian to assess the proposal against the design-rules and provide feedback or suggest changes in order to comply. 7. Create a data dictionary for the organisation, as a central register of variables, containing their labels, definitions, formats, scales, ranges, and instructions to be given to people capturing data. • The data dictionary should be made as accessible as possible on-line to all staff in the organisation (E.g. Police). Design-rules for guardians 8. A manual should be created to assist guardians in checking proposals against design-rules. • The manual should incorporate the recommendations in Appendix 2. Ensuring compliance 9. Policy should be created to assist and ensure compliance with these recommendations 10. IT designers should endeavour to prevent statistical reporting of data that has been excluded from the data to which the design-rules have been applied. Official Statistics Research Series, Vol 3, 2008 10 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police • For example, for NZ Police this may involve ensuring all statistical reporting occurs via the data warehouse, rather than directly from proprietary operation IT systems. It may also require that 'Business Objects' universes be designed to support the 'design-rules' and not report other data in a form that facilitates statistical reporting. • One mechanism for controlling this is through the governance processes for approving expenditure. For example, the organisation should consider how payments are currently approved to IT system suppliers for developing reports or reporting capability on their systems. • One type of change-request may involve making data from an operational IT system conform with the design-rules, thereby improving its quality and enabling statistical reporting from it. 11. IT designers should endeavour to prevent creation of or modification to database fields that bypass guardians and/or do not comply with the design-rules. 12. The Assurance group at PNHQ should incorporate auditing of new or newly modified forms and IT applications into its audit framework, to check for compliance with policy. Next Steps 13. Statistics NZ should consider the potential for developing a guideline/manual for government agencies, incorporating principles recommended by this project. 14. Police's National Statistics Manager should, in consultation with ICTSC, develop the recommended manual for guardians and the template and guidelines for proposing new forms or modifications to forms and IT applications. 15. Police's ICTSC should, in consultation with the Statistics Unit in Organisational Performance Group PNHQ, document a standard process for creating and modifying forms and IT applications. 16. Police's Applications Support group should, in consultation with the Statistics Unit in Organisational Performance Group PNHQ, develop an initial data dictionary and a manual for its use, then integrate maintenance of the data dictionary into the above new standard process. 17. The Manager ICT should allocate guardianship responsibilities to staff, in line with the recommendations of this project. 18. Police's Policy unit should draft, consult, and seek appropriate approval for policy that supports these recommendations. Official Statistics Research Series, Vol 3, 2008 11 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Conclusions This project, although not empirical, has provided a valuable step forward in understanding both design-principles for administrative data, and issues that require addressing in order to introduce such principles into the business. The project did not examine in any detail what constitutes 'quality' in administrative data. Rather, using 'quality' principles already identified in the literature, the project looked at how these manifest in an administrative data context and considered design-principles to address them. Many of the principles applicable to design of survey data are also applicable to administrative data. However, the context of creating survey data differs markedly from that of creating administrative data. This leads to a need for creation of resources in a variety of forms, in order to effect good quality statistical data. These resources include: • Standard documented processes, • An organisation-wide data dictionary, • Allocation of defined roles and responsibilities, • Documented guidelines for staff to undertake these responsibilities, • Organisational policy to assist and ensure compliance, and • Audit, to ensure compliance. Although undertaken as a case study on New Zealand Police, most results - at the principles and rules level - do not appear to be uniquely applicable to Police, but would be of more generic applicability to any organisation producing administrative data. Results are therefore encouraging. Official Statistics Research Series, Vol 3, 2008 12 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Appendix 1: Notes from workshops in Phase II 1. Aims of this policy A policy is required that assists and ensures Police staff create and modify forms and IT application in a standardised way that incorporates principles which ensure quality statistical information can be produced. Determining content is not in scope for the policy/guidelines This policy affects front-end design; not IT system data architectures or back-end extraction tools and standards. These are also required to deliver quality statistical reporting, but are out of scope for this particular policy. This policy does not dictate what information Police must record; neither does it dictate what recorded information will be used to generate statistical information. Instead, once the business has made these decisions, this policy affects aspects of how forms and IT application front-ends will be designed. That said, it is acknowledged, that unless centralised control exists over the process by which decisions are made about the design of forms and IT applications, standards will be impossible to apply in practice. Therefore, this policy must include relevant aspects of business process. As part of this, consideration must be given to whose responsibility it is to determine what statistical information is required from a particular form or IT application development. Options considered, along with their strengths and weaknesses were: Option Strengths Weaknesses 1. Gatekeeper/ • Centralised role who is • Places much higher responsibility on guardian involved in the process at the guardian than envisaged. appropriate stage. • Guardian is a facilitator, rather than a business owner. 2. Proposer • Likely to come from the • Involvement in this process is one- business area driving the off, so is unlikely to have expertise change. in identifying stakeholders and determining information needs. 3. Business owner of • Will be a key stakeholder for • May not appreciate broader the functional area the affected data and likely to synergies, information management that creates the data understand the business implications and stakeholder needs. implications in that area. 4. Business owner of • Will understand the directly • IT groups should have a capability the IT application affected system. responsibility, rather that business ownership of the data. 5. A centralised non-IT • Is centralised and is likely to • This function may not be consistent business information understand statistical issues with the core function of the group. group (E.g. and how to identify breadth of Headquarters information needs. Statistics Unit) Additionally, wherever the function sits, there may be resourcing implications. Official Statistics Research Series, Vol 3, 2008 13 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Conclusion: It is not necessary for one group to carry this responsibility alone. Instead, a standardised 'change-request' form should be created for use at step 2 of the processes recommended in sections 3.2 and 3.3 of the Integrationworks report (Appendix 3). Responsibility for completing the change-request form should be with the proposer. However, the form should guide the proposer to identify stakeholders and ensure they are consulted. Such stakeholders will be a mix of operational business managers and centralised expert groups. E.g. District Commanders, National Statistics Manager and the Manager Applications Support. On certain key projects, it may be appropriate for one of these stakeholders to take over responsibility for proposing the change. Such an approach leaves responsibility for doing the leg work with the proposer, but ensures input from a centralised group who can apply expertise that the proposer does not necessarily have. So what is in-scope? - Designing the capture of what is recorded Our aim is to ensure the quality of whatever statistical information Police desire to be produced by: • Requiring that any proposed creation or change of a Police form be considered by a 'gate-keeper' or 'guardian' with the responsibility and knowledge to ensure the form is designed in such a way that resultant statistical information will be of good quality. o The project team notes that what is meant by 'quality' requires definition. However, such a definition may not differ markedly from existing documented definitions regarding survey data, such as are contained in the documentation reviewed. • Requiring that any proposed creation or change of a Police IT application be considered by a 'gate-keeper' or 'guardian' with the responsibility and knowledge to ensure the IT application will capture information in such a way that any statistics Police desire be produced from this system will have acceptable quality. • Ensuring that these gatekeepers ask the following questions: o "Of all of the information that is proposed to be recorded on the form/in the IT application, what subset of this information do Police require statistics to be derived from?" (the statistics subset), and o "Is this subset sufficient to provide the desired statistical information?" (If not, either the statistics subset needs to be extended, additional information needs to be recorded or, where not possible, it needs to be accepted that not all of the desired statistical information can be produced from this particular form or IT application.) Note: It is anticipated that most data recorded will, by default, be part of the statistics subset unless specifically excluded. Subsequent discussion will therefore speak as if all data is included in the statistics subset. It is noted here, however, that Official Statistics Research Series, Vol 3, 2008 14 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police design-principles applied to this data will not necessarily be relevant to data specifically excluded from the statistics subset. • Preventing data that has been excluded from statistics subset from being extracted from Police systems in a way that enables such data to be used from calculating statistical information. E.g. A free-text field used for capturing rough notes may be excluded from the statistics subset. This field may be extracted to a data warehouse along with other data and may be queried, but it may not be used to select records for aggregation, such as in a 'Condition' statement in SQL. (This stops work-arounds that would jeopardise quality by avoiding proper design.) • Providing guardians with design-rules that ensure forms and IT applications capture data in such a way that good quality statistical information will result. Note: This policy alone cannot ensure quality. It will do so as far as the design of forms and IT applications is concerned, however to complement this policy, three further aspects are required: • policy is required to govern staff recording practices, • data storage, transformation and access mechanisms (I.e. the IT back-end) need to be designed appropriately, and • correct practice throughout the system needs to be monitored in a performance management framework. These components are addressed through other aspects of Police's Statistics Strategic Plan. Furthermore, some important existing Police systems have already been built in a way that would not comply with this new policy. As result, statistical information available from these systems is inferior in scope and quality than desired. Projects to retrospectively redesign aspects of these systems would be required if these limitations are to be addressed. This new policy will, however, prevent a repeat of bad design practices on these legacy systems. 2. Design Rules (drafted as if for guardian) Full vs partial modification When only a part of a form or IT system is being modified, a decision is required on whether to limit the application of these design-principles to just that part or whether to take the opportunity of reviewing the whole form, IT application, or relevant module of the IT application. Whereas any change to a form or IT application provides an opportunity to address a number of issues at the same time, a comprehensive change may require extensive investment that may not be warranted by the desire for a minor improvement in a form. Official Statistics Research Series, Vol 3, 2008 15 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Four distinct issues need to be considered: • implications for the form/screen layout, • implications for other systems that might use the same variable(s), • implications for time-series, and • operational implications. Implications for the form/screen layout This is the simplest impact to assess. The main requirement is that any changes must be consistent with the rest of the form and avoid confusion. E.g. another part of the form may refer to a field in the affected portion. Implications for other systems This may be more difficult to assess, unless some mechanism to link and/or match fields in different systems exists. If affected variables are contained in the organisation's data dictionary and if this data dictionary identifies all forms and systems using the same variable, this will make the task easier. Conclusion: A data dictionary should be created for the organisation. (Refer section 3.3 for details). Implications for time-series' Altering variables or classifications, can interrupt time-series'. For example, mandating entry of what was previously an optional field, introducing a new category, obsoleting an existing category, or changing whether or not 'don't known' or 'missing data' is permitted. Similarly, even altering how information is prompted can effect what is entered. Literature on design of survey measuring instruments makes this clear. In the absence of evidence to the contrary, we should not assume this principle differs for administrative data. Changes should therefore only be made if important and, wherever possible, where the impacts on time series' has been analysed. Metadata accompanying resultant statistical information should note changes that potentially affect time-series'. Operational implications Will the change being proposed for the form/IT application work for the business, including all affected stakeholders? For example, if a new field/variable is proposed, is it operationally appropriate to collect this information at the stage of the business process that the data is being recorded, or if a new category is being proposed, does that category have operational relevance? Official Statistics Research Series, Vol 3, 2008 16 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Conclusion The decision about whether to make a partial or full modification when a change is proposed, should be made on a case-by-case basis. The policy should recognise that judgment needs to be applied. However, it should always require consideration of the above four types of implications and, where appropriate, provide documentation (E.g. metadata, training materials, etc.). Statistical units vs attributes In short, both may be variables, appearing as fields in forms and data entry screens. However, statistical units are what is counted and attributes are the descriptors of what is counted, and can be used to select what to count. (E.g. in 'Condition' statements of SQL queries.) Statistical units (or 'measures') Considering the information to be collected on a form or in an IT system record, identify the desired statistical units. I.e. what is to be counted. Statistical units can be considered as being one of two types: 'Direct' and 'Derived': Direct: These will typically be the 'record' (E.g. a form) that corresponds to the action or object to which the form relates. For example, a transaction, occurrence, property item, person, etc. Derived: These do not necessarily have a one-to-one relationship with the core form or record being captured in an instance. Derived measures need to be considered carefully, firstly to determine feasibility and how to derive desired measures that are not one-to-one with records, and secondly to be sure that the measure's definition is valid. An example of a derived measure would be where we wish to count the number of children present at domestic violence incidents. A single record per incident is created from the relevant form (for Police, this is the POL400 form). However, this form contains fields specifying the number of children present at the incident. So it is possible to count the number of children present at domestic violence incidents. From a statistical perspective, where it is possible to obtain statistical units both directly and derived, it is preferable to obtain them directly. (E.g. a separate form for each person present at an incident, rather than a count of such people on the incident report.) However, this needs to be balanced against respondent burden, firstly through requiring additional information be recorded that is not required for operational purposes, and secondly through double-entering information that may already be recorded elsewhere in the system. Attributes Identify the factors that will be used to select and characterise statistical units. These are known as 'attributes'. Typical attributes may be: Official Statistics Research Series, Vol 3, 2008 17 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police • time periods, • geographical areas, • categories of measures, such as job-type, ethnicity, gender, age, cost-centre, job-closure category, etc. or • numerical values that describe the measure object, such as age, height, weight, etc. Different attributes may have different formats, such as categorical, numeric, date, etc. Note: Unique identifiers of individual records (E.g. a file number) are not statistical attributes, as they only apply to a single record and do not characterise measures. Once statistical units and attributes have been decided, determine the hierarchical relationships. For example, one Occurrence may contain one or more Offences, for which there may be none, one or many Apprehensions of offenders. Similarly, one District may contain a number of Police Areas which may contain a number of Police Stations. Check fitness-for-purpose Design the change-request template indicated in sections 3.2 and 3.3 of the Stage 1 report to provide all of the information required by the guardian. For example, it should work through all fields in the proposed form or IT application input screens, identifying which fields are to be excluded from the statistics subset. For all other fields, suggested labels, definitions, variable-type, scale and range, should be included in the template. The template should also identify stakeholders of the relevant statistical information and have their sign-off. Such sign-off is to mean that the stakeholder is satisfied that the statistics subset is sufficient to provide all of the statistical information they expect from what is collected on that form or IT application input screen. In deciding how much data to record, balance respondent burden with desired information. Also, design the structure of any data recoded to maximise the potential for integrating other data and deriving additional information. The template must make it clear to the proposer and stakeholders that it will be impossible to produce statistics using excluded fields. Design of variables Check all of the variables in the proposed form or IT application against the organisation's data dictionary. This data dictionary is a central register of variables, containing their labels, definitions, formats, scales, ranges, and instructions to be given to people capturing data. The data dictionary is not specific to IT systems; rather it defines variables that may appear in any system. It is important to retain the distinction between variable definitions, which are about information, and IT systems, which are about technology. Official Statistics Research Series, Vol 3, 2008 18 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police If a proposed variable is the same or similar to an existing variable in the data dictionary, where such similarity is considered adequate for operational purposes, the existing variable should be used in preference to creating a new one. In general, use a structured format in preference to free text, wherever possible. Wherever the guardian proposes a change from the proposal, this should be worked through with the proposer and, in turn, stakeholders identified on the template. Variables on different systems with the same label must be fully compliant with the specification in the data dictionary, including definitions, categories, etc. Where no suitable variable already exists in the data dictionary, consideration should be given to external standards that Police may desire to be compatible with. For example, we may wish to compare crime statistics with Australia. Data dictionaries for the Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims Statistics (RCVS), should be considered. Where no such explicit standard applies, the Justice Sector data dictionary should be consulted and, failing that, Statistics New Zealand should be consulted, in order to identify a potentially suitable existing variable. Where none exists, a new variable should be created and entered into the data dictionary, with all required information about it completed. This new variable must have a unique label - one that is not already in use elsewhere. Subject to this limitation, the label should be as intuitive as possible. The data dictionary should be made as accessible as possible on-line to all staff in Police. Free-text format fields should only be used in the statistics subset as a last resort. Where they are included, comprehensive instructions must be provided, specifying how to use these fields, and coding schemes must be created to encode data in them for the purpose of producing statistical information. Note: The contents of excluded fields may be reported as qualitative information, but excluded fields may not be used to determine which records to aggregate when calculating statistics. (E.g. they are not to be used in 'Condition' statements of SQL queries.) Design of classifications Statistics New Zealand's 'Best Practice for Classifications' should be referred to and complied with when designing categorical variables in forms and IT applications. Define 'classification'. Some specific requirements include: For categorical variables, categories must be mutually exclusive and exhaustive. For each categorical variable, a flat or hierarchical classification structure must be used. "A flat structure should be used when a simple listing is required or when there is no requirement to aggregate or group categories into more meaningful categories." Official Statistics Research Series, Vol 3, 2008 19 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Statistical balance should be considered, when deciding the level of aggregation in establishing category boundaries. Although is accepted that specific categories for certain infrequently occurring types of instances may require a unique category for operational purposes, unless such granularity is required for operational purposes, aggregation in setting category boundaries should seek to make the frequency counts in each category be of similar order. The level of precision that values or categories can take should reflect the level of precision appropriate for operational purposes in the context where the data is recorded. For example, it is inappropriate to record the specific level of aggravation in a violent offence when a member of the public is reporting an assault over the phone; conversely, when charging an alleged offender, a specific charge is required. When modifying or designing new classifications, the design should attempt to be robust against future needs. Statistical feasibility is part of this consideration. Instead of attempting to fit all responses, as is required for a survey. For admin data we need to ensure all operational scenarios can be fitted into the classification. Mandatory entry Data entry should not be prompted with default values in fields, as this introduces both and error component and statistical bias. Allow an 'Unknown' category wherever not knowing is a valid operational scenario. Avoid force-fitting unknown, as doing so would introduce an error component. IT front-end should force valid and mandatory entry of all fields which may be used in producing statistical information. Reliability of measurement If a variable or category cannot be measured with adequate reliability, even if such information is desired, it should be excluded from the statistics subset. Rationale: Avoids misinformation Definitions required for 'reliability' and 'adequate'. This may be difficult to assess. Modifications to existing designs Modifications to existing forms or IT applications may impact the continuity of time-series or the reliability of existing variables. Guidelines should be given in the redesign, to account for this. For example, any impact analysis, mapping, metadata creation, classification version numbers, documentation of form and system changes, etc. One option is to create a new version number to reflect real world change or where there is a change to structure or content. For example a new version could be created when Official Statistics Research Series, Vol 3, 2008 20 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police new categories are required, old categories deleted, or where the statistical unit being classified changes. Also, where a definition changes, this needs addressing in metadata. For example, what Police treat as being 'rural', even though the form doesn't change. Ideally the data dictionary should include a database that identifies which variables are contained in which forms and IT application screens. Modifications to forms, screens or variables should check this record to maintain alignment. Prevention of work-arounds It is acknowledged that administrative data is used for different purposes by different users throughout an organisation. Such users often seek to 'innovate' expedient solutions to report statistics, without giving due consideration to 'quality' issues. Although there is no single method of preventing this, a combination of steps should be taken, if an organisation wishes to ensure its statistical information has adequate quality. These steps should attempt to make it easy for staff to do the right thing, make it hard for staff to break the rules, and include consequences for breaking the rules Specific steps an organisation can take include: • Producing guidelines • Producing policy • Including compliance in audit and performance management frameworks • Ensuring back-end IT applications used to extract and present statistical information, restrict inclusion of non-compliant variables/fields in statistical reports. Optional tactical approaches include: 1. Ensure that any back-end applications used to extract and present statistical information, are designed in such a way that variables outside the statistics subset cannot be used as statistical attributes. For example, they cannot be used in SQL Condition statements or as categorical variables in SQL Select statements. 2. Ensure that any back-end applications used to extract and present statistical information are designed in such a way that if variables outside the statistics subset are used to select records to report, they can only be used to report lists; not measures. 3. Ensure that any back-end applications used to extract and present statistical information are designed in such a way that if variables outside the statistics subset are used to select records to report, such reports will not include algebraic computations. (E.g. summation) Official Statistics Research Series, Vol 3, 2008 21 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
Towards High Quality Administrative Data – A Case Study: New Zealand Police Appendix 2: Recommendations for the design-rules Determine the scope of the change When only a part of a form or IT system is being modified, a decision is required on whether to limit the application of these design-rules to just that part or whether to take the opportunity of reviewing the whole form, IT application, or relevant module of the IT application. This decision should be made on a case-by-case basis, applying the principles articulated in section 2.1 of Appendix 1. The policy should recognise that judgment needs to be applied. However, it should always require consideration of the principles articulated in section 2.1 of Appendix 1 and, where appropriate, provide documentation (E.g. metadata, training materials, etc.). Define the variables/fields to incorporate, with the aid of a Data Dictionary Check all of the variables determined to be in-scope in section 3.1 above, against the organisation's data dictionary. If a proposed variable is the same or similar to an existing variable in the data dictionary, where such similarity is considered adequate for operational purposes, the existing variable should be used in preference to creating a new one. In general, use a structured format in preference to free text, wherever possible. Wherever the gatekeeper proposes a change from the proposal, this should be worked through with the proposer and, in turn, stakeholders identified on the template. Variables on different systems with the same label must be fully compliant with the specification in the data dictionary, including definitions, categories, etc. Where no suitable variable already exists in the data dictionary, consideration should be given to external standards that Police may desire to be compatible with. For example, we may wish to compare crime statistics with Australia. Data dictionaries for the Australian Standard Offence Classification (ASOC) and the Recorded Crime Victims Statistics (RCVS), should be considered. Where no such explicit standard applies, the Justice Sector data dictionary should be consulted and, failing that, Statistics New Zealand should be consulted, in order to identify a potentially suitable existing variable. Where none exists, a new variable should be created and entered into the data dictionary, with all required information about it completed. This new variable must have a unique label - one that is not already in use elsewhere. Subject to this limitation, the label should be as intuitive as possible. Free-text format fields should only be used in the statistics subset as a last resort. Where they are included, comprehensive instructions must be provided, specifying how to use these fields, and coding schemes must be created to encode data in them for the purpose of producing statistical information. Official Statistics Research Series, Vol 3, 2008 22 ISSN 1177-5017 www.statisphere.govt.nz/official-statistics-research/series/vol-3
You can also read