We created the SIDF code to use within Stats NZ's Integrated Data Infrastructure (external link) (IDI). This code creates datasets with analysis variables based on a user’s specified population and time period for analysis.
In other words, a user can easily specify who, when and what they are interested in analysing, and datasets ready for analysis will be created.
Different projects require different datasets for analysis. The SIDF code has the flexibility to create datasets tailored to user specifications as well as to be extended by power users who wish to add more code to build custom variables for their dataset.
Building datasets ready for analysis is a time-consuming task – especially if you are a first time IDI user. Automating as much of the data preparation as possible will reduce the time to build datasets, leaving more time for analysis.
The code has been designed as a modular framework so you can run the code from end-to-end or create your own flow.
An example of the code being used in a modular format can be found in the Social Housing test case code (external link) .
The Social Investment Analytical Layer (SIAL) code (external link) creates a series of social sector service event tables, allowing the services people receive to be analysed in a standard way. This speeds up analysis and encourages consistent definitions.
The tables generated by the SIAL code are one of the inputs into the SIDF framework. They're used to create service-metric variables summarising a person’s interactions with government over a given time period, e.g.:
Other inputs into the SIDF include demographic information and indicators tied to outcomes that require more detailed coding logic that can't be easily incorporated into the SIAL.
Authorised IDI users will be able to create tables by running the SIDF code inside the IDI.
Detailed instructions can be found in the README file (external link) .
You're welcome to contribute your code, or raise issues, on GitHub.
Please email your feedback to email@example.com.