Coverage Technical Report, Census of Population, 2021
7. Census Undercoverage Study
The primary objective of the Census Undercoverage Study (CUS) is to estimate the number of persons in the 2021 Census target population who were not enumerated at the national, provincial and territorial levels. A sample of individuals was drawn from six sampling frames independent of the 2021 Census. The data for the selected persons (SPs) were linked with tax data and other administrative sources to obtain recent information about their usual residence, contact addresses, household members, and related groups of persons.
A set of complex automated linkages and manual searches was done to find the SP in the 2021 Census Response Database (RDB). The census coverage studies (CCS), including the CUS, were carried out based on the version of the RDB that was available in mid-October 2021 (i.e., before the end of census processing). This version, which predates the final 2021 RDB, was called the CCS-RDB. There are a few minor differences between the CCS-RDB and the later versions of the census databases. The
When a search produces no matches, multimode collection is done to determine whether the SP was a member of the target population and to get additional information (including addresses) to help find the SP in the CCS-RDB. At the end of the search, each SP is classified as out-of-scope (deceased, emigrated, temporarily outside Canada), enumerated or missed. A small number of non-response cases, consisting mostly of persons who could not be traced through collection, must be processed and are used to adjust respondent weights based on a non-response adjustment model.
7.1 Sampling
The sampling frame for the CUS target population, which includes all persons who should have been enumerated in the 2021 Census, is constructed from six frames independent of the 2021 Census. The first five frames were used to select a sample to estimate undercoverage in the 10 provinces, while estimates for the three territories were calculated using samples from the last frame only.
At the provincial level, sampling began with the persons who were in the 2016 Census target population. This includes all persons enumerated in the 2016 Census and all persons missed by the 2016 Census, represented by the portion of the sample of SPs in the 2016 CUS who were classified as missed. To account for persons added to the target population since the last census, intercensal (i.e., between the 2016 and 2021 censuses) births and immigrants were added, as were non-permanent residents as of Census Day in 2021. The data sources for these frames are as follows:
- Census frame: Persons who were enumerated in the 2016 Census and appear in the 2016 CCS-RDB.
- Missed frame: There is no comprehensive list of missed persons. However, there is a representative sample of these persons: the 2016 CUS sample of SPs classified as missed. They are all included in the 2016 sample with their 2016 weights.
- Birth frame: Vital statistics data on intercensal births. Since the final vital statistics file on births is only available late, the CUS sample of births is drawn from a mix of preliminary, final and raw vital statistics data files.
- Immigrant frame: Administrative data from Immigration, Refugees and Citizenship Canada (IRCC) on immigrants who arrived in Canada during the intercensal period.
- Non-permanent resident frame: Administrative data from IRCC on persons claiming refugee status on Census Day and persons with a valid work or study permit on Census Day.
For each territory, the main survey frame consisted of health insurance files for persons eligible for health care on Census Day. Although this frame has excellent coverage, it is incomplete, so the sampling weight must be adjusted. Each frame for a given territory is independent of the other territory frames and is used to estimate the undercoverage only for that given territory. In addition, the territory frames are not used to estimate undercoverage in the provinces. In the 2021 CUS, non-permanent residents in the territories who had work or study permits and were not already included in health insurance files were added to the territory frames.
None of the first five frames for the provinces covered persons who had emigrated or who were outside Canada during the 2016 Census and did not complete a 2016 Census questionnaire and who returned during the intercensal period (“returning Canadians within a province”). According to the 2021 Census long-form questionnaire, the number of persons in this group was estimated at 252,089. In addition, the number of persons returning from a territory to a province was estimated at 13,426. Added to this number were 120 persons from reserves and settlements that were incompletely enumerated in 2016 and enumerated in 2021, and 8,489 persons from reserves or settlements who had returned in 2016 and were enumerated in 2021, but who were excluded from the 2016 Census frame. Also, persons born after the 2016 Census outside Canada or in the territories who have Canadian citizenship and who returned to one of Canada’s 10 provinces by Census Day in 2021 were not covered by the first five CUS census frames. According to the 2021 Census long-form questionnaire, the number of persons in this group was estimated at 16,925. Coverage error estimates do not include these populations, estimated at a total of 291,049 persons.
One problem with using multiple sampling frames is the possibility that the same person could be included in more than one frame. For example, a person in the immigrant frame may have been in Canada on a work permit in May 2016 and therefore may have been enumerable in the 2016 Census. That person would then be in both the immigrant frame and the census frame if they were enumerated, or in the missed frame if they were not enumerated. Consequently, it is important to identify all cases of frame overlap. Otherwise, estimates may be too high because some people are included twice in the frames. Whenever possible, this overlap is identified when the sampling frames are constructed, but some overlap is also identified later using information provided by respondents.
The sample design varied by frame depending on the type of list used. A one-stage stratified design was used for the 2016 Census frame. The stratification methodology was significantly changed during the 2021 CUS. Prior to stratification, several deterministic linkages were done. First, there was a linkage of the frame with the tax data, and over 96% of the persons were linked. Then there was a linkage with the vital statistics death files. There was also a linkage with IRCC files to find non‑permanent residents in the frame. Finally, there was a linkage with the 2021 RDB using the monster match program, which is also used for the processing of the CUS sample. This process provides suggestions for potential enumeration and an indicator of the strength of this suggestion. Some suggestions are strong enough to consider the enumerated person without having to check the suggestion. These cases are called self-enumerations. Following these linkages, the frame was stratified. Two take-all strata were created: the deceased stratum and the self-enumerated stratum. Next, six take-some strata were created taking into account the probability of enumeration of persons (strength of the suggestion in the 2021 RDB), the tax situation and the likelihood of being out of scope of the census. However, enumerated persons on reserves and settlements in the 2016 Census were placed in separate strata using the same criteria, but by grouping some strata together as the population is smaller and more homogeneous.
Second, the take-some strata were stratified by province. For those residing in the six smallest provinces in 2016, the stratification province was the province of residence in 2016 (in the 2016 RDB). For persons in the four largest provinces in 2016, the derivation of the stratification province varied by stratum. In the strata with high probability of enumeration in the 2021 RDB, the province of potential enumeration in the 2021 RDB was used. Otherwise, where the person was linked to the tax data, the most recent province of residence based on these data was used. As a last resort, the province listed in the 2016 RDB was used.
The missed frame is a sample-based frame because there is no list of all persons missed in the 2016 Census. The sample for this frame consists of all cases classified as “missed” in the 2016 CUS. Although the sample was not stratified as such, implicit stratification was inevitable because the 2016 missed cases were from different frames and strata.
To construct the birth frame, copies of intercensal birth registrations were obtained from vital statistics through the National Routing System, which provides faster access to these data. The frame contains all births between May 10, 2016, and
The immigrant frame was constructed with records from IRCC. The immigrant frame contains all persons who immigrated to Canada between May 10, 2016, and May 10, 2021, inclusively. Those who were non-permanent residents on Census Day in 2016 were removed from the 2016 immigrant frame because they were already covered by the 2016 Census frame or by the 2016 missed frame. The immigrant frame was stratified by province. The province was derived based on information available in an address file provided by IRCC and in the IRCC immigration file. The most likely province of residence on Census Day in 2021 was selected. Then, immigrants from all provinces were separated into two strata by their immigration date. The first stratum consisted of immigrants who arrived between May 10, 2016, and April 30, 2020, and the second consisted of immigrants who arrived between May 1, 2020, and May 10, 2021, because newer immigrants are usually more likely to be missed in the census.
The non-permanent resident frame (persons who hold a work or study permit and refugee claimants) was constructed with IRCC records. Non-permanent residents as of Census Day in 2016 and intercensal immigrants were removed from the 2021 non‑permanent resident frame. The frame was stratified by province, according to the most likely province of residence on Census Day in 2021. To this end, a deterministic linkage of the frame with the tax data was done. The IRCC address file and the various IRCC non-permanent resident files were also used. At the end of the process, a number of non-permanent residents had no associated provinces of residence (residents with an open permit), so they were placed in a national stratum.
In the provinces, the total size of the 2021 sample was determined to achieve two main objectives. First, the 2021 CUS collection budget was to remain the same as the 2016 CUS collection budget (but adjusted for unit cost increases between 2016 and 2021). Only a portion of the persons in the sample required collection, and proportions varied by frame and stratum. Second, the CUS sought to obtain standard errors in the rate of similar undercoverage among provinces of comparable size. The aim was to produce smaller standard errors for the larger provinces than for the small provinces as this would help to obtain a small standard error at the national level. Where possible, standard errors were not to be higher than those obtained in 2016.
Starting in 2020, by constantly updating the parameters used to calculate the standard error of undercoverage and the number of persons requiring collection, sample size simulations by frame and stratum were done to calculate the appropriate standard errors at all levels (national, provincial, age and gender). The frames and results of the 2016 CUS were used to make these simulations. Since some survey frames were ready before others, sample sizes were determined for these frames before establishing sizes for other frames and strata. Among other things, the sample size of the stratum for the 2016 missed frame was already set because everyone who was classified as “missed” in the 2016 CUS was selected. Then, the size of the first stratum of the immigrant frame was determined in the summer of 2020, and so on for the other strata and frames (births and non‑permanent residents). The sample allocation was completed in November 2021 with the stratification of the 2016 Census frame as described above.
In several strata, a total size was determined for all ten provinces, and then a power-allocation scheme was used to allocate the total sample among the provinces. Minimum sample sizes were also set in the smallest provinces.
In addition, for some strata of the sampling frame, sub-stratification by sex and age group was performed to ensure that there were sufficient numbers of persons missed from these domains. Similarly, the allocation of the sample to the reserve strata of the census frame was carried out to obtain clarification on the undercoverage in the reserves at least as good as in the 2016 CUS. The final total allocated sample was 32,534 SPs across the frames in the provinces. Table 7.1.1 shows the final sample allocation by stratum for all provinces. According to this sample allocation, the target standard errors for the undercoverage rate ranged from 0.16% to 0.42% at the provincial level, and was 0.09% for the provinces as a whole. It should be noted that the resulting allocation does not guarantee that this level of precision will necessarily be achieved, because assumptions have been made about several parameters that are included in the calculation of the standard error of the undercoverage (strata and frame sizes, missed rate, CUS collection response rates, etc.). In addition, the effects of the COVID-19 pandemic may have affected the accuracy of these assumptions, including the number of immigrants and non-permanent residents, interprovincial migration and missed rates in the 2021 Census.
Sampling frames | Strata within each province | Number of people |
---|---|---|
Source: Statistics Canada, 2021 Census Undercoverage Study. |
||
Take-all total | ... not applicable | 26,944,027 |
2016 Census | Deceased | 1,239,662 |
Auto-enumerated in a province | 25,704,365 | |
Take-some total | ... not applicable | 32,534 |
2016 Census | Off reserves TS_1: Strong suggestions of enumeration | 5,559 |
Off reserves TS_2: Strong suggestions of incomplete enumeration | 369 | |
Off reserves TS_3: High probability of being out of scope | 510 | |
Off reserves TS_4: Medium suggestions of enumeration | 757 | |
Off reserves TS_5: High probability of being missed | 5,041 | |
Off reserves TS_6: Others | 1,712 | |
Reserves TS_7: Strong or medium suggestions of enumeration | 270 | |
Reserves TS_8: High probability of being missed | 505 | |
Reserves TS_9: Others | 200 | |
Reserves TS_10: Newfoundland and Labrador and Prince Edward Island | 60 | |
2016 missed | No further stratification | 4,821 |
Births | No further stratification | 5,978 |
Immigrants | Between May 10, 2016, and April 30, 2020 | 2,593 |
Between May 1, 2020, and May 10, 2021 | 588 | |
Non-permanent residents | No further stratification | 3,571 |
The sampling methodology for the territories was similar to that of the census frame for the provinces. The persons included in the sampling frame for each of the territories were linked to the tax data and then to the 2021 RDB, using the monster matching process, which is also used for the processing of the CUS sample (see Section 7.2.1). Following these steps, the frame was stratified, taking into account the strength of the linkage with the 2021 RDB, the location of the enumeration and recent fiscal activity. A take-all self-enumeration stratum in the territory was formed, and six take-some strata were formed (see Table 7.1.2). For the first and sixth strata, a sub-stratification by sex and three age groups (0 to 17 years, 18 to 29 years and 30 years of age and older) was performed.
For sample allocation to the territories, the first step was to determine the total sample to be allocated to each territory in order to achieve similar and adequate precision of the undercoverage. In 2021, the target standard error for the undercoverage rate was approximately 0.40% in Yukon and the Northwest Territories (an improvement from 2016) and 0.60% in Nunavut (similar to 2016). Using the results of the 2016 CUS, assumptions of missed rates, undercoverage rates, and others were calculated for each stratum. For the first take-some stratum, the sample size was set manually in each territory as this stratum had very little effect on the accuracy of the undercoverage rate but more impact on the accuracy of the enumeration rate. This is important for the calculation of a calibration factor at the time of weighting. In addition, the workload of the employees who had to check the sample of this stratum had to be taken into account. Similarly, a sample was manually set for the fourth stratum as it represented persons who are almost certainly out of scope, but who are subject to some research work by CUS’s employees. Then, iteratively, an optimal distribution of the total sample was made among the other take-some strata, including the six substrata of the last stratum. An approximate total size was initially set, then the accuracy of the optimal distribution was calculated, and this was repeated by increasing or decreasing the total size until the desired precision for the undercoverage rate in each territory was obtained. The final total allocated sample was 4,285 SPs across the frames in the territories.
Table 7.1.2 shows the allocation by stratum for all territories.
Strata | Yukon | Northwest Territories | Nunavut | Total |
---|---|---|---|---|
TS = take-some Source: Statistics Canada, 2021 Census Undercoverage Study. |
||||
Take-all: Auto-enumerated within its territory | 27,881 | 26,696 | 16,981 | 71,558 |
Take-some total | 1,156 | 1,331 | 1,798 | 4,285 |
TS_1: Strong suggestions of enumeration | 530 | 440 | 468 | 1,438 |
TS_2: Medium suggestions of enumeration | 57 | 196 | 356 | 609 |
TS_3: Strong suggestions of incomplete enumeration | 30 | 30 | 44 | 104 |
TS_4: Strong suggestions of enumeration outside its territory | 53 | 78 | 70 | 201 |
TS_5: High probability of being out of scope | 97 | 83 | 96 | 276 |
TS_6: High probability of being missed (substratification) | ||||
Females, 0 to 17 years | 30 | 59 | 158 | 247 |
Females, 18 to 29 years | 48 | 44 | 69 | 161 |
Females, 30 years and older | 109 | 117 | 157 | 383 |
Males, 0 to 17 years | 33 | 61 | 132 | 226 |
Males, 18 to 29 years | 54 | 61 | 65 | 180 |
Males, 30 years and older | 115 | 162 | 183 | 460 |
Table 7.1.3 shows the sample allocation for Canada, the provinces and the territories.
Provinces and territories | Take-all strata (number of people) | Take-some strata (number of people) |
---|---|---|
NPR-CA = non-permanent residents without a known province Source: Statistics Canada, 2021 Census Undercoverage Study. |
||
Canada | 27,015,585 | 36,819 |
All provinces | 26,944,027 | 32,534 |
Newfoundland and Labrador | 393,554 | 1,551 |
Prince Edward Island | 106,063 | 1,437 |
Nova Scotia | 696,275 | 1,943 |
New Brunswick | 579,964 | 1,680 |
Quebec | 6,668,208 | 4,298 |
Ontario | 10,415,555 | 7,126 |
Manitoba | 947,750 | 2,579 |
Saskatchewan | 794,538 | 2,540 |
Alberta | 2,940,437 | 4,215 |
British Columbia | 3,401,683 | 5,015 |
NPR-CA | 0 | 150 |
All territories | 71,558 | 4,285 |
Yukon | 27,881 | 1,156 |
Northwest Territories | 26,696 | 1,331 |
Nunavut | 16,981 | 1,798 |
A systematic sampling method within the strata was used to select samples. Here is the list of sorting variables used to obtain an efficient sample (implicit stratification), classified by sampling frame:
- 2016 Census frame: sex, age, Code M,Note 1 2016 geography, tax situation, reason for potentially being out of scope and likely province in 2021 (if stratified in the six smallest provinces);
- Birth frame: age on Census Day, sex, age group of mother and postal code;
- Immigrant frame: age group, sex and country of birth;
- Non-permanent resident frame: type of permit, age group, sex and country of birth;
- Territories frame: sex, age, code M, tax situation and municipality of residence.
No sampling was required for the 2016 missed frame, as all persons missed in the 2016 CUS were selected from the 2021 CUS sample.
Following the selection of provincial and territorial samples, these samples must be prepared by checking the quality of information for the different variables of interest (i.e., geographic and demographic variables); for example, the accuracy of names and the validity of birth dates were checked. Addresses were standardized to facilitate subsequent processing activities. To update the geographic information, especially for the census sample and the missed persons whose information was from 2016, these were linked with the Canada Revenue Agency (CRA) records, including personal income tax records for 2015 to 2021 and Canada Child Tax Benefit records for 2016 to 2022. CRA files and vital statistics data were also used to check whether any selected persons had died. This preparation stage was important because it helped to determine the persons enumerated in the census frames, and to contact persons not found and interview them.
7.2 Processing and classification
7.2.1 Processing
The objective of processing is to provide information for the classification of SPs for the purposes of non-response adjustment and estimation. Specifically, processing is carried out to:
- determine whether the SPs are enumerated in the Census Response Database
- determine whether the SPs are in the census target population
- provide further information for non-response adjustment.
The processing results were recorded in a classification assigned to each SP for estimation and tabulation purposes (see Section 7.4 and Section 9).
Most of the processing work involved automated and computer-assisted searching of the census coverage studies version of the 2021 Census Response Database (CCS-RDB) to determine whether the SP was enumerated.
Various elements of information were used for searching, including surnames, given names and birth dates. Telephone numbers and addresses associated with the SP or members of their household were also used. Questionnaires in which the SP could have been listed were identified from a variety of sources, including the following:
- matches with the CCS-RDB using the birth date and sex of the SP and members of the household, or the SP’s name, postal code or telephone number;
- selection addresses from the sampling frame;
- address updates from tax records;
- information from the computer-assisted telephone interview (CATI) (see Section 7.3).
The first step after sample preparation was to search the CCS-RDB for each SP by processing all SPs with the addresses available from the sampling frame and tax data. There were two outcomes. When the SP was found, they were usually classified as “enumerated,” and no further processing was required, except for SPs who were later identified through vital statistics information as being deceased before the census. When the SP was not found, the case was sent for collection. While collection was taking place, the CCS-RDB search continued. When CATI data were available, researchers could determine whether each SP was part of the census target population. If so, the CATI data could enable further searching.
Searching for the SP was done both automatically and manually by coding staff guided by subject matter experts. To ensure coding uniformity, coding staff were provided with a highly detailed procedure manual that spelled out the specific steps for coding the search results. Automated searches were conducted first. For addresses obtained from a match with the CCS-RDB, there was a corresponding census questionnaire. A measure of similarity between the census questionnaire and the data available for the survey was calculated. When this measure was above a specified threshold, it was automatically concluded that the SP was enumerated at that address. In these cases, neither this address nor the SP’s other addresses needed to be processed by the coding staff. Computer programs also determined when one address was a duplicate of another. These duplicate addresses also did not need to be processed.
For other cases, a manual linkage was conducted using DocLink’s Interactive Verification Application (DIVA), an application developed specifically for this operation. The coding staff used a number of tools for this process, such as Geographical Reference Files, electronic telephone directories and the Street Attributes File. There were often suggested census questionnaires or census collection units that matched the address that was used as the first step for searching. Staff could also search the CCS-RDB using flexible parameters further in the process (searching by name, date of birth, etc.). The results of the manual search were then automatically edited via DIVA built-in edits to minimize errors. A file containing the search results was then produced. The data from this file were used to classify SPs.
7.2.2 Classification
Processing provides the information required to determine whether SPs were:
- included in the “census target population” or “out of scope” (not included)
- “classified” or “not classified”
- “listed” or “not listed”
- “identifiable” or “non-identifiable”
- “enumerated”
- “missed.”
Some SPs fit into more than one category, which will be explained in greater detail in this section.
7.2.2.1 “Target population” or “out-of-scope” classification
The “census target population” includes the group of persons mentioned in Section 2.2. An SP is considered “out of scope” if they are not in the census target population. Each SP classified as “out of scope” is assigned one of the following statuses: deceased, emigrated or represented in another frame. For a person to be classified as deceased, they must appear as deceased in at least two administrative sources (vital statistics death files, income tax files, death files), or in the CUS collection interview. Permanent or temporary emigrants were also determined through a collection interview based on certain criteria and the response on their place of residence on Census Day, the amount of time spent outside Canada, their intention to return to live in Canada and the reason they were outside Canada on Census Day. Other SPs were also classified as “listed emigrants,” regardless of whether they were respondents during collection. These are non-permanent residents (from the 2016 Census and missed frames) who no longer had a work or study permit in 2021 or immigrant status since 2016.
SPs classified as “represented in another frame” includes cases selected in a province but classified in one of the three territories. Cases selected in a territory but classified in a province or another territory are also classified as “represented in another frame.”
SPs classified in the census target population were either “enumerated,” “missed” or “not classified” (see Section 7.2.2.2). An SP was considered “enumerated” if they were in the CCS-RDB. SPs in the census target population were classified as “missed” if they were not enumerated or “not classified.”
7.2.2.2 Classification for non-response and non-response adjustment
Whether an SP was classified as “listed” or “not classified” depended on the usefulness of the addresses provided and the CATI information. In many cases, collection provided information and one or more addresses that could not be found from other sources. In other cases, all the addresses and all the information obtained through collection could be found from other sources.
An SP was “listed” if they were classified without using CATI data; even if data were collected, the addresses and information collected through the interview were not required.
A person was considered “not classified” if it was possible to determine whether they were in the target population but not whether they were missed. This occurred when the place of residence on Census Day, as defined in Section 2.4, was known but not identified in the CCS-RDB. Persons whose place of residence on Census Day was not specific enough (e.g., only the name of a large city) and persons without a fixed address were included in this category.
SPs for whom one or more of the characteristics in the list above could not be determined were considered non-respondents. There are three types of non-respondents:
- An SP was “not identified” when it could not be determined whether they were listed. In other words, since the information about the SP was incomplete, it was impossible to link the SP with the CCS-RDB or to collect their information through an interview.
- An SP was “not traced” when it could not be determined whether they were included in the census target population.
- A “not classified” SP was deemed to be partial non-response. It was known that the person was in the target population but not whether they were missed or enumerated.
7.2.2.3 Distribution of the sample by classification
Table 7.2 shows the distribution of the sample by classification and sampling frame. This table excludes persons in the take-all strata as these persons were classified (enumerated or deceased) prior to sample selection. Classification is determined from specific combinations of the characteristics of the list presented above. Initially, a total sample of 36,819 SPs was selected in the provinces and territories. Of that number, 22,083 SPs were classified as “enumerated,” 7,453 as “missed,” and 5,171 as non‑respondents, of which 169 were classified as “not classified.” The other 2,112 SPs were classified as “out of scope,” specifically 583 “deceased,” 938 “emigrants” (permanent or temporary), 405 persons outside the universe of the territories or provinces, and 186 persons, for other reasons. A non-response adjustment was made during estimation (see Section 7.4). It is important to note that for the purposes of classification and, therefore, estimation, the definition of a non-respondent differs from the usual definition of a non-respondent that data collection is attempted but not completed. This is because classification is based on data from several sources, including collection. To prevent any confusion, Section 7.3 on collection refers to “completed collection” rather than “response.”
7.2.2.4 Implications of the classification
“Traced” SPs are SPs for whom it was possible to determine whether they were included in the census target population. For purposes of estimation and tabulation, traced SPs who were also classified were the respondents. Since names, including those of household members, and addresses were available in the CCS-RDB, and since the tools for consulting the database were sufficiently powerful, it was possible to verify whether an SP was enumerated at an address even if the address provided was vague.
The usefulness of knowing whether an SP was enumerated is self-evident. SPs who were in the census target population but who were not enumerated and were therefore classified as “missed” formed the basis for the undercoverage estimate. We also wanted to classify SPs according to the above-mentioned characteristics so that the most appropriate respondents could be chosen to represent non-respondents.
Lastly, except for SPs who were not classified, the Census Day address (usual place of residence) of each SP in the census target population was determined. This is the address where, according to census instructions, the SP should have been enumerated. If the SP was enumerated, the enumeration address was considered to be the Census Day address, despite other information provided that may suggest that the census instructions were not well understood.
For more information on processing and classification, see Parenteau (2023).
Classification | Provincial strata | Territorial strata | Total | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2016 |
2016 missed | Births | Immigrants | Non-permanent residents | Territorial framesTable 7.2 Note 1 | |||||||||
number | % | number | % | number | % | number | % | number | % | number | % | number | % | |
|
||||||||||||||
Total | 14,983 | 100.0 | 4,821 | 100.0 | 5,978 | 100.0 | 3,181 | 100.0 | 3,571 | 100.0 | 4,285 | 100.0 | 36,819 | 100.0 |
Enumerated | 7,354 | 49.1 | 3,122 | 64.8 | 5,210 | 87.2 | 2,610 | 82.0 | 2,015 | 56.4 | 1,772 | 41.4 | 22,083 | 60.0 |
Listed | 7,201 | 48.1 | 3,112 | 64.6 | 5,206 | 87.1 | 2,604 | 81.9 | 1,997 | 55.9 | 1,760 | 41.1 | 21,880 | 59.4 |
Not listed | 153 | 1.0 | 10 | 0.2 | 4 | 0.1 | 6 | 0.2 | 18 | 0.5 | 12 | 0.3 | 203 | 0.6 |
Missed | 4,156 | 27.7 | 710 | 14.7 | 432 | 7.2 | 284 | 8.9 | 630 | 17.6 | 1,241 | 29.0 | 7,453 | 20.2 |
Listed | 821 | 5.5 | 86 | 1.8 | 68 | 1.1 | 22 | 0.7 | 49 | 1.4 | 238 | 5.6 | 1,284 | 3.5 |
Not listed | 3,335 | 22.3 | 624 | 12.9 | 364 | 6.1 | 262 | 8.2 | 581 | 16.3 | 1,003 | 23.4 | 6,169 | 16.8 |
Out of scope | 882 | 5.9 | 433 | 9.0 | 102 | 1.7 | 100 | 3.1 | 188 | 5.3 | 407 | 9.5 | 2,112 | 5.7 |
Listed | 505 | 3.4 | 327 | 6.8 | 79 | 1.3 | 10 | 0.3 | 104 | 2.9 | 293 | 6.8 | 1,318 | 3.6 |
Not listed | 377 | 2.5 | 106 | 2.2 | 23 | 0.4 | 90 | 2.8 | 84 | 2.4 | 114 | 2.7 | 794 | 2.2 |
Non-response | 2,591 | 17.3 | 556 | 11.5 | 234 | 3.9 | 187 | 5.9 | 738 | 20.7 | 865 | 20.2 | 5,171 | 14.0 |
Traced not classified | 87 | 0.6 | 17 | 0.4 | 17 | 0.3 | 2 | 0.1 | 10 | 0.3 | 36 | 0.8 | 169 | 0.5 |
Identified not traced | 2,492 | 16.6 | 539 | 11.2 | 217 | 3.6 | 185 | 5.8 | 728 | 20.4 | 829 | 19.3 | 4,990 | 13.6 |
Not identified | 12 | 0.1 | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | 12 | 0.0 |
7.3 Collection
7.3.1 Overview
Head office staff in Ottawa worked closely with staff in the Statistics Canada regional offices (ROs) to collect data during the survey phase of the Census Undercoverage Study (CUS). The suggestions and recommendations made by the ROs as a result of conducting the 2016 CUS were incorporated into the design and operations of the 2021 survey.
The main purpose of the CUS is to find (trace) the correct selected persons (SPs) and collect demographic and address information so they can be classified as enumerated, missed or out of scope for the census. The classification results are used to estimate the number of persons who were missed, or undercovered, in the census. To help find and classify the SPs, the Census Day address and household composition were collected, as well as any other address where the SP may have been enumerated. Other information, such as the SP’s mother tongue, was also collected for the coverage study tables.
The CUS take-some sample size was 36,819 (Section 7.1 describes the sample design). Pre-collection processing attempted to find these cases on the CCS-RDB, in vital statistics and in other administrative files. The cases that were matched or found in those files, and that could thus be classified as either enumerated or deceased before Census Day, were not sent to collection. All other cases that were not classified were sent to collection. The total number of cases sent to collection (the collection sample size) was 13,096. During the collection period, the processing team continued to try to match some of the cases, and those that could be classified were removed from collection (see Table 7.3.2 for these counts).
By design, collection was by proxy for SPs who were younger than 18 years. Proxy respondents were also used when the SP was not available during the collection period or was difficult to reach. Overall, 34% of the completed cases were by proxy, and a higher percentage of proxy cases were completed by interviewers than by self-response.
For deceased SPs, it was important to determine whether they had died before, on or after Census Day, since different questionnaire flows were used, depending on the date of death. In some cases—for example, by matching tax records and vital statistics—SPs were determined to be deceased before Census Day, prior to collection. These cases were not sent for collection. However, when in doubt, cases were sent for collection with a note indicating that the SP may be deceased.
It was imperative that the correct SP (or a proxy for the correct SP) be interviewed. If data were collected about the wrong person, the matching and resulting classification would be incorrect. The computer-assisted telephone interview (CATI) system was designed to instruct interviewers to verify that the person they were interviewing was the correct SP at the beginning of the interview. If an interview was completed with someone other than the SP (e.g., someone with a similar name and date of birth), the case was sent back to collection to be completed with the correct person.
The CUS is a mandatory multi-mode survey. The main data collection mode is CATI, and the secondary mode is self‑enumeration. For 2021, the CUS used web-based electronic questionnaires for both modes as it transitioned to the Integrated Collection and Operation System, which is a standardized collection application developed at Statistics Canada. Previously, the CUS self‑response mode used paper questionnaires. The transition to an electronic questionnaire was a big improvement, as it decreased respondent burden and reduced operating time and costs associated with mailing out paper questionnaires and manually entering the returned data.
The third collection mode was personal visits by field interviewers. The plan for the 2021 CUS was to continue to use field interviews in a limited scope, as in previous cycles (in the 2016 CUS, only 0.5% of cases were completed by field interviewers), but instead of the paper questionnaires that were used in the past, field interviewers would have used a laptop and the same application as telephone interviewers. However, all in-person interviews were cancelled at the collection planning stage because of the COVID-19 pandemic.
7.3.2 Operations
Data collection for the CUS began in all ROs on March 28, 2022. The last day of active collection was November 4, 2022. Table 7.3.2 shows the distribution of cases loaded into CATI from head office over time. The majority of cases were sent at the start of collection on March 28 and consisted of adult cases from all frames except Nunavut. The adjusted total represents the number of cases sent to collection, excluding the cases removed from collection.
Description | Count |
---|---|
Source: Statistics Canada, 2021 Census Undercoverage Survey. | |
Cases started March 28, 2022: Adults in all frames except Nunavut | 9,922 |
Cases started April 27, 2022: Minors in all frames (including most of the birth frame) except Nunavut | 1,822 |
Cases started June 6, 2022: Nunavut frame and remaining birth frame cases | 1,352 |
Total cases sent | 13,096 |
Cases dropped by head office: Collection no longer required (classified in processing as either enumerated or out of scope) | 309 |
Adjusted total | 12,787 |
Introductory letters explaining the CUS and advising the SP (or proxy) that they had been selected for the survey were sent for all cases that started collection in March and April and that had a valid mailing address. A phone number was provided if they had any questions or if they wanted to call the RO to complete the survey. Cases without a contact phone number (requiring tracing) were also provided with a secure access code and a link to the self-response questionnaire. Introductory letters were not sent for the cases starting in June; instead, they received the reminder letters sent in July. These reminder letters were sent for all cases not yet completed near the midway point of collection. A second reminder letter was sent one month later. All reminder letters contained secure access codes and links to the self-response questionnaire. New for the 2021 CUS, near the end of collection, email reminders were sent for all incomplete cases that had a valid email address.
Near the end of collection, in an effort to boost response rates, the Toronto and Western ROs began a process similar to the field interview visits done in the past. If there was an address for an SP close to where an interviewer was visiting for another survey, they would visit the address to try to find the SP. If they located the SP or confirmed that the address was the SP’s residence, they requested a phone number and time for the RO to call back to complete the interview. If they were speaking to the SP, they could also provide a secure access code to complete the questionnaire online. If the SP was not there, the interviewer tried to collect any contact information that could be useful for tracing.
Data quality analysis was performed to verify the completeness and accuracy of each case. Cases with missing or ambiguous data in key fields, or where the data collected were for someone other than the SP, were reactivated and sent back to collection for follow-up. There were 41 reactivated cases in the 2021 CUS. Cases that passed the data quality analysis were compiled into batches for processing, as described in Section 7.2.1.
Quality management of the collection operation involved a two-day virtual training session for regional data collection managers, who in turn trained their interviewers. Weekly meetings between head office and ROs were held during collection to discuss progress and address any issues that arose. A ticket-based communication tool was used to centralize and facilitate communication between head office and ROs. It tracked all questions and issues and ensured that each one was resolved in a timely manner. RO managers allocated resources to the survey while balancing the needs of other surveys taking place in their region. Sustained efforts to interview persons who initially refused to participate in the survey improved response rates.
Detailed management reports were created at head office on a daily and weekly basis to document survey collection progress. The reports presented the number of cases collected and response rates by province of selection and sampling frame.
7.3.3 Tracing
As part of the sample preparation, cases were linked to tax and other administrative data to provide updated contact information for the SP and their household members. In some cases, initial CATI data were outdated or incomplete, and tracing was required. Tracing is the process of searching for contact information for either an SP or a suitable proxy, and it is a major part of the CUS.
Tracing leads were loaded into the CATI application as alternate contacts prior to collection, and additional leads were sent to the ROs as they were found in processing during the collection period. More tracing source files were sent to collection for the 2021 CUS (29 files, compared with 13 in 2016), and an improvement in processing meant that only new phone numbers and addresses were sent to the ROs, with no duplication of previous sources.
The CUS had agreements with and received tracing information from 11 provinces and territories, 9 of which used deemed employees. Head office sent files containing names of SPs, which were matched with health care files and sent back with updated contact information. Having a deemed employee meant that both the name and date of birth of the SP could be supplied, making it easier to match the files.
At the start of data collection, only 2.1% of the cases had insufficient contact information and needed to be traced. Because of the quality and quantity of tracing sources provided by head office, 90.6% of the completed cases used phone numbers that were provided by head office. Another 8.6% of the completed cases were contacted with a new phone number that was found by the RO tracing efforts, and a final 0.8% were completed when respondents called in to the RO.
7.3.4 Collection statistics
Many statistics were monitored throughout the data collection period, and they were analyzed after collection was completed.
Table 7.3.4.1 shows the provincial and territorial completion rates by collection method. Of the 7,702 completed cases, 87.6% were completed by CATI and 12.4% by online self-response.
Provinces and territories | Cases sent | Interviewer | Self-response | Total | |||
---|---|---|---|---|---|---|---|
Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | ||
NPR-CA = non-permanent residents without a known province Source: Statistics Canada, 2021 Census Undercoverage Survey. |
|||||||
Canada | 12,787 | 6,745 | 52.7 | 957 | 7.5 | 7,702 | 60.2 |
Newfoundland and Labrador | 503 | 291 | 57.9 | 33 | 6.6 | 324 | 64.4 |
Prince Edward Island | 487 | 278 | 57.1 | 52 | 10.7 | 330 | 67.8 |
Nova Scotia | 620 | 374 | 60.3 | 37 | 6.0 | 411 | 66.3 |
New Brunswick | 522 | 286 | 54.8 | 34 | 6.5 | 320 | 61.3 |
Quebec | 1,315 | 769 | 58.5 | 93 | 7.1 | 862 | 65.6 |
Ontario | 2,406 | 1,214 | 50.5 | 235 | 9.8 | 1,449 | 60.2 |
Manitoba | 852 | 451 | 52.9 | 47 | 5.5 | 498 | 58.5 |
Saskatchewan | 832 | 425 | 51.1 | 51 | 6.1 | 476 | 57.2 |
Alberta | 1,375 | 703 | 51.1 | 100 | 7.3 | 803 | 58.4 |
British Columbia | 1,746 | 828 | 47.4 | 152 | 8.7 | 980 | 56.1 |
Yukon | 460 | 239 | 52.0 | 29 | 6.3 | 268 | 58.3 |
Northwest Territories | 632 | 345 | 54.6 | 33 | 5.2 | 378 | 59.8 |
Nunavut | 950 | 529 | 55.7 | 56 | 5.9 | 585 | 61.6 |
NPR-CA | 87 | 13 | 14.9 | 5 | 5.7 | 18 | 20.7 |
Table 7.3.4.2 shows the completion rates by sampling frame and collection method. As expected historically, the non-permanent resident frame had the lowest completion rate, 49.4%, as SPs in this frame tend to be more mobile and have less contact information, making tracing more difficult.
Sampling frames | Cases sent | Interviewer | Self-response | Total | |||
---|---|---|---|---|---|---|---|
Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | ||
Source: Statistics Canada, 2021 Census Undercoverage Survey. | |||||||
Total | 12,787 | 6,745 | 52.7 | 957 | 7.5 | 7,702 | 60.2 |
2016 Census | 6,773 | 3,720 | 54.9 | 482 | 7.1 | 4,202 | 62.0 |
2016 missed | 1,310 | 691 | 52.7 | 84 | 6.4 | 775 | 59.2 |
Births | 671 | 377 | 56.2 | 40 | 6.0 | 417 | 62.1 |
Immigrants | 553 | 280 | 50.6 | 87 | 15.7 | 367 | 66.4 |
Non-permanent residents | 1,438 | 564 | 39.2 | 146 | 10.2 | 710 | 49.4 |
Yukon | 460 | 239 | 52.0 | 29 | 6.3 | 268 | 58.3 |
Northwest Territories | 632 | 345 | 54.6 | 33 | 5.2 | 378 | 59.8 |
Nunavut | 950 | 529 | 55.7 | 56 | 5.9 | 585 | 61.6 |
Table 7.3.4.3 shows the completion rates by sex and age group. The lowest completion rates were for both sexes aged 20 to 44 years, and the best rate was for females aged 45 years and older.
Sex and age groups | Cases sent | Interviewer | Self-response | Total | |||
---|---|---|---|---|---|---|---|
Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | Cases completed | Completion rate (%) | ||
Note: This table excludes four cases for which the sex was unknown.
Source: Statistics Canada, 2021 Census Undercoverage Survey. |
|||||||
Both sexes | 12,783 | 6,745 | 52.8 | 957 | 7.5 | 7,702 | 60.3 |
0 to 19 years | 1,930 | 1,062 | 55.0 | 140 | 7.3 | 1,202 | 62.3 |
20 to 29 years | 2,420 | 1,198 | 49.5 | 169 | 7.0 | 1,367 | 56.5 |
30 to 44 years | 4,697 | 2,303 | 49.0 | 389 | 8.3 | 2,692 | 57.3 |
45 years and older | 3,736 | 2,182 | 58.4 | 259 | 6.9 | 2,441 | 65.3 |
Males | 6,952 | 3,609 | 51.9 | 496 | 7.1 | 4,105 | 59.0 |
0 to 19 years | 963 | 530 | 55.0 | 74 | 7.7 | 604 | 62.7 |
20 to 29 years | 1,273 | 623 | 48.9 | 99 | 7.8 | 722 | 56.7 |
30 to 44 years | 2,678 | 1,305 | 48.7 | 199 | 7.4 | 1,504 | 56.2 |
45 years and older | 2,038 | 1,151 | 56.5 | 124 | 6.1 | 1,275 | 62.6 |
Females | 5,831 | 3,136 | 53.8 | 461 | 7.9 | 3,597 | 61.7 |
0 to 19 years | 967 | 532 | 55.0 | 66 | 6.8 | 598 | 61.8 |
20 to 29 years | 1,147 | 575 | 50.1 | 70 | 6.1 | 645 | 56.2 |
30 to 44 years | 2,019 | 998 | 49.4 | 190 | 9.4 | 1,188 | 58.8 |
45 years and older | 1,698 | 1,031 | 60.7 | 135 | 8.0 | 1,166 | 68.7 |
7.4 Estimation
The CUS estimate was divided into two parts. First, the SPs were weighted, and then the census undercoverage was calculated. Weighting involves determining the initial sampling weights of SPs, and all adjustments made to these initial weights, to create the SPs’ final weights. Weighting involves several steps that are described in Sections 7.4.1 to 7.4.4. The methodology for calculating census undercoverage is described in Section 7.4.6.
7.4.1 Calculating the initial weights
For SPs of all sampling frames except the 2016 missed frame, initial weights were based on the inverse of the probability of being selected in the sample. However, the initial weight of an SP from the 2016 missed frame corresponds to the final weight assigned to it during the 2016 CUS when the SP was classified as “missed.”
7.4.2 Initial weight adjustments
The weights of SPs from the 2016 Census frame who were enumerated more than once in 2016 were adjusted downward to account for the fact that these individuals had more than one chance of being selected.
Then, the initial influential weights in the 2016 missed frame were adjusted. The objective was to reduce the effect of high and influential weights on estimates and standard errors through the trimming of their initial weights. Some of the 4,821 people in the 2016 missed frame had a very high initial weight. The method used was to truncate weights to a multiplier of the median of weights in each trimming group formed. The trimming groups were formed by the province of selection and five age groups. The weight of a person with a weight above the threshold was reduced to that value. The truncated weights were redistributed evenly to other persons in the trimming group.
7.4.3 Non-response adjustment
To reduce statistical bias, the initial weights of respondents had to be adjusted to account for non-response. The weight of persons who could not be classified (non-respondents) was redistributed among persons who were classified (respondents). There are three types of non-response. First, there are the unidentified persons (only 12 SPs). The initial weights of these persons were transferred to identified persons in each sampling stratum.
The second type of non-response involves untraced persons (4,990 SPs). The adjustment involved forming response homogeneity groups (RHGs) among unlisted persons (listed persons being the persons classified without the help of CUS collection) and transferring the weight of untraced persons to unlisted traced persons within the RHGs.
The first step in the creation of the RHGs was to group unlisted persons (12,337 SPs) into main groups based on their estimated propensity to be in the target population. The groups were formed based on an analysis of the correlation between several tax indicators, particularly those for 2020 and 2021, and the final classification for unlisted traced persons. Up to seven main groups were created based on the sampling frame. These main groups were also strongly correlated with the likelihood to respond. The second step in creating RHGs was to group unlisted persons based on their likelihood to respond in each domain, with a domain being defined by crossing a sampling frame with a main group. In each domain, the likelihood to respond was analyzed using a national logistic regression model (and regional, when the data allowed it) and an analysis of multi-level, cross-frequency tables. For the models, several auxiliary variables available for both traced and untraced persons were used: variables available in the sampling frames (e.g., age, sex, relationship to other household members, country of origin, and type of non-permanent resident), variables available in the tax data for related persons (e.g., whether they were in certain files, frequency of address changes since 2016, and type of address), variables related to contact information (e.g., number and sources of telephone numbers, address availability and link of last known address with the 2021 Census), and a few other variables. Thus, the auxiliary variables that were significantly correlated with the likelihood to respond were determined and used to form the RHGs. In most domains, the RHGs were formed within the province or territory of selection. Therefore, the adjustment consisted of transferring the weight of untraced persons to unlisted traced persons within each RHG.
The third non-response adjustment was the adjustment for unclassified persons (169 SPs). An unclassified person is a person who had their primary residence in a given province or territory on Census Day (thus in the census target population), but for whom it was not certain whether they were missed or enumerated. Using the same principle as with untraced persons, homogeneous groups of classified persons were formed within each sampling frame and province of classification. The adjustment consisted of transferring the weight of unclassified persons to unlisted classified persons within each homogeneous group.
7.4.4 Final adjustments to the weights for classified persons
7.4.4.1 Adjustment for influential weights
At this stage, some SPs have a weight that is high and considered influential in their province of classification. To reduce the effect of high and influential weights on provincial estimates and their standard errors, an adjustment to influential weights was made in the five frames for the provinces. The method used was to trim weights by a multiplier of the median of weights in each trimming group formed. There are two types of influential weights at this stage.
First, there are SPs whose province of classification is different from the province of selection. Therefore, the weight is very high compared to other SPs in this province of classification. Consider, for example, an SP selected in Ontario with a large weight, who is classified in Prince Edward Island. In this situation, the weight is truncated according to the threshold established by trimming group. A factor between four and six times the median for each group was used as a pruning threshold. The trimming groups were formed according to the province of classification and five age groups. The truncated weights of an SP were redistributed evenly to the other SPs in the same province of selection, the same sampling frame, the same classification (enumerated, missed or out-of-scope person), the same status (listed or unlisted) and by age group. Therefore, the influential weight of a missed SP in a given province of classification was allocated to other missed persons, but in the province of selection of the SP. For this first type of influential weight, there were 49 SPs whose weight was truncated (i.e., 33 enumerated persons and 16 missed persons).
The second type of influential weight relates to the SPs from the 2016 missed frame only, who still had a high and influential weight within their province of classification even though it was identical to the province of selection (which is, in fact, the province of classification in 2016). For this type of influential weight, the threshold was set at four times the median weight in the trimming group. The truncated weights of SPs were redistributed evenly to the other SPs in the same province of classification and the same classification, thus having no effect on the estimate of provincial undercoverage. For this first type of influential weight, there were 95 SPs whose weight was truncated (i.e., 10 enumerated persons, 55 missed persons and 30 out-of-scope persons).
7.4.4.2 Weight calibration for the birth frame
For the birth frame sample, enumerated persons were calibrated to take into account cases where a provincial sample would contain too many or too few enumerated persons. An automated deterministic linkage applied to the 2021 CCS-RDB helped to determine the control totals per province for the enumerated persons calibration group. Then, for the other persons in the frame, a linkage to the tax data determined their province of residence on Census Day (otherwise, the province of selection was used) to determine the control totals per province for the non-enumerated persons calibration group. In addition, control totals by year of age (0-4 years) were calculated. The calibration was carried out using a raking mechanism for the margins using the 20 control totals described above as the first margin, and 5 calibration groups by age as the second margin. To this end, Statistics Canada’s Generalized Estimation System (G-EST) was used.
7.4.4.3 Weight calibration for the immigrant frame
For the immigrant frame sample, a calibration of the number of persons in certain calibration groups was carried out to take into account cases where a provincial sample would contain too many or too few enumerated persons or persons in other groups. An automated deterministic linkage applied to the 2021 CCS-RDB helped to determine the control totals per province for the enumerated persons calibration group. Then, for the other persons in the frame, a linkage to the tax data determined their tax status (active or non-active) and their province of residence on Census Day (otherwise, the province of selection was used) to determine control totals by province for the other non-enumerated persons calibration groups. In the four largest provinces, three control totals were determined: for enumerated persons, for persons with recent fiscal activities, and for other persons. However, in the other six provinces, only two control totals were determined: for enumerated persons and for other persons. Thus, 24 control totals were formed. A simple poststratification method was then used to calibrate the immigrant frame.
7.4.4.4 Post-stratification adjustment for the territories
After the initial weight adjustment, the estimated number of enumerated persons in the territories was observed to be traditionally lower than the comparable census count. This was due to undercoverage of the census target population in health insurance files. To address this undercoverage, the weights of the SPs selected in each territory were adjusted so that the estimated number of enumerated persons equalled the comparable census count for that territory. The adjustments were made for six calibration groups (by age and gender) in each territory.
7.4.4.5 Adjustment for overlap of frames or strata
For a small number of SPs in the five provincial frames, the weight is not the final weight, as another adjustment must be made to take into account the overlap between the sampling frames or, in some cases, the overlap between the census frame strata (i.e., overcoverage in 2016), but which was noted only after the CUS collection in 2021. As for the few SPs who overlap frames, it is mostly SPs from the immigrant frame or the non-permanent resident frame who were finally taken into account in the 2016 Census frame (i.e., enumerated in 2016). This information was not known when these sampling frames were prepared. Therefore, an adjustment factor was calculated taking into account the probability of selection in both sampling frames.
7.4.5 Weighted distribution by classification
Table 7.4.5 shows the weighted distribution of SPs by classification and sampling frame. For a reminder of the definitions, see Section 7.2. Only SPs found in the CCS-RDB were classified as “enumerated.” Persons who were in the target population but not in the CCS-RDB were classified as “missed.” The remaining SPs were classified as “out of scope” (e.g., deceased or emigrated).
Classification | Provincial strata | Territorial strata | Total | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2016 Census | 2016 missed | Births | Immigrants | Non-permanent residents |
Territorial frames | |||||||||
number | % | number | % | number | % | number | % | number | % | number | % | number | % | |
Source: Statistics Canada, 2021 Census Undercoverage Study. | ||||||||||||||
Total | 32,933,387 | 100.0 | 2,830,944 | 100.0 | 1,855,111 | 100.0 | 1,072,833 | 100.0 | 1,140,539 | 100.0 | 137,867 | 100.0 | 39,970,681 | 100.0 |
Enumerated | 29,127,257 | 88.4 | 1,784,797 | 63.0 | 1,646,438 | 88.8 | 874,651 | 81.5 | 639,236 | 56.0 | 94,583 | 68.6 | 34,166,962 | 85.5 |
Listed | 29,023,031 | 88.1 | 1,773,009 | 62.6 | 1,643,876 | 88.6 | 871,537 | 81.2 | 626,405 | 54.9 | 94,272 | 68.4 | 34,032,130 | 85.1 |
Not listed | 104,226 | 0.3 | 11,788 | 0.4 | 2,562 | 0.1 | 3,114 | 0.3 | 12,831 | 1.1 | 311 | 0.2 | 134,832 | 0.3 |
Missed | 2,083,885 | 6.3 | 662,494 | 23.4 | 164,767 | 8.9 | 130,942 | 12.2 | 387,586 | 34.0 | 32,760 | 23.8 | 3,462,434 | 8.7 |
Listed | 243,914 | 0.7 | 41,300 | 1.5 | 16,954 | 0.9 | 7,987 | 0.7 | 14,693 | 1.3 | 5,567 | 4.0 | 330,415 | 0.8 |
Not listed | 1,839,971 | 5.6 | 621,194 | 21.9 | 147,813 | 8.0 | 122,955 | 11.5 | 372,893 | 32.7 | 27,193 | 19.7 | 3,132,019 | 7.8 |
Out of scope | 1,722,245 | 5.2 | 383,653 | 13.6 | 43,906 | 2.4 | 67,240 | 6.3 | 113,717 | 10.0 | 10,524 | 7.6 | 2,341,285 | 5.9 |
Listed | 1,402,710 | 4.3 | 206,632 | 7.3 | 25,675 | 1.4 | 3,964 | 0.4 | 32,613 | 2.9 | 7,768 | 5.6 | 1,679,362 | 4.2 |
Not listed | 319,535 | 1.0 | 177,021 | 6.3 | 18,231 | 1.0 | 63,276 | 5.9 | 81,104 | 7.1 | 2,756 | 2.0 | 661,923 | 1.7 |
7.4.6 Calculating census undercoverage
Note the following definitions:
- =
- published census count of the number of persons in the target population
- =
- undercoverage estimate
- =
- estimate of the number of persons not included in who should have been
- =
- estimate of the number of persons in the CUS target population who were not enumerated
- =
- sum of the final weight of persons considered to be missed
- =
- the number of persons included in who could not be identified with certainty as enumerated in the CUS.
Census population undercoverage was estimated by the number (weighted) of missed persons less the number of persons counted in the census (term C) but excluded from the CCS-RDB:
has three components: imputations, incomplete enumerations and late enumerations.
The SP’s address on Census Day refers to a dwelling for which an enumeration was imputed. This was the case in particular for non-response dwellings for which another household’s data were used in WHI.
Some enumerations in the census database were deemed too incomplete to be used by the CUS to determine whether an SP was enumerated. Incomplete enumerations in this context usually involve missing or invalid date of birth or name data (e.g., “?”, “Mr.”, “Unknown” or “Person 1”). An SP enumerated in this manner was classified as “missed.” This was referred to as a “CUS incomplete enumeration.” This category of enumeration also includes certain types of collective dwellings for which only the number of usual residents was collected in the census (no names or dates of birth). Data of people living in these collective dwellings was imputed from the RDB.
At the national level, made up slightly less than half of . The value of increased from 2016 because of an increase in the number of persons imputed as part of the WHI and the increase in imputations in certain types of collective dwellings (incomplete enumerations).
Table 7.4.6 shows the national numbers for the various components of the population undercoverage estimate, namely the numbers for the three components of the term .
Components | Number of people |
---|---|
CUS = Census Undercoverage Study M = number of people in the Census Undercoverage Study (CUS) target population who were not enumerated X = number of people included in the published census count but who could not be identified with certainty as enumerated in the CUS U = undercoverage Source: Statistics Canada, 2021 Census Undercoverage Study. |
|
Estimate of M | 3,462,434 |
Total X | 1,564,558 |
X for imputed people | 931,346 |
X for late enumerations | 0 |
X for CUS incomplete enumerations | 633,212 |
Estimate of U | 1,897,876 |
Lastly, the variance of the undercoverage estimates was calculated as follows:
= estimated variance of as determined by the CUS design.
The variance was calculated using the classic bootstrap resampling method. To that end, weights of 500 bootstrap replicates were produced.
- Date modified: