Coverage Technical Report, Census of Population, 2021
10. Evaluation of coverage studies
10.1 Census Undercoverage Study
10.1.1 Introduction
The results of the largest coverage study, the Census Undercoverage Study (CUS), can be assessed by comparing its estimates with data on the same characteristics from other sources, such as the 2021 Census database and administrative data used by the Demographic Estimates Program (DEP). The purpose of making comparisons with CUS estimates is to evaluate the CUS estimates and to quantify conceptual and measurement differences.
Despite some conceptual differences between the CUS and the 2021 Census, the CUS estimates of persons enumerated in the 2021 Census can be compared with the census counts. To make the two numbers comparable, certain adjustments were first made to the census counts.
Estimates of the components of intercensal demographic growth can be compared with estimates from other sources. The CUS estimates of the number of persons who died between the 2016 Census and the 2021 Census can be compared with the counts from vital statistics files. Estimates of net interprovincial migration calculated by the DEP based on Canada Revenue Agency data can be compared with CUS estimates. Lastly, CUS estimates of the components of demographic growth can be compared with similar estimates from administrative data.
10.1.2 Comparisons with census counts
Since the CUS’s single-stage stratified sampling design produces unbiased estimates, differences between CUS estimates and census counts are mainly attributable to sampling error in the CUS estimates; conceptual differences between the two sources; or systematic biases that have impacts on the two sources, resulting in an underestimate or overestimate of the characteristic being studied.
Enumerated persons
Provincial and national comparisons are presented in Table 10.1.2.1, along with the standard error of the CUS estimate and the t‑value used to test the hypothesis that there is no difference between the CUS estimate and the comparable census count. The adjustments below were made to the published census counts to account for conceptual differences between the two sources:
- Adjustments based on whole household imputation were excluded because, while they were included in the census counts, they were not part of the CUS estimate of enumerated persons.
- The 2021 Census overcoverage estimate was subtracted because the census database contained overcovered persons, whereas the CUS estimate was based on the number of unique persons enumerated (and not on the number of enumerations).
- The estimate of the number of persons living outside Canada five years earlier (excluding intercensal immigrants and non‑permanent residents [NPRs]) from the 2021 Census long-form questionnaire was also subtracted because the CUS estimates did not include the majority of these persons. For the same reason, the estimated number of children aged
0 to 4 years who were born outside Canada but had Canadian citizenship was also subtracted. - Similarly, for the provinces, the number of persons living in a territory five years earlier was subtracted because they were not covered by the CUS provincial sampling frames.
- The number of persons from reserves (who participated in the 2021 Census, but not in the 2016 Census) was also subtracted because the CUS estimates did not include the majority of these persons.
Provinces and territories | Enumerated people | Difference | |||
---|---|---|---|---|---|
CUS | Comparable census count | ||||
Estimated number | Standard error | ||||
Sources: Statistics Canada, 2021 Census coverage studies and 2021 Census. |
|||||
Canada | 34,166,964 | 40,467 | 34,380,739 | -213,775 | -5.28 |
Newfoundland and Labrador | 478,988 | 2,432 | 480,370 | -1,382 | -0.57 |
Prince Edward Island | 142,075 | 1,033 | 144,380 | -2,305 | -2.23 |
Nova Scotia | 899,614 | 4,153 | 907,675 | -8,061 | -1.94 |
New Brunswick | 724,776 | 3,361 | 725,234 | -458 | -0.14 |
Quebec | 7,912,911 | 18,532 | 7,976,614 | -63,703 | -3.44 |
Ontario | 13,243,488 | 29,802 | 13,306,050 | -62,562 | -2.10 |
Manitoba | 1,217,061 | 5,621 | 1,231,102 | -14,041 | -2.50 |
Saskatchewan | 1,033,774 | 5,457 | 1,031,302 | 2,472 | 0.45 |
Alberta | 3,894,128 | 13,778 | 3,932,868 | -38,740 | -2.81 |
British Columbia | 4,525,565 | 15,461 | 4,550,560 | -24,995 | -1.62 |
Yukon | 34,815 | 0 | 34,815 | 0 | ... not applicable |
Northwest Territories | 34,360 | 0 | 34,360 | 0 | ... not applicable |
Nunavut | 25,409 | 0 | 25,409 | 0 | ... not applicable |
Nationally, the CUS estimate of the number of persons enumerated in the 2021 Census was lower than the comparable census count (-0.62%). For the 1996 to 2016 censuses, the national difference between the CUS estimate and the comparable census count was between -0.09% and 0.12%. It is the first time since that comparison has been performed that the difference is statistically significant at the national level. At the provincial level, the CUS estimate is lower than the comparable census count for every province except Saskatchewan, and that difference is statistically significant for five provinces: Prince Edward Island, Quebec, Ontario, Manitoba and Alberta.
In previous cycles, significant differences were also observed. The differences were investigated to make sure that there was no bias in the CUS classification (including, for example, province of residence on Census Day). Other factors may also play an important role in the observed differences. Apart from sampling error, biases in the adjustments (e.g., returning Canadians) applied to the published census counts to obtain conceptually comparable figures may be responsible for the differences. CUS non-response bias may also have played a role since the non-response adjustment was designed to obtain the best result for estimating missed persons rather than enumerated persons. Regular checks and quality controls were performed for all steps in the CUS.
In view of the more significant differences than typically observed, a thorough investigation was conducted. In past censuses, there have always been persons enumerated in the census who were not covered by the CUS frames, either because of limitations with those frames or because they were not part of the census target population. On top of the returning Canadians already mentioned above, the first category also includes dependants (children and spouses) of NPRs. The second category includes persons deceased before Census Day, postcensal births, foreign visitors, immigrants who arrived in Canada after Census Day, and NPRs and their dependants who arrived after Census Day or did not have a valid permit. In past cycles, the number of such persons was deemed small enough to not have a significant impact on the comparison between the CUS estimate of enumerated persons and the comparable census count. With the large increase in the number of NPRs in recent years, it seems that this is no longer the case. A large portion of the difference between the two numbers can be explained by the enumeration of dependants of NPRs or out-of-scope census records.
An analysis of the differences by age group showed that the negative differences were concentrated among children
None of the investigative work raised any concern with the CUS classification, weighting or estimation steps.
10.1.3 Comparison with demographic estimates
Deceased persons
Table 10.1.3.1a provides a comparison of the estimated number of persons who died during the intercensal period (May 10, 2016, to May 10, 2021) by CUS province of classification with counts from vital statistics files. The CUS estimate excludes persons who died outside Canada when the country of death is known. At the national level, the CUS estimate exceeded the vital statistics count by 6,492 persons (0.5%), and this difference was not statistically significant. At the provincial level, the greatest percentage differences were noted in Prince Edward Island (648, or 9.7%) and Quebec (12,731, or 3.7%), but only the first one was statistically significant (t‑value of 2.28). In the other provinces, the relative differences were between -1.7% and 2.1%. They were not statistically significant, and most differences were smaller than what was observed in 2016.
Provinces | People deceased May 10, 2016, to May 10, 2021 |
Difference | |||
---|---|---|---|---|---|
CUS | Vital statistics count | ||||
Estimated number | Standard error | ||||
Note: Coverage estimates may not necessarily add up to the totals because of rounding. Sources: Statistics Canada, 2021 Census Undercoverage Study, Vital Statistics Program and Demographic Estimates Program. |
|||||
Total | 1,442,017 | 13,967 | 1,435,525 | 6,492 | 0.46 |
Newfoundland and Labrador | 25,999 | 626 | 26,255 | -256 | -0.41 |
Prince Edward Island | 7,317 | 285 | 6,669 | 648 | 2.28 |
Nova Scotia | 47,877 | 933 | 48,349 | -472 | -0.51 |
New Brunswick | 38,084 | 918 | 37,748 | 336 | 0.37 |
Quebec | 355,031 | 7,366 | 342,300 | 12,731 | 1.73 |
Ontario | 531,134 | 9,193 | 540,387 | -9,253 | -1.01 |
Manitoba | 55,461 | 1,656 | 56,058 | -597 | -0.36 |
Saskatchewan | 48,750 | 1,419 | 48,835 | -85 | -0.06 |
Alberta | 132,746 | 3,842 | 133,471 | -725 | -0.19 |
British Columbia | 199,619 | 5,001 | 195,453 | 4,166 | 0.83 |
Certain reasons may explain a significant difference. Firstly, the CUS estimate may include deaths that occur abroad, which are not included in vital statistics. In the CUS, if the country of death is known and is abroad, then the death is not included in the comparison of deceased persons in Table 10.1.3.1a. However, if the person is not found in the vital statistics files and the country of death is unknown, then the death would be filed by default in the person’s most recent province of residence in Canada. There were only 12 selected persons in this situation, and none were in Prince Edward Island; therefore, it is not the main reason for the significant difference in that province. Another reason that may explain the difference is not being able to find the person in the vital statistics because of differences in personal information. The last reason is underreporting in vital statistics. There are 34 selected persons for whom the death was confirmed to be in Canada but who were not found in the vital statistics, and among them 10 were in Prince Edward Island. Table 10.1.3.1b provides a comparison of the CUS estimate of the number of persons who died during the intercensal period (May 10, 2016, to May 10, 2021) by province of residence indicated in the vital statistics files (therefore, only for persons found in these files) with vital statistics counts. The Prince Edward Island difference that had been significant no longer is, with a t‑value of 0.45. However, the difference becomes significant in Ontario (-19,386, with a t‑value of -2.47). Even if these last results do not seem indicative of issues related to the CUS estimates of the number of deceased persons, a more detailed investigation was conducted to confirm that no classification or other error was involved in the operations or estimates. No such errors or problems were detected.
Provinces | People deceased May 10, 2016, to May 10, 2021 |
Difference | |||
---|---|---|---|---|---|
CUS | Vital statistics count | ||||
Estimated number | Standard error | ||||
Note: Coverage estimates may not necessarily add up to the totals because of rounding. Sources: Statistics Canada, 2021 Census Undercoverage Study, Vital Statistics Program and Demographic Estimates Program. |
|||||
Total | 1,419,908 | 12,750 | 1,435,525 | -15,617 | -1.22 |
Newfoundland and Labrador | 25,966 | 626 | 26,255 | -289 | -0.46 |
Prince Edward Island | 6,773 | 232 | 6,669 | 104 | 0.45 |
Nova Scotia | 47,877 | 933 | 48,349 | -472 | -0.51 |
New Brunswick | 37,123 | 887 | 37,748 | -625 | -0.70 |
Quebec | 347,817 | 7,081 | 342,300 | 5,517 | 0.78 |
Ontario | 521,001 | 7,859 | 540,387 | -19,386 | -2.47 |
Manitoba | 54,605 | 1,628 | 56,058 | -1,453 | -0.89 |
Saskatchewan | 48,750 | 1,419 | 48,835 | -85 | -0.06 |
Alberta | 131,732 | 3,823 | 133,471 | -1,739 | -0.45 |
British Columbia | 198,263 | 4,869 | 195,453 | 2,810 | 0.58 |
Interprovincial migration
Table 10.1.3.2 compares CUS estimates of net interprovincial migration for the intercensal period with corresponding figures calculated by the DEP based on Canada Revenue Agency files. In general, data on interprovincial migrants were not comparable because the CUS only took into account migration flows that occurred between the sampling frame reference date (e.g., May 10, 2016, for the census frame) and Census Day in 2021, whereas the DEP estimates took annual migration into account. For this reason, only net interprovincial migration estimates are presented.
Although the estimates differ, the CUS and demographic estimates of net interprovincial migration go in the same direction (positive or negative net migration) for every province. The difference between the CUS and demographic estimates is smaller than it was in 2016 for 6 of the 10 provinces, and roughly the same for another one, but the standard errors of the CUS estimates are also much smaller than they were in 2016. As a result, two provinces show a statistically significant difference between the CUS and demographic estimates of net migration—Ontario (with a t‑value of 3.27) and Manitoba (-2.27). A significant difference was observed for one province in 2016.
Provinces | Net interprovincial migration | Difference | ||||
---|---|---|---|---|---|---|
CUSTable 10.1.3.2 Note 1 | Demographic estimate | |||||
Sample size | Estimated number | Standard error | ||||
|
||||||
Newfoundland and Labrador | 27,245 | -6,477 | 2,405 | -8,555 | 2,078 | 0.86 |
Prince Edward Island | 13,650 | 1,616 | 1,093 | 2,993 | -1,377 | -1.26 |
Nova Scotia | 69,283 | 14,315 | 4,119 | 21,720 | -7,406 | -1.80 |
New Brunswick | 50,933 | 6,938 | 2,918 | 7,491 | -553 | -0.19 |
Quebec | 111,790 | -39,080 | 9,831 | -26,852 | -12,227 | -1.24 |
Ontario | 291,826 | 73,340 | 15,334 | 23,157 | 50,183 | 3.27 |
Manitoba | 57,416 | -41,156 | 4,598 | -30,715 | -10,441 | -2.27 |
Saskatchewan | 65,324 | -42,617 | 5,066 | -42,019 | -599 | -0.12 |
Alberta | 232,028 | -58,339 | 13,084 | -35,046 | -23,293 | -1.78 |
British Columbia | 211,119 | 91,460 | 13,209 | 87,826 | 3,634 | 0.28 |
10.2 Census Overcoverage Study
The validation of the results of the two coverage studies was guided by Statistics Canada’s Directive for the Validation of Statistical Outputs. Among various validation steps, it was possible to evaluate the results from the 2021 Census Overcoverage Study (COS) by assessing how each of the components that led to the construction of its sampling frame contributed to the overall estimation of census overcoverage. It was also possible to look at the potential reasons why persons were counted more than once in the census. Refer to Section 8.7 for more information.
During the validation process, another step was to compare the COS frame and estimates with other available sources, to analyze what might be missing from the COS frame and identify systemic issues, if any. One such source is the Social Data Linkage Environment (SDLE). The SDLE team at Statistics Canada is in charge of all census linkages used to derive information from administrative data sources (to gather income information, for example). As a first step to its linkages, the SDLE team does an internal probabilistic linkage of the entire Census Response Database (RDB) to itself to identify potential duplicate persons. There are some important differences between the objective of the SDLE’s internal record linkage of the RDB and the objective of the COS. The objective of the SDLE is to identify with certainty duplicate records, and, therefore, it uses a conservative approach in duplicate identification. The objective of the COS is to construct a frame of all possible duplicate pairs from the internal linkage of the RDB.
The SDLE list of duplicated persons was compared with the COS frame, and all SDLE duplicates were on the 2021 COS frame. Also, a comparison of COS estimates and SDLE potential duplicate persons was done to investigate trends between 2016 and 2021. Although the results of the COS are an estimate of overcoverage and contain sampling error, whereas the results of the SDLE are counts and are expected to represent a subset of the COS estimate because of their nature, comparing the percentage change between 2016 and 2021 for both sources was still a useful exercise. The two sources showed consistent results, with an increase in overcoverage from 2016 to 2021, and a similar pattern was observed for each province and territory.
Another evaluation was done by examining the correlation between the overcoverage status of a pair of potentially duplicate RDB records and the linkage weight that was derived for this pair, for pairs that had been identified by the probabilistic linkage of the RDB to itself (refer to Section 8.2.3 for more information on this probabilistic linkage step). In a probabilistic linkage, a linkage weight is calculated for each pair of linked records, based on the strength of that link. As described in Section 8, in the COS, a sample of the pairs identified as potential duplicates by the probabilistic linkage of the RDB to itself was subject to a manual verification process, where coders had to determine whether each pair was a true duplicate (verified overcoverage) or not. The distribution of linkage weights for the overcoverage cases and the non-overcoverage cases was compared. If the linkage performed as expected, there should be a difference between the weights of the two groups. As expected, the linkage weights were much larger on average for the overcoverage cases, and this was true for every province and territory.
Also, the additional cases of overcoverage that were identified during manual verification operations but were not on the final COS frame were evaluated to understand why they were not captured. There does not appear to be a systemic reason why the additional pairs of overcoverage were not on the COS frame. In general, these pairs were too different, meaning they had multiple typos, errors or too many differences in the fields used during the linkage processes, resulting in them not being on the COS frame. As is done for every cycle, the initial selection criteria and linkage rules will be reviewed, revised and tested before the next cycle.
The 2021 COS validation activities did not raise any concern about the methodology of the study, and the evaluation of the COS estimates showed results that were consistent with past results and with what was to be expected.
10.3 Population estimates
10.3.1 Error of closure
Statistics Canada’s DEP determines provincial and territorial population counts on Census Day by summing census population counts, estimates of census net undercoverage (CNU), and the population estimate for incompletely enumerated reserves and settlements. The DEP then extends these adjusted counts to July 1, 2021, and they become the base for postcensal population estimates.
When determining these adjusted counts, the DEP evaluates the quality of the postcensal estimates that it produced in the five‑year period preceding the census. The evaluation focuses on the difference between the postcensal estimates for Census Day and the adjusted population counts for this census. This difference is referred to as the error of closure. The detailed examination of this error is the main quality measure of the postcensal estimates.
Table 10.3.1 shows the errors of closure for 2006, 2011, 2016 and 2021 by province and territory, and for Canada. Note that a positive error of closure means that the postcensal population estimate is higher than the adjusted census count. At the national level, the error of closure for 2021 was -41,269 persons, for an error rate of -0.11%. The national population estimates therefore underestimated Canada’s population. The error rate in 2021 was lower than from 2006 to 2016.Note 1 Four provinces and one territory had errors of closure greater than 1% or less than -1% in 2021: Newfoundland and Labrador (-1.24%), Prince Edward Island (1.59%), Nova Scotia (-1.00%), Saskatchewan (1.06%), and the Northwest Territories (2.57%). By comparison, in 2016, five provinces and one territory had similar errors of closure. In 2021, eight provinces and one territory had smaller errors of closure (in absolute value terms) than in 2016.
Provinces and territories | 2006 | 2011 | 2016 | 2021 | ||||
---|---|---|---|---|---|---|---|---|
number | rate (%) | number | rate (%) | number | rate (%) | number | rate (%) | |
Source: Statistics Canada, Centre for Demography. | ||||||||
Canada | 39,409 | 0.12 | 158,558 | 0.46 | 120,044 | 0.33 | -41,269 | -0.11 |
Newfoundland and Labrador | -1,821 | -0.36 | -11,121 | -2.12 | 1,097 | 0.21 | -6,540 | -1.24 |
Prince Edward Island | -31 | -0.02 | 2,096 | 1.46 | 2,906 | 1.99 | 2,564 | 1.59 |
Nova Scotia | -3,997 | -0.43 | 5,075 | 0.54 | 7,395 | 0.79 | -9,944 | -1.00 |
New Brunswick | 2,673 | 0.36 | 1,432 | 0.19 | -5,992 | -0.79 | -317 | -0.04 |
Quebec | 19,776 | 0.26 | -23,207 | -0.29 | 89,035 | 1.08 | 33,890 | 0.40 |
Ontario | 24,532 | 0.19 | 121,217 | 0.92 | 68,329 | 0.49 | -43,978 | -0.30 |
Manitoba | -5,977 | -0.51 | 21,464 | 1.74 | 5,358 | 0.41 | -3,084 | -0.22 |
Saskatchewan | -3,691 | -0.37 | -7,779 | -0.73 | 12,492 | 1.10 | 12,402 | 1.06 |
Alberta | -50,869 | -1.49 | -3,345 | -0.09 | 43,891 | 1.05 | 2,013 | 0.05 |
British Columbia | 61,120 | 1.44 | 52,325 | 1.16 | -104,201 | -2.15 | -29,372 | -0.56 |
Yukon | -1,027 | -3.19 | 103 | 0.29 | -391 | -1.02 | 150 | 0.35 |
Northwest Territories | -857 | -1.99 | 758 | 1.74 | -47 | -0.11 | 1,146 | 2.57 |
Nunavut | -422 | -1.37 | -460 | -1.35 | 172 | 0.47 | -199 | -0.50 |
10.3.2 Accuracy of postcensal estimates
For the purposes of producing the DEP estimates, the census coverage studies are used to adjust census counts for CNU. However, since these studies are based in part on sample surveys, the CNU results contain some statistical variability attributable to sampling. To determine whether the errors of closure discussed above are statistically significant, the standard error of the adjusted census counts must be taken into account. Moreover, since the 2016 adjusted census counts were used as the base population for the 2016 to 2021 postcensal estimates, a standard error that combines the statistical variability of the adjusted census counts for 2016 and 2021 was calculated for Canada and for each province and territory.
Table 10.3.2 shows the 2021 error of closure for Canada and the provinces and territories, the combined standard error of the 2016 and 2021 adjusted census counts, and the t‑value.Note 2 The error of closure is statistically significant at a 95% confidence level for Newfoundland and Labrador, Prince Edward Island, Nova Scotia, Saskatchewan, and the Northwest Territories. For these jurisdictions, the variability attributable to sampling of the 2016 and 2021 adjusted census counts therefore does not explain the majority of the error of closure.
Provinces and territories | Error of closure | Combined standard error of the 2016 and 2021 adjusted censuses |
t-valueTable 10.3.2 Note 1 |
---|---|---|---|
number | |||
|
|||
Canada | -41,269 | 49,096 | -0.84 |
Newfoundland and Labrador | -6,540 | 2,456 | -2.66 |
Prince Edward Island | 2,564 | 1,238 | 2.07 |
Nova Scotia | -9,944 | 4,372 | -2.27 |
New Brunswick | -317 | 3,542 | -0.09 |
Quebec | 33,890 | 23,487 | 1.44 |
Ontario | -43,978 | 38,965 | -1.13 |
Manitoba | -3,084 | 6,420 | -0.48 |
Saskatchewan | 12,402 | 5,646 | 2.20 |
Alberta | 2,013 | 16,334 | 0.12 |
British Columbia | -29,372 | 20,052 | -1.46 |
Yukon | 150 | 262 | 0.57 |
Northwest Territories | 1,146 | 320 | 3.58 |
Nunavut | -199 | 331 | -0.60 |
The components of demographic growth estimated by the DEP were compared with those from other sources, notably the CUS, to determine the components that could be more closely linked to the error of closure. This analysis focused on the five jurisdictions for which the error was statistically significant. Interprovincial migration, particularly that of recent immigrants, could explain part of the error of closure calculated for Prince Edward Island, Saskatchewan and the Northwest Territories. The impacts of the CUS sampling and several components of demographic growth could help to explain the error calculated for Newfoundland and Labrador, as well as for Nova Scotia. However, it is difficult to identify a primary factor for these two provinces. Lastly, emigration and the number of non-permanent residents generally remain demographic phenomena that are particularly difficult to measure.
- Date modified: