Canadian population sample in 1911

Credits

CCRI Microdata

Microdata

The CCRI microdata were collected by sampling the data available in the 1911 Canadian manuscript census, which was made public a few years ago. Confidential data from subsequent census enumerations are available from Statistics Canada's Research Data Centres.

The CCRI microdata facilitate research on individuals, families, households, and communities caught up in the complex transformation of Canadian society which took place during the first half of the twentieth century. Ultimately, these data represent the raw materials with which census statistics can be produced. For example, it is possible to calculate the average number of months that Canadian children spent in school in 1910 (1911 census, Schedule 1, Question 33). Furthermore, this information can be obtained for a particular community or region, and compared with information on children in other locations or from other social groups. And make it possible to compare any population characteristic or variable at the household or individual level.

The Sample

The CCRI microdata are based on a five percent sample of the Canadian population as recorded in the 1911 census. The basic sample unit is the dwelling, as defined by the census. The sample includes all responses recorded on the population schedule for all individuals residing in each sampled dwelling. This approach allows for information to be analyzed at three different levels: (1) data on individuals, which are used by most researchers; (2) data on families or households; and (3) data on dwellings. Some dwellings housed more than one family or household; others (such as hospitals, asylums, orphanages, and work camps) housed a variety of mostly unrelated individuals.

The CCRI sample is made up of cluster samples of individual records, with the dwelling serving as the cluster. All individual records in each sampled dwelling were retained. For the purposes of each census, the Canadian population was subdivided into relatively small geographic areas. Each of these subdivisions was sampled separately, retaining the dwelling as the basic unit, and ensuring that the number of dwellings in the sample remained proportional to the number of dwellings in the geographic subdivision being sampled. This approach has enhanced the geographic representativeness of the sample, when compared to simple random samples of the same size.

For each census, the main sample covers smaller dwellings with no more than thirty residents. This number, chosen following a preliminary examination of the 1911 census, reflects the methodology used in producing other public use samples, including those of the IPUMS project in the United States. This cutoff also facilitates international comparability between Canadian and American databases.

Larger dwellings with thirty-one or more residents, including institutions and a variety of other types of collective dwellings, were oversampled. A complete list of larger dwellings was prepared and they were separated into two main categories: larger dwellings with multiple units, such as apartment buildings; and large dwellings that operated as a single unit (mainly institutions, hospitals, shelters, orphanages, and work camps). Larger multifamily dwellings were sampled at a rate of twenty-five percent. In other words, for each multifamily dwelling, all individuals in every fourth randomly-selected family (or household) were included in the sample. In the case of larger single-unit dwellings, every tenth unrelated resident was included in the sample (after a random start, for a sampling rate of ten percent).