Geographic Data
The Geographic Context
Given Canada's highly differentiated geography, the CCRI aims to provide the data, information, and tools necessary for users to properly grasp the spatial dimension of their research. Special efforts have been made to reconstitute the 1911 census geography, to locate (georeference) each sampled dwelling or household, to create any geographic files necessary to map and process the sampled microdata— and to allow for the migration of data between statistical and mapping software.
Census Geography: Two Types of Boundaries
There are actually two different types of census boundaries: those used for census-taking and those used for compiling and disseminating results. Users need to understand both types in order to properly understand and analyze the microdata.
Census-taking Boundaries
Census-taking boundaries were based on electoral boundaries. Each province was divided into “Census Districts” (CDs), which corresponded to federal electoral districts. The CDs were divided into “Enumeration areas” (EAs), which corresponded to polling districts. The EA was the smallest spatial unit used during the enumeration itself; it was the area “walked” by an enumerator, encompassing a population of between 800 and 1200 individuals. These two units (CDs and EAs) were noted at the top of the forms used to collect information on individuals. Census-taking boundaries were also used during the archival preservation and microfilming of the manuscript census.
Compilation and Dissemination Boundaries
The compilation and dissemination (through published documents) of census returns was accomplished using somewhat different boundaries. In this case, provinces were divided into “Census Divisions” (CDs), which corresponded to supra-municipal entities (such as counties). In turn, each CD was divided into multiple “Census Subdivisions” (CSDs), which corresponded to units of local government (such as municipalities and local improvement districts). In regions where no such units of local government existed, cadastral units (such as townships) were used. Users tend to be more familiar with compilation and dissemination boundaries than census-taking boundaries. Furthermore, compilation and dissemination boundaries are the ones used by the CCRI to situate the microdata and aggregate data, as well as to produce the geographic files necessary to map and process those data.

When considering the geographic component of CCRI data, users must therefore keep these two sets of boundaries in mind. Users should only use the geographic descriptors provided by the CCRI when analyzing the infrastructure's microdata and aggregate data.
Reconstituting the Geographic Boundaries of Census Divisions (CDs) and Census Subdivisions (CSDs)
The boundaries of CDs and CSDs were reconstituted using GIS software. Statistics Canada’s geographic files from the 2001 census were used as a starting point. The boundaries of these present-day polygons were altered to create polygons that correspond to the census boundaries used during each individual census. Each of the resulting historical census polygons was assigned a “CCRI Geography Unique Identifier” (CCRIUID).

Since the available census records do not include detailed maps of CSDs (with a few exceptions, dating from the end of the period of study), electoral maps were used as the primary geographic data source. However, because local administrative units were established at different times in different provinces, other sources often had to be used. This is especially true of the Prairie Provinces, British Columbia and Nova Scotia. When the available sources did not provide reliable information, hexagonal-shaped place-holder polygons have been used to represent CSDs whose boundaries are unknown.
Georeferencing Individuals in the Microdata Samples
Every individual appearing in the microdata samples has been georeferenced (geocoded): they have been assigned a geographically unique identifier (CCRIUID), which situates them within their CSD. In order to do this, correspondence tables were created linking the geographic units used for the collection of census data (CDs and EAs) with the geographic units used in census reports and the published aggregate tables (CDs and CSDs).

Geocoding dwellings was a sometimes complicated task, especially when dwellings located within a single EA had to be matched with different CSDs. In such cases, single dwellings identified in the manuscript census were individually matched to a CSD. This process was further complicated by the different kinds of political boundaries—which were used to establish census boundaries—used in the different provinces, and changes to these boundaries over time.

This way of organizing the data and the corresponding files provides CCRI users with a geographic approach to census information. Users can select microdata according to spatial criteria and generate new aggregated data. Moreover, the infrastructure's geographic component allows for spatial analysis of data using GIS software. For example, it is possible to measure the representativeness of the sample at various levels, or to organize data by census subdivisions according to different criteria: census geographic units, urban/rural classification, or community type. It is even possible to select a CSD and then select individuals, households, or dwellings within it based on other geographic criteria. Users can then analyze the data within these aggregations or selections in order to produce a cartographic rendering (thematic map) of a particular variable or of the results of their analysis.