Recruiting Footprints 2: Scattered Thoughts

In my previous post, I talked about some maps that I made using Leaflet, a JavaScript library, to map the enlistment footprints of four battalions in the Canadian Expeditionary Force. The goal of the maps was to compare the enlistment patterns of ethnically-defined battalions, such as the 223rd (Canadian Scandinavian) Battalion and the 233rd (Canadiens-Français du Nord-Ouest) Battalion, against the enlistment patterns of local battalions raised in the same city during the same time. While the maps produced some very pleasing visualisations that showed how much further recruits travelled to join an ethnically-distinct battalion, I cautioned that the lines in these maps are only a representation of soldiers’ residences but the lines do not lead back to anyone’s historical address. I should explain why that is.

History is Written by the Vectors

As I thought about how to create a map that would accurately display the recruiting footprints of different units, I was confronted by the limitations of displaying data on a map. Generally, maps convey information using three kinds of vectors: points, lines, or polygons. Previous blog posts explored mapping variations in patriotic contributions using different-sized points and mapping the disparities in provincial enlistment rates using polygons. Mapping with polygons did not seem appropriate for this project, so that left me with lines and points.

My first attempt was to map individual enlistments using points, varying the size of the point to reflect the number of soldiers who enlisted from a particular town or city. I used QGIS, the old open-source go-to for rudimentary GIS work. The results looked ok:

Enlistments of the 233rd (Canadiens-Français du Nord-Ouest) Battalion.
Enlistments of the 233rd (Canadiens-Français du Nord-Ouest) Battalion in QGIS.

Certainly, the above map shows that the 233rd drew recruits from all over Western Canada. We can see that Edmonton – where the largest point is located – provided the greatest number of recruits, which makes sense because the battalion was headquartered in Edmonton. But the map lacks dynamism. Lines would be nicer, because they connote the movement and distance recruits travelled to reach their place of enlistments. To create a map using lines, I turned to Gephi, which is usually used for network visualisation but can display relational data and draw lines between georeferenced points. This map turned out a little better:

233rd (Canadiens-Français du Nord-Ouest) Battalion.
233rd (Canadiens-Français du Nord-Ouest) Battalion in Gephi.

Like the first map, we can see where the 233rd drew its recruits from. The lines give a nice sense of where they travelled and the thickness of the lines reflects the quantity of recruits that originated from the same town or city. This map does not display soldiers who lived in Edmonton before enlistment, however, which is a problem because we know from the first map that the largest proportion of recruits who enlisted with the 233rd lived in Edmonton. This group of recruits disappeared because Gephi drew a line between a soldier’s place of residence and their place of enlistment. Soldiers who both lived and enlisted in Edmonton are not represented because a line cannot appear between two overlapping points.

This was the challenge of the data collected from attestation papers: they record a city or a town as a soldier’s place of residence, not their personal address. So if a few dozen soldiers all resided in Prince Albert, Saskatchewan their points of origin are all represented by the same set of geographical coordinates.

233rd-coords

I figured the best thing to do would be to use a random number generator to alter the coordinates representing identical locations. Rather than have one point at the geographical centre of Prince Albert to represent all the soldiers who originated from that town, there would be a few dozen points scattered in a 10 kilometre radius around Prince Albert. The next step would be to draw a line from each of these scattered points to the soldiers’ common place of enlistment – Edmonton, in this case.

Dropping Leaflet

Plotting the coordinates to draw hundreds of individual lines seems tedious, which is one of the reasons I chose to make these maps in Leaflet. Leaflet is designed to create interactive, mobile-friendly maps – which are appealing qualities – but most importantly, Leaflet is written into a simple HTML file, which is easy to manipulate through the command-line. Once I figured out how to use a random number generator to “scatter” identical coordinates, I used SED (a stream-editor) to paste these coordinates into a HTML file that would plot the hundreds of lines that produced the final version of the maps:

cef-data-scattered

The two sets of coordinates represent the approximate place of residence (one the left) and the place of enlistment (on the right). Notice how each set of coordinates for the place of residence vary by a few one-hundredths of a degree.

Scattered Plots

The resulting maps were pretty striking, much more so than the first two attempts that just used points or weighed lines. By scattering the points and drawing lines between all of the scattered points, I managed to create the desired effect to show how enlistment rates varied by place while also preserving the movement implied by a line. The maps look great and certainly make a point about the origins of soldiers in different battalions, but these visualisations rely on approximated data. None of the soldiers actually lived where the points are located (probably). But then again, if I hadn’t scattered these points, the soldiers’ place of residence would be represented by a dot located in the geographical centre of their town or city – also a rough approximation of where the soldiers lived. So it all comes down to the same challenge of working with approximate data. But the point of these visualisation was to illustrate the broader patterns of enlistment, not the individual places of residence of each soldier.

Because the original data does not include individual addresses, scattering points doesn’t seem like a big deal. It’s not a big deal when plotting addresses in the Canadian prairies, but the maps get a little less credulous when repeating this in coastal communities. When I tried this with a battalion raised in Halifax, some of the points ended up in the Atlantic Ocean.

Still some work to do…

Leave a Reply

Your email address will not be published. Required fields are marked *