Playing with Fusion

PLEASE NOTE: Google decided it will downturn Fusion Tables in December 2019. Errors may appear in the embedded visualizations as early as August 2019

At the moment, I’m writing a chapter on voluntary contributions from Indigenous communities in Australia, Canada, and New Zealand during the First World War. Any discussion of First Nations enlistments in Canada is bound to include a discussion of the 114th (Brock’s Rangers) Battalion, raised in Haldiman County in Southwestern Ontario. The 114th included two companies of First Nations soldiers, recruited mainly from the Six Nations reserve that neighboured Haldiman County. The 114th had to recruit more broadly to fill the ranks of its two First Nations companies, and sent recruiting missions as far away as Saskatchewan and Quebec. I thought it might be possible to map the origins of each member of the 114th Battalion by using the information on the CEF embarkation rolls and use this information to determine how many First Nations soldiers were drawn from the Six Nations and how many were recruited elsewhere.

I recently discovered the CEF Study Group has built a database of digitized CEF embarkation rolls which is freely accessible. I found the embarkation roll for the 114th, which lists every member of the unit before it sailed overseas and includes such vital information as the address of their Next of Kin, their country of birth, and the place where they joined the battalion. I figured that it would be relatively easy to determine where a soldier was from based on the address of their next of kin, thinking that in most cases this would be the address of a parent, spouse, or sibling.

114thThe embarkation rolls can be downloaded as pdf files, but these have not been OCR’d so it took some work to make them machine-readable. I used Adobe Pro to OCR the text, then using the Linux command line, I separated the text layer from the pdf into a .txt file so I could clean up the OCR. One of the difficulties of these documents is the use of dots between each column, which creates a pretty messy text file. After a number of ‘sed’ commands and much much less time than it would ever take a human being to type out all 672 rows of the embarkation roll, I had a .csv file that was in relatively good shape.

114thexcelWhen I opened up the .csv file in Excel, it was clear that the OCR still wasn’t perfect and that there was a lot of work still to be done in cleaning up the data. I used OpenRefine to clean up the data even more. Open refine is a great tool for making sure that identical content is labelled consistently throughout a spreadsheet. Because of some dicey OCRthe rank of ‘Captain,’ for example, appeared as ‘Capn,’ ‘Captan,’ ‘Captam,’ and ‘Captmn.’ OpenRefine clusters similar items in the same column so that any unintentional variations or errors can be fixed.

Once the Embarkation Roll had been cleaned up, I imported it into Google Fusion Tables. Fusion tables combines a number of tools in the Google suite so that data can be easily turned into maps and charts. Because I wanted to map out the origin of each soldier in the 114th, I tried mapping the values in the ‘Next of Kin’s Address’ column.

The result showed that the soldiers of the 114th came from all over, or that a lot of their soldiers had next of kins located all over the place. The trouble with Google Maps is that when it geocodes individual locations from a spreadsheet, such as the dots above, each location is only represented by one dot, regardless of how many times it appears in the spreadsheet. Fifty soldiers in this spreadsheet reported the address of their next of kin only as ‘Ohsweken’ but the Google only places one dot over the town of Ohsweken. Ohsweken and Erzerum, Armenia, where one soldier’s next of kin resided, receive equal representation. The resulting map shows a few soldiers’ next of kin resided in Saskatchewan, around the border between Ontario and Quebec, and on Manatoulin Island. These are all regions visited by Charles Cooke, a member of the Six Nations who worked as a clerk for the Department of Indian Affairs and was dispatched as a recruiting officer for the 114th in early 1916. So the map offers a few dots to possibly represent First Nations soldiers who were not members of the Six Nations but travelled to Southwestern Ontario to enlist with the 114th. That’s a start, but without additional information it will be difficult to determine with certainty if any of these recruits are actually First Nations, and whether any of the other dots on the map represent First Nations recruits who were not from the Six Nations.

The embarkation roll also lists where a soldier was taken on strength. This data can also be mapped, but there are far fewer. Again, each dot can represent one or one thousand soldiers but the Fusion Tables can make a heat map that reflects the frequency with which each location appears in the spreadsheet.

TOS heatmap

The heat map on the right (which cannot be embedded from Fusion Tables) shows glowing red dots over Cayuga and Dunnville, the two locations where most soldiers joined the 114th. 171 soldiers were taken on strenght in Cayuga and 144 at Dunnville. Fusion Tables can also build charts, particularly based on relational data, which is useful for seeing larger patterns. Because the embarkation roll tells us where a solider (or his family) is from and where he enlisted, it is possible to chart the relationship between the two.

The resulting chart is difficult to read when all 353 NOK addresses are visible, so I scaled it back to show about half of those addresses. Blue dots represent the next of kin’s address and the yellow dots represent the location where a soldier enlisted. The chart shows, not surprisingly, that most soldiers enlisted close to where their next of kin lived. There is a strong correlation between soldiers whose next of kin is Dunnville and soldiers who enlisted in Dunnville, and the same can be seen for Ohseweken and Caledonia. These charts are best for identifying broad patterns, but it can be difficult to find the exceptions to the rule. On the far right, we see a dot representing a soldier whose next of kin lived in Oka, Quebec and who was taken on strength with the 114th at Camp Borden. Near the top right corner we see at least one recruit whose next of kin lived in Caughnawaga, Quebec also joined the 114th at Camp Borden. Oka and Caughnawaga are both reserves visited by Charles Cooke during his recruiting missions and because these soldiers joined the battalion after it had arrived in camp, it suggests that these were indeed late additions meant to bolster the still understrength First Nations companies of the 114th. This approach didn’t exactly provide an exact number of First Nations recruits who were not of the Six Nations, but at least it confirmed the accounts of the unit’s recruitment and may provide a method for examining other battalions with less detailed records of their recruiting patterns.

After that, I just played around with the data a little more.

Because the embarkation rolls provide the date on which a soldier was taken on strength, it’s possible to show the relationship between location and time of enlistment. This didn’t appear very clearly on the map, but showed up better on a chart.

Yellow dots represent the date a soldier was taken on strength and blue dots represent the location. The chart shows that a significant number of soldiers were taken on strength in late March 1916. The largest yellow dot represents March 27 1916 when 102 soldiers were taken on strength, mostly in Dunnville, while another sizeable dot shows 81 soldiers taken on strength in Hagersville. 79 soldiers were sworn in on March 24th (although you can’t read it on the above chart), mostly at Ohsweken on the Six Nations reserve. While the last chart showed us that a lot of soldiers enlisted close to where they lived, this shows us that a lot of soldiers were taken on strength on the same day. This could suggest that recruiting intensified in late March and successive recruiting drives in neighbouring towns were able to bring the battalion closer to its authorized strength.

The last piece of data I played with was the soldiers’ country of birth. While it’s easy to present this in percentages or turn those percentages into a pie graph, I tried to display this information on the map. After a lot of googling, I was able to find KML files for national boundaries and use those to create an intensity map to show where the soldiers of the 114th were born.

Country of birthThe majority of  soldiers in the 114th were born in Canada (68%), almost a third were from the UK (28%), a handful were from the US, with a few from Italy, one from Russia, and one from Armenia. I’ll probably try this again with a battalion from the prairies, where there was a higher proportion of immigrants, to see how the distribution changes.

All this playing around didn’t answer my original question but having written more of my chapter since setting out on this tangent, I found that I don’t really need the answer anyway. Cleaning and crunching this data only took up a few hours of my time, which shows how much can be done with these tools. This has been a good opportunity to learn some new tricks with Fusion Tables. Making intensity maps is definitely something I want to do more of in the future but Fusion Tables might not be the most efficient way to do it. Still, it’s a good introduction.

Further reading:
The 114th Battalion features prominently in Tim Winegard’s two studies of Indigenous Peoples in the First World War: Indigenous Peoples of the British Dominions and the First World War and For King and Kanata. Katherine McGowan and Whitney Lackenbauer wrote a chapter about recruiting for the 114th in Aboriginal Peoples and the Canadian MilitaryRobert J. Talbot discusses the Charles Cooke’s recruiting efforts as part of his article on First Nations’ ambivalence to the war effort. Alison Norman has written a chapter on women’s patriotic work on the Six Nations in A Sisterhood of Suffering and Service. John Moses and Evan Habkirk have written Honours and Masters’ Dissertations on the Six Nations, which can be found on ProQuest. The Great War Centenary Association has created a website commemorating the participation of Brantford, Brant County, and the Six Nations in the First World War.

Leave a Reply

Your email address will not be published. Required fields are marked *