This tutorial will walk you through some basics of using QGIS for mapping and spatial analysis. It was last updated in the Fall of 2020 for my HIST 696 Clio Wired graduate class.
Download QGIS from https://qgis.org – you’ll want the most recent “long term release” which is more stable than the “latest available” release. Currently, that’s 3.10.9 (though I am very much looking forward to eventually installing 3.14 Pi!)
Follow the download instructions to install the software. (NOTE: Mac users may have to go to their systems preferences and security & privacy settings to open the file after download if their security settings won’t allow them to open applications from “unknown” AKA open-source developers.)
Start by downloading the sample datasets from Basecamp. Expand the zip file and open the resulting folder to find three subfolders, each with a series of linked files including a .shp or shapefile, and a csv. Put them somewhere they can live for the duration of the project – if you load them into QGIS and then change their location on your computer, you’ll break the connection between the files and QGIS and have to relink them to continue working on the project.
NOTE: Windows users can sometimes “access” files inside a zip file without expanding the zip file. Don’t do this, it will break things. Make sure you expand/uncompress the zip file and work with the folders/files that way.
Open QGIS and start a new project by clicking on the “New Empty Project” template. The standard coordinate system used in the default project template is EPSG:4326 – WGS 84 but there are numerous other coordinate systems you can use. If you are working in a geographically limited area, you will likely want to use a coordinate system specifically designed for that part of the world. For example, when working on the British Isles, you might use the British National Grid. QGIS will help you translate between coordinate systems as needed but if you ever load two sets of shapefiles (or other data) and find what should be the same spot is in two different places, it’s likely a coordinate system issue.
In this case, if your shapefiles upload but then vanish, try using View < zoom to layer to find where it was put on your map. This is another instance where you may accidentally be using a different coordinate system. You can check coordinate systems by using Layer < properties < general for each layer and Project < project properties < CRS for the entire QGIS file.
Uploading Pre-Existing Data
Using the menu at the top bar of your computer, navigate to Layer < add layer < add vector layer. (You can add points, lines, or polygons this way, but we’ll be working with polygons. Rasterized/continuous data is different/more complicated so we’re sticking with vectors layers here.) This will open the Data Source Manager where you will see the words “Vector Dataset(s)”, a blank menu bar, and a small button with “…” on it. Click the “…” button to bring up a window that will let you navigate to the location of the files you downloaded to your computer. For this exercise, find the Early Modern London Parishes folder and look inside that folder. You should see something like the image below. The six Lon_Par files are linked and need to be kept in the same folder together to function correctly. The one you want to select for upload is the .shp (shapefile).
Take a minute to examine the results. You should see a series of colored polygons, one for each of early modern London’s parishes, with a blank curvy line running through them (the River Thames). If you’ve accidentally uploaded it more than once, you’ll see multiple layers, each with their own color, and can check and uncheck the boxes next to their names in the layer list to see them. You can also drag-and-drop them to reorder them and put different layers on top. No matter how many layers you’ve got, hover your mouse over the icons on the top of the map to see what each one does. Try to zoom in/out and pan around the map.
Next navigate to the Layer < open attribute table to look at the spreadsheet (in database terms this is a “table”) of “attribute” data attached to each shape in the layer. It is possible to edit this data directly in QGIS but we’re not going to attempt that here. Instead, just note the column names and the type of information about each of the London parishes that appears to be in each column.
Next we’re going to add a different type of data to QGIS – a spreadsheet or CSV. Navigate to Layer < add layer < add delimited text layer to upload the CSV. If this data had x/y coordinates, you would click the button that says “Point coordinates” and QGIS would put a series of dots (points) on the map. In this case, you need to select “No geometry (attribute only table)” because the CSV is just a list of London parishes with the order parishes were first infected with plague during the 1665 Great Plague of London. You won’t see anything change on the map once it’s uploaded, though you will see it on the bottom left-hand side of the screen in the list of layers you’ve added to QGIS.
NOTE for Mac users: if when you are trying to add a delimited text layer, it previews with all the data as “headers,” that is because QGIS does not recognize Mac line breaks. While the file provided has been set with proper line breaks, if you open it with Excel or another program on your map that may change the line breaks. The easiest way to fix the line break problem is to download the free text editor BBEdit (don’t worry about buying access to the premium features, the free version has everything you’ll want/need even if you want to use it for programming). Open the file you want to add to QGIS in BBEdit, look at the menu bar at the bottom of the page where it will say either Unix (LF) or Legacy Mac OS (CR). Click on that menu item and change it to Windows (CRLF). Save and it should be good to go.
Next we’re going to “join” the delimited text layer to our vector layer. Joins are basically a database term for adding data from two tables together. There are databases underlying pretty much everything we’re doing here with QGIS.
When you have a shapefile layer selected (don’t do this with the text layer selected!), navigate to Layer < Layer properties. This will give you numerous visualization options, including the option to join your spreadsheet to your shapefiles. Click the sidebar option for Joins then click the plus button to create a new join. “Join layer” is the name of the spreadsheet you want to join. “Join field” is the name of the column in the spreadsheet that corresponds to a column in the shapefile’s attribute table/spreadsheet. “Target field” is that matching column in the shapefile’s attribute table/spreadsheet. Click OK to finish creating the join. Then in the Layer Properties window click Apply to apply your changes and OK to exit back to the map.
Navigate back to the attribute table. How is it different from before you did the join?
More Layer Properties
Now that we’ve added some additional data to our basic shapefile, we’re going to make a few stylistic changes to the map. These are both done through Layer < Layer properties.
First, go to Labels and select “Single labels” to bring up the label options. Use the value field to select an attribute to “label with.” I recommend the parish names as the most logical labeling for this map, but you can choose any attribute that makes sense to you.
Next, and more interestingly, go to Symbology. This is where you can change the color of the shapes on your map from being a single color to being different colors based on attributes. Here, I’ve changed the option to Graduated and asked it to sort the numbers in the infection spread week column into five buckets or “classes.” You can change the number of classes in the lower right-hand corner. You have to click the Classify button to update the number of classes. You can also use the menu right above the Classify button to change how the options are sorted into those classes.
If you don’t like the color chose, click on the triangle next to the color ramp and change colors, invert the colors, etc. You can also double-click on a color square to manually change an individual color. You can also change how the legend displays (though adding a legend is a separate step, under the Legend item in the sidebar). Once you like your options click Apply to apply them to the map then OK to return to the map.
Take a few minutes to experiment with the different ways you can color the map. How do your decisions change the “argument” being conveyed by the map?
We’ve only just scratched the surface of what QGIS can do, in terms of visualizing an analyzing spatial data, but I’m going to stop here to avoid overwhelming you. Next, I want you to start a new QGIS project, upload the Fairfax county elections shapefiles, and try out some options until you decide how you want to visualize the data. Now stop to consider: what choices did you make? And what argument are you (visually) making with those choices?
- Programming Historian: https://programminghistorian.org/en/lessons/?topic=mapping
- QGIS documentation: https://www.qgis.org/en/docs/index.html including a training manual: https://docs.qgis.org/3.10/en/docs/