[This post was originally published on my blog]
Today I’ve decided to expand the number of cities included on my murder rate map to everywhere with 100,000+ people.
In order to do that using the FBI data (which only includes the names of the cities), I need to find the longitude-latitude for each city on my data set, and add it as new columns. This was not a big deal for the previous case, when I had 35 cities, but now my data set includes over 400, so I obviously won’t be looking them up by hand.
Here is one way of doing it using Python:
First, you need to create a free account on OpenCage Geocoder, which is an API that can be use to look up coordinates of places, and also find out the place a set of coordinates corresponds to. You can use any API you want, really. I just picked this one for simplicity and convenience. You will then get YOUR_API_KEY that you need to use every time that you make a request for a location. You also need to install and import the corresponding Python package, opencage (here is a tutorial in case you want more info).
Let’s start with a simple example, by looking for the coordinates of one single place. As an example, I’m gonna use Bijuesca, the village in Spain where I grew up, because it is awesome.
The ‘results’ variable has a lot more information than we need right now:
but you can access the important fields that include the info about the coordinates in a similar way as when accessing a Python dictionary:
Which are Bijuesca’s coordinates!
Ok, so now we are ready to get the coordinates for all the cities in my data set, which looks like this:
As the simplest, not-most-efficient approach, I am going to iterate over each row to get the city and state, then use the API to get the corresponding coordinates. I’ll save longitudes and latitudes in two separate lists. Then I can add these two lists as new columns once I’m done:
Here we have our dataframe with the new added columns: