William & Mary undergrads create one of world’s largest open, free databases of political administrative boundaries
Editor’s Note: This article originally appeared on the Cloudera Foundation website. In April 2021, the Cloudera Foundation merged with the Patrick J. McGovern Foundation.
by Sydney Fuhrig, Joshua Panganiban, Sylvia Shea
June 22, 2020
Students track boundaries of 199 entities
How does an NGO best allocate limited vaccines in the Democratic Republic of the Congo? What is the best way to distribute aid at the local level after a natural disaster? Subnational administrative boundaries are essential to answering these questions. While geospatial data acts as the backbone of various disciplines and fields of research, there have been few groups that seek to compile and clean global subnational boundaries for free and open use.
A database of important lines
The geoBoundaries Global Administrative Database is an online, open license resource of the geographic boundaries of political administrative divisions (i.e., state, parish, barangay). Contrasted to other resources, geoBoundaries:
- Provides detailed information on the legal open license for every boundary in the repository, and
- Focuses on provisioning highly precise boundary data to support accurate, replicable scientific inquiry.
The geoBoundaries team currently tracks and updates the boundaries of 199 total entities, including all 195 United Nations member states, Greenland, Taiwan, San Marino, and Kosovo. All boundaries are available to view or download in common file formats (i.e., shapefiles, geoJSONs), allowing for the integration of geoBoundaries with large-scale computational workflows.
The first major challenge for this project lies in the data collection. A majority of countries do not provide an open digital source for their boundaries, so researchers scour the internet for resources in unconventional places, contact owners of maps, or digitize the boundaries themselves. Furthermore, some boundaries are changeable and a bit messy. Nations may share a contested border or decide to update their administrative hierarchy.
In order to keep up with the dynamic nature of these static lines, each country’s definition of its own borders is used, even if it results in cross-country overlap, to ensure the most representative boundary according to each individual nation. The boundaries are updated annually, and previous versions of the repository remain available online.
Students, engagement, teamwork
Nearly all geoBoundaries work is completed by a diverse team of undergraduate students. The project began in 2017 with Leigh Seitz, advised by Dan Runfola, assistant professor of Applied Science at William and Mary. In fall 2017, she and another researcher recruited eight other students to the Geospatial Evaluation and Observation lab or geoLab. She built an initial methodology and developed the first training. After Seitz graduated, leadership transitioned to Lauren Hobbs, who solidified and enhanced geoBoundaries methodologies and trainings while pioneering external partnerships with the International Fund for Agricultural Development (IFAD) and Center for International Earth Science Network (CIESN), and later Joshua Panganiban, who has continued to expand external partnerships, improve team structure, and grow the team from six to nearly 20 undergraduate researchers. The geoBoundaries Team will continue in fall 2020 under new directors, Sydney Fuhrig and Sylvia Shea.
GeoBoundaries’ researchers come from a variety of disciplines ranging from data science to geology, history, anthropology, and accounting. They are united by their passions for geospatial skill development and contributing to research. With the release of geoBoundaries 2.0, nearly half of members are published undergraduates; students are motivated to see the impact and results of their work on a yearly basis.
In order to foster exchanges of ideas and skills, teamwork is essential. The students collaborate in finding boundaries and older members are ready to help newer members at any time. Students are also encouraged to get creative and come up with their own projects or lead a small working group with other interested students within geoLab. Most recently, the COVID response group was put together after the recent outbreak to conduct data visualization with dashboards and other research on mobility. All while working remotely!
In addition to geoBoundaries, there are 3 other teams within geoLab: geoData, geoDev, and geoParsing.
While geoBoundaries’ central mission is to grow and refine the database, geoLab partners with external organizations whose values and goals align with geoLab’s. These collaborations are rooted in geospatial data, but are oriented in a variety of fields. A collaboration with CIESN and GRID3 (Geo-Referenced Infrastructure and Demographic Data for Development), yielded health care boundaries in the Democratic Republic of Congo. Other partners include the Global Environment Facility, IFAD, World Resource Institute, Global Forest Watch, and more incoming. The push for diverse partners has expanded access to open-source industry-specific boundaries.
Moving forward, geoBoundaries faces two key challenges: refining its database and expanding its network of external partners.
Currently, the boundaries within the database include various types of licenses, which can be confusing or challenge the use of geoBoundaries data in some cases. The goal for the upcoming release of the geoBoundaries 4.0 dataset is to transform all of the data into a harmonized open-source license. With all boundaries under a uniform and open license, any individual or organization would be able to use this data within any field for any purpose. To reach this ambitious goal, the team will update existing boundary licenses when possible or find entirely new boundary files if needed. The team has grown significantly and further refined its methodologies to reach this goal.
The GeoBoundaries team is eagerly seeking additional external partnerships. To ensure that students are continuously engaged and gain real-world experience, geoBoundaries researchers conduct GIS services for a wide variety of partners. Companies and organizations interested in collaborating are encouraged to contact email@example.com.
Click here to access our data. For more detailed information about the geoBoundaries methodology, visit our recent publication. Feel free to email us at firstname.lastname@example.org. The project is produced and maintained by the geoBoundaries Team within William & Mary’s Data Science research lab, geoLab. Find the geoLab on Facebook, Twitter, Instagram, and LinkedIn.
Sydney Fuhrig is an incoming geoBoundaries Team Director who plans to graduate from the College of William & Mary in May 2021 with a double major in History and environmental science.
Joshua Panganiban is the current geoBoundaries Team Director and a recent graduate from William & Mary, with a Bachelor’s in Business Administration in accounting and a secondary major in environmental data science.
Sylvia Shea is an incoming geoBoundaries Team Director and is an undergraduate student at William & Mary, with a major in data science and GIS and a minor in marketing.