I recently became interested in preventing genocide with data. I think this is not an easy thing to do. I undertook to identify data sources that might be relevant, and thanks to many wonderful people, I can present the following results!
#1. Karen Payne’s “GIS Data Repositories“, “Conflict” section.
Karen has assembled a phenomenal collection of references to datasets, curated within a publicly accessible Google spreadsheet. I’m sure many of the other things I’ll mention are also included in her excellent collection!
This list of data repositories was compiled by Karen Payne of the University of Georgia’s Information Technologies Outreach services, with funding provided by USAID, to point to free downloadable primary geographic datasets that may be useful in international humanitarian response. The repositories are grouped according to the tabs at the bottom
Kalev Leetaru of Georgetown University is super helpful and runs this neat data munging effort. There is a lot of data available. The GDELT Event Database uses CAMEO codes; in this scheme, there is code “203: Engage in ethnic cleansing”. There’s also the Global Knowledge Graph (GKG) which may be better for identifying genocide, because one can identify “Material Conflict events that are connected to the genocode theme in the GKG.”
#3. The Humanitarian Data Exchange
This new project seems very promising – Javier Teran was very helpful in describing what’s currently available: “datasets on refugees, asylum seekers and other people of concern in our HDX repository that may be useful for your research”. By the time you read this, there may be even more genocide-related data!
The Uppsala Conflict Data Program (UCDP) offers a number of datasets on organised violence and peacemaking, all of which can be downloaded for free
#5. USHMM / Crisis in Darfur
The American Association for the Advancement of Science has a collection of research related to Geospatial Technology and Human rights. Start reading!
I haven’t looked into what data they might have and make available, but it seems like a relevant organization.
USAID and Humanity United ran a group of competitions in 2013 broadly around fighting atrocities against civilians. You can read about it via PR Newswire and Fast Company. I found the modeling challenge particularly interesting – it was hosted by TopCoder, as I understand it, and the winners came up with some interesting approaches for predicting atrocities with existing data.
This is a tip I haven’t followed up on, but it could be good:
Hi, I would reach out to Jonne Catshoek of elva.org, they have an awesome platform and body of work that is really unappreciated. They also have a very unique working relationship with the nation of Georgia that could serve as a model for other work.
#10. The CrisisMappers community
“The humanitarian technology network” – this group is full of experts in the field, organizes the International Conference of Crisis Mappers, and has an active and helpful Google Group. The group is closed membership but welcoming; connecting there is how I found many of the resources here. Thanks CrisisMappers!
CrisisNET finds, formats and exposes crisis data in a simple, intuitive structure that’s accessible anywhere. Now developers, journalists and analysts can skip the days of tedious data processing and get to work in minutes with only a few lines of code.
Examples of what you can do with it:
- Tracking War in Gaza using Facebook
- Mapping Refugees and Fighting in Iraq with UNHCR and Social Media
- Tracking Syrian Barrel Bombs
Tutorials and documentation on how to do things with it:
- Get Crisis Data with Python and pandas
- Making large CrisisNET requests with Python
- Export CrisisNET data to CSV with Python
- Functional Programming in Python for Crisis Data Bliss
- Choropleth Maps with D3
- API documentation
All I know is that PITF could be some sort of relevant dataset; I haven’t had time to investigate.
I’ll post this on my blog, where it’s easy to leave comments with additions, corrections, and so on without knowing git/github, but the “official” version of this document will live on github and any updates will be made there. Document license is Creative Commons Share-Alike, let’s say.
- Thanks of course to everyone who provided help with the resources they’re involved with providing and curating – I tried to give this kind of credit as much as possible above!
- Special thanks to Sandra Moscoso and Johannes Kiess of the World Bank for providing pointers to number 2 and more!
- Special thanks to Max Richman of GeoPoll for providing numbers 4, 5, 6, and 7.
- Special thanks to Minhchau “MC” Dinh of USAID for extra help with number 8!
- Number 9 was provided via email; all I have is an email address, and I thought people probably wouldn’t want their email addresses listed. Thanks, person!
- Special thanks to Patrick Meier of iRevolution for connecting me first to number 10!