London – As we approach a year since the global outpouring of support for Black Lives Matter that followed the death of George Floyd in the United States, it bears remembering the form that racism takes in the modern world. It is a racism that is not only a matter of individual prejudice, but of systems of power designed—intentionally or otherwise—to exploit, suppress and dehumanize.
It is a racism that is not an abstract, but a tangible, material reality for large parts of humanity. And it is a racism that permeates human creation, from borders to cities to algorithms.
To many, the advent of sophisticated data science methods like artificial intelligence (AI) seems to hold the promise of removing the harmful racial biases that humans bring to decision-making and analysis. If race is a social construct, shouldn’t a computer be able to see past prejudice and make fair decisions?
The assumption that technology will naturally reduce discrimination and identify inequalities that humans would otherwise miss is a common one, but it poses dangerous consequences for those working with data science tools.
Without conscious intervention, it should be assumed that bias—racial or otherwise—has been introduced in the development process by humans subject to their own social and individual constructs. As such, combatting racial discrimination in emerging humanitarian technologies requires planning, proactive identification and active monitoring.
For IOM’s Displacement Tracking Matrix (DTM), data science tools used responsibly hold the promise of improving aid delivery. These tools are useful in reducing the burden of repetitive, labour-intensive decisions with low levels of risk. If well planned to reduce human biases, advanced data science can expand the capacity of humanitarian organizations in ways that align with or even heighten the principles of neutrality and universality.
Some of these tools are already established in the humanitarian sector. AI systems can be used to predict drought and food insecurity, for example, or reunite separated families using facial recognition tools. These technologies can also help combat discrimination by identifying unequal outcomes that human analysts cannot see.
Yet the data science methods that form the foundation of technological humanitarian advances may still be limited in the way they input bias into projects. In recent years, AI systems have been developed in unregulated contexts and used in experimental cases. The possibilities of technological solutions are explored and sometimes even exploited in advance of regulatory legal frameworks, impacting decisions as serious as the determination of refugee status in national immigration systems.
These issues are disquieting not only for national or local authorities, but for also for humanitarians who do not consider in advance the possibly biased effects—individual and social—of the profound decisions they make. The vulnerability of people experiencing humanitarian crisis leaves them uniquely unable to evaluate and appeal decisions that affect them.
AI tools, for example, are often trained on data that is non-representative—both of use-case populations and of humanitarian contexts—leading to unexpected limitations. For example, mis-identification of non-white faces has been reported in many AI facial recognition software. Perhaps more fundamentally, the information that any data science tool uses in its development reflects conditions in a world full of prejudice and the real material inequalities that prejudice has created. If not used carefully, data science can reinforce biases and, what’s worse, wrongfully suggest that biases are natural.
AI poses some of the most pressing challenges in identifying and eliminating bias. Where traditional algorithms reflect human-created rules, AI relies on the ability of computers to recognize patterns and create their own expansive and ever-changing sets of rules. After studying a training data set, an AI may find millions of factors that minutely improve its analytical power.
As a result, AI systems may suffer from the so-called “black-box” problem, wherein the inputs and processes of the system are opaque to human observers. A person affected by the output of such a system may be entirely unable to appeal because it is impossible—even to the developer—to identify the process that led to it. So, while the effect of racialized action might be the same, it becomes far more difficult to identify the bias.
Simply eliminating data on race from data sets is no guarantee that the outcome of data processing won’t be biased, and, in some cases, might make identification of bias more difficult. An AI will in most cases be able to reach similar conclusions using proxies for race, such as address, religion or other culturally determined identifying factors. As such, eliminating data on race can simply obscure bias without reducing it at all, making it more likely to pass undetected.
Using race data is bad and not using race data is bad—the paradox explains why no simple technical guidance can suffice to eliminate racial discrimination from technology. Race and other discriminatory factors should not be relevant in society, but racism and its cumulative consequences are real. Addressing racism through unbiased technological improvement is not possible without actively considering and working to redress the reality of racism more broadly.
Fortunately, as data science tools are developed and adopted by humanitarian actors, useful protection frameworks have begun to emerge, including Harvard’s Signal Code, the IASC Operational Guidance on Data Responsibility in Humanitarian Action, and UNDP and Global Pulse’s Guide to Data Innovation for Development.
For bold, new data science projects to work, they must be guided by a robust process for assessing and minimizing biases. Because bias can be introduced into data science projects in innumerable ways, the Humanitarian Data Science and Ethics Group (DSEG) — of which DTM is a coordinator — has concluded that a static guidance cannot sufficiently address all forms of bias.
Rather, a dynamic, interactive process for assessing all the ways that bias could be introduced allows project designers at DTM and throughout the humanitarian sector to ensure that they have considered and made their best conscious efforts to eliminate bias.
This process is codified in DSEG’s Framework for the Ethical Use of Advanced Data Science in the Humanitarian Sector and the resulting Decision Tree, which challenge project designers to consider whether they have identified potentially discriminatory issues, such as historical bias in data collection, lack of representation in training data, and the diversity of the project’s team. Both formalize the process—rather than the outcome—of bias checks, so the system is appropriate for new and emerging tools. They rely on an entire project team’s knowledge of their project, the context in which it will be conducted, and the technology it will use to ensure that the guidance is always relevant.
Applying these frameworks to the development of data science projects can help practitioners decide on an appropriate accountability framework, test their model for bias, or even decide that, in certain cases, a process cannot be safely automated.
Racial discrimination and bias do not result from data science; but insofar as data science tools reflect the socio-economic and political realities in which they were created, they can formalize, obscure and deepen discrimination. So, it is the responsibility of all humanitarian data science users to ensure that they have proactively identified and mitigated all biases that may impact their work.
It is a universal responsibility—of data scientists, decision makers and everyone else—to eliminate the hate and exploitation that creates conditions of bias, inequality and discrimination to begin with.
This article was written by Jack Bahn, a consultant with DTM London. The views expressed on this blog are those of the author(s) and do not necessarily represent those of IOM.