The Hippocratic License


  Greg Wilson

Over the past few years, more and more people have become concerned with the ethical implications of work in data science and machine learning. From algorithmic bias to facial recognition, the tools we teach people to build and use have the potential to do great harm; as teachers, we have a responsibility to make our students aware of the issues in the same way that the medical profession teaches nurses and doctors to think about the human implications of what they do.

One example of how we can do this comes up when we discuss software licensing and intellectual property. Until recently, researchers had three principal options:

  • Make their work closed source, so that others could not use it without permission.

  • Use the MIT License or something equivalent, which allows users to do whatever they want.

  • Use the GNU Public License (GPL), which allows users to do what they want but also requires them to share the source of any project that modifies or incorporates GPL’d software.

A fourth option has recently been developed by Coraline Ada Ehmke (best known until now for creating the Contributor Covenant used by many open source projects). Like other open licenses, the Hippocratic License allows people to use and share the software, but where the GPL requires them to share their own work, the Hippocratic License prevents anyone from using the software to do harm. To avoid wrangling over what exactly that means, the license specifically forbids anyone from using software in ways that violate the United Nations Universal Declaration of Human Rights and the United Nations Global Compact. These are regarded as landmarks in the history of human rights, and more practically, have been ratified by many countries and argued over by lawyers and scholars so that their scope and meaning is clear.

Making students aware of the Hippocratic License and adopting it for our own projects is a small step toward a better world, but it is a step. From a teaching point of view, discussing it and its implications can turn an otherwise abstract lecture on ethics into a lively debate, and can give students practice discussing what they should do rather than what they could do.

If you’d like to read more, Dr. Nick Horton wrote a post about this license on the “Teach Data Science” blog. And here is a list of adopters as of the time of this post, with links to their projects: