How to change the world with data science

Things get done only if the data we gather can inform and inspire those in a position to make a difference — Michael J. Schmoker

Today, there are almost a million people working globally in a data science related field.

The industry has seen massive growth, with a majority of the world’s data being generated in the last two years. Data-driven industries have seen a massive growth in the past few years.

Applications of data science has made life a lot easier. From finding the best movies to choosing the best restaurant for dinner, data has provided us with more convenience than we could possibly need.

Almost every application of data science in today’s world is focused on making already comfortable lives even more comfortable.

However, there is a lot more that can be done with the data available to us. The same algorithms being used to boost sales can be written to instead boost social impact.

How to use data science for social good?

Photo by Micheile Henderson on Unsplash

Some examples of using data science for social good include:

  • Creating machine learning algorithms that predict a household’s poverty status.
  • Identifying trends between online bullying and suicide.
  • Examine the impact that different street characteristics have on pedestrian deaths.
  • Identify if personalization algorithms used by social media reinforce negative body images.

It is a better time now than ever before to create data-driven solutions to solve social problems. It is possible to collect large amounts of data related to all the topics mentioned above.

Non-profit organizations can work with this data to understand social issues better. If the work they do is driven by data, they will be able to take larger strides towards achieving solutions to social problems.

For example, machine learning models can be implemented to identify communities that need immediate attention and prioritize them.

However, it is difficult to make progress in the area of data science for social good. While large tech companies can afford to hire data scientists and develop large data ecosystems, non-profit organizations cannot afford to do so.

This means that even though NGO’s have an abundance of data available to them, they are unable to make use of it because they lack the technical skill.

Fortunately, there are a few organizations that help NGO’s gain access to technical talent.

These organizations pair data scientists/analysts with NGO’s all around the world.

In fact, if you are an aspiring data scientist you can volunteer at one of these organizations. You will be able to work with other industry professionals to analyze social data and answer pressing questions.

How you can get involved

Photo by Etty Fidele on Unsplash
“using data to not only make decisions about what kind of movie we want to see, but what kind of world we want to see”- DataKind

DataKind is an organization that connects data scientists and NGO’s. They even provide mentoring for data scientists who want to get involved in the social sciences.

Their motto is “harnessing the power of data science in the service of humanity.”

If you are a data scientist who wants to give back to society, you can join DataKind as a volunteer. If you join as a volunteer, they will assign you to an existing project and you will get to work with a group of like minded people in answering a social data science question.

As of now, one of the most popular projects they are working on is called Vision Zero. This project aims to bring traffic-related deaths and injuries down to zero.

Most of their events are organized during the evening or on weekends so people working full time jobs can also actively participate. You can learn more about the work DataKind does here.

The University of Warwick also runs a DSSG(Data Science for Social Good) summer fellowship, and does similar work to DataKind. They teach students to collaborate with NGO’s and develop data science products for social good. You can take a look at some the work they do here.

There are many more fellowships and volunteer opportunities available for social data science, so you should do some research on that if its something you’re interested in.

Projects you can work on

If you want to do some research in the area of social data science, here are some project ideas for you:

Analyzing the BLM Movement

Photo by Gabe Pierce on Unsplash

In response to systemic racism, a popular movement has emerged called Black Lives Matter. This movement protests police brutality and other forms of violence that black people face in their everyday lives.

However, the BLM movement has received a lot of backlash from other communities. Many people are unwilling to believe that systemic racism exists. They say that movements such as BLM are unnecessary.

Movements such as All Lives Matter have emerged as a response to the BLM movement, insisting that the United States is a post racial society and that racism no longer exists in the country.

Data can be used to prove whether or not black communities are at a disadvantage. Traffic stop data is being released everyday by the New York Police Department.

The details of every traffic stop is recorded, and this data can be used to find evidence of systemic racism. It can be used to answer questions like “are racial minorities stopped by police officers more often with less evidence of wrongdoing?”

If this is an analysis you’d be interested in, you should check out the Stanford Open Policing Project.

Poverty Prediction

Photo by Jordan Opel on Unsplash

To end extreme poverty, it is important to measure it regularly. NGO’s can only understand whether their poverty reduction strategies work if they measure it from time to time.

If we were to identify poverty, we first need to collect household consumption data. We can then train machine learning models on labelled poverty datasets and make predictions on future data.

With a good algorithm, we can quickly identify low income households. These predictions can help social organizations prioritize low income households and put poverty reduction strategies in place.

If this is a project you’d be interested in, you should check out Kaggle’s Poverty Prediction Dataset.

Conclusion

Most existing data driven solutions are focused on improving the level of comfort in a people’s everyday life. While this is an application of data science that generates the most revenue, it certainly isn’t the most meaningful one.

Data has the potential to create better lives for people and can be used to combat issues like poverty, drug abuse, and racism.

The application of data science to combat social issues is called social data science, and can have a world changing impact.