Environmentalists should learn data science too!

 

Photo by Imat Bagja Gumilar on Unsplash

I started learning data science as an environmentalist. Statistics was and will always be my first go-to tool to organize data for solving real-life problems.

I studied a branch of environmental science that rarely anyone could ever think of as their first option to enter university. I studied forest and agricultural science. It is an interdisciplinary subject because I could focus not only on the forest, but also on plant physiology, genetics, ecology and landscape science, environmental science, epidemiology, and many more. Then again, I would love to talk about forestry and how actually broad the topic is despite a very narrow intuition that the name may bring, but in this post, I would like to talk about why environmentalists, and everyone, should learn to program.

Personal documentation (Nancy, France, 2019). The MRI scan of wood, taken in the Wood Material Lab at INRAe Champenoux, Nancy


I started learning to program because I must learn it to analyze my data for my thesis project. I was dealing with 400 trees spanning across 10 repetition blocks and collecting the data of at least 3 growth parameters, so the observation consisted of 1200 data points. I could analyze it in Excel, but I learned from people that it might be interesting to use other software. So I talked to doctoral students and the supervisors I was working with, and they suggested SAS. SAS turned out to be my first programming environment.

I never knew how to code before, and I dived right into programming. It was hard, but I knew it was worth my time. Since then I moved to R and when I became fluent in it I moved to Python. It is hard indeed, it took me 7 months to learn SAS with the help of the supervisors’ old codes and at the cost of my own sanity, but afterward, I learned R in only 2 months maybe. By now I’m writing codes from scratch and studying another language. The takeaway message I get from my own experience is that learning programming can improve my emotional intelligence. It teaches me to be patient, to tackle the impostor syndrome, and to understand my values. There are other reasons why you should learn too!

Data is generated fast: an environmentalist should know how to make sense of it.

Many aspects of environmental science require data now. For example, solving the issues in land-use management.

Land management is now a multi-disciplinary discipline. Land management with no consideration of the biodiversity, carbon emission, and the needs of the local people could lead to climate change. In turn, this condition affects the land and all the people involved in it, regardless of their specialties. This is why solving the issues in land management requires collaboration between people with different disciplines. And involving people requires data-driven evidence.

What makes one management better than the other? How can one management manage biodiversity while maintaining the economy of the country?

We should talk in the same language with different people. We should talk numbers. That is why people are going a long way to improve the methods of collecting data.

For example, tools for remote sensing and Earth observation are developing into more integrated, simpler, and cheaper ones. There are already so many available geographical data stored in open source databases, such as GIS and Esri Open Data, OpenTopography, and UNEP Environmental Data Explorer.

Personal documentation (Alpen, Austria, 2019). Simulation on determining a multi-purpose land-use strategy for Alpen forests.

This is true for other topics as well, such as plantation projects. To ensure that a forest plantation project can lead to the consumption of carbon dioxide from the atmosphere, environmentalists should ensure that the planted trees will grow.

What is the adaptability potential of the species to different temperatures? Can we check their genetic potential for adaptability to climate change, drought, flood, or even a frequent pest attack?

Plants’ genetic potential becomes more popular now, especially that the process for their utilization takes lesser and lesser time. Nowadays, you do not have to wait for months to get the DNA of an individual since its sequencing becomes cheaper and faster. It shows just how incredibly fast are data being generated for our research.

With the rise in technology and in awareness of environmental protection, environmentalists can design their projects and collect the data from online databases.

  1. How many trees should be planted to accelerate carbon consumption from the atmosphere?
  2. How much water should we use to irrigate our agricultural land without allowing the excess of water?
  3. How will a species strive in its respective location under climate change?

These can be feasibly answered if environmentalists know how to analyze data and interpret the results. All of which require statistical and mathematical intuition. Moreover, since analysis is now done in an automated way, and insights are generated using a massive amount of data, flexibility is the key. Programming by coding the data processing pipeline is the flexibility that the environmentalists should have since you can do all the processes in one software.

Programming is an art of communication.

At its core, programming is a way of communication. It is about how you ask and utilize your machine to do complicated stuff for you. Hence you have to know how to give orders to it in a way it can understand. The programming languages such as R and Python are the platform to translate the high-level language we have into machine codes. High-level language has its own grammatical structure, and this is what’s hard in learning programming.

Learning programming means that you are learning an effective communication method with your machine. In Excel or other Graphical User Interface software, you have to click relevant buttons to give orders. In a high-level language, you also need to do this, but by typing codes. In a way, you have to structure your code script to get what you want from the machine. Similar to communicating with humans, the simpler you make the codes, the better the machine understands them, and the faster the analysis is done.

Learning programming is a transferable skill. You will most surely apply the method to program to the way you write an article or the way you communicate with the public.

As an environmentalist, you have to always find an effective method to get your point across by translating technical to more general words. For example, you are an environmentalist in Borneo, and you want to gather people to fight an industry that destroys the protection forest for the industry’s own good. You collected and analyzed data, and you found impactful evidence that destroying protection forests is only increasing the local people’s exposure to natural disasters. If you could not simplify your words, you might not be able to extend your insights to these people.

Data science for collaboration.

There is no barrier between countries anymore. As the world is affected by the current pandemic, there is also a possibility to work virtually. All machines are connected, and works can be done through a collaborative platform. These things will only accelerate collaborations between people, industries, and countries

As an environmentalist, you understand what this is. Global challenges, including climate change, are collective problems that influence and being influenced by all people irrespective of their backgrounds and social status. Collaboration is mentioned in Sustainable Development Goals (SDGs) and the Paris Agreement as a method to solve global challenges. Nowadays, collaboration is done virtually and progressed with data-driven evidence.

We are not only collaborating with humans but also with data. When you are working in an organization, you may have to work on data collaboratively, so you want your codes to be easily understandable to ease everyone’s work. This is also the art of communication; when you can simplify your procedures and translate them into codes, not only machines will understand, but will humans too. And this will lead to the success of the organization’s projects.

As an environmental enthusiast, sharing your codes can mean extending the audience of the insights from your project. It is another way to increase environmental awareness in the data science community.

By diving into the data science community, you introduce how an environment really means for us. Other people will respond to you, either by sharing your codes, reading your insights, or reproducing your method. Your method is developing, and your perception of the environment is reaching far.

Personal documentation (Alpen, Austria, 2019). How collaboration works in the Alpen forest :)

In conclusion…

Data science is a multi-disciplinary field. Anyone can learn and benefit from it. Being able to analyze data and program data processing pipeline can take you far. Statistics are always at the heart of every scientific field, and now we should step up our game by learning to program our statistical analysis. It is transferable because you will also learn how to create an effective method for communication and collaboration.


Posted at Analytics Vidya on Medium: Environmentalists should learn data science too!

Comments

Popular posts from this blog

What’s currently happening in forests: A perspective on how trees coping with the changes in climate.

Underrated statistic measures that you may want to start considering.

Should we worry about climate change?