ARGOS Data Science

ARGOS (Automated RecoGnition of Species) is a project to identify invasive flora species in high resolution drone imagery. My role on ARGOS was variable, but my main responsibility was to research and develop the machine learning solution to predict the species distributions on new images. Even after years of work, there is still a lot of improvement to be made. Every plant has different signatures and requirements from a model, which makes generalizing the process difficult.

For example, one of the plant species that we focused on was Frangula alnus, or Glossy Buckthorn. This species is particular wreaks havoc on many pristine Michigan ecosystems, so it was very important to develop an accurate and robust model for identification. Through many iterations of interviews with the scientists we were working with and investigation of our data, we were able to develop a convolutional neural network that fit our needs. While this original model was effective for a set of imagery that we collected during one field season, it was quickly evident when we were using a new drone in the next field season that the same model would not immediately be as effective. Again, we went through a long model development and verification process, and came out with a new model that, so far, has been robust to changes in season, time of day, weather, and camera.

Unfortunately, we learned that having a model that worked for one species did not mean that we could apply different training data sets and have a model that worked for all species. Species vary in terms of their growth structure, leaf shape, and ground covered per plant, among many other traits. All of these, but particularly the amount of ground covered per plant made it difficult to fit the same model to different parameters for different plant species.

It was also important to find a way to efficiently predict the distributions. Predicting every pixels neighborhood on a single flight would take days to weeks of computation time. We had two options: grid prediction, or Markov-Chain Monte Carlo sampling. There are clear benefits to both. With grid prediction, we would be able to truly survey the entire map, and understand where both low and high density regions existed. We could also tune the granularity of the grid to fit our computational requirements. Markov-Chain Monte Carlo (MCMC) presented a tempting alternative, as it can provably map out a distribution within a computational boundary. We could also then potentially use MCMC for the drone to survey larger natural areas. We ended up trying both methods and the results were inconclusive, as we got important and distinct information from each. ARGOS currently uses grid prediction, but work is still being done with MCMC.

contact@oliverhill.io — Resume