Earth observation data covers 100 percent of the earth's surface on a daily basis, but the resolutions available at near earth orbit are not high enough to see wildlife smaller than most trees. Since animals are the leading indicators of ecosystem health, it is important to develop more complete measures of animal abundance on the ground. Despite the importance of abundance measures, they are rarely carried out continuously or at any significant scale. Collecting and cleaning data is an expensive and time consuming process carried out by a small number of people that are experts for certain ecosystems. With an inexpensive and easily dispersible camera connected to a neural network in the field, it becomes possible to systematize and scale visual observation data to cover entire forests at a low hardware and maintenance cost.
For those who are not familiar with neural networks they are a means of doing machine learning. A computer will analyze training examples (in our case images labled "animal" or "not animal") and find visual patterns that allign with labels that are provided. If you own a smart phone you may have noticed that the "photos" application can group photos based on the people in them without even knowing who they are. This is because they use similar concepts as previously discussed. This is an example of image recognition and neural network techniques that you may not have realized occured on the chips in your phone!
Our capstone team has partnered with Dr. Sean McGregor on behalf of the company Syntiant in order to demonstrate the ability of their newest energy efficient neural accelerator chip. This chip can be built and deployed en masse along with existing camera sensors to be deployed in forests enabling large scale biodiversity monitoring. Without using an energy efficient chip the cost for buying and servicing devices becomes prohibitive for any significant scale.
Our capstone project covers the first step of bringing species monitoring to whole-forest scales: preparing a dataset and training the neural network model that will be loaded onto a physical device.
In the data preparation phase we worked on gathering a large data set of about 2.6 terabytes. This dataset contains a combination of photos with animals, and without. In order to train our model we labeled each photo with the classifications of either “animal” or “not animal.” Following this we concatenated the image and label data into TFrecords, and used this file format to train our model.
TFrecords also known as "Tensor Flow records" are a file format that allowed us to merge image and label data into a single file that is readable by our model training script. Using this file format helps speed up the training process because it is possible with tensor flow libraries to train TFrecords in parallel. It is also convienient when packaging and moving the files around as a single TFrecord can hold the data of multiple image/label files. For example we were able to fit 3 million resized photos into 338 TFrecords.
During the training phase we used convolutional neural networks and made significant changes to an existing person detection python script, provided by Syntiant, in order to fit our use case. After our script fit our proper animal/not animal parameters, we then continued to make iterative changes to improve the accuracy of our model. With this we are able to showcase the ability to use an accurate animal detection model that can theoretically run on an energy efficient production device. This can then be utilized by researchers to gather large amounts of ecological data more effectively than ever before.