Leveraging Supervised and Unsupervised Machine Learning to Study Shapes
Published:
As sensor technology improves, data volumes grow. We now live in a sea of data collected by our phones, smartwatches, and home assistants like Alexa. Science is not any different, new sensors are enabling the collection of large datasets that can be mined for new scientific discoveries. In plant science, sensor technology is being applied to study how plants grow under drought conditions.
NOTE: Access the workshop notebook here.
Research
We will be using data collected by the Field Scanalyzer at the University of Arizona Maricopa Agricultural Center. The Field Scanalyzer covers over an hectare of land - capturing data from over 20,000 plants over a growing season. The Field scanalyzer is equipped with stereo RGB and thermal cameras, a PSII chlorophyll fluorescence imager, and a pair of 3D laser scanners (pictured below).
Collectively, these sensors capture 20 terabytes (TBs) in a three-month period, which makes converting these raw data into information a difficult task. Accomplishing extraction of information requires leveraging machine learning, high performance computers, and distributed computing.
These data enable me and other scientists to study how plants respond to drought stress under real-world, field conditions. These data will contribute to efforts aimed at improving the resiliency of plants to drought stress.
Data
Today we will be working with 3D point cloud data collected by the Field Scanalyzer. These data provide fine-scale resolution on plant shapes. We will: (i) extract TDA shape descriptors, (ii) run PCA on these data, and (iii) classify plants into their respective variety name.
Workshop materials
Acknowledgements
With special thanks to:
- Dr. Duke Pauli & lab members
- Dr. Eric Lyons & lab members