Using interactive data visualization to make sense of large datasets

1 minute read

Published: November 16, 2022

As sensor technology improves, data volumes grow. We now live in a sea of data collected by our phones, smartwatches, and home assistants like Alexa. Science is not any different, new sensors are enabling the collection of large datasets that can be mined for new scientific discoveries. In plant science, sensor technology is being applied to study how plants grow under drought conditions.

NOTE: Access the workshop notebook here.

Phenomics: A case study in big data

We will be using data collected by the Field Scanalyzer at the University of Arizona Maricopa Agricultural Center. The Field Scanalyzer covers over an hectare of land - capturing data from over 20,000 plants over a growing season. The Field scanalyzer is equipped with stereo RGB and thermal cameras, a PSII chlorophyll fluorescence imager, and a pair of 3D laser scanners (pictured below).

Collectively, these sensors capture 20 terabytes (TBs) in a three-month period, which makes converting these raw data into information a difficult task. Accomplishing extraction of information requires leveraging machine learning, high performance computers, and distributed computing.

These multiple sources of data provide a fine-scale information of plant growth under drought (decreased water) conditions. Today, we will use some of these data to learn interactive visualization using Python!

Workshop materials

Google Colab notebook

Survey

Please provide your feedback to improve future workshops here: https://bit.ly/2022-ds2f.

Additional materials

Seminar invitation
- School of Plant Sciences Seminar - Transforming a quarter petabyte of field phenomics data into functional traits and beyond
  - Date: Tuesday, 22-Nov
  - Time: 4pm
  - Zoom link: https://arizona.zoom.us/j/83941552191
  - Password: spls2022
Reading
- Living in Data: A Citizen’s Guide to a Better Information Future by Jer Thorp
- Data Science by John D. Kelleher and Brendan Tierney
Software
- PhytoOracle
  - Data processing pipelines that convert raw data from the Field Scanalzyer into phenotypic trait information
  - To check out our open source code, click here.

Acknowledgements

This program is funded by the University of Arizona Libraries: https://data.library.arizona.edu/ds2f.

With special thanks to:

Jeffrey Oliver
Megan Senseney
Jim Martin
Yvonne Mery
Leslie Sult
Cheryl Casey

Share on

Twitter Facebook LinkedIn

Emmanuel Gonzalez

Using interactive data visualization to make sense of large datasets

Phenomics: A case study in big data

Workshop materials

Survey

Additional materials

Acknowledgements

Share on

You May Also Enjoy

Leveraging Supervised and Unsupervised Machine Learning to Study Shapes

Pip install without sudo on HPC clusters

Phenomic Data Exploration

Setting up iRODS