Imageomics: Images as a Source of Information about Life

Sam Stevens, The Ohio State University

Data Deluge: The Problem

Imageomics: The Solution

What is Imageomics?

imageomics vs Imageomics

Projects in the Imageomics Institute

  1. Beetle wing lengths
  2. Species identification on social media
  3. Individual zebra identification
  4. Counting plankton

...all from images!

Traits from Beetles

How To Extract Traits:

  1. Take pictures.
  2. Measure traits.
  3. Share data.

Is It Possible? What's Fastest?

Trait Data at Scale: Who Cares?

Zooming Out

Embedding Spaces

Making Embedding Spaces

Making Embedding Spaces in Biology

BioCLIP

Making BioCLIP

  1. Lots of GPUs.
  2. Moderate technical expertise.
  3. Large, clean, diverse data.

Large, Clean Diverse Data

TreeOfLife-10M

Large

Diverse

What Can You Do With BioCLIP?

Develop efficient classifiers.

Skip the labeling phase

Iterate in real time.

Improve your study design.

Understanding BioCLIP

Look at the data or the model?

What's in the data?

Garbage in, garbage out.

xkcd #1833

What's in our models?

Interpreting Predictions

Interpreting Models

Models don't learn like us.

Sparse Autoencoders

Sparse Vectors

"Striped Beetle Thoraxes"

"Moth Antennae"

Implications of Disentangled Representations

Better Search

Better Integration

The Road Ahead

Data Bias

Naming Things

There are only two hard problems in computer science: naming things and cache invalidation

  • Phil Karlton

Computational Framing

Wrapping Up