Satellite and Population Embedding

Powerful GeoAI Tools or Overrated?

Zhanchao Yang

University of Pennsylvania

About Me

Zhanchao Yang

  • Master of City Planning (MCP)
  • Master of Urban Spatial Analytics (MUSA)
  • Bachelor of Arts in Geography, SUNY Binghamton
  • Research Assistant, Weitzman School of Design

What is Embeddings?


In general terms, embeddings are numerical representations of complex information that capture the most important features or relationships in a simplified, machine-readable form.

Blackbox Nature of Embeddings

  • Generated through various neural network models (GNN or CNN) trained on a large datasets.
  • Exact architecture and training process vary widely.
  • The company or research institutions may not disclose the full details of their models.

Population Dynamics Foundation (PDFM)

  • Developed by Google Research (Google Deep Mind) in 2024.
  • Input:
    • Census data
    • Searching Trends
    • Other geospatial data
  • Output:
    • Multi-dimensional vectors (represent population characteristics and dynamics).

How PDFM Embeddings look like?

  • 330-dimensional vectors for each census tract or ZIP code
  • Each dimension captures different aspects of population dynamics
  • Examples of captured features:
    • Mobility patterns
    • Search behavior trends
    • Local economic activity
    • Environmental conditions

Applications of PDFM Embeddings

Health & Social Services:

  • Disease prevalence prediction
  • Healthcare resource allocation

Economic Analysis:

  • Socioeconomic indicators
  • Income and poverty estimation

Urban Planning:

  • Population growth forecasting
  • POI and hotspot prediction

Additional Applications

Demo 1: Using PDFM Embeddings to predict housing prices

Dataset:

  • Zillow Home Value Index (ZHVI) by ZIP code
  • PDFM embeddings (326 features)

Model:

  • Linear Regression (Stepwise regression)
  • Data mining approaches
  • Output: Predicted home prices

Satellite Foundation Model (SFM) Embeddings

  • Developed by Google Research
  • Based on Sentinel-2 satellite imagery
  • Global coverage (2017-2023)
  • Spatial resolution: 10 meters
  • Available through Google Earth Engine (GEE)

What Satellite Embeddings Capture?

  • Land cover patterns
  • Vegetation characteristics
  • Urban development
  • Environmental changes
  • Temporal dynamics

Trained on massive satellite imagery corpus using self-supervised learning

Satellite Embeddings with Google Earth Engine (GEE)

Dataset:

  • GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL
  • 70 bands

Key Operations:

  • Filter by date and location
  • Extract embeddings for points/regions
  • Compare temporal changes
  • Calculate similarity (dot product)

Applications of SFM Embeddings

Environmental and Land Use Planning

  • Urban sprawl analysis
  • Land cover classification

Temporal Analysis:

  • Seasonal pattern detection
  • Change Detection
  • Similarity searches across time

Demo 2: Satellite Embeddings for Land Cover Classification

Approach:

  • Use embeddings as features
  • Train classification model
  • Predict land cover types

Advantages:

  • No need to process raw imagery
  • Pre-trained representations
  • Faster than traditional methods

Workflow

  1. Define area of interest
  2. Load embedding ImageCollection
  3. Extract features for labeled samples
  4. Train classifier (Random Forest, SVM)
  5. Apply to entire region
  6. Validate results

Uses Earth Engine’s cloud computing for scalability

Limitations and Concerns

Interpretability Issues:

  • Black box representations
  • Limited transparency in training process

Bias and Fairness:

  • May encode historical biases
  • Risk of perpetuating spatial inequalities

Technical Limitations:

  • Resolution constraints (10m for satellite)
  • Updates not in real-time

Summary and Conclusions

The Promise:

  • Powerful tools for spatial analysis
  • Capture complex patterns

The Reality:

  • Require careful validation
  • Complement, not replace

Best Practices:

  • Validate predictions thoroughly
  • Be aware of limitations and biases




Final Verdict



Powerful GeoAI tools, but use with caution and critical thinking!

“All models are false, but some are useful”

References

Google Research:

Tutorials & Resources:

This Presentation: