Satellite and Population Embedding
Powerful GeoAI Tools or Overrated?
Zhanchao Yang
University of Pennsylvania
About Me
Zhanchao Yang
- Master of City Planning (MCP)
- Master of Urban Spatial Analytics (MUSA)
- Bachelor of Arts in Geography, SUNY Binghamton
- Research Assistant, Weitzman School of Design
What is Embeddings?
In general terms, embeddings are numerical representations of complex information that capture the most important features or relationships in a simplified, machine-readable form.
Blackbox Nature of Embeddings
- Generated through various neural network models (GNN or CNN) trained on a large datasets.
- Exact architecture and training process vary widely.
- The company or research institutions may not disclose the full details of their models.
Population Dynamics Foundation (PDFM)
- Developed by Google Research (Google Deep Mind) in 2024.
- Input:
- Census data
- Searching Trends
- Other geospatial data
- Output:
- Multi-dimensional vectors (represent population characteristics and dynamics).
How PDFM Embeddings look like?
- 330-dimensional vectors for each census tract or ZIP code
- Each dimension captures different aspects of population dynamics
- Examples of captured features:
- Mobility patterns
- Search behavior trends
- Local economic activity
- Environmental conditions
Applications of PDFM Embeddings
Health & Social Services:
- Disease prevalence prediction
- Healthcare resource allocation
Economic Analysis:
- Socioeconomic indicators
- Income and poverty estimation
Urban Planning:
- Population growth forecasting
- POI and hotspot prediction
Additional Applications
![]()
Demo 1: Using PDFM Embeddings to predict housing prices
Dataset:
- Zillow Home Value Index (ZHVI) by ZIP code
- PDFM embeddings (326 features)
Model:
- Linear Regression (Stepwise regression)
- Data mining approaches
- Output: Predicted home prices
Satellite Foundation Model (SFM) Embeddings
- Developed by Google Research
- Based on Sentinel-2 satellite imagery
- Global coverage (2017-2023)
- Spatial resolution: 10 meters
- Available through Google Earth Engine (GEE)
What Satellite Embeddings Capture?
- Land cover patterns
- Vegetation characteristics
- Urban development
- Environmental changes
- Temporal dynamics
Trained on massive satellite imagery corpus using self-supervised learning
Satellite Embeddings with Google Earth Engine (GEE)
Dataset:
GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL
- 70 bands
Key Operations:
- Filter by date and location
- Extract embeddings for points/regions
- Compare temporal changes
- Calculate similarity (dot product)
Applications of SFM Embeddings
Environmental and Land Use Planning
- Urban sprawl analysis
- Land cover classification
Temporal Analysis:
- Seasonal pattern detection
- Change Detection
- Similarity searches across time
Demo 2: Satellite Embeddings for Land Cover Classification
Approach:
- Use embeddings as features
- Train classification model
- Predict land cover types
Advantages:
- No need to process raw imagery
- Pre-trained representations
- Faster than traditional methods
Workflow
- Define area of interest
- Load embedding ImageCollection
- Extract features for labeled samples
- Train classifier (Random Forest, SVM)
- Apply to entire region
- Validate results
Uses Earth Engine’s cloud computing for scalability
Limitations and Concerns
Interpretability Issues:
- Black box representations
- Limited transparency in training process
Bias and Fairness:
- May encode historical biases
- Risk of perpetuating spatial inequalities
Technical Limitations:
- Resolution constraints (10m for satellite)
- Updates not in real-time
…
Summary and Conclusions
The Promise:
- Powerful tools for spatial analysis
- Capture complex patterns
The Reality:
- Require careful validation
- Complement, not replace
Best Practices:
- Validate predictions thoroughly
- Be aware of limitations and biases
Final Verdict
Powerful GeoAI tools, but use with caution and critical thinking!
“All models are false, but some are useful”
References
Google Research:
Tutorials & Resources:
This Presentation: