Lab 2¶
Onboarding Pre-computed Examples
⏱️ Estimated Duration: 20 minutes
📊 Difficulty Level: Beginner
📥 Getting the Lab Materials
Getting the Lab Materials: Clone the repository:
git clone https://github.com/terrastackai/geospatial-studio.git
cd geospatial-studio/workshop/docs/notebooks
jupyter notebook lab2-onboarding-examples.ipynb
🎯 Learning Objectives¶
By the end of this lab, you will be able to:
- Understand what "onboarding" means in Geospatial Studio
- Load pre-computed inference examples into your Studio
- Visualize inference results in the UI
- Understand the structure of inference payloads
- Know how to adapt the process for your own data
📖 What is Onboarding?¶
Onboarding is the process of loading data into Geospatial Studio so you can:
- View it in the UI
- Use it for training models
- Run inference on it
- Share it with your team
There are two main types of onboarding:
- Inference Results (this lab) - Pre-computed model outputs
- Training Datasets (Lab 4) - Raw data for model training
🌍 About This Lab's Example¶
We'll onboard a pre-computed Above Ground Biomass (AGB) example from Karen, Nairobi, Kenya. This example shows:
- Input: RGB satellite imagery
- Output: Predicted biomass values (in MgC/ha - Megagrams of Carbon per hectare)
- Use case: Forest carbon stock estimation
This is a real-world application used for:
- Carbon credit verification
- Forest management
- Climate change monitoring
🔧 Setup¶
First, let's set up our environment and connect to the Studio.
%load_ext autoreload
%autoreload 2
# Import required packages
import json
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
from geostudio import Client
Connect to Studio¶
Use the same connection method from Lab 1:
# Initialize the client
client = Client(geostudio_config_file=".geostudio_config_file")
print("✅ Connected to Geospatial Studio!")
📦 Understanding the Inference Payload¶
Before we onboard the example, let's understand what we're loading. An inference payload contains:
1. Data Location¶
Where the files are stored (URLs to pre-computed results)
2. Visualization Settings¶
How to display the data in the UI:
- Layer names
- Color schemes
- Display order
3. Metadata¶
Information about the inference:
- Location
- Description
- Model used
Let's look at the structure:
# This is the structure of an inference payload
example_structure = {
"model_display_name": "Name of the model",
"description": "What this inference shows",
"location": "Geographic location",
"spatial_domain": {
"urls": [
"URL to input imagery",
"URL to model predictions"
]
},
"geoserver_push": [
{
"layer_name": "unique_layer_id",
"display_name": "Human-readable name",
"geoserver_style": {
"rgb": "[band indices]" # or "regression": "[config]" or "segmentation": "[config]"
}
}
],
"demo": {
"demo": True,
"section_name": "Category in UI"
}
}
print("📋 Inference Payload Structure")
print(json.dumps(example_structure, indent=2))
🌳 Onboard the AGB Karen Example¶
Now let's create the actual payload for our Above Ground Biomass example.
What This Example Contains:¶
- RGB Layer - True color satellite image of Karen, Nairobi
- AGB Prediction Layer - Estimated biomass (0-300 MgC/ha)
The data is hosted on IBM Cloud Object Storage and ready to load.
# Create the inference payload for AGB Karen example
agb_karen_payload = {
# Model information
"fine_tuning_id": "sandbox",
"model_display_name": "add-layer-sandbox-model",
# Metadata
"description": "Above Ground Biomass (AGB) Estimation",
"location": "Karen, Nairobi, Kenya",
# Data URLs - Pre-computed results stored in cloud
"spatial_domain": {
"urls": [
# RGB imagery
"https://geospatial-studio-example-data.s3.us-east.cloud-object-storage.appdomain.cloud/test-add-layer/d5c33eb4-635d-4070-b72c-d57351ab2586_hls-agb_rgb.zip",
# AGB predictions
"https://geospatial-studio-example-data.s3.us-east.cloud-object-storage.appdomain.cloud/test-add-layer/d5c33eb4-635d-4070-b72c-d57351ab2586_hls-agb_pred_postprocessed.zip"
]
},
"temporal_domain": [],
# Visualization configuration
"geoserver_push": [
# Layer 1: RGB Image
{
"workspace": "geofm",
"layer_name": "karen_agb_rgb",
"display_name": "2024 Karen AGB RGB",
"filepath_key": "original_input_image",
"file_suffix": "",
"z_index": 0, # Bottom layer
"visible_by_default": "True",
"geoserver_style": {
"rgb": [
{"minValue": 0, "maxValue": 255, "channel": 1, "label": "RedChannel"},
{"minValue": 0, "maxValue": 255, "channel": 2, "label": "GreenChannel"},
{"minValue": 0, "maxValue": 255, "channel": 3, "label": "BlueChannel"}
]
}
},
# Layer 2: AGB Predictions
{
"workspace": "geofm",
"layer_name": "karen_agb_pred",
"display_name": "2024 Karen AGB Prediction",
"filepath_key": "original_input_image",
"file_suffix": "",
"z_index": 1, # Top layer
"visible_by_default": "True",
"geoserver_style": {
"regression": [
{"color": "#d0ffc9", "quantity": "0", "opacity": 1, "label": "0 MgC/ha"},
{"color": "#2dba18", "quantity": "300", "opacity": 1, "label": "300 MgC/ha"}
]
}
}
],
# Mark as demo example
"demo": {
"demo": True,
"section_name": "My Examples"
}
}
print("✅ Payload created successfully!")
print(f"\n📍 Location: {agb_karen_payload['location']}")
print(f"📊 Description: {agb_karen_payload['description']}")
print(f"🗺️ Layers: {len(agb_karen_payload['geoserver_push'])}")
Submit the Inference¶
Now we'll submit this payload to Studio. This will:
- Download the data from cloud storage
- Process and validate the files
- Publish layers to GeoServer
- Make it available in the UI
Note: This process takes 2-5 minutes depending on your network speed.
# Submit the inference
print("🚀 Submitting inference to Studio...")
print("This will download and process the data.\n")
response = client.submit_inference(data=agb_karen_payload)
print("✅ Inference submitted successfully!")
print(f"\n📋 Inference ID: {response['id']}")
print(f"📊 Status: {response['status']}")
print(f"📍 Location: {response['location']}")
# Save the inference ID for later
inference_id = response['id']
⏳ Monitor Processing¶
Let's wait for the inference to complete. The SDK will poll the status automatically.
print("⏳ Waiting for inference to complete...")
print("This typically takes 2-5 minutes.\n")
# Poll until finished
client.poll_inference_until_finished(inference_id=inference_id)
print("\n✅ Inference completed!")
Check Final Status¶
Let's verify everything completed successfully:
# Get final inference details
final_status = client.get_inference(inference_id=inference_id)
print("📊 Final Inference Status")
print("=" * 50)
print(f"Status: {final_status['status']}")
print(f"Location: {final_status['location']}")
print(f"Description: {final_status['description']}")
print(f"Created: {final_status['created_at']}")
print(f"Completed: {final_status.get('completed_at', 'N/A')}")
🎨 View in the UI¶
Now that the inference is loaded, you can view it in the Geospatial Studio UI!
Steps to View:¶
Open the Studio UI in your browser:
- Local:
https://localhost:4180 - Or your deployed URL
- Local:
Navigate to the Inference Lab:
- Click on "Inference Lab" in the left sidebar
- Or go directly to:
https://localhost:4180/inference
Find Your Example:
- Look for "My Examples" section
- Click on "2024 Karen AGB RGB"
Explore the Layers:
- Toggle between RGB and AGB Prediction layers
- Use the layer controls to adjust opacity
- Zoom in to see details
What You'll See:¶
- RGB Layer: Satellite image showing vegetation, buildings, and terrain
- AGB Layer: Color-coded biomass estimates:
- Light green: Low biomass (0-100 MgC/ha)
- Dark green: High biomass (200-300 MgC/ha)
🎓 Understanding What Happened¶
Let's break down what the Studio did behind the scenes:
1. Data Download¶
Studio downloaded ZIP files from cloud storage:
├── RGB imagery (GeoTIFF)
└── AGB predictions (GeoTIFF)
2. Data Validation¶
Checked:
✓ File format (GeoTIFF)
✓ Coordinate system
✓ Spatial extent
✓ Data integrity
3. GeoServer Publishing¶
Created map layers:
├── karen_agb_rgb (RGB visualization)
└── karen_agb_pred (Regression color ramp)
4. Database Registration¶
Stored metadata:
├── Inference ID
├── Location
├── Timestamps
└── Layer configurations
🔧 How to Onboard Your Own Data¶
Now that you've seen how to onboard a pre-computed example, here's how you would onboard your own inference results:
Step 1: Prepare Your Data¶
You need:
- Input imagery (GeoTIFF format)
- Model predictions (GeoTIFF format)
- Both files in ZIP archives
- Files uploaded to accessible URLs (S3, HTTP, etc.)
Step 2: Create Your Payload¶
Modify the payload structure:
# Template for your own data
your_payload_template = {
"model_display_name": "your-model-name",
"description": "Your inference description",
"location": "Your location",
"spatial_domain": {
"urls": [
"https://your-storage.com/input-imagery.zip",
"https://your-storage.com/predictions.zip"
]
},
"temporal_domain": [],
"geoserver_push": [
{
"workspace": "geofm",
"layer_name": "your_unique_layer_name",
"display_name": "Your Display Name",
"filepath_key": "original_input_image",
"file_suffix": "",
"z_index": 0,
"visible_by_default": "True",
"geoserver_style": {
# Choose one:
# "rgb": [...] for RGB imagery
# "regression": [...] for continuous values
# "segmentation": [...] for classes
}
}
],
"demo": {
"demo": True,
"section_name": "My Custom Examples"
}
}
print("📝 Template for your own data:")
print(json.dumps(your_payload_template, indent=2))
Step 3: Choose Visualization Style¶
For RGB Imagery:¶
"geoserver_style": {
"rgb": [
{"minValue": 0, "maxValue": 255, "channel": 1, "label": "Red"},
{"minValue": 0, "maxValue": 255, "channel": 2, "label": "Green"},
{"minValue": 0, "maxValue": 255, "channel": 3, "label": "Blue"}
]
}
For Segmentation (Classes):¶
"geoserver_style": {
"segmentation": [
{"quantity": "0", "label": "Background", "color": "#000000", "opacity": 0},
{"quantity": "1", "label": "Water", "color": "#0000ff", "opacity": 1},
{"quantity": "2", "label": "Vegetation", "color": "#00ff00", "opacity": 1}
]
}
For Regression (Continuous Values):¶
"geoserver_style": {
"regression": [
{"color": "#ffffcc", "quantity": "0", "opacity": 1, "label": "Low"},
{"color": "#006837", "quantity": "100", "opacity": 1, "label": "High"}
]
}
Step 4: Submit Your Data¶
# Submit your custom inference
response = client.submit_inference(data=your_payload)
client.poll_inference_until_finished(inference_id=response['id'])
⚠️ Common Issues and Solutions¶
Issue 1: "URL not accessible"¶
Solution: Ensure your URLs are:
- Publicly accessible or have proper credentials
- Direct download links (not web pages)
- Valid and not expired
Issue 2: "Invalid GeoTIFF format"¶
Solution: Check that your files:
- Are valid GeoTIFF format
- Have proper georeferencing
- Are not corrupted
Issue 3: "Layer not appearing in UI"¶
Solution:
- Wait for processing to complete
- Refresh the UI page
- Check the inference status
- Verify GeoServer is running
Issue 4: "Processing takes too long"¶
Solution:
- Large files take longer (5-10 minutes)
- Check network connection
- Monitor backend logs if needed
📚 Summary¶
In this lab, you learned:
✅ What onboarding means - Loading data into Studio
✅ Inference payload structure - How to describe your data
✅ Onboarding pre-computed examples - Using the AGB Karen example
✅ Monitoring progress - Tracking inference status
✅ Viewing results - Exploring data in the UI
✅ Adapting for your data - How to onboard custom results
Key Takeaways:¶
- Onboarding is flexible - Works with various data types and sources
- Visualization is configurable - Choose RGB, segmentation, or regression styles
- Process is automated - Studio handles download, validation, and publishing
- Results are shareable - Team members can view in the UI
Next Steps:¶
- Lab 3: Learn to run inference with existing models
- Lab 4: Complete end-to-end workflow with model training
- Try your own data: Adapt the payload for your use case
Ready for Lab 3? Let's learn how to run inference with pre-trained models! 🚀