002-Upload-Completed-Tune-Artifacts¶
📥 Download 002-Upload-Complete-Tune-Artifacts.ipynb and try it out
Introduction¶
This notebook guides you through the complete workflow of deploying a fine-tuned TerraTorch model to the GeoStudio platform. The process involves three steps:
- Preparation: Setting up checkpoint and configuration files from your fine-tuned TerraTorch model
- Upload Process: Utilizing the GeoStudio SDK to transfer your model artifacts to the GeoStudio cloud infrastructure.
- Inference: Running geospatial inference tasks using your uploaded model/tune.
Prerequisites¶
Before proceeding with this notebook, ensure you have:
- Active GeoStudio Service Access: Valid credentials and permissions for the GeoStudio inference service
- SDK Installation: The GeoStudio SDK installed in your environment
- Authentication Setup: API keys configured (either via environment variables or key files)
- Model Artifacts: A completed fine-tuned TerraTorch model with both checkpoint (.ckpt) and configuration (.yaml) files
- Cloud Storage Access: Valid credentials for your object storage bucket (AWS S3, IBM Cloud Object Storage, etc.)
Note: This workflow assumes you have already completed the model training process and possess both the trained checkpoint file and its corresponding configuration file. If you need guidance on fine-tuning TerraTorch models, refer to the TerraTorch documentation first.
%load_ext autoreload
%autoreload 2
Imports & Setup¶
from IPython.display import JSON
from geostudio import Client
from geostudio import gswidgets
Connecting to Geospatial Studio¶
Connecting to the platform¶
First, we set up the connection to the platform backend. To do this we need the base url for the studio UI and an API key.
To get an API Key:
- Go to the Geospatial Studio UI page and navigate to the Manage your API keys link.
- This should pop-up a window where you can generate, access and delete your api keys. NB: every user is limited to a maximum of two activate api keys at any one time.
Store the API key and geostudio ui base url in a credentials file locally, for example in /User/bob/.geostudio_config_file. You can do this by:
echo "GEOSTUDIO_API_KEY=<paste_api_key_here>" > .geostudio_config_file
echo "BASE_STUDIO_UI_URL=<paste_ui_base_url_here>" >> .geostudio_config_file
Copy and paste the file path to this credentials file in call below.
#############################################################
# Initialize Geostudio client using a geostudio config file
#############################################################
geostudio_client = Client(geostudio_config_file=".geostudio_config_file")
Prepare your tune artifacts with Presigned URLs¶
Presigned URLs are temporary, signed links that let you securely access objects in your storage bucket without exposing your credentials.
There are two main kinds of presigned URLs:
PUT presigned URL → lets you upload a file to your bucket.
Think of it as a one-time signed permission slip that says: “For the next 1 hour, anyone with this link may PUT (write) an object here.”GET presigned URL → lets you download or allow another service to fetch the file.
This is what you’ll hand off to the geospatial studio service, since it needs to read the checkpoint and config.
Workflow:
Whe you use the sdk upload_file function, it:
- Generates both a PUT and a GET URL for each file.
- Uses the PUT URL locally to upload the files.
- Passes the GET URL to the service so it can later retrieve them.
object_name = "test-checkpoint.ckpt" #Must be a valid string, not a path
checkpoint_file= "../sample_files/best-state_dict.ckpt"
checkpoint_urls = geostudio_client.get_fileshare_links(object_name)
checkpoint_upload_url = checkpoint_urls["upload_url"]
checkpoint_download_url = checkpoint_urls["download_url"]
!curl --progress-bar -O -X PUT -T "$checkpoint_file" "$checkpoint_upload_url"
Upload your config file¶
object_name = "test-config.yaml" #Must be a valid string, not a path
config_file = "../sample_files/config_deploy.yaml"
config_urls = geostudio_client.get_fileshare_links(object_name)
config_upload_url = config_urls["upload_url"]
config_download_url = config_urls["download_url"]
!curl --progress-bar -O -X PUT -T "$config_file" "$config_upload_url"
Register your tuning artifacts¶
Once your model artifacts are successfully uploaded to cloud storage, register them with the GeoStudio platform. This registration process:
- Creates Platform Records: Establishes your tune as a recognized model within the system
- Validates Artifacts: Confirms that uploaded files are accessible and properly formatted
- Establishes Metadata: Associates descriptive information with your model for easy identification
tune = geostudio_client.upload_completed_tunes(
data={
"name": "flood-test-001",
"description": "Fine-tuned model for flooding detection",
"tune_checkpoint_url": checkpoint_download_url,
"tune_config_url": config_download_url,
}
)
JSON(tune)
created_tune = geostudio_client.get_tune(tune_id=tune["tune_id"])
JSON(created_tune)
Run inference with uploaded tune¶
Once your tune is successfully registered and available, you can execute inference tasks by trying out the uploaded tune with payload that includes spatial and temporal domains for your inference.
When running the inference and you require to download the inference data from data sources like sentinelhub, update the payload with model_input_data_spec and geoserver_push
model_input_data_spec section specificies which collection, connector and bands to be downloaded.
"model_input_data_spec": [
{
"bands": [
{
"band_name": "B01",
"resolution": "60m",
"description": "Coastal aerosol, 442.7 nm (S2A), 442.3 nm (S2B)"
},
.... # List all your bands here
],
"connector": "sentinelhub",
"collection": "s2_l2a",
"file_suffix": "S2L2A.tif",
"modality_tag": "S2L2A",
}
]
geoserver_push section specifies which layers to push after running inference. By default, we push rgb and model_output
"geoserver_push": [
{
"z_index": 0,
"workspace": "geofm",
"layer_name": "input_rgb",
"file_suffix": "",
"display_name": "Input image (RGB)",
"filepath_key": "model_input_original_image_rgb",
"geoserver_style": {
"rgb": [
{
"label": "RedChannel",
"channel": 1,
"maxValue": 2000,
"minValue": 0
},
{
"label": "GreenChannel",
"channel": 2,
"maxValue": 2000,
"minValue": 0
},
{
"label": "BlueChannel",
"channel": 3,
"maxValue": 2000,
"minValue": 0
}
]
},
"visible_by_default": "True"
},
{
"z_index": 1,
"workspace": "geofm",
"layer_name": "pred",
"file_suffix": "",
"display_name": "Model prediction",
"filepath_key": "model_output_image",
"geoserver_style": {
"segmentation": [
{
"color": "#7d7247",
"label": "no-ships",
"opacity": 0,
"quantity": "0"
},
{
"color": "#c1121f",
"label": "ships",
"opacity": 1,
"quantity": "1"
}
]
},
"visible_by_default": "True"
}
],
inference = geostudio_client.try_out_tune(
tune_id=created_tune["id"],
data={
"spatial_domain": {
"bbox": [
[
92.5290608449473,
26.185925522799945,
92.80352715571134,
26.419756619674683
]
]
},
"temporal_domain": [
"2024-07-25_2024-07-27"
],
"model_display_name": "geofm-sandbox-models",
"description": "test",
"location": "Jarani, Nagaon, Nagaon, Assam, India",
# "model_input_data_spec": {[]},
# "geoserver_push":[{}],
"post_processing": {
"cloud-masking": "False",
"cloud_masking": "False",
"ocean-masking": "False",
"ocean_masking": "False",
"snow_ice_masking": "False",
"permanent_water_masking": "False"
},
}
)
JSON(inference)
Fetch and View inference results¶
Accessing inference outputs¶
Once an inference run status is completed, the inputs and outputs of each task within an inference are packaged up into a zip file which is uploaded to a url you can use to download the files.
To access and view the inference task files:
- Get the inference tasks list
- Identify the specific inference task you want to view
- Use the sdk to visualize the inference task results
# Check status of the submitted inference
inference = geostudio_client.get_inference(inference["id"])
JSON(inference)
display(f"Inference Status: {inference['status']}")
# Get the inference tasks list
inf_tasks_res = geostudio_client.get_inference_tasks(inference["id"])
inf_tasks_res
# Select a task to view
selected_task = 0
selected_task_id = f"{inference["id"]}-task_{selected_task}"
# Visualize output files with the SDK
gswidgets.inferenceTaskViewer(client=geostudio_client , task_id=selected_task_id)