Skip to content

Intro to the studio

Dataset Facrory

The Geospatial Exploration and Orchestration Studio allows users to onboard their curated data for fine-tuning. The Studio Uses Terrakit, a Python package for finding, retrieving and processing geospatial information, to find and fetch data from a range of data connectors. Data connectors in Terrakit are different platforms(data sources) that provide access to geospacial data.

Terrakit currently supports the following data connectors:

  • Sentinel Hub
  • NASA Earthdata
  • Sentinel AWS
  • IBM Research STAC
  • The Weather Company

Each data connector has a different access requirements. Please check out the Terrakit documentation for the specific access requirements of each data connector.

When onboarding your curated dataset to the Studio, you need to define:

  • The data connectors to use and information about the data sources like specific collection, and bands.
  • Information about your dataset like name, description, data and label file suffixes, etc.
  • Specific configurations that aid in the data onboarding and data fetching processes in the Studio.

Check out the Dataset factory API reference page for a full list and description of what you need to define when onboarding a dataset to the GEOStudio..

The Geospatial Studio allows users to onboard either multi-modal data or uni-modal data. For the multi-modal data, users shall provide, as a list, a different data source for each input modality of the dataset. The Studio allows users to define a modality tag parameter for each collection that are used in the Terramind model to identify the specific collection to use.

The table below shows the modality tags associated with each collection across the different data connectors:

Modality tag Sentinel hub collections NASA Earthdata collections Sentinel AWS collections IBM Research STAC collections The Weather Company collections
S2L1C s2_l1c
DEM dem
S1GRD s1_grd
HLS_L30 hls_l30 HLSL30_2.0 weathercompany-daily-forecast
HLS_S30 hls_s30 HLSS30_2.0
S2L2A s2_l2a sentinel-2-l2a

Check out these example payloads that defines most of the values you will need to onboard different sample datasets we have in the Studio.

Fine-tuning

In order to run a fine-tuning task in the studio, you need to select the following items:

  • tuning task type - The type of learning task you are attempting. Based on the task selected, the studio provides a configuration template(tuning template) that will be used when fine-tuning. The GEOstudio currently defines the following options to select for fine-tuning task/template:
Tuning task type Description
Segmentation Generic template v1 and v2 models: Segmentation
Regression Generic template for v1 & v2 models: Regression
terramind: Segmentation Terramind multimodal task for Segmantation
clay_v1 : Segmentation Segmentation of the clay backbone models
timm_resnet : Segmentation Segmentation of the resnet backbone models
timm_convnext : Segmentation Segmentation of the convnext backbone models

You can also create and manage your own tuning template. Check out the SDK guide on the parameters you will need to define to create your own template as well as example payloads.

  • fine-tuning dataset - The dataset you will use to train the model for your particular application.
  • base foundation model(Backbone model) - The geospatial foundation model you will use as the starting point for your tuning task.

In the Studio, specific tuning templates are compatible with specific backbone models. Below is a current list of the available templates and compatible backbone models. Fine-tuning in the GEOStudio leverages TerraTorch, a flexible fine-tuning framework for Geospatial Foundation Models (GFMs) based on TorchGeo and Lightning. So in theory any model available in TerraTorch can be supported with an appropriate config/template, more will be made available in future.

Model family Backbone model Tuning template
Prithvi
  • Segmentation
  • Regression
Terramind
  • terramind: Segmentation
Clay
  • clay_v1_base
  • clay_v1: Segmentation
ResNet
  • timm_resnet152
  • timm_resnet : Segmentation
  • timm_resnet101
  • timm_resnet50
  • timm_resnet18
  • timm_resnet34
  • clay_v1: Segmentation
Convnext
  • timm_convnext : Segmentation

Check out these example template configs for sample json configs for each of these tune templates, as well as these fine-tuning configs for the datasets we have in the studio.

Inference

The GEOStudio platform provides a no-code portal for running inference with different fine-tuned models, and visualize the results. A user can select a model, a spatial domain and temporal range, and the studio backend will do the rest. Check out the deatiled UI and SDK user guide on how to run inference on the studio.

You can use these inference payload examples for testing inference in the studio.