002-Add-Precomputed-Examples¶
📥 Download 002-Add-Precomputed-Examples.ipynb and try it out
Introduction¶
This notebook is meant for someone with minimal knowledge, to be able to meaningfully use the most important functions of the Geospatial SDK.
For more information about the Geospatial Studio see the docs page: Geospatial Studio Docs
For more information about the Geospatial Studio SDK and all the functions available through it, see the SDK docs page: Geospatial Studio SDK Docs
%load_ext autoreload
%autoreload 2
# import the required packages
from geostudio import Client
from geostudio import gswidgets
Connecting to the platform¶
First, we set up the connection to the platform backend. To do this we need the base url for the studio UI and an API key.
To get an API Key:
- Go to the Geospatial Studio UI page and navigate to the Manage your API keys link.
- This should pop-up a window where you can generate, access and delete your api keys. NB: every user is limited to a maximum of two activate api keys at any one time.
Store the API key and geostudio ui base url in a credentials file locally, for example in /User/bob/.geostudio_config_file. You can do this by:
echo "GEOSTUDIO_API_KEY=<paste_api_key_here>" > .geostudio_config_file
echo "BASE_STUDIO_UI_URL=<paste_ui_base_url_here>" >> .geostudio_config_file
Copy and paste the file path to this credentials file in call below.
#############################################################
# Initialize Geostudio client using a geostudio config file
#############################################################
gfm_client = Client(geostudio_config_file=".geostudio_config_file")
Preparing layers¶
Now we have connected to the Geospatial Studio, we are ready to start preparing layers to be onboarded. To add a layer to the studio, a presigned link is used and you have to upload the files to the link outside of the studio.
Prerequisites¶
We support onboarding of;
- Raster Data
- GeoTIFF
- NetCDF
- Vector Data
- Shapefile
- GeoPackage
Below are the requirements for each file type.
GeoTIFF¶
For GeoTIFFs, especially where you have the same spatial domain over different temporal domain, you need to have each of individual tif files have the date in the file name in the format in this example somename_2024-04-27_whatever.tif and all the tif files zipped in a .zip file.
NetCDF¶
Onboard a single NetCDF file with the extension .nc
Shapefile¶
For shapefiles, zip all the mandatory files for a shp file in a .zip files. In the zip include only one shapefile.
GeoPackage¶
Onboard a single GeoPackage file with the extension .gpkg
Pushing the layers to a link¶
If you have your layers already in an environment like S3 or BOX etc, you can just get presigned links to them and ignore this section. However, ensure the links include the extension of the file being onboarded, e.g. .zip (for both tiff and shapefile), .nc, .gpkg
If you do not have an environment to upload your files, you can leverage a temporary COS storage that we provide. Generate the upload and download links using below code for each layer you want to add.
# Unique object name to be used in temporary COS for each layer you want to upload
object_name = "my-test-layer.zip"
gfm_client.get_fileshare_links(object_name)
use the upload_url to upload each file you want to add.
curl -X PUT -T **your_file.zip or your_file.gpkg or your_file.nc** "**upload_url**"
use the equivalent download_url for the section below.
Defining the Add Layer query¶
Now we have our layers ready, we are ready to set up an add layer run. To add layer you need to define the spatial (urls) and temporal domain for the layers to be added. This is done with a json payload sent to the studio.
Below is a template json with comments on the different fields that need to be defined.
request_payload = {
"fine_tuning_id": "sandbox", // DO NOT CHANGE
"spatial_domain": {
"urls": [
"https://download_url.zip" // PRESIGNED URL WITH IMAGES TO BE ONBOARDED
]
},
"temporal_domain": [],
"geoserver_push": [
{
"workspace": "geofm", // DO NOT CHANGE
"layer_name": "layer_name", // LAYER NAME TO BE USED TO PUSH TO GEOSERVER. USE LOWER CASE ALPHANUMERIC JOINED WITH UNDERSCORE WITHOUT ANY OTHER SPECIAL CHARACTERS
"display_name": "My Layer", // THE NAME TO APPEAR IN THE UI
"filepath_key": "original_input_image", // DO NOT CHANGE
"file_suffix": "", // DO NOT CHANGE
"z_index": 1, // Z INDEX OF THE CURRENT LAYER ON THE UI MAP (HIGHEST NUMBER WILL APPEAR ON TOP OF ALL OTHER LAYERS)
"visible_by_default": "True", // WHETHER THE LAYER WILL BE SHOWN BY DEFAULT WHEN LOADED IN THE UI
"coverage_name": "", // ONLY USED FOR NETCDF TO POINT TO A PROPERTY OF INTEREST IN THE NETCDF
"geoserver_style": {
"regression": [ // RASTER REGRESSION STYLING
{
"opacity": 1,
"quantity": "0",
"color": "#000dff",
"label": "Min"
},
{
"opacity": 1,
"quantity": "300",
"color": "#ff00d9",
"label": "MAX"
}
],
"segmentation": [ // RASTER SEGMENTATION STYLING
{
"opacity": 0,
"quantity": "0",
"color": "#000dff",
"label": "No flood"
},
{
"opacity": 1,
"quantity": "1",
"color": "#ff00d9",
"label": "Flood"
}
],
"rgb": [ // RASTER RGB STYLING
{
"minValue": 0,
"maxValue": 255,
"channel": 1,
"label": "RedChannel"
},
{
"minValue": 0,
"maxValue": 255,
"channel": 2,
"label": "GreenChannel"
},
{
"minValue": 0,
"maxValue": 255,
"channel": 3,
"label": "BlueChannel"
}
],
"polygon_style": { // VECTOR POLYGON STYLING EXAMPLE
"fill": "#fffcfd",
"fill_opacity": 0.5
},
"point_style": { // VECTOR POINT STYLING EXAMPLE
"well_known_name": "circle",
"fill": "#b3b3b3",
"stroke": "#253c99",
"stroke_width": 1,
"size": 6
},
"line_style": { // VECTOR LINE STYLING EXAMPLE
"stroke": "#b0d6ff",
"stroke_width": 1
}
}
}
],
"model_display_name": "add-layer-sandbox-model", // DO NOT CHANGE
"description": "Descriptions for my layers", // TEXTUAL DESCRIPTION OF THE ONBOARDED LAYERS
"location": "Layers location", // LOCATION NAME FOR THE LAYERS
"demo": {
"demo": true, // DO NOT CHANGE
"section_name": "My Examples" // DO NOT CHANGE
}
}
You can find example jsons for GeoTIFF, NetCDF, and Vector to guide you.
You can then submit the request using:
response = gfm_client.submit_inference(
data=request_payload,
output="json"
)
The SDK also includes some widgets which can help you to browse the available models, define bounding boxes etc.
Now we put that information into the payload below and send the request to the cluster.
request_payload = {
"fine_tuning_id": "sandbox", // DO NOT CHANGE
"spatial_domain": {
"urls": [
"https://download_url.zip" // PRESIGNED URL WITH IMAGES TO BE ONBOARDED
]
},
"temporal_domain": [],
"geoserver_push": [
{
"workspace": "geofm", // DO NOT CHANGE
"layer_name": "layer_name", // LAYER NAME TO BE USED TO PUSH TO GEOSERVER. USE LOWER CASE ALPHANUMERIC JOINED WITH UNDERSCORE WITHOUT ANY OTHER SPECIAL CHARACTERS
"display_name": "My Layer", // THE NAME TO APPEAR IN THE UI
"filepath_key": "original_input_image", // DO NOT CHANGE
"file_suffix": "", // DO NOT CHANGE
"z_index": 1, // Z INDEX OF THE CURRENT LAYER ON THE UI MAP (HIGHEST NUMBER WILL APPEAR ON TOP OF ALL OTHER LAYERS)
"visible_by_default": "True", // WHETHER THE LAYER WILL BE SHOWN BY DEFAULT WHEN LOADED IN THE UI
"coverage_name": "", // ONLY USED FOR NETCDF TO POINT TO A PROPERTY OF INTEREST IN THE NETCDF
"geoserver_style": {
"regression": [ // RASTER REGRESSION STYLING
{
"opacity": 1,
"quantity": "0",
"color": "#000dff",
"label": "Min"
},
{
"opacity": 1,
"quantity": "300",
"color": "#ff00d9",
"label": "MAX"
}
],
"segmentation": [ // RASTER SEGMENTATION STYLING
{
"opacity": 0,
"quantity": "0",
"color": "#000dff",
"label": "No flood"
},
{
"opacity": 1,
"quantity": "1",
"color": "#ff00d9",
"label": "Flood"
}
],
"rgb": [ // RASTER RGB STYLING
{
"minValue": 0,
"maxValue": 255,
"channel": 1,
"label": "RedChannel"
},
{
"minValue": 0,
"maxValue": 255,
"channel": 2,
"label": "GreenChannel"
},
{
"minValue": 0,
"maxValue": 255,
"channel": 3,
"label": "BlueChannel"
}
],
"polygon_style": { // VECTOR POLYGON STYLING EXAMPLE
"fill": "#fffcfd",
"fill_opacity": 0.5
},
"point_style": { // VECTOR POINT STYLING EXAMPLE
"well_known_name": "circle",
"fill": "#b3b3b3",
"stroke": "#253c99",
"stroke_width": 1,
"size": 6
},
"line_style": { // VECTOR LINE STYLING EXAMPLE
"stroke": "#b0d6ff",
"stroke_width": 1
}
}
}
],
"model_display_name": "add-layer-sandbox-model", // DO NOT CHANGE <IF YOU GET 404 ERROR TRY WITH "add-layer-sandbox-models">
"description": "Descriptions for my layers", // TEXTUAL DESCRIPTION OF THE ONBOARDED LAYERS
"location": "Layers location", // LOCATION NAME FOR THE LAYERS
"demo": {
"demo": true, // DO NOT CHANGE
"section_name": "My Examples" // DO NOT CHANGE
}
}
response = gfm_client.submit_inference(data=request_payload)
response
Monitor tuning status and progress¶
After submitting the request, we can poll the inference service to check the progress as well as get the output details once its complete (this could take a few minutes depending on the request size and the current service load).
# Poll inference status
gfm_client.poll_inference_until_finished(inference_id=response['id'])
gfm_client.get_inference(inference_id=response['id'])
Accessing inference outputs¶
Once an inference run is completed, the inputs and outputs of each task within an inference are packaged up into a zip file which is uploaded to a url you can use to download the files.
To access the inference task files:
- Get the inference tasks list
- Identify the specific inference task you want to view
- Download task output files
# Get the inference tasks list
inf_tasks_res = gfm_client.get_inference_tasks(response["id"])
inf_tasks_res
df = gfm_client.inference_task_status_df(response["id"])
display(df.style.map(gswidgets.color_inference_tasks_by_status))
gswidgets.view_inference_process_timeline(gfm_client, inference_id = response["id"])