001-Introduction-to-Inferencing¶
📥 Download 001-Introduction-to-Inferencing.ipynb and try it out
Introduction¶
This notebook is meant for someone with minimal knowledge of Geospatial, to be able to meaningfully use the most important functions of the Geospatial Studio SDK to run a model.
For more information about the Geospatial Studio see the docs page: Geospatial Studio Docs
For more information about the Geospatial Studio SDK and all the functions available through it, see the SDK docs page: Geospatial Studio SDK Docs
# Install extra requirements
! pip install boto3
%load_ext autoreload
%autoreload 2
# import the required packages
import json
import uuid
import pandas as pd
import wget
import rasterio
import matplotlib.pyplot as plt
from IPython.display import display, HTML
# import seaborn as sns
import getpass # For use in Colab as well
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
from geostudio import Client
from geostudio import gswidgets
Connecting to the platform¶
First, we set up the connection to the platform backend. To do this we need the base url for the studio UI and an API key.
To get an API Key:
- Go to the Geospatial Studio UI page and navigate to the Manage your API keys link.
- This should pop-up a window where you can generate, access and delete your api keys. NB: every user is limited to a maximum of two activate api keys at any one time.
Store the API key and geostudio ui base url in a credentials file locally, for example in /User/bob/.geostudio_config_file. You can do this by:
echo "GEOSTUDIO_API_KEY=<paste_api_key_here>" > .geostudio_config_file
echo "BASE_STUDIO_UI_URL=<paste_ui_base_url_here>" >> .geostudio_config_file
Copy and paste the file path to this credentials file in call below.
#############################################################
# Initialize Geostudio client using a geostudio config file
#############################################################
gfm_client = Client(geostudio_config_file=".geostudio_config_file")
Defining the Inference query¶
Studio allows inference requests by either providing the bounding box or a downloadabe link to the files.
Using bounding box¶
Now we have connected to the Geospatial Studio backend, we are ready to set up an inference run. To run inference you need to choose a model to run, and define the spatial and temporal domain over which to run the model. This is done with a json payload sent to the inference gateway.
request_payload = {
"model_display_name": "prithvi-eo-flood-blair",
"description": "Jarani, Nagaon, Nagaon, Assam, India",
"location": "Jarani, Nagaon, Nagaon, Assam, India",
"spatial_domain": {
"bbox": [[92.40665153547121, 26.1051042015407,92.92535070071905,26.498933088370826]],
"polygons": [],
"tiles": [],
"urls": []
},
"temporal_domain": ["2024-07-25_2024-07-27"]
}
You can then submit the request using:
response = gfm_client.submit_inference(
data=request_payload,
output="json"
)
The SDK also includes some widgets which can help you to browse the available models, define bounding boxes etc.
# To list the available models deployed on the inference service you are connected to you can use the function:
# [Optional] - use output="json" to view full details about the models
models = gfm_client.list_models(output="df")
models
# if you want help choosing a bounding box, you can use the widget below and copy and paste the bbox i.e. [....] to the payload below.
gswidgets.bboxSelector()
Now we put that information into the payload below and send the request to the cluster.
bbox = [-51.33225, -30.08903, -51.19011, -29.97489]
# Choose a model by copying and pasting its display name here
request_payload = {
"model_display_name": "prithvi-eo-flood",
"description": "Porto Alegre, Brazil SDK flooding demo",
"location": "Porto Alegre, Brazil",
"spatial_domain": {
"bbox": [bbox],
"polygons": [],
"tiles": [],
"urls": []
},
"temporal_domain": [
"2024-05-06_2024-05-07"
]
}
response = gfm_client.submit_inference(data=request_payload)
response
Using an S3 pre-signed link¶
If you have your image locally and would like to pre-sign the image using S3.
Personal buckets¶
Use the create_upload_presigned_url to generate an upload link that you can use to upload the file to the dataset.
This function assumes you have your own storage bucket to upload to.
upload_url = gfm_client.create_upload_presigned_url(
bucket_name="bucket_name", # bucket name
object_key="data/train/austin1_sdk_upload.tiff", # file path to upload in the bucket
endpoint_url="https://s3.us-east.cloud-object-storage.appdomain.cloud", # s3 endpoint url
service_name= "s3", # service to use
region_name="us-east", # cloud region
expiration=3600 # expiration
# Add any other args to pass to the s3 client
)
upload_url
# Push your file to the bucket using the url generated.
!curl -X PUT -T **your_file.zip or your_file.tiff or your_file.tif** "**upload_url**"
Once the image is uploaded to your s3 bucket, create a download link to use in the inference request.
download_url = gfm_client.create_download_presigned_url(
bucket_name="geospatial-studio-example-data", # bucket name
object_key="data/train/austin1_sdk_upload.tiff", # file path to upload in the bucket
endpoint_url="https://s3.us-east.cloud-object-storage.appdomain.cloud", # s3 endpoint url
service_name= "s3", # service to use
region_name="us-east", # cloud region
expiration=7200 # expiration
# Add any other args to pass to the s3 client
)
download_url
Geostudio temporary buckets¶
If you would like to upload to a geostudio temporary bucket, use this function get_fileshare_links function.
# Unique object name to be used in temporary COS for each layer you want to upload
object_name = "austin1_sdk_upload.tiff"
gfm_client.get_fileshare_links(object_name)
# Push your file to the bucket using the url generated.
!curl -X PUT -T **your_file.zip or your_file.tiff or your_file.tif** "**upload_url**"
Submit Inference¶
Now you can create the inference payload using the download link.
# grab the download url for use in inference.
download_url_tiff = download_url
# Choose a model by copying and pasting its display name here
request_payload_with_url = {
"model_display_name": "prithvi-eo-flood",
"description": "Your inference description",
"location": "Your tiff location",
"spatial_domain": {
"bbox": [],
"polygons": [],
"tiles": [],
"urls": [download_url_tiff]
},
"temporal_domain": [
"2024-05-06_2024-05-07"
]
}
response = gfm_client.submit_inference(data=request_payload_with_url)
response
Monitor tuning status and progress¶
After submitting the request, we can poll the inference service to check the progress as well as get the output details once its complete (this could take a few minutes depending on the request size and the current service load).
# Poll inference status
gfm_client.poll_inference_until_finished(inference_id=response['id'])
gfm_client.get_inference(inference_id=response['id'])
Accessing inference outputs¶
Once an inference run is completed, the inputs and outputs of each task within an inference are packaged up into a zip file which is uploaded to a url you can use to download the files.
To access the inference task files:
- Get the inference tasks list
- Identify the specific inference task you want to view
- Download task output files
# Get the inference tasks list
inf_tasks_res = gfm_client.get_inference_tasks(response["id"])
inf_tasks_res
Next, Identify the task you want to view from the response above, ensure status of the task is FINISHED and set selected_task variable below to the task number at the end of the task id string. For example, if task_id is "6d1149fa-302d-4612-82dd-5879fc06081d-task_0", selected_task would be 0
# Select a task to view
selected_task = 0
selected_task_id = f"{inf_tasks_res['inference_id']}-task_{selected_task}"
# Download task output files
gswidgets.fileDownloaderTasks(client=gfm_client, task_id=selected_task_id)
Visualizing the output of the inference runs¶
You can check out the results visually in the Studio UI, or with the quick widget below. You can alternatively use the SDK to download selected files for further analysis see documentation.
We have several options for visualising the data:
- we can load the data with a package like rasterio and plot the images, and/or access the values.
- we could use the widget from the SDK to visualise the chosen files for a inference run. (shown below)
- view the data in the Geospatial Studio Inference lab UI.
- load the files in an external software, such as QGIS.
Load the data with a package rasterio and plot the images, and/or access the values.¶
# Paste the name (+path) to one of the files you downloaded and select the band you want to load+plot
filename = '6d1149fa-302d-4612-82dd-5879fc06081d-task_0_HLS_L30_2024-08-12_imputed__merged_pred_masked.tif'
band_number = 1
# open the file and read the band and metadata with rasterio
with rasterio.open(filename) as fp:
data = fp.read(band_number)
bounds = fp.bounds
print("Image dimensions: " + str(data.shape))
plt.imshow(data, extent=[bounds.left, bounds.right, bounds.bottom, bounds.top])
plt.xlabel('Longitude'); plt.xlabel('Latitude')
Visualize through the SDK widgets¶
# Visualize output files with the SDK
gswidgets.inferenceTaskViewer(client=gfm_client , task_id=selected_task_id)
List past inference runs¶
All past inference runs from a user are stored in the Studio database and the user can access this historical record, with the ability to retrieve past output data. This can be done using the simple sdk function:
gfm_client.list_inferences()
or similarly using the sdk widget:
gfm_client.list_inferences()