TerraKit Data Connectors
Data connectors are classes which enable a user to search for data and query data from a particular data source using a common set of functions. The TerraKit Pipeline makes use of the Data Connectors, but they can also be used independently to explore and retrieve EO data.
Each data connector has the following mandatory methods:
- list_collections()
- find_data()
- get_data()
Bounding Box Constraints
All TerraKit data connectors adhere to standard geographic bounding box constraints:
Bounding boxes must be specified in the format: bbox = [West, South, East, North] = [min_lon, min_lat, max_lon, max_lat]
The following constraints are enforced:
- Longitude (West/East):
-180 <= west < east <= 180 - Latitude (South/North):
-90 <= south < north <= 90
These constraints ensure:
- Valid geographic coordinates within Earth's coordinate system
- Proper ordering (minimum < maximum for both longitude and latitude)
- Consistency across all data connectors regardless of the underlying data source
Example of a valid bounding box:
# Valid: London area
bbox = [-0.5, 51.3, 0.3, 51.7] # [West, South, East, North]
# Valid: Global extent
bbox = [-180, -90, 180, 90]
# Invalid: West >= East
bbox = [0.3, 51.3, -0.5, 51.7] # ❌ West (0.3) must be < East (-0.5)
# Invalid: Longitude out of range
bbox = [-200, 51.3, 0.3, 51.7] # ❌ West (-200) outside valid range [-180, 180]
# Invalid: South >= North
bbox = [-0.5, 51.7, 0.3, 51.3] # ❌ South (51.7) must be < North (51.3)
Note: For regions crossing the antimeridian (180°/-180° longitude), split the query into two separate bounding boxes or use data connector-specific handling if available.
Available data connectors and collections
The following data connectors and associated collections are available:
| Connectors | Collections |
|---|---|
| sentinelhub | s2_l1c, dem, s1_grd, hls_l30, s2_l2a, hls_s30 |
| nasa_earthdata | HLSL30_2.0, HLSS30_2.0 |
| sentinel_aws | sentinel-2-l2a |
| climate_data_store | derived-era5-single-levels-daily-statistics, projections-cordex-domains-single-levels |
| IBMResearchSTAC | 'HLSS30', 'esa-sentinel-2A-msil1c', 'HLS_S30',, 'atmospheric-weather-era5', 'deforestation-umd', 'Radar-10min', 'tasmax-rcp85-land-cpm-uk-2.2km', 'vector-osm-power', 'ukcp18-land-cpm-uk-2.2km', 'treecovermaps-eudr', 'ch4' + more |
| TheWeatherCompany | weathercompany-daily-forecast |
Data connector access
Each data connector has a different access requirements. For example, connecting to SentinelHub and NASA EarthData, you will need to obtain credentials from each provider. Once these have been obtained, they can be added to a .env file at the root directory level using the following syntax:
SH_CLIENT_ID="<SentinelHub Client ID>"
SH_CLIENT_SECRET="<SentinelHub Client Secret>"
NASA_EARTH_BEARER_TOKEN="<NASA EarthData Bearer Token>"
CDSAPI_KEY="<Climate Data Store API Key>"
NASA Earthdata
To access NASA Earthdata, register for an Earthdata Login profile and requests a bearer token. https://urs.earthdata.nasa.gov/profile
Sentinel Hub
To access sentinel hub, register for an account and requests an OAuth client using the Sentinel Hub dashboard https://www.planet.com
Sentinel AWS
Access sentinel AWS data is open and does not require any credentials.
Climate Data Store
Create an account at https://cds.climate.copernicus.eu/. Once created, find your API key under the Profile section. Each dataset may also require accepting the licence agreement. If this is the case, the first time a request is made, an error will be returned with the url to visit to accept the terms.
Available collections include: - ERA5 post-processed daily statistics on single levels from 1940 to present - CORDEX regional climate model data on single levels
The Weather Company
To access The Weather Company, register for an account and requests an API Key https://www.weathercompany.com/weather-data-apis/. Once you have an API key, set the following environment variable:
IBM Research STAC
Access IBM Research STAC is currently restricted to IBMers and partners. If you're elegible, you need to register for an IBM AppID account and set the following environment variables:
APPID_ISSUER=<issuer>
APPID_USERNAME=<user-email>
APPID_PASSWORD=<user-password>
CLIENT_ID=<client-id>
CLIENT_SECRET=<client-secret>
Please reach out the maintainers of this repo.
IBMers don't need credentials to access the internal instance of the STAC service.
Try out
Data Connectors can be used outside the TerraKit Pipeline. Take a look at the TerraKit: Easy geospatial data search and query notebook for more help getting started with TerraKit Data Connectors.