Troubleshooting Guide¶

Common issues and solutions when working with IBM Geospatial Studio.

Deployment-Specific Commands

This guide provides commands for both Local (Lima VM) and Cluster (Kubernetes/OpenShift) deployments. Use the tabs to switch between deployment types where applicable.

Local Deployment: Uses Lima VM running Kubernetes on your laptop/workstation
Cluster Deployment: Uses production Kubernetes or OpenShift clusters

🚀 Deployment Issues¶

Services Fail to Start¶

Problem: Services fail to start or pods/containers exit immediately.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check Lima VM status:
```
limactl list
limactl shell studio
```

Check pod status in Lima VM:

# Set kubeconfig
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

# Check pods
kubectl get pods -n default

View pod logs:

kubectl logs <pod-name> -n default
kubectl describe pod <pod-name> -n default

Ensure sufficient resources:
Minimum 16GB RAM
100GB free disk space
Check Lima VM disk space: limactl shell studio df -h

Restart Lima VM if needed:

limactl stop studio
limactl start studio

# Re-export kubeconfig
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

Check pod status:

kubectl get pods -n <namespace>
# Or for OpenShift
oc get pods -n <namespace>

View pod logs:

kubectl logs <pod-name> -n <namespace>
# Or for OpenShift
oc logs <pod-name> -n <namespace>

# For all containers in a pod
kubectl logs <pod-name> -n <namespace> --all-containers=true

Describe pod for details:

kubectl describe pod <pod-name> -n <namespace>
# Or for OpenShift
oc describe pod <pod-name> -n <namespace>

Check resource quotas:

kubectl get resourcequota -n <namespace>
kubectl describe resourcequota -n <namespace>

Verify node resources:

kubectl top nodes
kubectl describe node <node-name>

Check for ImagePullBackOff errors:

# If pods are stuck pulling images
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

# Verify image pull secret
kubectl get secret -n <namespace> | grep image-pull

Port Forwarding Issues¶

Problem: Port forwarding fails or disconnects frequently.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check if port forwarding is active:

# List all port-forward processes
ps aux | grep "port-forward"

# Check studio-pf.log for errors
tail -f studio-pf.log

Restart port forwarding:

# Kill existing port-forwards
pkill -f "kubectl port-forward"

# Set kubeconfig and namespace
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
export OC_PROJECT=default

# Restart all port-forwards
kubectl port-forward -n $OC_PROJECT svc/keycloak 8080:8080 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT svc/postgresql 54320:5432 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT svc/geofm-geoserver 3000:3000 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT deployment/geofm-ui 4180:4180 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT deployment/geofm-gateway 4181:4180 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT deployment/geofm-mlflow 5000:5000 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT svc/minio 9001:9001 >> studio-pf.log 2>&1 &
kubectl port-forward -n $OC_PROJECT svc/minio 9000:9000 >> studio-pf.log 2>&1 &

Check Lima VM network:

# Test connectivity to Lima VM
limactl shell studio

# Inside VM, check services
kubectl get svc -n default

Check if port forwarding is active:

# List all port-forward processes
ps aux | grep "port-forward"

# Check studio-pf.log for errors
tail -f studio-pf.log

Restart port forwarding:

# Kill existing port-forwards
pkill -f "port-forward"

# Restart required port-forwards
kubectl port-forward -n <namespace> svc/minio 9000:9000 >> studio-pf.log 2>&1 &
kubectl port-forward -n <namespace> svc/minio 9001:9001 >> studio-pf.log 2>&1 &
kubectl port-forward -n <namespace> svc/postgresql 54320:5432 >> studio-pf.log 2>&1 &
kubectl port-forward -n <namespace> svc/keycloak 8080:8080 >> studio-pf.log 2>&1 &
kubectl port-forward -n <namespace> svc/geofm-geoserver 3000:3000 >> studio-pf.log 2>&1 &
kubectl port-forward deployment/geofm-ui 4180:4180 >> studio-pf.log 2>&1 &
kubectl port-forward deployment/geofm-gateway 4181:4180 >> studio-pf.log 2>&1 &
kubectl port-forward deployment/geofm-mlflow 5000:5000 >> studio-pf.log 2>&1 &

Use kubectl proxy as alternative:

kubectl proxy --port=8001
# Access services via proxy

Permission Denied Errors¶

Problem: Permission errors when running scripts or accessing files.

Solutions:

Local DeploymentCluster Deployment

# Make scripts executable
chmod +x deploy_studio_k8s.sh
chmod +x deploy_studio_ocp.sh
chmod +x deploy_studio_lima.sh
chmod +x deployment-scripts/*.sh

# Fix ownership issues
sudo chown -R $USER:$USER .

Check service account permissions:

kubectl get serviceaccount -n <namespace>
kubectl describe serviceaccount default -n <namespace>

Check RBAC permissions:

kubectl get rolebinding -n <namespace>
kubectl describe rolebinding <binding-name> -n <namespace>

# Check cluster-wide permissions
kubectl get clusterrolebinding | grep <namespace>

For OpenShift, check Security Context Constraints (SCC):

oc get scc
oc describe scc anyuid

# Add SCC to service account if needed (requires admin)
oc adm policy add-scc-to-user anyuid -n <namespace> -z default

Check pod security policies:

kubectl get psp
kubectl describe psp <policy-name>

Configuration Not Loading¶

Problem: Services can't find configuration or environment variables.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Verify workspace env files exist:

# For Lima deployment, check lima workspace
ls -la workspace/lima/env/.env
ls -la workspace/lima/env/env.sh

Check environment variable format:

# Correct format (no spaces around =)
GEOSTUDIO_API_KEY=your-key-here

# Incorrect format
GEOSTUDIO_API_KEY = your-key-here

Validate environment variables:

# Use the validation script
python deployment-scripts/validate-env-files.py \
  --env-file workspace/lima/env/.env \
  --env-sh-file workspace/lima/env/env.sh

Source environment files:
```
source workspace/lima/env/env.sh
```

Verify kubeconfig is set:

echo $KUBECONFIG
# Should be: /Users/<username>/.lima/studio/copied-from-guest/kubeconfig.yaml

# If not set:
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

Check ConfigMaps:

kubectl get configmap -n <namespace>
kubectl describe configmap <configmap-name> -n <namespace>

# View ConfigMap content
kubectl get configmap <configmap-name> -n <namespace> -o yaml

Check Secrets:

kubectl get secrets -n <namespace>
kubectl describe secret <secret-name> -n <namespace>

# Decode secret values
kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data.key}' | base64 -d

Verify environment variables in pod:

kubectl exec <pod-name> -n <namespace> -- env

# Check specific variable
kubectl exec <pod-name> -n <namespace> -- env | grep STUDIO

Update ConfigMap and restart pods:

kubectl edit configmap <configmap-name> -n <namespace>
kubectl rollout restart deployment/<deployment-name> -n <namespace>

Storage Issues¶

Problem: PVC not binding or storage errors.

Solutions:

Cluster Deployment

Check PVC status:

kubectl get pvc -n <namespace>
kubectl describe pvc <pvc-name> -n <namespace>

Check storage classes:

kubectl get storageclass
kubectl describe storageclass <storage-class-name>

# Verify COS storage class (for MinIO/S3)
kubectl get storageclass cos-s3-csi-s3fs-sc

Check PV availability:

kubectl get pv
kubectl describe pv <pv-name>

Verify IBM Object CSI Driver (for S3 storage):

kubectl get pods -n kube-system -l app=cos-s3-csi-controller
kubectl get pods -n kube-system -l app=cos-s3-csi-driver

# Check driver logs
kubectl logs -n kube-system -l app=cos-s3-csi-controller

Check node labels (for local storage):

kubectl get nodes --show-labels
kubectl label nodes <node-name> topology.kubernetes.io/region=us-east-1
kubectl label nodes <node-name> topology.kubernetes.io/zone=us-east-1a

🔐 Authentication Issues¶

Cannot Generate API Key¶

Problem: API key generation fails in UI.

Solutions:

Check if you have existing keys:
Maximum 2 active keys per user
Delete old keys before creating new ones
Verify authentication:
Log out and log back in
Clear browser cache and cookies
Check backend logs:

Local Deployment (Lima VM)Cluster Deployment

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl logs -l app=geofm-gateway -n default

kubectl logs -l app=geofm-gateway -n <namespace>
# Or for OpenShift
oc logs -l app=geofm-gateway -n <namespace>

Keycloak Authentication Fails¶

Problem: Cannot log in or Keycloak returns errors.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check Keycloak pod:

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app=keycloak -n default
kubectl logs -l app=keycloak -n default

Verify Keycloak setup:

# Re-run Keycloak setup script
./deployment-scripts/setup-keycloak.sh

Check port forwarding:

# Ensure port-forward is active
ps aux | grep "port-forward.*keycloak"

# Test endpoint
curl http://localhost:8080/realms/geostudio

Restart Keycloak port-forward if needed:

pkill -f "port-forward.*keycloak"
kubectl port-forward -n default svc/keycloak 8080:8080 >> studio-pf.log 2>&1 &

Check Keycloak pod:

kubectl get pods -l app=keycloak -n <namespace>
kubectl logs -l app=keycloak -n <namespace>

Verify Keycloak configuration:

# Check if realm exists
kubectl port-forward -n <namespace> svc/keycloak 8080:8080 &
curl http://localhost:8080/realms/geostudio

Check OAuth environment variables:

# Verify in workspace env files
cat workspace/<deployment-env>/env/env.sh | grep OAUTH

For OpenShift, check routes:

oc get route keycloak -n <namespace>
oc describe route keycloak -n <namespace>

SDK Authentication Fails¶

Problem: Client() initialization fails with authentication error.

Solutions:

Verify API key format:

# Check your .geostudio_config_file
cat .geostudio_config_file

Ensure correct URL:

# Use the correct base URL
client = Client(
    api_key="your-key",
    base_url="https://localhost:4180"  # Include https://
)

Check SSL certificate:

# For self-signed certificates
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

client = Client(
    api_key="your-key",
    base_url="https://localhost:4180",
    verify_ssl=False
)

Verify API key is valid:

# Test API key with curl
curl -k -H "Authorization: Bearer <your-api-key>" https://localhost:4181/health

📊 Data Issues¶

MinIO/S3 Connection Fails¶

Problem: Cannot connect to object storage.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check MinIO pod:

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app=minio -n default
kubectl logs -l app=minio -n default

Verify MinIO credentials:

# Default credentials
# Access Key: minioadmin
# Secret Key: minioadmin

# Check in workspace env
cat workspace/lima/env/.env | grep -E "access_key_id|secret_access_key"

Test MinIO connection:

# Ensure port-forward is active
ps aux | grep "port-forward.*minio"

# Test endpoint
curl -k https://localhost:9000/minio/health/live

Restart MinIO port-forwards if needed:

pkill -f "port-forward.*minio"
kubectl port-forward -n default svc/minio 9000:9000 >> studio-pf.log 2>&1 &
kubectl port-forward -n default svc/minio 9001:9001 >> studio-pf.log 2>&1 &

Verify buckets were created:

# Re-run bucket creation script
python deployment-scripts/create_buckets.py --env-path workspace/lima/env/.env

Check MinIO pod:

kubectl get pods -l app=minio -n <namespace>
kubectl logs -l app=minio -n <namespace>

Verify MinIO service:

kubectl get svc minio -n <namespace>
kubectl describe svc minio -n <namespace>

Check MinIO TLS secret:

kubectl get secret minio-tls-secret -n <namespace>
kubectl describe secret minio-tls-secret -n <namespace>

Test MinIO connectivity:

# Port-forward and test
kubectl port-forward -n <namespace> svc/minio 9000:9000 &
curl -k https://localhost:9000/minio/health/live

Verify buckets were created:

# Re-run bucket creation script
python deployment-scripts/create_buckets.py --env-path workspace/<deployment-env>/env/.env

Dataset Onboarding Fails¶

Problem: Dataset upload or onboarding process fails.

Solutions:

Check file format:
Must be a ZIP file
Contains matching data and label pairs
Files have correct suffixes

Verify file structure:

dataset.zip
├── tile_001_merged.tif
├── tile_001_mask.tif
├── tile_002_merged.tif
└── tile_002_mask.tif

Check file size limits:
Individual files: < 2GB
Total dataset: < 10GB

Validate band configuration:

# Ensure band count matches your data
"bands": [
    {"index": "0", "band_name": "Blue", ...},
    {"index": "1", "band_name": "Green", ...},
    # ... must match actual bands in files
]

Cannot Access Pre-computed Examples¶

Problem: Example datasets not visible in UI or SDK.

Solutions:

Check if examples are loaded:
```
client.list_datasets()
```
Verify backend is running:

Local Deployment (Lima VM)Cluster Deployment

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app=geofm-gateway -n default
kubectl logs -l app=geofm-gateway -n default

kubectl get pods -l app=geofm-gateway -n <namespace>
kubectl logs -l app=geofm-gateway -n <namespace>

Check database initialization:

Local Deployment (Lima VM)Cluster Deployment

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app.kubernetes.io/name=postgresql -n default
kubectl logs -l app.kubernetes.io/name=postgresql -n default

kubectl get pods -l app=postgres -n <namespace>
kubectl logs -l app=postgres -n <namespace>

File Not Found Errors in Notebooks¶

Problem: You see errors like:

FileNotFoundError: [Errno 2] No such file or directory: 'template-seg.json'

Solutions:

Option 1: Clone the repository (Recommended)

git clone https://github.com/terrastackai/geospatial-studio.git
cd geospatial-studio/workshop/docs/notebooks
jupyter notebook

Option 2: Download missing files

If you downloaded notebooks individually, you need to also download the JSON configuration files:

Lab 3 requires:
template-seg.json
tune-prithvi-eo-flood.json
Lab 4 requires:
backbone-Prithvi_EO_V2_300M.json
dataset-burn_scars.json
template-seg.json Download these files from the notebooks directory and place them in the same directory as your notebook.

Verify files are in the correct location:

# Check current directory
pwd

# List files
ls -la *.json

# Should see the required JSON files

🤖 Model Training Issues¶

Fine-tuning Job Fails¶

Problem: Training job fails or gets stuck.

Solutions:

Check GPU availability:

Local DeploymentCluster Deployment

nvidia-smi  # Should show available GPUs

# Check GPU nodes
kubectl get nodes -o json | jq '.items[].status.capacity."nvidia.com/gpu"'

# Check GPU operator
kubectl get pods -n gpu-operator-resources

# Verify node labels
kubectl get nodes --show-labels | grep nvidia

# Check GPU resource allocation
kubectl describe node <node-name> | grep -A 5 "Allocated resources"

Verify dataset is onboarded:

dataset = client.get_dataset(dataset_id)
print(dataset['status'])  # Should be 'COMPLETED'

Check training parameters:

# Reduce batch size if OOM errors
task_params['data']['batch_size'] = 2

# Reduce epochs for testing
task_params['runner']['max_epochs'] = 1

Monitor MLflow logs:

Local Deployment (Lima VM)Cluster Deployment

Access MLflow UI at http://localhost:5000

Check experiment logs for errors

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl logs -l app=geofm-mlflow -n default

Ensure MLflow port-forward is active:

ps aux | grep "port-forward.*mlflow"
# If not active, restart:
kubectl port-forward -n default deployment/geofm-mlflow 5000:5000 >> studio-pf.log 2>&1 &

Access MLflow UI via port-forward or route

Check experiment logs

kubectl logs -l app=geofm-mlflow -n <namespace>
# Or for OpenShift
oc logs -l app=geofm-mlflow -n <namespace>

# Access MLflow UI
kubectl port-forward -n <namespace> svc/geofm-mlflow 5000:5000

Out of Memory (OOM) Errors¶

Problem: Training fails with CUDA out of memory.

Solutions:

Reduce batch size:

task_params['data']['batch_size'] = 2  # or 1

Use gradient accumulation:

task_params['trainer']['accumulate_grad_batches'] = 4

Enable mixed precision:

task_params['trainer']['precision'] = '16-mixed'

Clear GPU cache:
```
import torch
torch.cuda.empty_cache()
```
Check GPU memory:

Cluster Deployment

# Check GPU memory usage on nodes
kubectl exec -it <training-pod> -n <namespace> -- nvidia-smi

🔄 Inference Issues¶

Inference Request Fails¶

Problem: Inference submission returns error.

Solutions:

Verify model is deployed:

models = client.list_tunes()
print(models)

Check spatial domain format:

# Correct bbox format: [min_lon, min_lat, max_lon, max_lat]
"bbox": [[-121.84, 39.83, -121.64, 40.04]]

Validate temporal domain:

# Correct format: YYYY-MM-DD_YYYY-MM-DD
"temporal_domain": ["2024-08-12_2024-08-13"]

Check data availability:
Ensure satellite data exists for your date range
Try a different date if no data available

Inference Takes Too Long¶

Problem: Inference job runs for hours without completing.

Solutions:

Reduce spatial extent:

# Use smaller bounding box for testing
bbox = [-121.80, 39.90, -121.70, 40.00]

Check task status:
```
client.get_inference(inference_id)
```
Monitor backend logs:

Local Deployment (Lima VM)Cluster Deployment

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl logs -l app=inference-service -n default -f

kubectl logs -l app=inference-service -n <namespace> -f
# Or for OpenShift
oc logs -l app=inference-service -n <namespace> -f

Cannot Download Inference Results¶

Problem: Download links expired or files not found.

Solutions:

Check task completion:

tasks = client.get_inference_tasks(inference_id)
# Ensure status is 'FINISHED'

Regenerate download links:

# Links expire after 24 hours
client.get_inference_tasks(inference_id)  # Gets fresh links

Use SDK download widget:

from geostudio import gswidgets
gswidgets.fileDownloaderTasks(client=client, task_id=task_id)

🌐 Network Issues¶

Cannot Access UI¶

Problem: Cannot reach the Studio UI in browser.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Problem: Browser cannot reach https://localhost:4180.

Check Lima VM is running:

limactl list
# Status should be "Running"

Check if pods are running:

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -n default

Verify port forwarding is active:

ps aux | grep "port-forward.*geofm-ui"

# If not active, restart:
kubectl port-forward -n default deployment/geofm-ui 4180:4180 >> studio-pf.log 2>&1 &

Test endpoint:
```
curl -k https://localhost:4180
```

Check firewall settings:

# macOS
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --getglobalstate

# Linux
sudo ufw status

Try different browser:
Clear cache and cookies
Try incognito/private mode
Accept self-signed certificate

Check Lima VM logs:

limactl shell studio
# Inside VM, check system logs
journalctl -xe

Problem: Cannot reach Studio UI via ingress/route.

Check ingress/route status:

# Kubernetes
kubectl get ingress -n <namespace>
kubectl describe ingress geofm-ui -n <namespace>

# OpenShift
oc get routes -n <namespace>
oc describe route geofm-ui -n <namespace>

Verify DNS resolution:

# Get the route URL
export UI_ROUTE_URL=$(oc get route geofm-ui -o jsonpath='{"https://"}{.spec.host}')
echo $UI_ROUTE_URL

# Test DNS
nslookup <hostname>
dig <hostname>

Test internal connectivity:

# Port-forward to test
kubectl port-forward -n <namespace> deployment/geofm-ui 4180:4180 &
# Then access https://localhost:4180

Check ingress controller:

# Kubernetes
kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx <ingress-controller-pod>

# OpenShift (uses built-in router)
oc get pods -n openshift-ingress
oc logs -n openshift-ingress <router-pod>

Verify TLS certificates:

# Check TLS secret
kubectl get secret -n <namespace> | grep tls
kubectl describe secret <tls-secret-name> -n <namespace>

SSL Certificate Errors¶

Problem: Browser shows SSL/TLS errors.

Solutions:

Accept self-signed certificate:
Click "Advanced" → "Proceed to localhost"
Add exception in browser settings

For cluster deployments, check certificate:

# View certificate details
openssl s_client -connect <hostname>:443 -showcerts

Regenerate certificates if needed:

Cluster Deployment

# For Kubernetes
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout tls.key -out tls.crt \
  -subj "/CN=<your-domain>"

kubectl create secret tls <secret-name> \
  --cert=tls.crt --key=tls.key -n <namespace>

Geoserver Connection Issues¶

Problem: Cannot access Geoserver or layers not loading.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check Geoserver pod:

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app.kubernetes.io/name=gfm-geoserver -n default
kubectl logs -l app.kubernetes.io/name=gfm-geoserver -n default

Verify Geoserver port-forward:

ps aux | grep "port-forward.*geoserver"

# If not active, restart:
kubectl port-forward -n default svc/geofm-geoserver 3000:3000 >> studio-pf.log 2>&1 &

Test Geoserver endpoint:

curl http://localhost:3000/geoserver/web/

Verify Geoserver credentials:

# Default credentials
# Username: admin
# Password: geoserver

# Check in workspace env
cat workspace/lima/env/.env | grep geoserver

Re-run Geoserver setup:

./deployment-scripts/setup_geoserver.sh

Check Geoserver pod:

kubectl get pods -l app.kubernetes.io/name=gfm-geoserver -n <namespace>
kubectl logs -l app.kubernetes.io/name=gfm-geoserver -n <namespace>

Verify Geoserver service:

kubectl get svc geofm-geoserver -n <namespace>
kubectl describe svc geofm-geoserver -n <namespace>

Test Geoserver connectivity:

kubectl port-forward -n <namespace> svc/geofm-geoserver 3000:3000 &
curl http://localhost:3000/geoserver/web/

Re-run Geoserver setup:

./deployment-scripts/setup_geoserver.sh

For OpenShift with SCC issues:

# Check if anyuid SCC is applied
oc describe scc anyuid | grep <namespace>

# Apply if needed (requires admin)
oc adm policy add-scc-to-user anyuid -n <namespace> -z default

🗄️ Database Issues¶

Database Connection Fails¶

Problem: Services cannot connect to PostgreSQL.

Solutions:

Local Deployment (Lima VM)Cluster Deployment

Check PostgreSQL pod:

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl get pods -l app.kubernetes.io/name=postgresql -n default
kubectl logs -l app.kubernetes.io/name=postgresql -n default

Verify database credentials:

# Check workspace env file
cat workspace/lima/env/.env | grep pg_

# Default password: devPostgresql123

Check database port-forward:

ps aux | grep "port-forward.*postgresql"

# If not active, restart:
kubectl port-forward -n default svc/postgresql 54320:5432 >> studio-pf.log 2>&1 &

Test database connection:

# Install psql if needed
psql -h localhost -p 54320 -U postgres -d geostudio

Check database logs:

kubectl logs -l app.kubernetes.io/name=postgresql -n default

Re-create databases if needed:

python deployment-scripts/create_studio_dbs.py \
  --env-path workspace/lima/env/.env

Check PostgreSQL pod:

kubectl get pods -l app.kubernetes.io/name=postgresql -n <namespace>
kubectl logs -l app.kubernetes.io/name=postgresql -n <namespace>

Verify database credentials:

kubectl get secret postgresql -n <namespace> -o yaml

# Decode password
kubectl get secret postgresql -n <namespace> -o jsonpath='{.data.postgres-password}' | base64 -d

Check database connectivity:

# Port-forward to database
kubectl port-forward -n <namespace> svc/postgresql 54320:5432 &

# Test connection
psql -h localhost -p 54320 -U postgres -d geostudio

Check PVC status:

kubectl get pvc -n <namespace>
kubectl describe pvc postgresql-pvc -n <namespace>

Re-create databases:

python deployment-scripts/create_studio_dbs.py \
  --env-path workspace/<deployment-env>/env/.env

Database Migration Fails¶

Problem: Database schema migration errors.

Solutions:

Check database logs for errors:

Local Deployment (Lima VM)Cluster Deployment

export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl logs -l app.kubernetes.io/name=postgresql -n default | grep ERROR

kubectl logs -l app.kubernetes.io/name=postgresql -n <namespace> | grep ERROR

Verify database exists:

# Connect to database
psql -h localhost -p 54320 -U postgres

# List databases
\l

# Check if geostudio database exists
\c geostudio

Re-run database creation:

python deployment-scripts/create_studio_dbs.py \
  --env-path workspace/<deployment-env>/env/.env

🔍 Debugging Tips¶

Enable Debug Logging¶

import logging
logging.basicConfig(level=logging.DEBUG)

Check Service Health¶

Local Deployment (Lima VM)Cluster Deployment

# Set kubeconfig
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

# Check all pods
kubectl get pods -n default

# Check specific pod logs
kubectl logs <pod-name> -n default -f

# Check resource usage
kubectl top pods -n default
kubectl top nodes

# Check Lima VM status
limactl list

# Check Lima VM resources
limactl shell studio
# Inside VM:
df -h
free -h
top

# Check all pods
kubectl get pods -n <namespace>

# Check specific pod logs
kubectl logs <pod-name> -n <namespace> -f

# Check resource usage
kubectl top pods -n <namespace>
kubectl top nodes

# Check all resources
kubectl get all -n <namespace>

Inspect Container/Pod¶

Local Deployment (Lima VM)Cluster Deployment

# Set kubeconfig
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

# Enter running pod
kubectl exec -it <pod-name> -n default -- /bin/bash

# Check environment variables
kubectl exec <pod-name> -n default -- env

# Check file system
kubectl exec <pod-name> -n default -- ls -la

# Copy files from pod
kubectl cp default/<pod-name>:/path/to/file ./local-file

# Access Lima VM directly
limactl shell studio

# Enter running pod
kubectl exec -it <pod-name> -n <namespace> -- /bin/bash

# Check environment variables
kubectl exec <pod-name> -n <namespace> -- env

# Check file system
kubectl exec <pod-name> -n <namespace> -- ls -la

# Copy files from pod
kubectl cp <namespace>/<pod-name>:/path/to/file ./local-file

Monitor Events¶

Cluster Deployment

# Watch events in namespace
kubectl get events -n <namespace> --watch

# Get events for specific pod
kubectl describe pod <pod-name> -n <namespace> | grep Events -A 20

# Sort events by timestamp
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

Check Helm Deployment¶

Cluster Deployment

# List Helm releases
helm list -n <namespace>

# Get Helm release status
helm status geospatial-studio -n <namespace>

# Get Helm values
helm get values geospatial-studio -n <namespace>

# Check Helm history
helm history geospatial-studio -n <namespace>

Validate Environment Configuration¶

# Use the validation script
python deployment-scripts/validate-env-files.py \
  --env-file workspace/<deployment-env>/env/.env \
  --env-sh-file workspace/<deployment-env>/env/env.sh \
  --env-variables "studio_api_key,access_key_id,secret_access_key" \
  --env-sh-variables "DEPLOYMENT_ENV,OC_PROJECT,CLUSTER_URL"

📞 Getting Help¶

If you're still experiencing issues:

Collect logs:

Local Deployment (Lima VM)Cluster Deployment

# Set kubeconfig
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"

# Collect all pod logs
kubectl logs -l app=geospatial-studio -n default --all-containers=true > logs.txt

# Collect events
kubectl get events -n default --sort-by='.lastTimestamp' > events.txt

# Collect Lima VM info
limactl list > lima-status.txt
limactl shell studio df -h > lima-disk.txt

# Collect port-forward logs
cat studio-pf.log > port-forward-logs.txt

# Collect all pod logs
kubectl logs -l app=geospatial-studio -n <namespace> --all-containers=true > logs.txt

# Or use stern for better log aggregation
stern -n <namespace> . > logs.txt

# Collect events
kubectl get events -n <namespace> --sort-by='.lastTimestamp' > events.txt

Gather system information:

Local Deployment (Lima VM)Cluster Deployment

# Host system info
limactl --version
kubectl version --client
helm version
python --version
pip list

# Lima VM info
limactl list
export KUBECONFIG="$HOME/.lima/studio/copied-from-guest/kubeconfig.yaml"
kubectl version
kubectl get nodes -o wide

# Workspace info
ls -la workspace/lima/env/
cat workspace/lima/env/env.sh | grep -E "DEPLOYMENT_ENV|OC_PROJECT"

kubectl version
helm version
python --version
pip list

# Cluster information
kubectl cluster-info
kubectl get nodes -o wide

# For OpenShift
oc version
oc get clusterversion

Check deployment configuration:

Local Deployment (Lima VM)

# Review workspace environment files
cat workspace/lima/env/env.sh
cat workspace/lima/env/.env

# Review Helm values
cat workspace/lima/values/geospatial-studio/values-deploy.yaml

# Check Lima VM configuration
cat deployment-scripts/lima/studio.yaml  # macOS
cat deployment-scripts/lima/studio-linux.yaml  # Linux

Cluster Deployment

# Review workspace environment files
cat workspace/<deployment-env>/env/env.sh
cat workspace/<deployment-env>/env/.env

# Review Helm values
cat workspace/<deployment-env>/values/geospatial-studio/values-deploy.yaml

Search existing issues:
Geospatial Studio Issues
Geospatial Studio Toolkit Issues
Create a new issue:
Include error messages
Provide steps to reproduce
Share relevant logs
Mention your environment (OS, deployment type, cluster version, etc.)
Community support:
Check FAQ for common questions
Review Additional Resources for documentation

← Back: Additional Resources Next: FAQ →