Skip to main content
This page is for the team that owns the live integration after implementation is complete. Use it to define:
  • What must be checked before release
  • What must be monitored after release
  • How to respond to common incidents
  • How to rotate credentials and publish content safely
Read this together with:

Operating model

Every production integration should have clear ownership for:
  • Runtime configuration
  • API key lifecycle
  • Document rollout
  • Monitoring and alerts
  • Incident response
  • Release approvals
If more than one service calls Uppzy, define one owning team and one shared escalation path.

Environment checklist

Keep these values explicit per environment:
  • UPPZY_BASE_URL
  • UPPZY_API_KEY
  • UPPZY_TENANT_ID
  • UPPZY_SITE_ID
  • Request timeout settings
  • Retry settings
  • Async polling timeout settings
Do not reuse production keys in development or staging.

Pre-release checklist

Before a production rollout:
  1. Confirm the correct tenant_id and site_id
  2. Confirm the deployment has the correct API key
  3. Run a tenant connectivity check
  4. Run one document operation if the release affects content sync
  5. Run one sync chat smoke test
  6. Run one async chat smoke test if async mode is used
  7. Run one feedback submission test if feedback is wired
  8. Confirm dashboards and alerts are active

Smoke test script

Use a small smoke flow after deployment and after any key rotation.
export UPPZY_BASE_URL="https://api.uppzy.com/api/v1"
export UPPZY_API_KEY="<YOUR_API_KEY>"
export UPPZY_TENANT_ID="<YOUR_TENANT_ID>"
export UPPZY_SITE_ID="<YOUR_SITE_ID>"

echo "1) Connectivity"
curl --silent --show-error \
  --request GET \
  --url "$UPPZY_BASE_URL/m2m/tenants/$UPPZY_TENANT_ID/limits" \
  --header "X-API-Key: $UPPZY_API_KEY"

echo
echo "2) Sync chat"
curl --silent --show-error \
  --request POST \
  --url "$UPPZY_BASE_URL/m2m/sites/$UPPZY_SITE_ID/chat" \
  --header "X-API-Key: $UPPZY_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "session_id": "ops_smoke_sync",
    "message": "What are your support hours?",
    "response_language": "en"
  }'
Keep smoke questions simple and based on approved, stable content.

Dashboard checklist

At minimum, your dashboards should show:
  • Request volume by endpoint
  • Success rate by endpoint
  • 4xx, 429, and 5xx rate
  • P50 and P95 latency
  • Timeout count
  • Async completion latency
  • Feedback trend
Useful dimensions:
  • Environment
  • Integration name
  • Tenant ID
  • Site ID

Alert checklist

Start with simple operational alerts:
  • Repeated 401 over a short period
  • Repeated 403 after a deployment or configuration change
  • Sustained 429 rate
  • Sustained 5xx rate
  • Timeout spike
  • Async polling timeout spike
  • Sudden drop in request volume for a critical integration
Avoid noisy alerts for one-off single failures.

Content rollout runbook

When publishing a new content batch:
  1. Verify that the source content is approved
  2. Start with a small batch
  3. Record the source IDs in your own sync log
  4. Run smoke questions for the changed topics
  5. Monitor feedback and low-confidence outcomes
  6. Expand the batch only after answer quality is acceptable
Pause the rollout if:
  • Bad feedback rises sharply
  • Expected answers stop matching approved content
  • 5xx or timeout rate increases during bulk sync

API key rotation runbook

Treat key rotation as a planned operational change. Recommended sequence:
  1. Create the replacement key
  2. Update runtime configuration in the target environment
  3. Deploy or reload the service
  4. Run the smoke test
  5. Confirm normal traffic and error rate
  6. Revoke the old key after verification
After rotation, watch closely for:
  • 401 spike
  • Retry storm
  • Worker failures using stale configuration
  • Async jobs still using an outdated client instance

Incident runbook

401 Unauthorized

Check:
  • Wrong key in runtime configuration
  • Old key still used after rotation
  • Missing key in one worker or one deployment target
Immediate action:
  • Stop repeated retries
  • Validate current configuration
  • Re-run connectivity smoke test

403 Forbidden

Check:
  • Wrong tenant or site targeting
  • Plan or capability mismatch
  • Endpoint used by the wrong integration flow
Immediate action:
  • Pause the affected rollout
  • Validate environment ownership and endpoint usage

404 Not Found

Check:
  • Wrong tenant_id
  • Wrong site_id
  • Wrong request_id
  • Resource deleted or outside the expected scope
Immediate action:
  • Inspect the exact identifier values in logs
  • Confirm which service produced the request

429 Too Many Requests

Check:
  • Traffic burst
  • Worker concurrency
  • Polling interval
  • Unexpected loop or duplicate retries
Immediate action:
  • Reduce concurrency
  • Increase backoff
  • Pause non-critical bulk sync jobs

5xx or timeout spike

Check:
  • Current traffic shape
  • Bulk content sync jobs
  • Async worker backlog
  • Recent deployment or configuration change
Immediate action:
  • Keep retries bounded
  • Pause non-essential traffic if needed
  • Run a small smoke flow to confirm recovery

Deploy checklist for application teams

Use this checklist when a new service or release starts using Uppzy:
  • Shared client layer is in place
  • User-facing fallback messages are safe
  • Metrics emit tenant ID, site ID, status, and latency
  • Retry behavior is bounded
  • Async timeout budget is defined
  • Feedback flow is tested if used
  • One rollback path is documented

Change log checklist

Record these changes in your internal release notes:
  • API key rotation date
  • Environment configuration change
  • New content batch release
  • Session strategy change
  • Retry policy change
  • Async polling timeout change
  • Alert threshold change

Weekly review checklist

At least once per week, review:
  • High-error endpoints
  • 429 trend
  • Timeout trend
  • Low-confidence answer trend
  • Bad feedback trend
  • Recent content rollout impact
Use this review to decide whether the next action should be:
  • Content improvement
  • Retry tuning
  • Polling change
  • Alert tuning
  • Rollback of a recent rollout