An SRE agent built on AWS Bedrock AgentCore. It lives in Slack, thinks with Claude Opus 4.6,
and keeps your infrastructure in check — with human-in-the-loop safety for every dangerous action.
@mention Orbit in any Slack channel. It processes your request through a serverless pipeline with built-in safety rails.
1
Slack Trigger
User @mentions Orbit in a Slack thread. The message hits API Gateway, gets signature-verified, deduplicated, and kicks off a Step Functions workflow.
2
Agent Processing
Step Functions invokes the Orbit agent on AgentCore via callback pattern. Claude Opus 4.6 processes the request with access to CloudWatch, Datadog, Jira, Confluence, and more.
3
Safe Response
Every tool call passes through a four-tier permission guard. Structural shell bypasses and catastrophic commands are auto-denied, dangerous actions require Slack approval, and safe commands auto-allow. Responses are chunked and posted back to the thread.
Architecture
Main Request Flow
From @mention to response — follow the path of a Slack message through the entire serverless pipeline.
Slack / API Gateway
Lambda Functions
Step Functions
AgentCore Runtime
DynamoDB
Click to watch a request flow through the system
click to expand
Slack Workspace
@mention Orbit API Gateway receives events Approve / Reject API Gateway receives actions
Two API Gateway HTTP routes receive all Slack traffic. Every request is verified with HMAC-SHA256 before any processing occurs.
The Events route handles @mentions; the Actions route handles interactive button clicks from the HITL approval flow.
click to expand
Verification Lambda
1. HMAC-SHA256 signature check
2. Dedup via DynamoDB (1h TTL)
3. Start Step Functions
4. Return 200 within 3s
Validates request signature, deduplicates via DynamoDB with TTL, then starts async processing via Step Functions. Must ACK within Slack's retry window.
Handles approval button clicks with atomic DynamoDB updates to prevent race conditions. Supports both tool-level and workflow-level approval modes.
click to expand
Step Functions (callback pattern)
▶PostThinking — post "Thinking…" to Slack
▶InvokeAgentWithCallback — waitForTaskToken
▶PostResult — update thread with response
▶Error handlers — 4 catch states
Callback pattern: Step Functions generates a unique task token and PAUSES at zero cost. The agent processes asynchronously and calls SendTaskSuccess when done.
Includes configurable retry with exponential backoff and jitter, execution timeouts, and multiple error handler states that post specific error messages back to the Slack thread.
click to expand
invoke_agent Lambda
Generate deterministic session ID from Slack thread
Invoke AgentCore with task token + prompt
Generates a deterministic session ID from the Slack thread context, ensuring all messages in the same thread share a session for multi-turn conversation. Fetches thread history for context injection.
click to expand
AgentCore Runtime (Orbit)
Spawns background thread, returns ACK
Claude Opus 4.6 processes the request
Sends SFN heartbeats every 30 min
Calls SendTaskSuccess when done
Tool Permission Guard (tool_guard_hook)
SAFE auto-allow — Read, Grep, CloudWatch, Lumigo, etc.
Skills: cloudwatch-guide, datadog-guide, lumigo-guide, jira-guide, confluence-guide, embrace-guide, tacobell-store-api, tacobell-menu-api MCP servers: CloudWatch, Jira, Confluence, Lumigo, Datadog, Embrace Session persistence: Claude session ID stored locally for conversation continuity across invocations. Thread context: Injects prior Slack messages into prompt (full, missed, or none based on session freshness). Truncated to 2,000 chars/message, 80,000 chars total.
click to expand
DynamoDB
event-dedup-table approval-tokens-table
Event dedup table: Prevents duplicate Slack event processing using TTL-based expiration. Approval tokens table: Stores HITL approval state and tool context with automatic TTL cleanup.
Safety
Human-in-the-Loop Approval
When the tool guard classifies a command as dangerous, the agent pauses and asks a human reviewer via Slack buttons. Fail-closed on timeout.
Click to watch the HITL approval flow in action
click
Agent detects danger
Tool classified as DANGEROUS tier
The tool_guard_hook runs before every tool call. When a bash command matches dangerous patterns (rm -rf, kill -9, etc.) or a WebFetch targets an untrusted domain, the agent initiates the approval flow.
click
post_approval_request
Post Slack buttons Store approval_id in DynamoDB
Generates a unique approval_id, stores the tool call context (command, arguments, reason) in DynamoDB, and posts a Slack message with [Approve] and [Reject] buttons to the thread.
Slack Buttons
ApproveReject
Reviewer clicks to decide
click
handle_interactivity
Atomic DynamoDB update Prevents double-click
Uses DynamoDB ConditionExpression: only succeeds if status = PENDING. If two reviewers click simultaneously, only the first write wins. Updates the Slack message to show who approved/rejected and when.
DynamoDB
approval-tokens-table Stores approval decision
Agent polls
Periodic polling with timeout
Fail-closed on timeout
Try typing a bash command to see how the four-tier permission guard classifies it in real-time. Structural shell bypasses and catastrophic commands are auto-denied, dangerous commands require HITL approval, and safe commands auto-allow.
Enter a command above to see its classification
Try these examples:
ls -la /var/log
cat /etc/hosts
rm -rf /tmp/cache
kill -9 1234
:(){ :|:& };:
mkfs.ext4 /dev/sda1
chmod 777 /etc/passwd
systemctl stop nginx
dd if=/dev/zero of=/dev/sda
python3 -c "import os"
kubectl get pods
sed -i 's/foo/bar/' config
xargs rm *.log
shutdown -h now
rm -rf /
echo test | bash
eval "rm -rf /"
bash -c "whoami"
nc -l 4444
Infrastructure
Lambda Functions
Serverless Lambda functions powering the pipeline. Lambdas needing slack_sdk share a Lambda Layer.