Skip to content

Debugging

This page covers the diagnostic tools and techniques available for troubleshooting Clanker.


Debug Flag

The --debug flag enables verbose output that shows internal progress and diagnostics during execution:

bash
clanker --debug ask "What EC2 instances are running?"

When debug mode is active, Clanker prints:

  • The config file being used
  • Which AI provider and model are selected
  • The keyword inference and routing decisions (which cloud provider was detected)
  • LLM analysis prompt size and response details
  • AWS operations selected and their execution status
  • Context sizes at each stage of the pipeline
  • Tool calling decisions and results

Debug output is printed to stderr so that it does not interfere with the primary response on stdout.


Agent Trace Flag

The --agent-trace flag enables detailed logging of the coordinator agent lifecycle. This is useful when debugging the intelligent investigation pipeline:

bash
clanker ask --agent-trace "Why is the image processing service failing?"

Agent trace output includes:

  • Agent creation and initialization events
  • Decision tree traversal steps
  • Dependency scheduling and parallel execution details
  • Inter-agent communication via the shared data bus
  • Playbook step execution and results

You can also enable agent tracing in the config file:

yaml
agent:
  trace: true

The --agent-trace CLI flag overrides the config file setting when specified.


Local Mode and Rate Limiting

Clanker runs in local mode by default, which applies a delay between consecutive AWS CLI calls to prevent overwhelming your system or hitting rate limits:

bash
# Local mode is on by default (100ms delay between calls)
clanker ask "Show me all S3 buckets"

# Disable local mode for maximum throughput
clanker --local-mode=false ask "Show me all S3 buckets"

# Adjust the delay (in milliseconds)
clanker --local-delay 200 ask "Show me all S3 buckets"
FlagDefaultDescription
--local-modetrueEnable rate limiting between consecutive calls
--local-delay100Delay in milliseconds between calls

Reading Log Output

When using --debug, watch for these key log lines:

Routing decisions:

Keyword inference: AWS=true, GitHub=false, Terraform=false, K8s=false, GCP=false, Cloudflare=false

This tells you which cloud provider Clanker detected from your question.

AI provider selection:

Using config file: /home/user/.clanker.yaml

Operation selection and execution:

Stage 1: Analyzing query with dynamic tool selection...
Successfully parsed analysis: 3 operations found
  1. describe-instances - Check running EC2 instances
  2. describe-vpcs - Get VPC information
  3. describe-security-groups - List security groups
Stage 2: Executing AWS operations...
Executing 3 operations with infrastructure profile: my-dev-profile...
AWS operations completed. Results length: 12453 characters
Stage 3: Building final context and getting response...

Common Error Patterns and Solutions

Missing AWS CLI

Error:

failed to create AWS client: aws CLI not found in PATH

Solution: Install AWS CLI v2. Clanker requires v2 because it uses --no-cli-pager. See Prerequisites for installation instructions.

Wrong AWS Profile

Error:

failed to create AWS client with profile my-profile: ...

Solution: Verify the profile exists in ~/.aws/config and that credentials are valid. Run aws sts get-caller-identity --profile my-profile to test.

API Key Not Configured

Error:

OpenAI API key not configured

or

Anthropic API key not configured

Solution: Set the API key in one of three ways:

  1. Config file: Set ai.providers.openai.api_key in ~/.clanker.yaml
  2. Environment variable: Export OPENAI_API_KEY
  3. CLI flag: Pass --openai-key "$OPENAI_API_KEY"

Gemini API Key Missing

Error:

gemini-api provider configured without API key

Solution: Obtain an API key from Google AI Studio and set it:

  1. Config file: Set ai.providers.gemini-api.api_key_env: GEMINI_API_KEY
  2. Environment variable: Export GEMINI_API_KEY
  3. CLI flag: Pass --gemini-key "$GEMINI_API_KEY"

Timeout Errors

Error:

context deadline exceeded

Solution: Increase the timeout in your config file:

yaml
timeout: 60

Or verify that your network connection can reach the AI provider's API endpoint.

Rate Limiting from AI Provider

Error:

API request failed with status 429

Solution: Clanker automatically retries rate-limited requests with exponential backoff (up to 5 attempts). If you continue to see this error, wait a moment and try again, or switch to a different AI provider with --ai-profile.

Cloudflare API Token Missing

Error:

cloudflare api_token is required (set cloudflare.api_token, CLOUDFLARE_API_TOKEN, or CF_API_TOKEN)

Solution: Set the token via config file, CLOUDFLARE_API_TOKEN environment variable, or CF_API_TOKEN.

DigitalOcean Token Missing

Error:

digitalocean api_token is required (set digitalocean.api_token, DO_API_TOKEN, or DIGITALOCEAN_ACCESS_TOKEN)

Solution: Set the token via config file, DO_API_TOKEN environment variable, or DIGITALOCEAN_ACCESS_TOKEN.

Azure Subscription ID Missing

Error:

azure subscription_id is required (set infra.azure.subscription_id, AZURE_SUBSCRIPTION_ID, or use --azure-subscription)

Solution: Set the subscription ID in the config file under infra.azure.subscription_id, export AZURE_SUBSCRIPTION_ID, or pass --azure-subscription.


Reporting Issues

If you encounter a bug or unexpected behavior:

  1. Reproduce the issue with --debug enabled to capture full diagnostic output.
  2. If using the agent pipeline, add --agent-trace for additional detail.
  3. Note your Clanker version (clanker version).
  4. Note which AI provider and model you are using.
  5. Open an issue on the Clanker GitHub repository with the debug output and a description of the expected versus actual behavior. Redact any sensitive information (API keys, account IDs, resource names) before sharing.