I’ve been building automation workflows that invoke Claude Code programmatically from Python scripts on Windows. What started as a straightforward subprocess call turned into a solid chunk of troubleshooting time. This post started as a braindump written by Claude Code after we worked through that process together, and I’ve since updated it with everything that went wrong when I scaled it up to a batch pipeline.

I’m documenting this because executing Claude Code from an orchestrator script is genuinely useful, and it’s not the first time I’ve hit these issues. If you’ve seen the Ralph projects doing the rounds on social media (e.g. https://github.com/mikeyobrien/ralph-orchestrator, https://github.com/michaelshimeles/ralphy), this is the same sort of idea.

Basic invocation

The simplest way to call Claude Code non-interactively:

> claude -p "Say hello in exactly 15 words"
Hello there! I hope you are having a wonderful and productive day today, friend!

The -p (or --print) flag runs in non-interactive mode, prints the response, and exits. Combine it with --output-format json to get structured output you can parse, including tool calls and token usage.

> claude -p --output-format json --model sonnet "Say hello in exactly 15 words"
{
  "type": "result",
  "subtype": "success",
  "is_error": false,
  "duration_ms": 3302,
  "duration_api_ms": 2918,
  "num_turns": 1,
  "result": "Hello! I'm Claude, your AI assistant here to help with your software engineering tasks.",
  "session_id": "3892c079-721c-4e63-b06b-38a0be87809b",
  "total_cost_usd": 0.03988395,
  "usage": {
    "input_tokens": 3,
    "cache_creation_input_tokens": 9443,
    "cache_read_input_tokens": 13829,
    "output_tokens": 21,
    "server_tool_use": {
      "web_search_requests": 0,
      "web_fetch_requests": 0
    },
    "service_tier": "standard",
    "cache_creation": {
      "ephemeral_1h_input_tokens": 9443,
      "ephemeral_5m_input_tokens": 0
    }
  },
  "modelUsage": {
    "claude-sonnet-4-5-20250929": {
      "inputTokens": 3,
      "outputTokens": 21,
      "cacheReadInputTokens": 13829,
      "cacheCreationInputTokens": 9443,
      "webSearchRequests": 0,
      "costUSD": 0.03988395,
      "contextWindow": 200000,
      "maxOutputTokens": 64000
    }
  },
  "permission_denials": [],
  "uuid": "6cf4cfcc-0bb6-463b-8864-2f9bba2dfca2"
}

The tool permission gotcha

If you want Claude to use tools like WebSearch, WebFetch, Bash, or Edit in non-interactive mode, you need to pre-approve them with --allowedTools:

claude -p \
  --allowedTools "WebSearch,WebFetch" \
  "Search for Python tutorials"

Without this, the model will try to use the tools and you’ll get a cryptic error: “Claude requested permissions to use X, but you haven’t granted it yet.” Simple enough once you know the fix, but the error message doesn’t exactly spell it out.

Note: In older versions of Claude Code (pre-2.1.x), you needed both --tools and --allowedTools. The --tools flag made tools available, --allowedTools granted permission. As of 2.1.x, --allowedTools alone handles both.

To disable all tools entirely, pass an empty string:

claude -p --tools "" "Your prompt"

System prompts on Windows: the hidden problem

Long or complex system prompts passed via --system-prompt can cause tool permissions to silently fail on Windows, even when --allowedTools is set correctly.

The workaround is to keep your system prompt minimal and put detailed instructions in the user prompt instead:

# This causes issues with long instructions
cmd = [claude_path, "-p", "--system-prompt", very_long_instructions, ...]

# This works reliably
system_prompt = "You are a research assistant. Use WebSearch immediately."
user_prompt = f"{detailed_instructions}\n\n---\n\nNow do this: {task}"

proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, ...)
stdout, stderr = proc.communicate(input=user_prompt)

This might be related to Windows command-line argument length limits or escaping issues, but my prompts were under the Windows arg length limit. I haven’t dug into the root cause. The workaround is good enough.

Subprocess pattern

This pattern worked for reliable invocation:

import subprocess
import sys

cmd = [
    claude_path, "-p",
    "--output-format", "json",
    "--model", "sonnet",
    "--system-prompt", short_system_prompt,
    "--allowedTools", "WebSearch,WebFetch",
]

creation_flags = 0
if sys.platform == "win32":
    creation_flags = subprocess.CREATE_NEW_PROCESS_GROUP

proc = subprocess.Popen(
    cmd,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
    encoding="utf-8",
    creationflags=creation_flags,
)

stdout, stderr = proc.communicate(input=user_prompt, timeout=300)

Notes:

  • CREATE_NEW_PROCESS_GROUP on Windows helps with process tree management and avoids some hanging issues (but see the caveat in the batch section below)
  • Piping the user prompt via stdin sidesteps command-line length limits
  • encoding="utf-8" is important on Windows. The default cp1252 encoding causes instability when Claude returns non-ASCII characters or emoji

Parsing the JSON output

The CLI returns a JSON array with different entry types:

import json

data = json.loads(stdout)
for entry in data:
    if entry.get("type") == "result":
        # Final result text
        print(entry.get("result"))
        # Token usage
        usage = entry.get("usage", {})
        print(f"Tokens: {usage.get('input_tokens')} in / {usage.get('output_tokens')} out")

    elif entry.get("type") == "assistant":
        # Check for tool calls
        for block in entry.get("message", {}).get("content", []):
            if block.get("type") == "tool_use":
                print(f"Tool called: {block.get('name')}")

    elif entry.get("type") == "user":
        # Tool results come back as user messages
        for block in entry.get("message", {}).get("content", []):
            if block.get("type") == "tool_result":
                print(f"Tool result: {block.get('content')}")

Agentic mode

For tasks requiring multiple tool calls, use --max-turns:

claude -p --max-turns 5 --allowedTools "WebSearch,WebFetch" "Research this topic thoroughly"

The model will loop through tool calls up to the specified limit before returning its final response.

Multi-agent pipelines: hard-won lessons (March 2026 update)

I spent a solid 4+ hours debugging a batch research pipeline that invokes claude -p repeatedly from a Python orchestrator. The pipeline researches AI services using 7 sequential claude -p calls per service (web search, page fetch, compile, risk assessment, jurisdiction, DNS recon, synthesis). Here’s everything that broke and how I fixed it.

Sub-agent permissions don’t propagate

If your claude -p prompt tells Claude to use the Agent tool to dispatch sub-agents, --dangerously-skip-permissions does not propagate to those sub-agents (anthropics/claude-code#29110, #12261). The sub-agents hang on permission prompts with no terminal to answer them.

The fix: Don’t nest agents inside claude -p. Make your Python script the orchestrator and call claude -p once per step.

WebFetch + large structured output = infinite stall

This was the most painful one to track down. A single claude -p call that performs web searches, fetches multiple pages, and then writes a large structured document will stall indefinitely after the fetches complete. The model ingests the fetched page content and then never produces the output. This happens with both Opus and Sonnet, and is consistent across claude -p, the Claude Agent SDK, and any headless execution. The same workflow works fine in interactive Claude Code.

The fix: Break the work into micro-steps. Never combine WebFetch and structured writing in the same claude -p call:

# Step 1: Search only (no fetching)
run_claude("-p", "Search for X, write results to search.md",
           "--allowedTools", "WebSearch,Write")

# Step 2: Fetch only (no writing structured output)
run_claude("-p", "Read search.md, fetch top 3 URLs, append findings",
           "--allowedTools", "WebFetch,Read,Write")

# Step 3: Compile only (no searching, no fetching)
run_claude("-p", "Read search.md, write structured output to final.md",
           "--allowedTools", "Read,Write")

Long prompts as command-line arguments fail silently

Even under the Windows 8191-char limit, long prompts passed as -p <prompt> arguments can cause silent failures or truncation. Piping via stdin (-p - with communicate(input=prompt)) also caused issues with CREATE_NEW_PROCESS_GROUP.

The fix: Write your detailed prompt to a temp file, then pass a short instruction:

# Write detailed instructions to a file
prompt_file = work_dir / "_prompt.md"
prompt_file.write_text(detailed_instructions, encoding="utf-8")

# Tell claude to read and follow the file
cmd = [claude, "-p",
       f"Read the instructions at {prompt_file} and follow them exactly.",
       "--allowedTools", "Read,Write,WebSearch"]

This keeps the command line short and lets the prompt be as long as you need.

The Claude Agent SDK crashes on Windows

The Python claude-agent-sdk package uses anyio for async operations. On Windows with Python 3.12, sequential query() calls crash with RuntimeError: Attempted to exit cancel scope in a different task than it was entered in. The first call works fine. The second crashes when spawning a new subprocess. This looks like an anyio + Windows asyncio backend incompatibility.

The fix: Skip the SDK on Windows. Use claude -p subprocess calls directly.

CREATE_NEW_PROCESS_GROUP breaks stdin piping

My original recommendation above was to use CREATE_NEW_PROCESS_GROUP for Windows subprocess calls. This works for simple prompts passed as arguments, but it breaks stdin piping. If you’re using communicate(input=prompt) with CREATE_NEW_PROCESS_GROUP, the process may fail to read stdin and time out immediately.

The fix: Don’t use CREATE_NEW_PROCESS_GROUP if you’re piping via stdin. If you’re passing prompts as arguments or via temp files, it’s not needed anyway.

--dangerously-skip-permissions vs --allowedTools

For headless batch runs, you need one of these since there’s no terminal to answer permission prompts. --allowedTools is the safer option: it pre-approves only the specific tools you need. --dangerously-skip-permissions bypasses the entire safety stack including the command blocklist. Use --allowedTools with least-privilege per step:

# Step that only searches: minimal tools
["--allowedTools", "WebSearch,Write,Read"]

# Step that runs a script: needs Bash
["--allowedTools", "Read,Write,Bash"]

Working batch pattern (as of March 2026)

After all the debugging, here’s the pattern that works reliably for batch automation on Windows:

import subprocess, json, shutil

claude = shutil.which("claude")

def run_step(prompt_text, tools, work_dir, label=""):
    # Write prompt to temp file
    prompt_file = work_dir / f"_prompt-{label}.md"
    prompt_file.write_text(prompt_text, encoding="utf-8")

    cmd = [claude, "-p",
           f"Read the instructions at {prompt_file} and follow them exactly.",
           "--output-format", "json",
           "--model", "sonnet",
           "--allowedTools", ",".join(tools + ["Read"])]

    proc = subprocess.Popen(
        cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
        text=True, encoding="utf-8", cwd=str(repo_root))

    stdout, stderr = proc.communicate(timeout=300)
    prompt_file.unlink(missing_ok=True)

    if proc.returncode != 0:
        return None
    return json.loads(stdout).get("result", "")

Key principles:

  • One step per subprocess. Never combine WebFetch with structured writing.
  • Prompts via temp files. Short -p argument, detailed instructions in a file.
  • Least-privilege tools. Only what each step actually needs.
  • Sonnet for batch work. Fast enough, and Opus stalls on large contexts in headless mode.
  • No SDK, no async. Plain subprocess calls are the most reliable on Windows.
  • No CREATE_NEW_PROCESS_GROUP. Not needed when prompts are in files.
  • State tracking. .done marker files for resume-on-interrupt.

Security note

If you’re using --allowedTools "Bash" or --allowedTools "Edit", you’re giving Claude auto-approved access to run arbitrary commands or modify files. This is fine when you control the prompt, but if any part of the user prompt comes from an untrusted source (a web form, external API, etc.), you’ve created a remote code execution vector. An attacker could inject instructions to wipe the filesystem. Treat --allowedTools the same way you’d treat sudo: don’t grant it to untrusted input.

Use --allowedTools with least-privilege per step rather than --dangerously-skip-permissions, which bypasses the command blocklist, write restrictions, permission prompts, and MCP server trust verification.

Quick reference

Flag Example Purpose
-p -p Non-interactive print mode
--model --model sonnet Select model (sonnet, opus, haiku, or full IDs)
--output-format --output-format json Output format (text, json, stream-json)
--system-prompt --system-prompt "Be concise" Custom system prompt (keep short on Windows!)
--allowedTools --allowedTools "Bash,Edit" Pre-approve tools without prompting
--max-turns --max-turns 10 Agentic turn limit
--timeout --timeout 300000 Timeout in milliseconds

Troubleshooting

Symptom Likely Cause Fix
“Claude requested permissions but you haven’t granted it” Missing --allowedTools Add --allowedTools for the tools you need
Tools silently fail with long system prompt Windows CLI argument handling Move instructions to a temp file, pass short -p instruction
Process hangs after WebFetch calls Large fetched content + structured output in same call Break into micro-steps: search, fetch, compile separately
Process hangs indefinitely Sub-agent waiting for permission prompt Don’t use Agent tool in -p mode; orchestrate from Python
claude-agent-sdk crashes on second call anyio cancel scope bug on Windows Python 3.12 Use claude -p subprocess calls instead of SDK
CREATE_NEW_PROCESS_GROUP + stdin = timeout Process group flag breaks stdin piping Don’t use the flag; pass prompts via temp files
Unicode errors in output Missing encoding specification Add encoding="utf-8" to Popen
--dangerously-skip-permissions not propagating Sub-agents don’t inherit parent permissions Use --allowedTools instead; make Python the orchestrator