Claude Opus 4.7 xhigh Effort & Prompt Engineering: Get Peak Performance [2026]
Why xhigh Is the New Default — and What It Unlocks
Claude Opus 4.7 introduced the xhigh effort level as the recommended default for most tasks. This isn't marketing — it's where the benchmark numbers live. The 90.9% BigLaw Bench accuracy, top Finance Agent scores, and the 3× SWE-Bench production task resolution rate are all xhigh results. At high effort, those numbers drop meaningfully.
The intuition: xhigh unlocks deeper reasoning chains, more thorough self-verification, and more careful tool orchestration. For tasks where correctness is the binding constraint — legal analysis, financial modeling, complex debugging, security review — xhigh is not optional, it's where the model's true capability lives.
Effort Level Decision Rule
Use xhigh when correctness matters more than latency. Use high when running concurrent sessions where cost compounds. Skip max for most tasks — it shows diminishing returns vs xhigh at significantly higher cost.
Working With Literal Instruction Following
The most important behavioral change in Opus 4.7: it follows instructions literally. Where 4.6 would infer intent and fill in gaps, 4.7 does exactly what you ask — no more, no less. This is a feature, not a bug. It makes the model far more predictable and auditable. But it requires higher-quality prompts.
Pattern 1: State What You Want, Then What You Don't Want
# 4.6 style (relied on inference):
"Improve the error handling in this function"
# 4.7 style (explicit scope):
"Add try/catch error handling to the fetchUser() function.
Catch network errors and database timeout errors specifically.
Return a typed ErrorResult object on failure.
Do NOT modify the function signature or the success return type."
Pattern 2: Include Acceptance Criteria
"Refactor the payment processing module.
Acceptance criteria:
- All existing unit tests still pass
- New code has TypeScript strict mode compliance
- No changes to the public API surface
- Payment processor abstraction supports 3 providers minimum"
Pattern 3: Pre-Answer Likely Clarifying Questions
"Migrate the user authentication from sessions to JWT.
If you encounter the legacy cookie handling in middleware/legacy.ts,
preserve it as-is — it's used by the mobile app and can't be changed yet.
If you need to add new dependencies, use existing packages in package.json
where possible. Prefer jsonwebtoken over alternatives."
Steering Adaptive Thinking
Opus 4.7 removed fixed thinking budgets. Instead, the model decides per-step whether extended reasoning is warranted. You can steer this with natural language — and you should, because the default is calibrated for average task complexity, not your specific use case.
Prompts that increase thinking depth:
"This is a tricky concurrency bug — think through all possible race
conditions before suggesting a fix."
"Before writing any code, reason through the security implications
of each approach. Think step-by-step."
"Don't rush this. The problem is harder than it appears."
Prompts that reduce thinking overhead (for speed-sensitive tasks):
"Quick fix only — don't over-engineer. The straightforward solution is correct."
"Just rename the variable. No analysis needed."
"Speed over depth — give me a first pass I can iterate on."
Precision Prompts for Specific Task Types
For Security Reviews:
"Audit src/api/endpoints.ts for OWASP Top 10 vulnerabilities.
For each finding: severity (Critical/High/Medium/Low), affected line range,
vulnerability class, and a one-sentence fix. Think through injection,
broken auth, and exposure issues specifically. Use xhigh reasoning depth."
For Complex Debugging:
"This test fails intermittently: [paste test]. The failure rate is ~15%.
Think through all possible causes of non-determinism: timing, shared state,
external dependencies, floating point, random seeds. List them in order
of likelihood before proposing a fix."
For Architecture Reviews:
"Review the attached system diagram for single points of failure,
bottlenecks at 10× current load, and missing observability gaps.
For each issue: impact (High/Medium/Low), effort to fix (days),
and the specific component affected. Think through failure modes
systematically, not just obvious ones."
Anti-Patterns to Avoid
- Vague verbs: "improve", "fix", "enhance", "clean up" — replace with specific, measurable actions
- Implicit scope: Don't let the model infer what files are in scope — list them
- No acceptance criteria: Without them, the model decides when "done" is; you probably disagree
- Relying on follow-ups: Front-load your constraints; follow-up turns add overhead and context noise
- Using max effort as default: It's more expensive than xhigh with minimal quality difference on most tasks
Use our Code Formatter to normalize code snippets before including them in prompts — clean, consistently indented code reduces tokenizer overhead and improves the model's structural understanding of the codebase.
Get Engineering Deep-Dives in Your Inbox
Weekly AI tooling and prompt engineering guides — no fluff.