Skip to content

Improve read_file tool descriptions to prevent inefficient pagination behavior in non-OpenAI models #11191

@hannesrudolph

Description

@hannesrudolph

Problem

Non-OpenAI models (particularly Claude Opus) consistently use inefficient pagination patterns when reading files with the read_file tool. Instead of using the default 2000-line limit, they specify small limits (100-200 lines) and paginate repeatedly, wasting tool calls and context.

Root Cause

The current tool description uses passive, descriptive language rather than prescriptive, directive language:

  • Current: "By default, returns up to 2000 lines per file."
  • Issue: This describes behavior but doesn't tell the model what to DO

OpenAI models work well because they were trained on Codex conventions and implicitly understand the 2000-line default. Other models lack this training and need explicit guidance.

Current Tool Definition Location

src/core/prompts/tools/native-tools/read_file.ts - specifically the createReadFileTool() function

Solution

Update the tool descriptions to use imperative, directive language that explicitly guides models toward efficient behavior.

Changes Required

1. Update Main Description (limitNote)

Current:

const limitNote = ` By default, returns up to ${DEFAULT_LINE_LIMIT} lines per file. Lines longer than ${MAX_LINE_LENGTH} characters are truncated.`

Change to:

const limitNote = ` Default limit is ${DEFAULT_LINE_LIMIT} lines - use this default by omitting the limit parameter. Only specify a smaller limit if you know the file exceeds ${DEFAULT_LINE_LIMIT} lines and you need pagination. Lines longer than ${MAX_LINE_LENGTH} characters are truncated.`

2. Update limit Parameter Description

Current:

limit: {
    type: "integer",
    description: `Maximum number of lines to return (slice mode, default: ${DEFAULT_LINE_LIMIT})`,
},

Change to:

limit: {
    type: "integer",
    description: `Maximum number of lines to return (slice mode). Default is ${DEFAULT_LINE_LIMIT}. Omit this parameter to use the default. Only specify a value if you need pagination for files larger than ${DEFAULT_LINE_LIMIT} lines.`,
},

3. Update mode Parameter Description (Optional Enhancement)

Current ending:

"'indentation': extract complete semantic code blocks containing anchor_line - PREFERRED when you have a line number because it guarantees complete, valid code blocks. WARNING: Do not use indentation mode without specifying indentation.anchor_line, or you will only get header content."

Add:

"'indentation': extract complete semantic code blocks containing anchor_line - PREFERRED when you have a line number because it guarantees complete, valid code blocks (ignores offset/limit entirely). WARNING: Do not use indentation mode without specifying indentation.anchor_line, or you will only get header content."

Key Language Principles

  1. Use imperative verbs: "use this default by omitting" instead of "returns up to"
  2. Create conditional barriers: "Only specify... if you know the file exceeds" creates a threshold for parameter usage
  3. Give explicit instructions: "Omit this parameter" is directive, not descriptive
  4. Emphasize defaults: "Default is X - use this by omitting" makes the preferred behavior clear

Testing

After implementing, test with Claude Opus on various file reading scenarios to verify it:

  1. Omits the limit parameter by default
  2. Only specifies limit for genuinely large files
  3. Stops inefficient pagination patterns

Implementation Notes

  • This is a tool definition change only - no runtime behavior changes
  • The 2000-line default and constants remain unchanged
  • Only description strings are being updated
  • Changes should be backward compatible (behavior is the same, just better guidance)

Discovered during: Discussion about why Claude Opus uses inefficient pagination while OpenAI models don't
Impact: Reduces wasted tool calls, improves context efficiency, faster task completion

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions