Automatic I/O Strategy
Overview
Section titled “Overview”In v4.0.0, there are no separate streaming tools. The three core tools — read_file, write_file, and edit_file — automatically select the optimal I/O strategy based on file size. You never need to choose between “direct read” and “chunked read” or between “write” and “streaming write.” The engine handles it transparently.
What Was Consolidated
Section titled “What Was Consolidated”| v3 Tool | v4 Equivalent | What Happens Now |
|---|---|---|
streaming_write_file | write_file | Streaming is automatic for large content |
chunked_read_file | read_file | Chunking is automatic for large files |
smart_edit_file | edit_file | Line-by-line processing is automatic for large files |
intelligent_read | read_file | Intelligence is built-in |
intelligent_write | write_file | Intelligence is built-in |
streaming_read_file | read_file | Streaming is automatic when needed |
File Size Thresholds
Section titled “File Size Thresholds”The engine classifies files into four tiers and selects the I/O strategy accordingly. These thresholds are defined in core/config.go and apply to all three core tools.
| Tier | Size Range | Strategy | Description |
|---|---|---|---|
| Small | < 100 KB | Direct I/O | File is read/written entirely in a single operation. Fastest for most source code files. |
| Medium | 100 KB — 500 KB | Streaming I/O | Uses buffered streaming with moderate memory allocation. Suitable for larger source files and small data files. |
| Large | 500 KB — 5 MB | Chunked Processing | File is processed in adaptive chunks. Suitable for large data files, logs, and generated code. |
| Very Large | > 5 MB | Special Handling | Lazy loading, pagination, and progress reporting. Edit operations are rejected above 50 MB to prevent accidental destructive changes. |
Buffer Size
Section titled “Buffer Size”All streaming and chunked operations use a 64 KB buffer (DefaultBufferSize), which is optimal for most disk I/O patterns on modern hardware.
How Each Tool Adapts
Section titled “How Each Tool Adapts”read_file
Section titled “read_file”| File Size | Behavior |
|---|---|
| < 100 KB | Reads entire file into memory and returns it |
| 100 KB — 500 KB | Streams content with buffered reader |
| 500 KB — 5 MB | Reads in chunks, assembles result |
| > 5 MB | Uses pagination; consider using start_line/end_line for specific ranges |
For very large files, the most token-efficient approach is to read only the lines you need:
// Read the entire file (auto-selects strategy)read_file({ path: "large-dataset.csv" })
// Read only lines 100-150 (most efficient for large files)read_file({ path: "large-dataset.csv", start_line: 100, end_line: 150 })
// Read the last 50 lines (log tailing)read_file({ path: "server.log", max_lines: 50, mode: "tail" })write_file
Section titled “write_file”| Content Size | Behavior |
|---|---|
| < 100 KB | Direct write with atomic rename |
| 100 KB — 500 KB | Streaming write with buffered writer |
| 500 KB — 5 MB | Chunked write with progress tracking |
| > 5 MB | Streaming write with progress reporting |
All writes are atomic — content is written to a temporary file first, then renamed to the target path. This prevents partial writes on failure.
// Small file (direct write)write_file({ path: "config.json", content: '{"key": "value"}' })
// Large file (streaming is automatic)write_file({ path: "data.csv", content: largeCSVContent })edit_file
Section titled “edit_file”| File Size | Behavior |
|---|---|
| < 100 KB | Loads file into memory, applies replacement, writes back |
| 100 KB — 500 KB | Streaming read, in-memory replace, streaming write |
| 500 KB — 5 MB | Line-by-line processing with the LargeFileProcessor |
| > 50 MB | Rejected — file is too large for safe editing |
The 50 MB hard limit on edits exists to prevent accidental massive changes. If you need to modify files larger than 50 MB, split the operation or use external tools.
// Small file edit (direct)edit_file({ path: "config.go", old_text: "v3.0.0", new_text: "v4.0.0" })
// Large file edit (line-by-line processing is automatic)edit_file({ path: "large-module.go", old_text: "oldPattern", new_text: "newPattern" })The Large File Processor
Section titled “The Large File Processor”For files in the 500 KB — 5 MB range, the engine uses LargeFileProcessor (defined in core/large_file_processor.go), which supports three processing modes:
| Mode | When Used | Description |
|---|---|---|
| In-Memory | File < 500 KB | Loads entire content, applies transformations |
| Line-by-Line | File 500 KB — 5 MB | Processes one line at a time with minimal memory |
| Chunk-Based | File > 5 MB (reads only) | Processes in fixed-size chunks |
The processor is also used by multi_edit and the pipeline system’s regex_transform action.
Regex Transformations
Section titled “Regex Transformations”The RegexTransformer (defined in core/regex_transformer.go) handles advanced regex-based edits. It applies to edit_file with mode:"regex" and the pipeline regex_transform action.
For large files, regex transformations use the same adaptive strategy:
// Regex transform (auto-selects strategy based on file size)edit_file({ path: "handlers.go", mode: "regex", patterns_json: JSON.stringify([ { pattern: "func (\\w+)\\(\\)", replacement: "func $1(ctx context.Context)", limit: -1 } ]), dry_run: true})When multiple regex patterns are applied, the transformer supports both sequential (pattern-by-pattern) and parallel execution modes, depending on whether the patterns are independent.
Memory Usage
Section titled “Memory Usage”The adaptive I/O strategy keeps memory usage predictable:
| Scenario | Memory Overhead |
|---|---|
| Reading a 50 KB file (direct) | ~50 KB |
| Streaming a 300 KB file | ~64 KB buffer + partial content |
| Chunked read of a 3 MB file | ~64 KB buffer at any point |
| Editing a 2 MB file (line-by-line) | ~64 KB buffer + current line |
The engine’s cache (default 100 MB, configurable via --cache-size) stores recently accessed file contents, directory listings, and metadata. Cache eviction is automatic when the limit is reached.
Progress and Timeouts
Section titled “Progress and Timeouts”Operation Timeout
Section titled “Operation Timeout”All file operations have a default timeout of 30 seconds (DefaultOperationTimeout). For very large files that take longer to process, the engine extends the timeout automatically.
Audit Logging
Section titled “Audit Logging”When --log-dir is configured, each operation logs its duration, file size, and strategy used. This data appears in the dashboard’s Operations page and can be used to identify bottlenecks.
// Check performance stats to see I/O strategy distributionserver_info({ action: "stats" })When to Use Line Ranges Instead
Section titled “When to Use Line Ranges Instead”Even though the engine handles large files automatically, reading only the lines you need is always more token-efficient. For files over 100 KB, consider using start_line/end_line instead of reading the full file:
// Instead of reading a 500 KB file entirely:read_file({ path: "large-module.go" }) // ~12,500 tokens
// Read only the section you need:search_files({ path: "large-module.go", pattern: "targetFunction", include_content: true })// Then:read_file({ path: "large-module.go", start_line: 200, end_line: 250 }) // ~1,250 tokensThis is the single biggest token optimization available — a 90%+ reduction for targeted reads on large files.
Configuration
Section titled “Configuration”The size thresholds are compiled constants and cannot be changed at runtime. However, you can influence I/O behavior with these server flags:
| Flag | Default | Effect on I/O |
|---|---|---|
--cache-size | 100 MB | Larger cache reduces disk reads for repeated access |
--parallel-ops | 2x CPU cores (max 16) | More concurrent operations for batch workloads |
--compact-mode | false | Reduces response size (65-75% token savings) |
See Configuration for the complete CLI reference.
See Also
Section titled “See Also”- Core Tools API Reference — Complete
read_file,write_file,edit_fileparameters - Performance and Tokens — Token optimization strategies and real-world data
- Configuration — Server flags that affect I/O behavior
- Benchmarks — Throughput and latency measurements
Last updated: March 2026 Version: 4.0.0