Skip to content

Automatic I/O Strategy

In v4.0.0, there are no separate streaming tools. The three core tools — read_file, write_file, and edit_file — automatically select the optimal I/O strategy based on file size. You never need to choose between “direct read” and “chunked read” or between “write” and “streaming write.” The engine handles it transparently.

v3 Toolv4 EquivalentWhat Happens Now
streaming_write_filewrite_fileStreaming is automatic for large content
chunked_read_fileread_fileChunking is automatic for large files
smart_edit_fileedit_fileLine-by-line processing is automatic for large files
intelligent_readread_fileIntelligence is built-in
intelligent_writewrite_fileIntelligence is built-in
streaming_read_fileread_fileStreaming is automatic when needed

The engine classifies files into four tiers and selects the I/O strategy accordingly. These thresholds are defined in core/config.go and apply to all three core tools.

TierSize RangeStrategyDescription
Small< 100 KBDirect I/OFile is read/written entirely in a single operation. Fastest for most source code files.
Medium100 KB — 500 KBStreaming I/OUses buffered streaming with moderate memory allocation. Suitable for larger source files and small data files.
Large500 KB — 5 MBChunked ProcessingFile is processed in adaptive chunks. Suitable for large data files, logs, and generated code.
Very Large> 5 MBSpecial HandlingLazy loading, pagination, and progress reporting. Edit operations are rejected above 50 MB to prevent accidental destructive changes.

All streaming and chunked operations use a 64 KB buffer (DefaultBufferSize), which is optimal for most disk I/O patterns on modern hardware.


File SizeBehavior
< 100 KBReads entire file into memory and returns it
100 KB — 500 KBStreams content with buffered reader
500 KB — 5 MBReads in chunks, assembles result
> 5 MBUses pagination; consider using start_line/end_line for specific ranges

For very large files, the most token-efficient approach is to read only the lines you need:

// Read the entire file (auto-selects strategy)
read_file({ path: "large-dataset.csv" })
// Read only lines 100-150 (most efficient for large files)
read_file({ path: "large-dataset.csv", start_line: 100, end_line: 150 })
// Read the last 50 lines (log tailing)
read_file({ path: "server.log", max_lines: 50, mode: "tail" })
Content SizeBehavior
< 100 KBDirect write with atomic rename
100 KB — 500 KBStreaming write with buffered writer
500 KB — 5 MBChunked write with progress tracking
> 5 MBStreaming write with progress reporting

All writes are atomic — content is written to a temporary file first, then renamed to the target path. This prevents partial writes on failure.

// Small file (direct write)
write_file({ path: "config.json", content: '{"key": "value"}' })
// Large file (streaming is automatic)
write_file({ path: "data.csv", content: largeCSVContent })
File SizeBehavior
< 100 KBLoads file into memory, applies replacement, writes back
100 KB — 500 KBStreaming read, in-memory replace, streaming write
500 KB — 5 MBLine-by-line processing with the LargeFileProcessor
> 50 MBRejected — file is too large for safe editing

The 50 MB hard limit on edits exists to prevent accidental massive changes. If you need to modify files larger than 50 MB, split the operation or use external tools.

// Small file edit (direct)
edit_file({ path: "config.go", old_text: "v3.0.0", new_text: "v4.0.0" })
// Large file edit (line-by-line processing is automatic)
edit_file({ path: "large-module.go", old_text: "oldPattern", new_text: "newPattern" })

For files in the 500 KB — 5 MB range, the engine uses LargeFileProcessor (defined in core/large_file_processor.go), which supports three processing modes:

ModeWhen UsedDescription
In-MemoryFile < 500 KBLoads entire content, applies transformations
Line-by-LineFile 500 KB — 5 MBProcesses one line at a time with minimal memory
Chunk-BasedFile > 5 MB (reads only)Processes in fixed-size chunks

The processor is also used by multi_edit and the pipeline system’s regex_transform action.


The RegexTransformer (defined in core/regex_transformer.go) handles advanced regex-based edits. It applies to edit_file with mode:"regex" and the pipeline regex_transform action.

For large files, regex transformations use the same adaptive strategy:

// Regex transform (auto-selects strategy based on file size)
edit_file({
path: "handlers.go",
mode: "regex",
patterns_json: JSON.stringify([
{
pattern: "func (\\w+)\\(\\)",
replacement: "func $1(ctx context.Context)",
limit: -1
}
]),
dry_run: true
})

When multiple regex patterns are applied, the transformer supports both sequential (pattern-by-pattern) and parallel execution modes, depending on whether the patterns are independent.


The adaptive I/O strategy keeps memory usage predictable:

ScenarioMemory Overhead
Reading a 50 KB file (direct)~50 KB
Streaming a 300 KB file~64 KB buffer + partial content
Chunked read of a 3 MB file~64 KB buffer at any point
Editing a 2 MB file (line-by-line)~64 KB buffer + current line

The engine’s cache (default 100 MB, configurable via --cache-size) stores recently accessed file contents, directory listings, and metadata. Cache eviction is automatic when the limit is reached.


All file operations have a default timeout of 30 seconds (DefaultOperationTimeout). For very large files that take longer to process, the engine extends the timeout automatically.

When --log-dir is configured, each operation logs its duration, file size, and strategy used. This data appears in the dashboard’s Operations page and can be used to identify bottlenecks.

// Check performance stats to see I/O strategy distribution
server_info({ action: "stats" })

Even though the engine handles large files automatically, reading only the lines you need is always more token-efficient. For files over 100 KB, consider using start_line/end_line instead of reading the full file:

// Instead of reading a 500 KB file entirely:
read_file({ path: "large-module.go" }) // ~12,500 tokens
// Read only the section you need:
search_files({ path: "large-module.go", pattern: "targetFunction", include_content: true })
// Then:
read_file({ path: "large-module.go", start_line: 200, end_line: 250 }) // ~1,250 tokens

This is the single biggest token optimization available — a 90%+ reduction for targeted reads on large files.


The size thresholds are compiled constants and cannot be changed at runtime. However, you can influence I/O behavior with these server flags:

FlagDefaultEffect on I/O
--cache-size100 MBLarger cache reduces disk reads for repeated access
--parallel-ops2x CPU cores (max 16)More concurrent operations for batch workloads
--compact-modefalseReduces response size (65-75% token savings)

See Configuration for the complete CLI reference.



Last updated: March 2026 Version: 4.0.0