Skip to content

Bug #5: Token Efficiency

Status: RESOLVED in v3.5.0 - v3.7.0 Category: Token Optimization Severity: High (Cost impact) Resolution Date: 2025

Working with Claude Desktop on enterprise projects consumed enormous amounts of tokens. A typical 2-hour coding session could use 2+ million tokens (~$6-7 USD), making intensive AI-assisted development prohibitively expensive.

  1. Verbose output: Every file listing, search result, and operation returned formatted, human-readable output with headers, separators, and decorations
  2. Full file operations: No way to read/write portions of files - always entire content
  3. No caching: Repeated reads of the same file consumed tokens each time
  4. Wasteful patterns: AI agents read entire files to make small changes
OperationTokens UsedCost at Scale
List directory (50 files)~2,500 tokensAdds up fast
Read 5,000-line file~125,000 tokensExpensive
Write same file back~125,000 tokensVery expensive
Single edit cycle~250,000 tokensUnsustainable

A single read-modify-write cycle on a large file could cost 250,000+ tokens.

Added --compact-mode flag that reduces output verbosity dramatically.

Before (verbose mode):

Directory listing for: C:\project\src
Type Name Size Modified
----------------------------------------------------
[DIR] components - 2024-01-15
[FILE] index.ts 2.3 KB 2024-01-14
[FILE] app.tsx 5.1 KB 2024-01-14
...

~50 tokens per entry

After (compact mode):

src: components/, index.ts(2KB), app.tsx(5KB), utils/, config.json

~3 tokens per entry

Result: 65-75% reduction on listing operations

Added tools to read/write specific portions of files.

New tools:

  • read_file_range - Read specific line range
  • chunked_read_file - Read file in chunks
  • smart_edit_file - Edit without loading full file

Before:

read_file({path: "large.go"}) // 125,000 tokens for 5000 lines

After:

read_file_range({path: "large.go", start: 145, end: 160}) // 400 tokens

Result: 99% reduction for targeted reads

Optimized edit operations to avoid full file rewrites.

Before (full rewrite):

1. read_file("large.go") → 75,000 tokens (input)
2. [Claude processes] → thinking tokens
3. write_file("large.go") → 75,000 tokens (output)
Total: ~150,000 tokens

After (surgical edit):

1. smart_search("functionName") → 500 tokens
2. read_file_range(lines 145-160) → 400 tokens
3. edit_file(old_text, new_text) → 300 tokens
Total: ~1,200 tokens

Result: 99% reduction on edit operations

Implemented 3-tier caching system:

Cache TierPurposeHit Rate
BigCacheFile content85-95%
go-cacheDirectory listings90%+
go-cacheFile metadata95%+

Repeated operations hit cache instead of re-processing.

Result: 85-95% cache hit rate after warmup

{
"mcpServers": {
"filesystem-ultra": {
"command": "filesystem-ultra.exe",
"args": [
"--compact-mode",
"--cache-size", "200MB"
]
}
}
}

Compact formatting (core/engine.go):

func (e *UltraFastEngine) formatDirectoryListing(entries []os.DirEntry) string {
if e.config.CompactMode {
// Compact: "file1.go(2KB), file2.go(5KB), dir/"
return e.formatCompact(entries)
}
// Verbose: Full table with headers
return e.formatVerbose(entries)
}

Range reading (core/engine.go):

func (e *UltraFastEngine) ReadFileRange(path string, start, end int) (string, error) {
// Only read requested lines
scanner := bufio.NewScanner(file)
for lineNum := 1; scanner.Scan(); lineNum++ {
if lineNum >= start && lineNum <= end {
result.WriteString(scanner.Text())
}
if lineNum > end {
break
}
}
return result.String(), nil
}

From an actual 2-hour coding session on a Go project:

MetricWithout OptimizationWith Optimization
File reads156 operations156 operations
File edits43 operations43 operations
Searches28 operations28 operations
Total tokens~2,100,000~480,000
Cost~$6.30~$1.44

Savings: 77% tokens, 77% cost

Operation TypeBeforeAfterSavings
File read15,000 tokens800 tokens95%
File write12,000 tokens600 tokens95%
File edit25,000 tokens1,200 tokens95%
Directory list8,000 tokens400 tokens95%
Search (10 results)5,000 tokens800 tokens84%

Added get_edit_telemetry to monitor efficiency:

get_edit_telemetry()

Response:

Edit Telemetry Summary
Total edits: 43
Targeted edits: 38 (88%)
Full rewrites: 5 (12%)
Goal: >80% targeted edits
Status: OPTIMAL

Added get_optimization_suggestion for file-specific advice:

get_optimization_suggestion({path: "large.go"})

Response:

File: large.go
Size: 156 KB
Category: Medium
Recommendation: Use surgical operations
- For reading: read_file_range or intelligent_read
- For editing: edit_file or intelligent_edit
- Avoid: read_file (unnecessary token cost)

100% backward compatible:

  • Compact mode is opt-in via --compact-mode flag
  • All original tools work unchanged
  • New tools are additions, not replacements
  • v3.5.0: Compact mode, range operations
  • v3.6.0: Multi-edit, cache improvements
  • v3.7.0: Telemetry, optimization suggestions
  • Status: Production Ready
  1. Measure before optimizing - Token telemetry revealed the biggest waste areas
  2. Optimize the common case - Most edits are small changes, not full rewrites
  3. Make efficiency the default - Compact mode should be on by default for AI clients
  4. Cache aggressively - AI agents read the same files repeatedly
  5. Provide guidance - Tools like get_optimization_suggestion help agents self-optimize