Repository Analysis¶
Tokuye's repository analysis tools are the foundation of its context-aware assistance. Before writing a single line of code, Tokuye understands your entire project.
repo-summary¶
Inspired by Repomix, repo-summary analyzes your entire repository and converts it to a Claude-friendly XML format.
- Traverses the project directory tree
- Respects
.gitignorepatterns - Respects
.tokuye/summary.ignorefor additional exclusions - Outputs a structured XML summary that the agent uses as context
The summary is stored at .tokuye/repo-summary.xml and regenerated when the agent detects changes.
repo-summary-rag¶
repo-summary-rag pre-indexes your code using FAISS for fast semantic search.
How It Works¶
- Code files are parsed using tree-sitter into function/class-level chunks
- Each chunk is embedded using Amazon Titan Embeddings v2 (512-dimensional vectors)
- Embeddings are stored in a FAISS index at
.tokuye/ - On subsequent runs, only changed files are re-indexed (differential update)
Supported Languages¶
| Language | Parser |
|---|---|
| Python | tree-sitter-python |
| JavaScript | tree-sitter-javascript |
| TypeScript | tree-sitter-typescript |
| Go | tree-sitter-go |
| Ruby | tree-sitter-ruby |
Semantic Chunking¶
Code is split at function and class boundaries, not arbitrary line counts. This means search results are always complete, meaningful units of code — not truncated snippets.
Differential Updates¶
The index tracks file modification times. On each agent invocation, only files that have changed since the last index build are re-embedded. This keeps startup fast even on large projects.
Excluding Files¶
Create a .tokuye/summary.ignore file to exclude specific paths from repository analysis:
# Exclude vendor directory
vendor/
# Exclude generated files
*.generated.ts
dist/
# Exclude large data files
data/*.csv
- One path pattern per line
- Supports glob patterns
- Applied in addition to
.gitignore
See CLI Usage & Exclusions for more details.