Entity-level diffs on top of Git. Instead of line 43 changed, sem tells you function validateToken was added in src/auth.ts.
sem diff
┌─ src/auth/login.ts ──────────────────────────────────
│
│ ⊕ function validateToken [added]
│ ∆ function authenticateUser [modified]
│ ⊖ function legacyAuth [deleted]
│
└──────────────────────────────────────────────────────
┌─ config/database.yml ─────────────────────────────────
│
│ ∆ property production.pool_size [modified]
│ - 5
│ + 20
│
└──────────────────────────────────────────────────────
Summary: 1 added, 1 modified, 1 deleted across 2 filesInstall
npm install -g @ataraxy-labs/semOr run directly:
npx @ataraxy-labs/sem diffUsage
Works in any Git repo. No setup required.
# Semantic diff of working changes
sem diff
# Staged changes only
sem diff --staged
# Specific commit
sem diff --commit abc1234
# Commit range
sem diff --from HEAD~5 --to HEAD
# JSON output (for AI agents, CI pipelines)
sem diff --format json
# Semantic commit history
sem log -n 5
# SQL queries against stored changes
sem init
sem diff --store
sem query "SELECT entity_type, entity_name, change_type FROM changes"What It Parses
| Format | Extensions | Entities |
|---|---|---|
| TypeScript | .ts .tsx | functions, classes, interfaces, types, enums |
| JavaScript | .js .jsx .mjs .cjs | functions, classes, variables |
| Python | .py | functions, classes, decorated definitions |
| Go | .go | functions, methods, types, vars, consts |
| Rust | .rs | functions, structs, enums, impls, traits, mods |
| JSON | .json | properties, objects (RFC 6901 paths) |
| YAML | .yml .yaml | sections, properties (dot paths) |
| TOML | .toml | sections, properties |
| CSV | .csv .tsv | rows (first column as identity) |
| Markdown | .md .mdx | heading-based sections |
Everything else falls back to chunk-based diffing.
How Matching Works
Three-phase entity matching:
- Exact ID match: same entity in before/after → modified or unchanged
- Content hash match: same content, different name → renamed or moved
- Fuzzy similarity: >80% token overlap → probable rename
This means sem detects renames and moves in addition to adds and deletes.
JSON Output
sem diff --format json{
"summary": {
"fileCount": 2,
"added": 1,
"modified": 1,
"deleted": 1,
"total": 3
},
"changes": [
{
"entityId": "src/auth.ts::function::validateToken",
"changeType": "added",
"entityType": "function",
"entityName": "validateToken",
"filePath": "src/auth.ts"
}
]
}SQL Queries
sem init
sem log --store
sem query "SELECT change_type, count(*) as n FROM changes GROUP BY change_type"change_type │ n
─────────────────────────────────
added │ 29
deleted │ 2
modified │ 7Architecture
- tree-sitter (native) for code parsing. Not WASM.
- better-sqlite3 for storage. WAL mode, fast transactions.
- simple-git for Git operations
- Plugin system. Add your own parsers.
View on GitHub · Read the technical deep dive · MIT License