Skip to content
If you are an agent, a clean markdown version is at /sem.md
Home
§ 01 · Building — Product

sem

Semantic version control. Entity-level diffs on top of Git. Instead of 'line 43 changed', sem tells you 'function validateToken was added in src/auth.ts'.

● liveMITnpmnative tree-sitter

Entity-level diffs on top of Git. Instead of line 43 changed, sem tells you function validateToken was added in src/auth.ts.

sem diff

┌─ src/auth/login.ts ──────────────────────────────────
│
│  ⊕ function  validateToken          [added]
│  ∆ function  authenticateUser       [modified]
│  ⊖ function  legacyAuth             [deleted]
│
└──────────────────────────────────────────────────────

┌─ config/database.yml ─────────────────────────────────
│
│  ∆ property  production.pool_size   [modified]
│    - 5
│    + 20
│
└──────────────────────────────────────────────────────

Summary: 1 added, 1 modified, 1 deleted across 2 files

Install

npm install -g @ataraxy-labs/sem

Or run directly:

npx @ataraxy-labs/sem diff

Usage

Works in any Git repo. No setup required.

# Semantic diff of working changes
sem diff

# Staged changes only
sem diff --staged

# Specific commit
sem diff --commit abc1234

# Commit range
sem diff --from HEAD~5 --to HEAD

# JSON output (for AI agents, CI pipelines)
sem diff --format json

# Semantic commit history
sem log -n 5

# SQL queries against stored changes
sem init
sem diff --store
sem query "SELECT entity_type, entity_name, change_type FROM changes"

What It Parses

FormatExtensionsEntities
TypeScript.ts .tsxfunctions, classes, interfaces, types, enums
JavaScript.js .jsx .mjs .cjsfunctions, classes, variables
Python.pyfunctions, classes, decorated definitions
Go.gofunctions, methods, types, vars, consts
Rust.rsfunctions, structs, enums, impls, traits, mods
JSON.jsonproperties, objects (RFC 6901 paths)
YAML.yml .yamlsections, properties (dot paths)
TOML.tomlsections, properties
CSV.csv .tsvrows (first column as identity)
Markdown.md .mdxheading-based sections

Everything else falls back to chunk-based diffing.

How Matching Works

Three-phase entity matching:

  1. Exact ID match: same entity in before/after → modified or unchanged
  2. Content hash match: same content, different name → renamed or moved
  3. Fuzzy similarity: >80% token overlap → probable rename

This means sem detects renames and moves in addition to adds and deletes.

JSON Output

sem diff --format json
{
  "summary": {
    "fileCount": 2,
    "added": 1,
    "modified": 1,
    "deleted": 1,
    "total": 3
  },
  "changes": [
    {
      "entityId": "src/auth.ts::function::validateToken",
      "changeType": "added",
      "entityType": "function",
      "entityName": "validateToken",
      "filePath": "src/auth.ts"
    }
  ]
}

SQL Queries

sem init
sem log --store
sem query "SELECT change_type, count(*) as n FROM changes GROUP BY change_type"
change_type          │ n
─────────────────────────────────
added                │ 29
deleted              │ 2
modified             │ 7

Architecture

View on GitHub · Read the technical deep dive · MIT License

§ 01·sem|0%We are Alive