Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

jscpd

npm NPM License jscpd CI

NPM

Copy/paste detector for programming source code. Supports 224+ formats. AI-ready with MCP server and token-efficient reporter. Now with a Rust-powered engine — 24-37x faster.

jscpd implements the Rabin-Karp algorithm to find duplicated code blocks across files.

Quick Start

# Install (all platforms — installs both jscpd and cpd commands) curl -fsSL https://jscpd.dev/install.sh | bash # TypeScript engine (Node.js, v4.x) npm install -g jscpd@4 jscpd /path/to/code # or use without installing npx jscpd@4 /path/to/code # Rust engine (v5.x, 24-37x faster) — both jscpd and cpd commands npm install -g jscpd@5 jscpd /path/to/code cpd /path/to/code # Rust engine — cpd command only npm install -g cpd cpd /path/to/code # Rust-native install (exposes both jscpd and cpd) cargo install jscpd # Nix (installs both jscpd and cpd) nix run github:kucherenko/jscpd -- /path/to/code # or install permanently nix profile install github:kucherenko/jscpd # Homebrew (macOS/Linux) brew install jscpd

Documentation

DocumentDescription
TypeScript (v4.x)Node.js engine — CLI, reporters, config, detection modes
Rust (v5.x)Rust engine — installation, CLI, reporters, blame, Rust API
AI-ReadyAI reporter, agent skills, MCP server
Programming APITypeScript and Rust programmatic APIs
CI & Pre-Commit HooksGitHub Action, pre-commit hooks
PackagesMonorepo package and crate overview

Two Engines

TypeScript (v4)Rust (v5)
npm packagejscpd@4jscpd@5 or cpd
CLI commandjscpdjscpd and cpd (both available)
SpeedBaseline24-37x faster
Formats224223
Node.js requiredYesNo (self-contained binary)
Programming APITypeScript (jscpd(), detectClones())Rust (cpd-finder crate)
LevelDB storeYesNo
Reporters1313

jscpd@5 installs both jscpd and cpd commands. The cpd npm package installs only the cpd command. Both contain the same Rust binary.

What's New

v5.0.x — Rust Engine

jscpd v5 is a ground-up Rust rewrite that ships as jscpd@5 (installs both jscpd and cpd commands) or cpd (installs the cpd command only). Self-contained binary — no Node.js runtime required.

Same interface, 24-37x faster:

  • All CLI options from v4 are preserved — drop-in replacement: jscpdjscpd@5
  • Same .jscpd.json config file, same detection algorithm, same reporters
  • 223 language formats with cross-format detection (Vue SFC, Svelte, Astro, Markdown)

New in v5:

  • 24-37x faster detection on real projects (see benchmark)
    • Small codebases (548 files): 34x faster
    • Medium codebases (9K files): 37x faster
    • Large codebases (17K files, 900 MB): 24x faster
  • Git blame with side-by-side author comparison (--blame --reporters console-full)
  • --workers — control parallelism for file tokenization and detection (default: auto, uses all CPU cores; not available in v4)
  • 13 reporters: console, console-full, json, xml, csv, html, markdown, badge, sarif, ai, xcode, threshold, silent
  • AI reporter — token-efficient output for LLM pipelines (~79% fewer tokens than console)
  • Self-contained binary — prebuilt for 6 platforms (macOS arm64/x64, Linux arm64/x64, Windows x64)

Not yet in v5 (use v4 for these):

  • LevelDB/Redis stores (--store leveldb)
  • Node.js programming API (jscpd(), detectClones())

See Rust docs for the full CLI reference and differences from v4.

v4.2.x — TypeScript Engine

  • Custom tokenizer backend — replaced prismjs with own backend built on reprism. ~11.5% faster tokenization on real projects
  • Cross-format detection — Vue SFC, Svelte, Astro, and Markdown tokenized per-block, enabling detection across file types
  • New formats: Apex, CFML/ColdFusion, GDScript, and 70+ additional formats (224 total, up from 152)
  • Shebang detection — auto-detect language for extensionless scripts
  • --store-path — configure LevelDB cache directory for parallel runs
  • --skipComments — shorthand for --mode weak
  • --formats-names — map filenames (e.g. Makefile, Dockerfile) to formats
  • --noTips — suppress tip output in CI
  • Bug fixes: entire-file duplicates silently dropped (#728), ReDoS on Lisp/Elisp files (#737), process crash on malformed package.json (#739), Vue SFC cross-file detection (#737), Vue SFC column numbers (#737), 50 dependency security vulnerabilities

See TypeScript docs for the full CLI reference.

Packages

PackageDescription
jscpdCLI and Node.js API (v4.x)
jscpd-serverREST API + MCP server
@jscpd/coreCore detection algorithm
@jscpd/finderFile detection, reporters
@jscpd/tokenizerSource code tokenization
@jscpd/html-reporterHTML report
@jscpd/badge-reporterSVG badge
jscpd-sarif-reporterSARIF (GitHub Code Scanning)
@jscpd/leveldb-storeLevelDB persistent store
@jscpd/redis-storeRedis distributed store
cpd (Rust engine)Rust-powered engine (v5.x) — also available as jscpd@5

Who Uses jscpd

  • GitHub Super Linter — official GitHub linter aggregator, bundles jscpd as its copy/paste detector
  • Codacy — automated code analysis platform, jscpd powers the duplication engine
  • MegaLinter — 100% open-source linter aggregator for CI, integrates jscpd
  • OpenClaw — personal AI assistant for self-hosted devices
  • Natural — NLP library for Node.js, uses jscpd for code quality
  • Nixpkgs — 140K+ package repo for NixOS, packages jscpd

Performance

Benchmarked on macOS (Apple Silicon), 10 runs per target (3 for CopilotKit). v4 ran with --no-gitignore -i "node_modules" to ensure comparable file scanning.

TargetFilesSizejscpd v4jscpd v5Speedup
fixtures5481.5 MB1.03s0.03s34.3x
svelte9K38 MB15.80s0.43s36.9x
CopilotKit17K159 MB82.89s3.44s24.1x

See performance-comparison.md for full methodology and raw data.

AI-Ready Features

jscpd integrates into AI-powered workflows through three mechanisms:

AI Reporter

Token-efficient output for LLM pipelines (~79% fewer tokens than the default console reporter):

jscpd --reporters ai /path/to/source # v4 cpd --reporters ai /path/to/source # v5

Agent Skills

Two installable skills that teach AI coding assistants how to use jscpd and refactor detected duplications:

SkillPurposeInstall
jscpdTool reference — CLI options, AI reporter format, config syntaxnpx skills add kucherenko/jscpd --skill jscpd
dry-refactoringGuided refactoring workflow — read clones, choose strategy, apply, verifynpx skills add kucherenko/jscpd --skill dry-refactoring

After installation, ask your agent to "find and fix code duplication" and it will invoke jscpd with the right options and act on the results.

See AI-Ready docs for full details.

Contributing

  1. Fork the repo kucherenko/jscpd
  2. Clone forked version (git clone https://github.com/{your-id}/jscpd)
  3. Install dependencies (pnpm install)
  4. Run in dev mode: pnpm dev
  5. Add your changes
  6. Add tests and check: pnpm test
  7. Build: pnpm build
  8. Create PR

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

License

MIT © Andrey Kucherenko

关于 About

Copy/paste detector for programming source code, supports 223 formats. AI-ready with token-efficient reporter, skill and MCP server.
aiclones-detectioncode-qualitycoding-agentscopy-pastecpddetectordryduplicatesduplicationsmcpqualityskillskills

语言 Languages

TypeScript50.0%
Rust23.4%
Shell3.0%
JavaScript2.0%
APL1.3%
Pug1.1%
Java1.0%
Vue0.9%
Haskell0.8%
Perl0.7%
HTML0.6%
C#0.6%
Svelte0.6%
D0.6%
C++0.6%
Objective-C0.5%
Astro0.4%
Brainfuck0.4%
Erlang0.4%
Dart0.4%
ASP.NET0.3%
Solidity0.3%
ColdFusion0.3%
SAS0.3%
OCaml0.2%
PLSQL0.2%
WGSL0.2%
RobotFramework0.2%
C0.2%
R0.2%
SCSS0.2%
Zig0.2%
PureScript0.2%
Emacs Lisp0.2%
Power Query0.2%
LLVM0.2%
Tcl0.2%
XSLT0.2%
Haxe0.2%
UnrealScript0.2%
Smarty0.2%
ReScript0.1%
Closure Templates0.1%
Python0.1%
Gherkin0.1%
VHDL0.1%
PLpgSQL0.1%
Velocity Template Language0.1%
Stylus0.1%
Puppet0.1%
Kotlin0.1%
Eiffel0.1%
Racket0.1%
Q#0.1%
q0.1%
Apex0.1%
Groovy0.1%
Nix0.1%
F#0.1%
Verilog0.1%
PHP0.1%
PowerShell0.1%
HCL0.1%
Wolfram Language0.1%
Visual Basic .NET0.1%
Idris0.1%
Bicep0.1%
CoffeeScript0.1%
TeX0.1%
Scala0.1%
XQuery0.1%
CSS0.1%
NSIS0.1%
Fortran0.1%
GDScript0.1%
Pascal0.1%
Smalltalk0.1%
Elm0.1%
Julia0.1%
Lua0.1%
Handlebars0.1%
Ada0.1%
Linker Script0.1%
Oz0.1%
Ruby0.1%
ActionScript0.1%
LiveScript0.1%
Scheme0.1%
CMake0.1%
Sass0.1%
Slim0.1%
Haml0.1%
Go0.1%
Twig0.1%
BASIC0.1%
Less0.1%
Crystal0.1%
Prolog0.1%
Awk0.1%
OpenQASM0.1%
Clojure0.1%
LilyPond0.1%
Swift0.0%
ANTLR0.0%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
390
Total Commits
峰值: 142次/周
Less
More

核心贡献者 Contributors