ast-grep: AST-based cross-language structural code search, lint and rewrite tool

ast-grep: AST-powered CLI for structural code search, bulk rewrite and YAML-driven linting.

GitHub ast-grep/ast-grep Updated 2025-09-15 Branch main Stars 10.4K Forks 267

Rust TypeScript AST search code rewrite CLI tool

💡 Deep Analysis

What concrete code-migration and search problems does ast-grep solve, and how does it do that?

Core Analysis ¶

Project Positioning: ast-grep targets structured search and bulk rewriting in large codebases by matching AST nodes instead of raw text. It replaces fragile text-based grep with syntax-aware patterns and provides a templated rewrite engine for codemods.

Technical Features ¶

AST-based pattern matching: Pattern syntax resembles real code and supports $UPPERCASE wildcards, making pattern writing more intuitive than raw AST traversal.
Built-in rewrite engine and YAML linting: Matches can be mapped to replacement templates; YAML rules can be distributed and integrated into CI for automated checks and fixes.
High-performance multi-language parsing: Implemented in Rust and powered by tree-sitter, enabling parallel parsing across large repositories and multiple languages.

Usage Recommendations ¶

Start small: Validate patterns and rewrites with --dry-run or the online playground before applying to the full repo.
Manage rules declaratively: Store reusable checks/fixes as YAML and run them in CI—first as lint warnings, then as auto-fixes.
Migrate incrementally: Break big migrations into small, verifiable steps to minimize risk and ease code review.

Important Notice: AST-aware matching is not type/semantic-aware. For migrations requiring type updates or cross-file reference adjustments, combine ast-grep with compiler/LS tooling.

Summary: ast-grep is well-suited for syntax-sensitive bulk searches and codemods, offering an approachable pattern language and high-performance parsing that make large-scale structural refactoring practical and repeatable.

85.0%

Why did ast-grep choose Rust + tree-sitter, and what architectural advantages does this choice provide?

Core Analysis ¶

Project Positioning: Choosing Rust + tree-sitter addresses the dual needs of high throughput for large repos and multi-language parsing with a unified pattern API: Rust gives performance and safe concurrency; tree-sitter provides robust grammar-based parsing across languages.

Technical Features ¶

Performance and concurrency: Rust’s zero-cost abstractions and memory safety let the CLI parallel-parse and match thousands of files with predictable resource usage.
Cross-language parsing: tree-sitter offers stable parsers and incremental parsing, enabling reuse of the same pattern language across multiple programming languages.
Native distribution and ecosystem integration: Rust produces native binaries easily packaged for npm/pip/homebrew, simplifying deployment across platforms.

Usage Recommendations ¶

Tune parallelism: Adjust thread/concurrency settings on large repos to balance CPU and IO.
Confirm grammar coverage: Validate tree-sitter grammar for language-specific features (macros/new syntax) in the playground before running wide changes.
Use published binaries: Prefer official packages to avoid environment and build inconsistencies.

Important Notice: Performance does not equal semantic awareness—Rust+tree-sitter upgrades parsing and throughput but does not supply type/control-flow info; compiler tools are still needed for semantic refactors.

Summary: The Rust + tree-sitter choice gives ast-grep a scalable, cross-language parsing foundation that makes large-scale structural search and codemods both efficient and practical.

85.0%

What practical risks exist when using ast-grep for automatic rewrites (codemod), and how can you reduce mis-modifications or semantic errors?

Core Analysis ¶

Issue Core: ast-grep’s rewrites operate at the syntax level and lack type/control-flow awareness, creating risk of semantic breakage. Pattern mistakes (too broad/too narrow) and tree-sitter grammar differences can also cause mis- or under-modifications.

Technical Analysis ¶

Semantic blind spots: Rewrites cannot safely perform changes that require type updates or cross-file signature adjustments.
Pattern errors: Overuse of wildcards (e.g., $A) can match unintended sites; overly specific patterns can miss equivalent syntactic forms.
Parsing variance: Different tree-sitter grammars/versions may change node names or structures and thus affect matches.

Practical Recommendations (risk mitigation)¶

Dry-run / preview: Validate matches and replacements on sample files with --dry-run or the playground first.
Incremental migrations: Break large migrations into small, reviewable steps focused on one syntactic equivalence class at a time.
Combine with compiler/type checks: Run builds and tests in CI after auto-fixes; for type-sensitive changes, prefer compiler/LS-based refactors or add a type-aware verification step.
Unit-test patterns: Keep small regression examples for each pattern and test them in the playground before wide application.

Important Notice: For repo-wide updates of references/signatures, ast-grep is best used as a syntactic pre-filter or helper; semantic consistency must be validated with type-aware tools.

Summary: With careful dry-runs, incremental execution, and CI-enforced testing or type checks, ast-grep’s syntax-level codemods can be used safely and efficiently while minimizing semantic regressions.

85.0%

For engineers unfamiliar with AST, what is the learning curve of using ast-grep, and what best practices improve efficiency?

Core Analysis ¶

Issue Core: The “pattern-as-code” approach of ast-grep reduces abstraction overhead, but crafting robust match/replace rules requires understanding the target language’s AST representation in tree-sitter and the tool’s wildcard/template syntax.

Technical Analysis ¶

Low-to-moderate initial barrier: Simple searches and replacements are easy to learn because patterns resemble real code.
Moderate proficiency needed: Covering syntactic variants, avoiding wildcard overuse, and handling language-specific nodes require inspecting tree-sitter node names and testing in the playground.
Tooling support: The online playground, examples, YAML rules, and CLI --dry-run provide a fast experiment-and-verify loop.

Practical Recommendations (getting started & speedups)¶

Prototype in the playground: Validate each pattern with small examples before running on the repo.
Keep example suites: Store input/output samples for common rules to serve as regression tests and onboarding material.
Phase from search to fix: First identify candidates with AST search, then apply replacements in a small scope, and finally codify as YAML rules in CI.
Inspect tree-sitter grammars: When matching surprises occur, consult the language’s tree-sitter grammar to understand node names/structures.

Important Notice: Quick wins are available with intuitive patterns, but production use benefits from investing in a ruleset and example-driven validation to lower operational risk.

Summary: ast-grep is approachable for beginners, but safe, reliable production use requires modest AST literacy; using the playground and maintaining example-driven tests significantly shortens the learning curve and increases trust.

85.0%

In which scenarios is ast-grep most suitable, and what clear limitations should be evaluated?

Core Analysis ¶

Suitable Scenarios: ast-grep excels at syntax-level bulk modifications and detections, such as:

Large codemods: Rewriting certain call patterns to a new API style (e.g., replacing A && A() patterns with optional chaining).
Declarative lint/fix rules: Defining team rules in YAML and running them in CI for detection or auto-fix of simple anti-patterns.
Cross-language structural search: Using tree-sitter’s multi-language parsing to run consistent structural searches across languages.

Clear Limitations ¶

No type/semantic awareness: Cannot safely handle migrations requiring type updates or cross-file signature/reference adjustments.
Depends on tree-sitter grammars: If a grammar lacks coverage or differs across versions, matches and rewrites can be affected.
Macros/compile-time metaprogramming: Limited handling of complex preprocessor macros or generated code, which can lead to mismatches.

Practical Advice ¶

Use ast-grep for syntactic pre-filtering and replacements; hand off semantic-heavy changes to compiler/LS-based tools.
Combine in CI: Run ast-grep lint rules, then gate auto-fixes behind build/test verification.
Validate grammar coverage: Test parsing reliability for critical language features in the playground before wide application.

Important Notice: Treat ast-grep as a syntax-aware efficiency tool rather than a type-aware refactoring engine; it yields big wins in its domain but should be complemented for semantic tasks.

Summary: For tasks that can be expressed as syntax transformations, ast-grep is a performant and scalable choice. For type-sensitive or cross-module refactors, plan to use it alongside semantic tools.

85.0%

✨ Highlights

Write AST patterns as code examples—intuitive and reusable
Rust implementation offers compiled performance and parallelism
Uses tree-sitter to parse multiple language grammars
Provides an online playground and multiple installation channels
Limited core maintainers; evaluate long-term enterprise support
AST matches are not semantic equivalence; complex refactors need manual review

🔧 Engineering

Structural matching and search powered by tree-sitter AST
Intuitive code-like pattern and rewrite syntax lowers rule authoring cost
YAML-driven lint rules and a jQuery-like API for AST manipulation

⚠️ Risks

Relatively few contributors; releases and issue response may be intermittent
AST-based approach has limited handling for semantic-level changes and macro expansion
Depends on tree-sitter grammar support; some languages or edge grammars may be under-covered

👥 For who?

Library authors and maintainers: for migrations and fixing breaking changes
Tech leads: enforce custom team rules and automated fixes
Security researchers and engineering teams: fast rule authoring and bulk rewrites