Project

gitingest

0.0
The project is in a healthy, maintained state
Gitingest is a powerful command-line tool that fetches files from GitHub repositories and generates consolidated text prompts for AI analysis. It features smart file filtering, concurrent processing, custom exclusion patterns, authentication support, and automatic rate limit handling. Perfect for creating context-rich prompts from codebases for AI assistants, documentation generation, or codebase analysis.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 2.0
~> 13.0
~> 3.0

Runtime

 Project Readme

Gem Version Gem Total Downloads

Gitingest

Gitingest is a Ruby gem that fetches files from a GitHub repository and generates a consolidated text prompt, which can be used as input for large language models, documentation generation, or other purposes.

Installation

From RubyGems

gem install gitingest

From Source

git clone https://github.com/davidesantangelo/gitingest.git
cd gitingest
bundle install
bundle exec rake install

Usage

Command Line

# Basic usage (public repository)
gitingest --repository user/repo 

# With GitHub token for private repositories
gitingest --repository user/repo --token YOUR_GITHUB_TOKEN

# Specify a custom output file
gitingest --repository user/repo --output my_prompt.txt

# Specify a different branch
gitingest --repository user/repo --branch develop

# Exclude additional patterns
gitingest --repository user/repo --exclude "*.md,docs/"

# Control the number of threads
gitingest --repository user/repo -T 4

# Set thread pool shutdown timeout
gitingest --repository user/repo -W 120

# Combine threading options
gitingest --repository user/repo -T 8 -W 90

# Quiet mode
gitingest --repository user/repo --quiet

# Verbose mode
gitingest --repository user/repo --verbose

Available Options

  • -r, --repository REPO: GitHub repository (username/repo) [Required]
  • -t, --token TOKEN: GitHub personal access token [Optional but recommended]
  • -o, --output FILE: Output file for the prompt [Default: reponame_prompt.txt]
  • -e, --exclude PATTERN: File patterns to exclude (comma separated)
  • -b, --branch BRANCH: Repository branch [Default: main]
  • -T, --threads COUNT: Number of concurrent threads [Default: auto-detected]
  • -W, --thread-timeout SECONDS: Thread pool shutdown timeout [Default: 60]
  • -q, --quiet: Reduce logging to errors only
  • -v, --verbose: Increase logging verbosity
  • -h, --help: Show help message

As a Library

require "gitingest"

# Basic usage - write to a file
generator = Gitingest::Generator.new(
  repository: "user/repo",
  token: "YOUR_GITHUB_TOKEN" # optional
)

# Run the full workflow (fetch repository and generate file)
generator.run

# OR generate file only (if you need the output path)
output_path = generator.generate_file

# Get content as a string (for in-memory processing)
content = generator.generate_prompt

# With custom options
generator = Gitingest::Generator.new(
  repository: "user/repo",
  token: "YOUR_GITHUB_TOKEN",
  output_file: "my_prompt.txt",
  branch: "develop",
  exclude: ["*.md", "docs/"], 
  threads: 4,              # control concurrency
  thread_timeout: 120,     # custom thread timeout
  quiet: true              # or verbose: true
)

# With custom logger
custom_logger = Logger.new("gitingest.log")
generator = Gitingest::Generator.new(
  repository: "user/repo",
  logger: custom_logger
)

Features

  • Fetches all files from a GitHub repository based on the given branch
  • Automatically excludes common binary files and system files by default
  • Allows custom exclusion patterns for specific file extensions or directories
  • Uses concurrent processing for faster downloads
  • Handles GitHub API rate limiting with automatic retry
  • Generates a clean, formatted output file with file paths and content

Default Exclusion Patterns

By default, the generator excludes files and directories commonly ignored in repositories, such as:

  • Version control files (.git/, .svn/)
  • System files (.DS_Store, Thumbs.db)
  • Log files (*.log, *.bak)
  • Images and media files (*.png, *.jpg, *.mp3)
  • Archives (*.zip, *.tar.gz)
  • Dependency directories (node_modules/, vendor/)
  • Compiled and binary files (*.pyc, *.class, *.exe)

Limitations

  • To prevent memory overload, only the first 1000 files will be processed
  • API requests are subject to GitHub limits (60 requests/hour without token, 5000 requests/hour with token)
  • Private repositories require a GitHub personal access token

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/davidesantangelo/gitingest.

Acknowledgements

Inspired by cyclotruc/gitingest.

License

The gem is available as open source under the terms of the MIT License.