Welcome to Crawlable!
If you’ve ever tried to feed an entire project to an AI model (like ChatGPT, Claude, or Gemini) to ask for a refactor or bug fix, you know the struggle. You hit token limits, you accidentally upload node_modules or .git folders, and the AI loses context.
Crawlable solves this. It is an intelligent, high-speed CLI tool that scans your project directory, uses Google’s powerful Gemini AI to figure out what files are useless noise, and asynchronously extracts only the core, proprietary source code. It outputs a beautiful, human-readable (and LLM-readable) text artifact that you can instantly drop into any AI assistant.
Whether you are a seasoned software architect or a non-technical manager trying to understand a codebase, Crawlable does the heavy lifting for you.
asyncio, it reads hundreds of files concurrently for blazing-fast performance.node_modules or venv) in the project roadmap so you don’t waste valuable AI tokens.rich library, featuring live spinning progress bars, dynamic tables, and real-time status updates./crawlable_output/Project_YYYY-MM-DD_HH-MM/) keeping your workspace clean.Don’t have a Computer Science degree? Never used the terminal before? No problem. Follow these 4 easy steps to get Crawlable running on your machine.
You need Python to run this tool.
Crawlable uses Google’s AI brain, which requires a “key” to access.
Open your computer’s terminal (Command Prompt on Windows, Terminal on Mac) and run these commands one by one:
# 1. Download the code to your computer
git clone [https://github.com/Med-Gh-TN/Crawlable.git](https://github.com/Med-Gh-TN/Crawlable.git)
# 2. Go into the project folder
cd Crawlable
# 3. Install the required packages
pip install -r requirements.txt
src/config.py and open it with any text editor (Notepad, VS Code, TextEdit).API_KEY = "YOUR_API_KEY_HERE".YOUR_API_KEY_HERE with the key you got from Google in Step 2. (Keep the quotation marks!)Using Crawlable is incredibly simple. Open your terminal and run the main script, followed by the path of the folder you want to analyze.
Command:
python main.py /path/to/your/target/project
(Tip: You can literally drag and drop a folder from your desktop into the terminal window to automatically paste its path!)
Expected Output:
The terminal will display a gorgeous dashboard as it works through the 4 phases. Once complete, look inside the crawlable_output/ folder. You will find:
project_roadmap.txt: A clean, tree-like map of the project.source_code.txt: The consolidated, purely filtered code ready to be fed to an AI.prompt.txt: A base prompt template you can use to start your AI conversation.For the technical folks, Crawlable operates on a highly decoupled, service-oriented architecture:
AsyncFileSystemService)
Generates a pre-filtered map of the directory, instantly applying HARDCODED_EXCLUSIONS (mathematically guaranteed noise) and our Smart Truncation algorithm.GeminiFilterService)
Passes the roadmap to gemini-2.5-flash with a strict JSON schema prompt to dynamically identify project-specific noise. Includes exponential backoff and retry logic for API resilience.AsyncCodeExtractorService)
Fires off non-blocking asyncio.gather tasks to read all approved files concurrently, updating the rich UI progress bar in real-time.CrawlablePipeline, saving versioned artifacts to the file system.We want Crawlable to be the absolute standard for AI code extraction. Contributions from developers of all skill levels are highly welcomed!
How to contribute:
git checkout -b feature/AmazingFeature).git commit -m 'Add some AmazingFeature').git push origin feature/AmazingFeature).Note: If you find a bug or have a feature request, please open an Issue first so we can discuss it!
Distributed under the Apache 2.0 License. See LICENSE for more information. This grants you the freedom to use, modify, and distribute the software, even commercially, under the terms of the license.
Mouhamed Gharsallah - GitHub: @Med-Gh-TN
Built with ❤️ for the open-source community. Let’s make AI collaboration seamless. Would you like me to help you draft the Release Notes for version 1.0 of Crawlable?
```