-
surya Public
Forked from datalab-to/suryaOCR, layout analysis, reading order, table recognition in 90+ languages
Python GNU General Public License v3.0 UpdatedJul 11, 2025 -
all-rag-techniques Public
Forked from hemmydev/all-rag-techniquesImplementation of all RAG techniques in a simpler way
Jupyter Notebook MIT License UpdatedJun 10, 2025 -
llm-course Public
Forked from mlabonne/llm-courseCourse to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Apache License 2.0 UpdatedJun 4, 2025 -
crawl4ai Public
Forked from unclecode/crawl4ai🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Python Apache License 2.0 UpdatedApr 6, 2025 -
jtokkit Public
Forked from knuddelsgmbh/jtokkitJTokkit is a Java tokenizer library designed for use with OpenAI models.
Java MIT License UpdatedFeb 14, 2025 -
flexmark-java Public
Forked from vsch/flexmark-javaCommonMark/Markdown Java parser with source level AST. CommonMark 0.28, emulation of: pegdown, kramdown, markdown.pl, MultiMarkdown. With HTML to MD, MD to PDF, MD to DOCX conversion modules.
Java BSD 2-Clause "Simplified" License UpdatedOct 3, 2024 -
gpt-crawler Public
Forked from BuilderIO/gpt-crawlerCrawl a site to generate knowledge files to create your own custom GPT from a URL
TypeScript ISC License UpdatedAug 9, 2024 -
domains Public
Forked from tb0hdan/domainsWorld’s single largest Internet domains dataset
HTML BSD 3-Clause "New" or "Revised" License UpdatedJul 26, 2024 -
rag-from-scratch Public
Forked from langchain-ai/rag-from-scratchJupyter Notebook UpdatedJul 9, 2024 -
EasySpider Public
Forked from NaiboWang/EasySpiderA visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
JavaScript Other UpdatedJun 19, 2024 -
awesome-claude-prompts Public
Forked from langgptai/awesome-claude-promptsThis repo includes Claude prompt curation to use Claude better.
UpdatedApr 21, 2024 -
LiveLessons Public
Forked from douglascraigschmidt/LiveLessonsThis repository contains all the source code examples from my LiveLessons course on "Java Concurrent Programming" and my various LiveTraining courses, as described at http://www.dre.vanderbilt.edu/…
Java UpdatedApr 3, 2024 -
google-maps-reviews-scraper Public
Forked from omkarcloud/google-maps-reviews-scraper✨ Effortlessly extract valuable insights from Google Maps reviews with our Google Maps Reviews Scraper. Uncover customer sentiments, trends, and more! ✨
MIT License UpdatedMar 18, 2024 -
Clean-Code---Tieng-Viet Public
Forked from quoctinnguyen8/Clean-Code---Tieng-VietClean Code Tiếng Việt: Bản dịch 6 chương đầu từ quyển "Clean Code - A Handbook of Agile Software Craftsmanship" - Robert C. Martin et. al.
UpdatedDec 9, 2023 -
Vietnamese_LLMs Public
Forked from VietnamAIHub/Vietnamese_LLMsDự án bao gồm: 1. Xây dựng bộ dữ Instructions Vietnamese (chất lượng, nhiều, và đa dạng). 2.LLM Training, Finetuning, Evaluating & Testing trên Open-source mô hình ngôn ngữ: Bloomz,T5, UL2, LLaMA (…
Python Apache License 2.0 UpdatedOct 22, 2023 -
python-cheatsheet Public
Forked from gto76/python-cheatsheetComprehensive Python Cheatsheet
Python UpdatedSep 8, 2023 -
news-crawler Public
Forked from pmphan/news-crawlerNews crawlers for some sites.
Python UpdatedSep 6, 2023 -
fileshelter Public
Forked from epoupon/fileshelterFileShelter is a “one-click” file sharing web application
C++ GNU General Public License v3.0 UpdatedJul 25, 2023 -
text2vec Public
Forked from shibing624/text2vectext2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Python Apache License 2.0 UpdatedJun 28, 2023 -
WebCollector Public
Forked from CrawlScript/WebCollectorWebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
Java GNU General Public License v3.0 UpdatedJun 3, 2023 -
numpy_exercises Public
Forked from Kyubyong/numpy_exercisesNumpy exercises.
Python MIT License UpdatedMay 21, 2023 -
pandas_exercises Public
Forked from guipsamora/pandas_exercisesPractice your pandas skills!
Jupyter Notebook BSD 3-Clause "New" or "Revised" License UpdatedMay 9, 2023 -
ccia_code_samples Public
Forked from anthonywilliams/ccia_code_samplesCode samples for C++ Concurrency in Action
C++ UpdatedApr 23, 2023 -
software-architecture-books Public
Forked from mhadidg/software-architecture-booksA comprehensive list of books on Software Architecture.
UpdatedMar 15, 2023 -
tiktok-scraper Public
scraper do tiktok (principal hashtag)
JavaScript MIT License UpdatedMar 7, 2023 -
nlp-cheat-sheet-python Public
Forked from janlukasschroeder/nlp-cheat-sheet-pythonNLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Jupyter Notebook UpdatedFeb 11, 2023 -
easy-rules Public
Forked from j-easy/easy-rulesThe simple, stupid rules engine for Java
Java MIT License UpdatedJan 11, 2023 -
awesome-cto Public
Forked from kuchin/awesome-ctoA curated and opinionated list of resources for Chief Technology Officers, with the emphasis on startups
Creative Commons Zero v1.0 Universal UpdatedNov 30, 2022 -
tiktok-downloader-bot Public
Forked from sero01000/tiktok-downloader-botA Telegram bot to download videos or images from tiktok without watermark
Python MIT License UpdatedNov 1, 2022 -
Low-Level-Design Public
Forked from coding-parrot/Low-Level-DesignUseful Resources for Low Level System Design
Java UpdatedOct 9, 2022