Skip to content
View MauroCE's full-sized avatar
:octocat:
:octocat:

Highlights

  • Pro

Organizations

@compass-queens

Block or report MauroCE

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

Python 8,497 642 Updated Sep 30, 2025

This is a repo with links to everything you'd ever want to learn about data engineering

Jupyter Notebook 38,003 7,294 Updated Sep 25, 2025

VIINA: Violent Incident Information from News Articles on the 2022 Russian Invasion of Ukraine

317 26 Updated Sep 29, 2025

A secure authentication module to manage user access in a Streamlit application.

Python 2,025 290 Updated Sep 1, 2025
Jupyter Notebook 8,255 1,580 Updated Sep 22, 2024

A library for prompt engineering and optimization (SAMMO = Structure-aware Multi-Objective Metaprompt Optimization)

Python 729 42 Updated Jun 23, 2025

Streamlit Annotation Tools is a Streamlit component that gives you access to various annotation tools (labeling, highlighting, etc.) for text data.

TypeScript 94 9 Updated Dec 28, 2023

[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

Python 206 12 Updated Oct 8, 2024

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Python 785 115 Updated Jul 25, 2024

TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)

Python 348 59 Updated Mar 15, 2025

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 44,506 6,410 Updated Sep 29, 2025

VectorHub is a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector retrieval to their ML stack.

Jupyter Notebook 495 125 Updated Sep 29, 2025

Natural Langugae Processing toolbox

Python 14 10 Updated Sep 22, 2022
Python 5 Updated Jun 5, 2024

DataComp for Language Models

HTML 1,367 125 Updated Sep 9, 2025

The n-gram Language Model

C 1,444 104 Updated Aug 5, 2024

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…

Python 2,746 301 Updated Jun 24, 2024

Repository for paper Decrypting Cryptic Crosswords

Python 9 3 Updated Jan 15, 2022

The Official Repository of the Cryptonite Dataset

Python 21 2 Updated Feb 19, 2022

ELT pipeline to move CAD data from Fusion360 to Neo4j graph database. This is a rewrite of the version used for my PhD.

Python 3 1 Updated Oct 31, 2024

Feature-rich codebase for pretrained language models research in PyTorch and 🤗

Python 2 Updated Aug 7, 2023

21 Lessons, Get Started Building with Generative AI

Jupyter Notebook 99,405 52,320 Updated Sep 29, 2025
Python 1 1 Updated Apr 10, 2024
Jupyter Notebook 20 7 Updated Mar 1, 2023

nice and effective super simple calorie counter web app

HTML 99 9 Updated May 30, 2024

🧱 Modula software package

Python 271 21 Updated Aug 18, 2025

Implements Inside-Out SMC^2, a nested sequential Monte Carlo algorithm developed for Bayesian experimental design in dynamical systems.

Julia 6 Updated Aug 19, 2024

Code implementing Integrator Snippets, joint work with Christophe Andrieu and Chang Zhang

Python 3 1 Updated Oct 19, 2024

An efficient, massively parallel probabilistic programming language

Python 9 Updated May 22, 2024
Next