A Docker-powered microservice for intelligent PDF document layout analysis, OCR, and content extraction. This tool provides advanced PDF layout analysis with VGT and LightGBM models, supporting 150...
BentoPDF is a privacy-first, open-source PDF toolkit containing 100+ professional-grade tools for document manipulation—all running entirely in your browser. Designed to replace expensive deskto...
A step-by-step low-level PDF writer for Rust.
Privacy-first, fully client-side PDF toolkit that enables editing, merging, annotation, and Office ↔ PDF conversions directly in the browser. Includes offline cross-platform support for Mac, Wind...
Privacy-first, fully client-side PDF toolkit that enables editing, merging, annotation, and Office ↔ PDF conversions directly in the browser. Includes offline cross-platform support for Mac, Wind...
Linux-Intelligent-OCR-Solution(Lios) Lios is a for converting printed text into digital text using either a scanner or a camera. It can also extract text from scanned images sourced from PDF files,...
officeParser is a strictly-typed, zero-dependency-core library for Node.js and the browser that transforms complex office documents (DOCX, PDF, XLSX, etc.) into a unified, hierarchical Abstract Syn...
Open data is the most crucial missing link to create truly open-source AI. Without sufficient high-quality open data, open-source AI cannot be competitive with closed-source AI models. We are crea...
MIT-licensed Python framework providing the semantic intelligence layer that AI applications are missing — bridging raw text and trustworthy, explainable AI. Problem: Modern RAG pipelines and ag...
Converts SVG files to PDF.
An AI-powered web scraping and automation engine supporting browser automation, PDF extraction, OCR, database integrations, and workflow generation via a FastAPI backend.