OCR expert VLM powered by Hunyuan's native multimodal architecture
PDF to Markdown with vision models
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Leveraging BERT and c-TF-IDF to create easily interpretable topics
This repository contains the official implementation of FastVLM
Qwen2.5-VL is the multimodal large language model series
Tensor search for humans
Large-language-model & vision-language-model based on Linear Attention
Connect any LLM to your internal knowledge sources
Windows application to search multiple pdfs and chat with them
Generate 3D objects conditioned on text or images
Free, portable desktop Computer-Assisted Translation (CAT) tool.
FaceOnLive Open KYC: Streamlining Identity Verification with AI
Langchain Apps on Production with Jina & FastAPI
Open-source framework that gives you AI Agents
AI powered image classification for nudity and documents / id-cards
A platform for building vector based applications
Retro Games in Gym
Code for "Improving Language Understanding by Generative Pre-Training"
Convert Scanned PDFs to Selectable By OCR