Showing 24 open source projects for "php-simple-html-dom-parser"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    MegaParse

    MegaParse

    File Parser optimised for LLM Ingestion with no loss

    MegaParse is a file parser optimized for Large Language Model (LLM) ingestion, ensuring no loss of information. It efficiently parses various document formats, such as PDFs, DOCX, and PPTX, converting them into formats ideal for processing by LLMs. This tool is essential for applications that require accurate and comprehensive data extraction from diverse document types.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. ...
    Downloads: 158 This Week
    Last Update:
    See Project
  • 3
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    llm.c

    llm.c

    LLM training in simple, raw C/CUDA

    ...By stripping away heavy frameworks, it exposes the core math and memory flows of embeddings, attention, and feed-forward layers. The code illustrates how to wire forward passes, losses, and simple training or inference loops with direct control over arrays and buffers. Its compact design makes it easy to trace execution, profile hotspots, and understand the cost of each operation. Portability is a goal: it aims to compile with common toolchains and run on modest hardware for small experiments. Rather than delivering a production-grade stack, it serves as a reference and learning scaffold for people who want to “see the metal” behind LLMs.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
    Try for free
  • 5
    Qwen3-Coder

    Qwen3-Coder

    Qwen3-Coder is the code version of Qwen3

    Qwen3-Coder is the latest and most powerful agentic code model developed by the Qwen team at Alibaba Cloud. Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    LangCheck

    LangCheck

    Simple, Pythonic building blocks to evaluate LLM applications

    Simple, Pythonic building blocks to evaluate LLM applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    OSS-Fuzz Gen

    OSS-Fuzz Gen

    LLM powered fuzzing via OSS-Fuzz

    OSS-Fuzz-Gen is a companion project that helps automatically create or improve fuzz targets for open-source codebases, aiming to increase coverage in OSS-Fuzz with minimal maintainer effort. It analyses a library’s APIs, examples, and tests to propose harnesses that exercise parsers, decoders, or protocol handlers—precisely the code where fuzzing pays off. The system integrates with modern LLM-assisted workflows to draft harness code and then iterates based on build errors or low coverage...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into structured outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 10
    Controllable-RAG-Agent

    Controllable-RAG-Agent

    This repository provides an advanced RAG

    Controllable-RAG-Agent is an advanced Retrieval-Augmented Generation (RAG) system designed specifically for complex, multi-step question answering over your own documents. Instead of relying solely on simple semantic search, it builds a deterministic control graph that acts as the “brain” of the agent, orchestrating planning, retrieval, reasoning, and verification across many steps. The pipeline ingests PDFs, splits them into chapters, cleans and preprocesses text, then constructs vector stores for fine-grained chunks, chapter summaries, and book quotes to support nuanced queries. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos. Deep Lake is used by Google, Waymo, Red Cross, Omdena, Yale, & Oxford. Use one API to upload, download, and stream datasets to/from AWS S3/S3-compatible storage, GCP, Activeloop cloud, or local storage. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    AReal

    AReal

    Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible

    AReaL is an open source, fully asynchronous reinforcement learning training system. AReal is designed for large reasoning and agentic models. It works with models that perform reasoning over multiple steps, agents interacting with environments. It is developed by the AReaL Team at Ant Group (inclusionAI) and builds upon the ReaLHF project. Release of training details, datasets, and models for reproducibility. It is intended to facilitate reproducible RL training on reasoning / agentic tasks,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    LlamaIndex

    LlamaIndex

    Central interface to connect your LLM's with external data

    LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. LlamaIndex is a simple, flexible interface between your external data and LLMs. It provides the following tools in an easy-to-use fashion. Provides indices over your unstructured and structured data for use with LLM's. These indices help to abstract away common boilerplate and pain points for in-context learning. Dealing with prompt limitations (e.g. 4096 tokens for Davinci) when the context is too big. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    BertViz

    BertViz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    BertViz is an interactive tool for visualizing attention in Transformer language models such as BERT, GPT2, or T5. It can be run inside a Jupyter or Colab notebook through a simple Python API that supports most Huggingface models. BertViz extends the Tensor2Tensor visualization tool by Llion Jones, providing multiple views that each offer a unique lens into the attention mechanism. The head view visualizes attention for one or more attention heads in the same layer. It is based on the excellent Tensor2Tensor visualization tool. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Kaleidoscope-SDK

    Kaleidoscope-SDK

    User toolkit for analyzing and interfacing with Large Language Models

    kaleidoscope-sdk is a Python module used to interact with large language models hosted via the Kaleidoscope service available at: https://github.com/VectorInstitute/kaleidoscope. It provides a simple interface to launch LLMs on an HPC cluster, asking them to perform basic features like text generation, but also retrieve intermediate information from inside the model, such as log probabilities and activations. Users must authenticate using their Vector Institute cluster credentials. This can be done interactively instantiating a client object. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Bard API

    Bard API

    The unofficial python package that returns response of Google Bard

    The Python package returns a response of Google Bard through the value of the cookie. This package is designed for application to the Python package ExceptNotifier and Co-Coder. Please note that the bardapi is not a free service, but rather a tool provided to assist developers with testing certain functionalities due to the delayed development and release of Google Bard's API. It has been designed with a lightweight structure that can easily adapt to the emergence of an official API....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Gemma

    Gemma

    Gemma open-weight LLM library, from Google DeepMind

    Gemma, developed by Google DeepMind, is a family of open-weights large language models (LLMs) built upon the research and technology behind Gemini. This repository provides the official implementation of the Gemma PyPI package, a JAX-based library that enables users to load, interact with, and fine-tune Gemma models. The framework supports both text and multi-modal input, allowing natural language conversations that incorporate visual content such as images. It includes APIs for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Doctor Dignity

    Doctor Dignity

    Doctor Dignity is an LLM that can pass the US Medical Licensing Exam

    Doctor Dignity is a prototype project exploring how AI-assisted tooling might support compassionate, accessible health guidance for people who struggle to get timely care. The repository centers on a simple end-to-end pipeline—intake of user-reported symptoms, basic triage logic, and clear, supportive messaging—intended to demonstrate how such systems could be built. It emphasizes a humane UX: plain-language prompts, de-jargonized outputs, and guardrails that nudge users toward professional care when needed. The code is designed to be hackable rather than production-grade, giving learners a chance to experiment with NLP flows and lightweight back-end components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Qwen2.5-Coder

    Qwen2.5-Coder

    Qwen2.5-Coder is the code version of Qwen2.5, the large language model

    ...The model supports over 92 programming languages and offers exceptional performance in generating code, debugging, and mathematical problem-solving. Qwen2.5-Coder, with its long context length of 128K tokens, is ideal for a variety of use cases, from simple code assistants to complex programming scenarios, matching the capabilities of models like GPT-4o.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    LangChain Apps on Production with Jina

    LangChain Apps on Production with Jina

    Langchain Apps on Production with Jina & FastAPI

    Jina is an open-source framework for building scalable multi-modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs. long-chain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    LM Human Preferences

    LM Human Preferences

    Code for the paper Fine-Tuning Language Models from Human Preferences

    ...It was tested on the smallest GPT-2 (124M parameters) under a specific environment (TensorFlow 1.x, specific CUDA / cuDNN combinations). It includes utilities for launching experiments, sampling from policies, and simple experiment orchestration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Emb-GAM

    Emb-GAM

    An interpretable and efficient predictor using pre-trained models

    Deep learning models have achieved impressive prediction performance but often sacrifice interpretability, a critical consideration in high-stakes domains such as healthcare or policymaking. In contrast, generalized additive models (GAMs) can maintain interpretability but often suffer from poor prediction performance due to their inability to effectively capture feature interactions. In this work, we aim to bridge this gap by using pre-trained neural language models to extract embeddings for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    GPT Neo

    GPT Neo

    An implementation of model parallel GPT-2 and GPT-3-style models

    An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library. If you're just here to play with our pre-trained models, we strongly recommend you try out the HuggingFace Transformer integration. Training and inference is officially supported on TPU and should work on GPU as well. This repository will be (mostly) archived as we move focus to our GPU-specific repo, GPT-NeoX. NB, while neo can technically run a training step at 200B+ parameters, it is very...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next