Showing 3 open source projects for "learning language"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. The repository notes some limitations (e.g., languages like Chinese/Arabic/Korean may not be well-supported). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    WikiSQL

    WikiSQL

    A large annotated semantic parsing corpus for developing NL interfaces

    A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Learning JavaScript Design Patterns

    Learning JavaScript Design Patterns

    Repo for my 'Learning JavaScript Design Patterns' book

    Learning JavaScript Design Patterns is released under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 unported license. It is available for purchase via O'Reilly Media but will remain available for both free online and as a physical (or eBook) purchase for readers wishing to support the project. This edition was updated to ES2015+ syntax in 2021.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB