Showing 33 open source projects for "document"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    elasticsearc-php

    elasticsearc-php

    PHP low-level client for Elasticsearch

    Introducing Elasticsearch DSL library to provide objective query builder for Elasticsearch bundle and elasticsearch-php client. You can easily build any Elasticsearch query and transform it to an array. This agnostic package is a lightweight wrapper on top of the Elasticsearch PHP client. Its main goal is to allow for easier structuring of queries and indices in your application. It does not want to hide or replace the functionality of the Elasticsearch PHP client. Feature complete, object...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    VecText

    Converting text to a structured representation

    VecText is an application that converts raw text to a structured format suitable for various data mining software. The application is written in interpreted programming language Perl. A part of the functionality is realized by external modules (e.g., Lingua::Stem::Snowball for stemming). The graphical user interface enables user-friendly software employment without requiring specialized technical skills and knowledge of a particular programming language, names of libraries and their...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AI learning

    AI learning

    AiLearning, data analysis plus machine learning practice

    ..., and online earning community, with a QQ group of more than 10,000 people and at least 10,000 subscribers. The number of Github Stars exceeds 60k, and it ranks in the top 100 of all Github organizations. The daily up of all its websites exceeds 4k, and the peak of Alexa ranking is 20k. Our core members are certified as CSDN blog experts and short-book programmers as excellent authors. We have established ApacheCN, a non-profit document, and tutorial translation project.
    Downloads: 2 This Week
    Last Update:
    See Project
  • JobNimbus Construction Software Icon
    JobNimbus Construction Software

    For Roofers, Remodelers, Contractors, Home Service Industry

    Track leads, jobs, and tasks from one easy to use software. You can access your information wherever you are, get everyone on the same page, and grow your business.
    Learn More
  • 5
    Gamera is a framework for the creation of structured document analysis applications by domain experts. It combines a programming library with GUI tools for the training and interactive development of recognition systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    libcrn is document image processing library written in C++11 for Linux, Windows, Mac OsX and Google Android. It is a toolbox that allows to create easily software such as OCRs and layout analysis tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    jLDADMM

    A Java package for the LDA and DMM topic models

    The Java package jLDADMM is released to provide alternative choices for topic modeling on normal or short texts. It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models. See the usage of jLDADMM in its website at http://jldadmm.sourceforge.net/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DCTFinder

    DCTFinder

    Extract title and creation time from web page.

    Web pages do not offer reliable metadata concerning their creation date and time. However, getting the document creation time is a necessary step for allowing to apply temporal normalization systems to web pages. DCTFinder is a system that parses a web page and extracts from its content the title and the creation date of this web page. DCTFinder combines heuristic title detection, supervised learning with Conditional Random Fields (CRFs) for document date extraction, and rule-based creation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SCAN
    SCAN (Smart Content Aggregation and Navigation) is a universal semantic content aggregator. It combines search, text analysis, tagging and metadata functions to provide new user experience of desktop navigation and document management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • SKUDONET Open Source Load Balancer Icon
    SKUDONET Open Source Load Balancer

    For companies that need a load balancing solution

    SKUDONET is designed to enhance service quality with advanced load balancing capabilities. Allowing scale your infrastructure effortlessly while maintaining unwavering data security, ensuring the continuity of your operations.
    Learn More
  • 10
    FALCON - Text Search Java Project

    FALCON - Text Search Java Project

    JSON based text search Java Project

    ----------------- - What is it? - ----------------- The "Falcon Search" is a JAVA API and tool to search inside the documents. It was originally started to search the content in pdf files under the project "HAWK Search". Searching with this tool is query-based not word-based as in most of the document search tools OR document readers. It also takes care of jumbling of words within query and spelling mistakes. Commonly used techniques in this project are Natural Language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    Texalyzer

    Text analyzer

    Analyzes text document using TF-IDF and optionally stopword list, and extracts important keywords.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ... trained with A to summarize a document B, and vice versa. This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MyNook

    MyNook

    A machine learning system for supervised document classification

    An open source system for supervised document classification based on statistical machine learning techniques. On the contrary of the state of art classification techniques, MyNook just requires the title of the document, not the content itself.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    XmlView
    GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    SIDoBI is an automatic summarization system for documents in Indonesian language. It is an acronym for Sistem Ikhtisar Dokumen untuk Bahasa Indonesia. SIDoBI is built based on MEAD, a public domain portable multi-document summarization system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ILEDocs is a documentation tool which helps the software developers to document their programs in a convenient way similar to javadoc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Maui is a multi-purpose automatic topic indexing algorithm. Given a document, Maui automatically identifies its topics. Depending on the task topics are tags, keywords, keyphrases, vocabulary terms, descriptors or Wikipedia titles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    RDF-DocMan is a document manager based on a Sesame (RDF repository) backend. Documents are stored in the filesystem and their metadata in a Sesame repository. It was developed for porQual web content generator (also in sf.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Graphist uses PHP's GD library to produce data plots, in real time, served up as standard images for consumption by web pages (though such images could be saved for use in other document types).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Flesh is a Java application designed to analyze a document (plain text, rich text, Word documents, and PDFs) and display the difficulty associated with comprehending using the Flesch-Kincaid Grade Level and the Flesch Reading Ease Score.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    Qualiweb aims at providing semantic web metrics for modeling a website visitors needs according to a given taxonomy or document classification. Web metrics provided by Qualiweb give an indication of how successful each of the website topics have been.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next