Showing 33 open source projects for "text clustering"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
  • 1
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    rqlite

    rqlite

    The lightweight, distributed relational database built on SQLite

    rqlite is an easy-to-use, lightweight, distributed relational database, which uses SQLite as its storage engine. rqlite is simple to deploy, operating it is very straightforward, and its clustering capabilities provide you with fault-tolerance and high availability. rqlite is available for Linux, macOS, and Microsoft Windows. rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Finetuner

    Finetuner

    Task-oriented finetuning for better embeddings on neural search

    ...-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Eptura Workplace Software Icon
    Eptura Workplace Software

    From desk booking and visitor management, to space planning and office utilization data, Eptura Workplace helps your entire organization work smarter.

    With the world of work changed forever, it’s essential to manage your workplace and assets together to effectively create a high-performing environment. The Eptura experience combines the power of workplace management software with asset management, enabling you to effectively operate your building and facilitate hybrid work.
  • 5
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    yabasta

    yabasta

    Yet Another BAsic Scraper and Text Analysis

    YA BASTA! is a Python/R application for Lyrics Web Scraper and Text Analysis. Web scraping is developed in Python, text analysis in R as Python subprocesses. YA BASTA! is only tested on windows OS. To run YA BASTA! just type on window command prompt: python.exe yabasta.py
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    AngClust

    AngClust: Angle-based feature clustering for time series

    Citation: Aimin Li, Siqi Xiong, Junhuai Li, Saurav Mallik, Yajun Liu, Rong Fei, Hongfang Zhou, Guangming Liu. AngClust: Angle Feature-Based Clustering for Short Time Series Gene Expression Profiles. January 2022. IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM. DOI: 10.1109/TCBB.2022.3192306 Full text: https://ieeexplore.ieee.org/document/9833353/ https://pubmed.ncbi.nlm.nih.gov/35853049/ Highlights * We proposed a novel clustering algorithm based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • All-in-One Payroll and HR Platform Icon
    All-in-One Payroll and HR Platform

    For small and mid-sized businesses that need a comprehensive payroll and HR solution with personalized support

    We design our technology to make workforce management easier. APS offers core HR, payroll, benefits administration, attendance, recruiting, employee onboarding, and more.
  • 10
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 11
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    cbrTekStraktor

    an application to automatically extract text from comic books.

    ... by a combination of statistical and graphical processing operations. It is based on the following 3 major algorithms - Binarization of color images (Niblak and other methods) - Connected components - K-Means clustering Apache Tesseract is used to perform Optical Character Recognition on the extracted text. A subsequent version of the application will integrate with translation software in order to provide automated translation of comic book texts and re-inserion of translated texts
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    webtextanalysis

    Mining knowledge from text data

    This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Fast Matrix for Java

    General purpose matrix utilities for Java in Parallel Computing

    Fast Matrix for Java (fm4j) is a general-purpose matrix utility library for computing with dense matrices. fm4j encapsulated different underlying implementations and select the optimal one in run-time depending on the size of the input matrix. Moreover, fm4j employs Java (Tm) Concurrency to take advantage of the computation power of multi-cor processors.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    TextProcessor

    A Java package to preprocess text datasets for posterior text analysis

    The TextProcessor Java package is a text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported. The package can also generate feature files from a given text dataset with LDA and LIBSVM format for posterior procedures such as classification or clustering. The toolkit is also being...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    WaveSorter

    A powerful, versatile tool for offilne spike analysis and sorting

    WaveSorter emphasizes dynamic visualization and versatility. Slider controls let the user select any coefficient or sample from any of several transforms, which can then be plotted to either axis of a 2D histogram (scatterplot). Within the waveform space, cursor-based controls let the user select subregions of the waveform space or individual waveforms to view. The user may cluster waveforms manually or via one of several popular clustering programs. The classification along with waveform...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Ticket Cluster

    Java aplication who groups related text documents, text mining

    .... Once created the folder with text, run ticket cluster, select folder and click process. After processing is done you can watch results in dendogram or tree. Text mining, text clustering using LINGO
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The first 3d search engine for text. Javascript only. Work in all browsers. Ajax downloads new words (and links between them) as you move mouse to control AI to learn what you're looking for (in context) and put it on screen. Includes Wikipedia data
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Blaze - Appliance for Solr
    Indexing and Search Appliance Powered by Apache Solr. It's major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    bocca is a text-based, scriptable, interactive development environment for SIDL/Babel-based development of mixed language C/C++/Fortran/Java/Python code. It manages source and build systems. Bocca automates creating SIDL/Babel code or CCA components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Cyn.in - Open Source Group Collaboration
    Cyn.in helps teams to build collaborative knowledge by sharing & discussing digital content within secure & unified application. It combines the capabilities of wikis, social network, blogs, files, microblogs, discussions into secure enterprise platform.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next