Showing 38 open source projects for "text clustering"

View related business solutions
  • Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
    Achieve perfect load balancing with a flexible Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • 1
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Finetuner

    Finetuner

    Task-oriented finetuning for better embeddings on neural search

    ...-quality embeddings for semantic search, visual similarity search, cross-modal text image search, recommendation systems, clustering, duplication detection, anomaly detection, or other uses. Bring considerable improvements to model performance, making the most out of as little as a few hundred training samples, and finish fine-tuning in as little as an hour.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    rqlite

    rqlite

    The lightweight, distributed relational database built on SQLite

    rqlite is an easy-to-use, lightweight, distributed relational database, which uses SQLite as its storage engine. rqlite is simple to deploy, operating it is very straightforward, and its clustering capabilities provide you with fault-tolerance and high availability. rqlite is available for Linux, macOS, and Microsoft Windows. rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • ConnectWise Cybersecurity Management for MSPs Icon
    ConnectWise Cybersecurity Management for MSPs

    Software and support solutions to protect your clients’ critical business assets

    ConnectWise SIEM (formerly Perch) offers threat detection and response backed by an in-house Security Operations Center (SOC). Defend against business email compromise, account takeovers, and see beyond your network traffic. Our team of threat analysts does all the tedium for you, eliminating the noise and sending only identified and verified treats to action on. Built with multi-tenancy, ConnectWise SIEM helps you keep clients safe with the best threat intel on the market.
  • 5
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    m23

    m23

    Your linux deployment tool!

    m23 is a free software distribution system (license: GPL), that installs (via network, starting with partitioning and formatting) and administrates (updates, adds / removes software, adds / removes scripts) clients with Debian, (X/K)Ubuntu and LinuxMint. It is used for deployment of Linux clients in schools, institutions and enterprises. The m23 server is controlled via a web interface. A new m23 client can be installed easily in only three steps. Group functions and mass installation...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    yabasta

    yabasta

    Yet Another BAsic Scraper and Text Analysis

    YA BASTA! is a Python/R application for Lyrics Web Scraper and Text Analysis. Web scraping is developed in Python, text analysis in R as Python subprocesses. YA BASTA! is only tested on windows OS. To run YA BASTA! just type on window command prompt: python.exe yabasta.py
    Downloads: 0 This Week
    Last Update:
    See Project
  • RMM Software | Remote Monitoring Platform and Tools Icon
    RMM Software | Remote Monitoring Platform and Tools

    Best-in-class automation, scalability, and single-pane IT management.

    Don’t settle when it comes to managing your clients’ IT infrastructure. Exceed their expectations with ConnectWise RMM, our MSP RMM software that provides proactive tools and NOC services—regardless of device environment. With the number of new vulnerabilities rising each year, smart patching procedures have never been more important. We automatically test and deploy patches when they are viable and restrict patches that are harmful. Get better protection for clients while you spend less time managing endpoints and more time growing your business. It’s tough to locate, afford, and retain quality talent. In fact, 81% of IT leaders say it’s hard to find the recruits they need. Add ConnectWise RMM, NOC services and get the expertise and problem resolution you need to become the advisor your clients demand—without adding headcount.
  • 10

    AngClust

    AngClust: Angle-based feature clustering for time series

    Citation: Aimin Li, Siqi Xiong, Junhuai Li, Saurav Mallik, Yajun Liu, Rong Fei, Hongfang Zhou, Guangming Liu. AngClust: Angle Feature-Based Clustering for Short Time Series Gene Expression Profiles. January 2022. IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM. DOI: 10.1109/TCBB.2022.3192306 Full text: https://ieeexplore.ieee.org/document/9833353/ https://pubmed.ncbi.nlm.nih.gov/35853049/ Highlights * We proposed a novel clustering algorithm based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 12
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    cbrTekStraktor

    an application to automatically extract text from comic books.

    ... by a combination of statistical and graphical processing operations. It is based on the following 3 major algorithms - Binarization of color images (Niblak and other methods) - Connected components - K-Means clustering Apache Tesseract is used to perform Optical Character Recognition on the extracted text. A subsequent version of the application will integrate with translation software in order to provide automated translation of comic book texts and re-inserion of translated texts
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    JInsect
    The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16

    webtextanalysis

    Mining knowledge from text data

    This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    Fast Matrix for Java

    General purpose matrix utilities for Java in Parallel Computing

    Fast Matrix for Java (fm4j) is a general-purpose matrix utility library for computing with dense matrices. fm4j encapsulated different underlying implementations and select the optimal one in run-time depending on the size of the input matrix. Moreover, fm4j employs Java (Tm) Concurrency to take advantage of the computation power of multi-cor processors.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Multi-Pipeline Gene Expression Analysis

    Helps users to determine the optimal gene expression analysis pipeline

    .... Summarization (mas5, rma, farms, DFW) 4. Gene Selection (SAM, ANOVA) Then executes K-means clustering on the significant genes, and evaluates the pipelines using the cumulative distribution funciton of the GO term co-clustering p-values. From this, the optimal microarray data workflow is chosen. Input files are CEL files along with a CLM file, which is a tab delimited text file containing one scan, sample and class per line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    TextProcessor

    A Java package to preprocess text datasets for posterior text analysis

    The TextProcessor Java package is a text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported. The package can also generate feature files from a given text dataset with LDA and LIBSVM format for posterior procedures such as classification or clustering. The toolkit is also being...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21

    WaveSorter

    A powerful, versatile tool for offilne spike analysis and sorting

    WaveSorter emphasizes dynamic visualization and versatility. Slider controls let the user select any coefficient or sample from any of several transforms, which can then be plotted to either axis of a 2D histogram (scatterplot). Within the waveform space, cursor-based controls let the user select subregions of the waveform space or individual waveforms to view. The user may cluster waveforms manually or via one of several popular clustering programs. The classification along with waveform...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Meresco is both an OAI Data Provider and a Service Provider. SourceForge is only used to host the source control (subversion). Sources: http://sources.meresco.org/ Binaries: http://repository.cq2.org/ Mail: http://groups.google.com/group/meresco
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Ticket Cluster

    Java aplication who groups related text documents, text mining

    ... created the folder with text, run ticket cluster, select folder and click process. After processing is done you can watch results in dendogram or tree. Text mining, text clustering using LINGO
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    This project aims to implement in java the following text mining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The first 3d search engine for text. Javascript only. Work in all browsers. Ajax downloads new words (and links between them) as you move mouse to control AI to learn what you're looking for (in context) and put it on screen. Includes Wikipedia data
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next