source code data mining free download

Open Interpreter

A natural language interface for computers

Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards.

Downloads: 5 This Week

Last Update: 2025-09-12

See Project

Milvus Bootcamp

Dealing with all unstructured data, such as reverse image search

Milvus Bootcamp is a collection of tutorials, examples, and best practices for using Milvus, an open-source vector database designed for AI-powered similarity search and retrieval applications.

Downloads: 0 This Week

Last Update: 2025-05-22

See Project

Datasets

Hub of ready-to-use datasets for ML models

Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. We also feature a deep...

Downloads: 0 This Week

Last Update: 2 days ago

See Project

Spark NLP

State of the Art Natural Language Processing

Experience the power of large language models like never before, unleashing the full potential of Natural Language Processing (NLP) with Spark NLP, the open source library that delivers scalable LLMs. The full code base is open under the Apache 2.0 license, including pre-trained models and pipelines. The only NLP library built natively on Apache Spark. The most widely used NLP library in the enterprise. Spark ML provides a set of machine learning applications that can be built using two main components, estimators and transformers. ...

Downloads: 0 This Week

Last Update: 2026-01-27

See Project

Obsei

Obsei is a low code AI powered automation tool

Obsei is an automated no-code/low-code AI-powered text observation and analysis framework, designed for extracting insights from unstructured text data such as social media, reviews, and logs.

Downloads: 0 This Week

Last Update: 2025-01-24

See Project

BioNLP

BioNLP is an initiative by the University of Colorado Denver Health Sciences Center to create and distribute code, software, and data for applying natural language processing techniques to biomedical texts

Downloads: 0 This Week

Last Update: 2022-10-26

See Project

XLM (Cross-lingual Language Model)

PyTorch original implementation of Cross-lingual Language Model

XLM (Cross-lingual Language Model) is a family of multilingual pretraining methods that align representations across languages to enable strong zero-shot transfer. It popularized objectives like Masked Language Modeling (MLM) across many languages and Translation Language Modeling (TLM) that jointly trains on parallel sentence pairs to tighten cross-lingual alignment. Using a shared subword vocabulary, XLM learns language-agnostic features that work well for classification and sequence...

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

fastNLP

fastNLP: A Modularized and Extensible NLP Framework

fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models...

Downloads: 0 This Week

Last Update: 2022-08-05

See Project

neural network designer

a dbms for neural nets. Chatbots, DTrees, random forests, n-grams,...

This project consists out of a windows based designer application and a library (that can run on multiple platforms, including android) together with several demo applications (including an MVC3 chatbot client and an android application). It is probably best compared to a database management system, but for neural networks instead of relational data. As such, the library is optimized for handling any type of data-size by using advanced streaming and caching algorithms. With the designer,...

Downloads: 0 This Week

Last Update: 2017-03-07

See Project

TextProcessor

A Java package to preprocess text datasets for posterior text analysis

The TextProcessor Java package is a text processing toolkit, which provides some frequently used text processing functions such as stemming, removing stop-words, generating a term vocabulary, and calculating the term-doc frequency matrix. Basic topic mining models such as LDA and sparse NMF are also supported. The package can also generate feature files from a given text dataset with LDA and LIBSVM format for posterior procedures such as classification or clustering. The toolkit is also...

Downloads: 1 This Week

Last Update: 2015-11-23

See Project

BioEvent

This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the...

Downloads: 0 This Week

Last Update: 2013-04-25

See Project

Semantic Weblog Monitoring Framework

Facilitates data mining/natural language processing experiments to be executed on weblogs, such as classification, clustering and rating. As part of these experiments, it is possible to apply Latent Semantic Analysis.

Downloads: 0 This Week

Last Update: 2014-03-29

See Project

semantic search

A project intended to extract a structure from the unstructured www. Making the web documents "understandable" by computers. Fields: NLP,Computational Linguistics,Information Theory,Information Retrieval,Clustering,Data Mining,Semantic web

Downloads: 0 This Week

Last Update: 2014-06-06

See Project

Search Results for "source code data mining"

Showing 13 open source projects for "source code data mining"

Open Interpreter

Milvus Bootcamp

Datasets

Spark NLP

Obsei

BioNLP

XLM (Cross-lingual Language Model)

fastNLP

neural network designer

TextProcessor

BioEvent

Semantic Weblog Monitoring Framework

semantic search

Search Results for "source code data mining"

Showing 13 open source projects for "source code data mining"

Open Interpreter

Milvus Bootcamp

Datasets

Spark NLP

Obsei

BioNLP

XLM (Cross-lingual Language Model)

fastNLP

neural network designer

TextProcessor

BioEvent

Semantic Weblog Monitoring Framework

semantic search

Related Searches

Related Categories