Cuttlefish aims to be a highly extensible visualization and analysis platform for all kinds of network data. IMPORTANT: The Cuttlefish project has been migrated to github: https://github.com/dev-cuttlefish/cuttlefish
The FAKE GAME tool uses natural evolution to evolve Data Mining models. It incorporates several preprocessing, optimization and visualization methods aimed to streamline the Knowledge Discovery process. Knowledge Extraction from data is being automated!
Bowtie, an ultrafast, memory-efficient short read aligner for short DNA sequences (reads) from next-gen sequencers. Please cite: Langmead B, et al. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
ZedGraph is a class library, user control, and web control for .net, written in C#, for drawing 2D Line, Bar, and Pie Charts. It features full, detailed customization capabilities, but most options have defaults for ease of use.
Java graph/network library
JUNG provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. New version now available on GitHub: https://github.com/jrtom/jung/releases/tag/jung-2.1
A set of tools for working with high-throughput sequencing data
A set of tools (in Java) for working with next generation sequencing data in the SAM/BAM format. Note that development has moved to GitHub at https://github.com/broadinstitute/picard and support is available on the GATK forum at http://gatkforums.broadinstitute.org/categories/ask-the-team
SciDAVis is a user-friendly data analysis and visualization program primarily aimed at high-quality plotting of scientific data. It strives to combine an intuitive, easy-to-use graphical user interface with powerful features such as Python scriptability.
The Network Forensics Tool
NetworkMiner is a Network Forensic Analysis Tool (NFAT) for Windows that can detect the OS, hostname and open ports of network hosts through packet sniffing or by parsing a PCAP file. NetworkMiner can also extract transmitted files from network traffic. New versions of NetworkMiner are released exclusively on www.netresec.com since version 2.0 of NetworkMiner. This page on SourceForge is only kept to provide hosting of older versions of the software. To get the latest version of NetworkMiner, please visit: http://www.netresec.com/?page=NetworkMiner
A cross-platform statistical package for econometric analysis
gretl is a cross-platform software package for econometric analysis, written in the C programming language.
C++/C library to construct Excel .xls files in code.
A multiplatform C++ library for dynamic generation of Excel .xls files containing multiple worksheets. Unlike .csv files, these can be directly opened by Excel and thus provide an excellent way to output large data sets that require further analysis. To see the latest changes, select "Files" and view the README text displayed at the bottom of that pane. IMPORTANT: Major changes are contained in the current SVN source. If you have time please try to use it or the xlslib-package-2.4.0b1.zip archive, and enter bug reports on any problems! Changes: - library specific strings now in their own namespace - iOS Objective-C library - most project files updated (MSVS etc) - C bridge now supports formulas Note: there is a related SF project, libxls, to read Excel files.
The National Library of New Zealand's Metadata Extraction Tool automatically extracts preservation-related metadata from digital files, then output that metadata in XML formats. It can be used through a graphical user interface or command-line interface. Please take the latest code from 'https://github.com/DIA-NZ/Metadata-Extraction-Tool.git'. The code on source forge will not be updated henceforth as it is moved to github.
An easy to use Java program that allows you to digitize data points off of scanned plots, scaled drawings, or orthographic photographs. Includes an automatic digitization feature that can automatically digitize many types of functional data.
Magstripper is a magnetic card reader and decoder that takes raw waveform information from a magnetic audio head (soldered directly to a mono audio jack) and processes it via a mic input. It also includes a multi-user door lock access control system.
Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex compounding or character encoding. Hunspell interfaces: Curses, Ispell compatible pipe interface, OpenOffice.org UNO module
The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
World's first open source data quality & data preparation project
This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
C++, Matlab and Python library for Hidden-state Conditional Random Fields. Implements 3 algorithms: LDCRF, HCRF and CRF. For Windows and Linux, 32- and 64-bits. Optimized for multi-threading. Works with sparse or dense input features.
GATE (General Architecture for Text Engineering) is an architecture, framework and development environment for developing, evaluating and embedding Human Language Technology. See http://gate.ac.uk for full details.
Defraser is a forensic analysis application that can be used to detect full and partial multimedia files in datastreams. It is typically used to find (and restore) complete or partial video files in datastreams (for instance, unallocated diskspace).
GridLAB-D is a new power system simulation tool that provides valuable information to users who design and operate electric power transmission and distribution systems, and to utilities that wish to take advantage of the latest smart grid technology. It incorporates advanced modeling techniques with high-performance algorithms to deliver the latest in end-use load modeling technology integrated with three-phase unbalanced power flow, and retail market systems. Historically, the inability to effectively model and evaluate smart grid technologies has been a barrier to adoption; GridLAB-D is designed to address this problem. User documentation can be found at: http://gridlab-d.shoutwiki.com/wiki/Quick_links The source code is available from GitHub. See https://github.com/gridlab-d/gridlab-d. Issue tracking is handled by GitHub. See https://github.com/gridlab-d/gridlab-d/issues.
FormScanner - Free OMR Software
FormScanner is an OMR (Optical Mark Recognition) software that automatically marks multiple-choice papers. FormScanner not bind you to use a default template of the form, but gives you the ability to use a custom template created from a simple scan of a blank form. The modules can be scanned as images with a simple scanner and processed with FormScanner software. All the collected information can be easily exported to a spreadsheet.
A real-time graph plotter. While your application is computing and logging results to a CSV file using the LiveGraph Writer API, the plotter lets you visualise and monitor the results live - by instantly plotting charts and graphs of the data.
Java Modelling Tools is a suite for performance evaluation and modelling. Queuing Network models are solved with analytical, asymptotic and simulation methods; workload is characterized using clustering techniques.
This program provides two main functionality: 1 ) Generate srt subtitles with timing (but without text) from hardcoded subtitles into the video by using algorithms of digital image processing. 2 ) Generate cleared (text only) images from images with hardcoded subtitles (using text mining algorithms) for further recognition by software like FineReader and generation srt subtitles with timing and text. For working this programm possibly will be required to install "K-Lite Codec Pack Full" and Visual Studio 2017 runtime x64. Latest version 2.20 was tested on: Windows 10 For faster support in case of bug fixes please contact me in: https://vk.com/skosnits
ThManager is a tool for creating and visualizing knowledge organization models, such as thesauri, classification schemes, glosaries and other types of controlled vocabulary in SKOS format