crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.
Create XML and JSON data services from any data source
Create services to integrate applications & move data of any type. Build data views across DBMS, SOAP, HTTP/REST, Salesforce, SAP, Microsoft, SharePoint, Text, LDAP, FTP sources to read, write & transfer data. Eclipse designer & run-time engine.
A RESTFul/JSON Web Service for text and metata extraction
An open source RESTFul Web Service for text , meta-data extraction and analysis.
oss-text-extractor supports various binary formats:
Word processor (doc, docx, odt, rtf)
Spreadsheet (xls, xlsx, ods)
Presentation (ppt, pptx, odp)
Publishing (pdf, pub)
Web (rss, html/xhtml)
Medias (audio, images)
Others (vsd, text)
Syndicate text and multimedia content with this API and storefront.
Use this suite of Application Programming Interface (API) platforms to share web content across multiple channels. Mobile and tablet applications, widgets, and web pages may use the APIs to deliver and update content. The APIs allow content reuse and reduce development costs and product time-to-market. The APIs are available as .NET or Java instances. For more information, see the ReadMe.txt file in the downloadable zip archive.
The Centers for Disease Control and Prevention (CDC) and...
Supply chain managers, executives, and businesses seeking AI-powered solutions to optimize planning, operations, and decision-making across the supply
Logility is a market-leading provider of AI-first supply chain management solutions engineered to help organizations build sustainable digital supply chains that improve people’s lives and the world we live in. The company’s approach is designed to reimagine supply chain planning by shifting away from traditional “what happened” processes to an AI-driven strategy that combines the power of humans and machines to predict and be ready for what’s coming. Logility’s fully integrated, end-to-end platform helps clients know faster, turn uncertainty into opportunity, and transform the supply chain from a cost center to an engine for growth.
Jitterbit is an open source integration tool that delivers a quick and simple way to design, configure, test, and deploy integration solutions. It supports many document types and protocols: XML, web services, database, LDAP, text, FTP, HTTP(S), file
A software to store, (PUT) meteorological data from complex free form text format to databases and and GET stored (and already loaded) data from databases using OPeNDAP protocol. Written using Java6, XSD, and C++. It support OPeNDAP clients thanks to
This is our implementation of a tagger that was described in iTag: A Personalized Blog Tagger. It will automatically tag blog posts, text documents and other textual media. Comes with an XML-RPC server for web deployment.
The p2dir "phone mashup" provides point-to-point driving directions based on start and destination phone numbers. It combines telephony, speech recognition, reverse number, and driving-direction web services; no Java, GPS, or cell towers needed.
It's a modern take on desktop management that can be scaled as per organizational needs.
Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.