Archive | Community Showcase RSS for this section

If you’re looking into Hadoop you might be interested in HPCC Systems

This is a guest blog post from HPCC Systems. HPCC Systems and Hadoop are open source projects, with both leveraging commodity hardware nodes and local storage interconnected through IP networks, allowing for parallel data processing and querying across this architecture. But this is where similarities end.

HPCC Systems was designed and developed about 14 years ago (1999-2000), under a different paradigm, to provide for a comprehensive, consistent high-level and concise declarative dataflow oriented programming model, represented by the ECL language. You can express data workflows and data queries in a very high level manner, avoiding the complexities of the underlying parallel architecture of the system.

Hadoop has two scripting languages which allow for some abstractions (Pig and Hive), but they don’t compare with the formal aspects, sophistication and maturity of the ECL language, which provides for a number of benefits such as data and code encapsulation, the absence of side effects, the flexibility and extensibility through macros, functional macros and functions, and the libraries of production ready high level algorithms available.

One of the limitations of the MapReduce model utilized by Hadoop, is that internode communication is left to the shuffle phase, which makes certain iterative algorithms that require frequent internode data exchange hard to code and slow to execute (they need to go through multiple phases of Map, Shuffle and Reduce, each representing a barrier operation that forces the serialization of the long tails of execution).

The HPCC Systems platform provides direct internode communication, leveraged by many of the high level ECL primitives. Another Hadoop disadvantage is the use of Java as the programming language for the entire platform, including the HDFS distributed file system, which adds for overhead from the JVM.  In contrast, HPCC and ECL are compiled into C++, which executes natively on top of the Operating System, lending to more predictable latencies and overall faster execution (performance of the HPCC Systems platform is anywhere between 3 and 10 times faster than Hadoop, on the same hardware).

The HPCC Systems platform is comprised of two components: a back-end batch oriented data workflow processing and analytics system called Thor (a data refinery engine equivalent to Hadoop MapReduce), and a front-end real-time data querying and analytics system called Roxie (a data delivery engine which has no equivalent in the Hadoop world). Roxie allows for real-time delivery and analytics of data through parameterized ECL queries (think of them as equivalent to store procedures in your traditional RDBMS). The closest to Roxie in the Hadoop ecosystem is Hbase, which is a strict key/value store and, thus, provides only for very rudimentary retrieval of values by exact or partial key matching. Roxie allows for compound keys, dynamic indices, smart stepping of these indices, aggregation and filtering, and complex calculations and processing.

Moreover, the HPCC Systems platform presents the users with a homogeneous platform, which is production ready and has been largely proven for many years in our own data services, from a company which has been in the Big Data Analytics business before Big Data was called Big Data.

A Quick Guide to Driving a Project to Success with Sourceforge

exo logoThis is a guest blog post from eXo.At eXo, we’ve been partnering with SourceForge since 2003. As an open-source project, which transformed into a professional open-source vendor over the years, we’ve found SourceForge to be the best place for distributing our software. I’m pleased to share some of our experience with the system.

 

Do your basic homework

SourceForge provides an impressive list of software project management tools (forums, mailing lists, version control, wiki, tickets, etc.). However, your first stop is to choose a categorization for your project and fill in the metadata carefully. This sounds obvious, but it is critical for standing out from the mass of projects living on SourceForge. Put the right name and the right keywords in your description, add a few compelling screenshots and design a logo. Hint: Look at your competitors and do better.

Downloads are your driving force

Once we had this done properly, we found we were easier to find on SourceForge and more traffic started to come to our own website.

But for us, the killer feature is the file management tool. It has a simple yet efficient design that makes it very easy to upload a file via an SCP web interface. Any file referenced as your primary download will turn instantly into a very appealing button on the project page.

And you know what? It works. Visitors just click it and download your software because it looks so easy.

But what’s really great about file management on SourceForge is that it comes with download statistics. At any time, you can see how many people have downloaded your software, where they came from and what operating system they are used. At eXo, we’ve even built a dashboard to track our downloads through the very convenient Download Stats API.

 

So, last year, when we launched eXo Platform 4.0, we decided to make SourceForge the unique location from which to download our eXo Platform Community Edition package. This turned out to be one of our best decisions ever.

The number of downloads instantly took off to levels we had never seen before. We ranked better in SourceForge listings and a virtuous loop started. As we got more reviews, more people visited our website from SourceForge, leading to more downloads, leading to a better rank, and so on. Within a couple of months, the rate of community registration had literally been boosted by an order or magnitude. And things have never stopped since then.

 

Foster your community: build a marketplace

 

As a happy consequence of this renewed success and a growing community, we saw many discussions in the forums. People from different horizons came with new ideas and requirements. We observed a blossoming of side projects built by very motivated individuals to address their requirements. So much creativity had to be made public. We needed to provide a place to promote these projects. That’s why we built the eXoAdd-on center, a collection of third-party add-ons to complement eXo Platform’s core features. Add-ons can be many things, like templates, new apps such as for blogging or chatting or new integrations such as Google Drive or Bonita.

 

Screen Shot 2014-04-25 at 10.05.19.png

 

The add-on center is open to all eXo community members. Submitting an add-on to the center is as simple as filling in a form to describe what it does and where to get the downloads and docs. In fact, eXo imposes no special constraints on add-ons except they must be related to eXo Platform. An add-on can even be proprietary or commercial software but making it open source and free will bring more benefits.

First, we can host open-source code on GitHub under the official exo-addons organization. Second, we can host add-on files on SourceForge, giving them extra exposure and statistics.

The cherry on the cake is that since eXo is referenced in the SourceForge Enterprise directory, add-ons can be promoted directly on our SourceForge project page.

For an open-source project, SourceForge can be a key driver to your downloads and popularity. If you have a business on top of your project, I suggest making SourceForge an integral part of your open-source strategy. Any decision around your strategy should start by asking yourself how you can leverage SourceForge before building anything yourself.

Guest Post from the Podcast Generator project

This is a guest post written by Alberto Betella of the Podcast Generator project

Podcast Generator (PG) is an open source Content Management System specifically designed for podcast publishing. It has been developed and maintained since 2006 by Alberto Betella.

PG provides the user with the tools to easily manage all of the aspects related to the publication of a podcast, from the upload of episodes to its submission to the iTunes store.

PG was originally developed for the academic environment, where teachers often lack the technical skills (or the time) to manage dealing with technicalities (e.g. creation and maintenance of an RSS feed) of publishing a podcast and prefer to focus on producing quality content for their students.

With this in mind, PG is conceived to be extremely simple to use and easy to customize, yet still powerful.

Publishing an episode using PG is as simple as uploading a file through the web browser, along with a title and a short description. PG automatically generates or updates a W3C-compliant RSS podcast feed. By doing so, it performs a preprocessing of the input to avoid the most common formatting errors (e.g. non-alphanumeric characters in filenames, html entities conversion, etc.), ensuring feed interoperability and compatibility with the widest range of RSS clients. The RSS feed includes the support for iTunes-specific tags such as long description of the episodes, keywords, content rating, iTunes category and cover art. The episodes can be optionally organized into thematic categories, each of which features its own separate feed.

In addition, PG produces a dynamic website that includes a list of the most recent podcasts, a podcast archive and an mp3 streaming player. In this way not only are podcast episodes available via RSS, but also through the website, thus gaining an increased discoverability and visibility in search engines. To do so, PG fully implements the sharing capabilities of some among the most popular social networks (Facebook, Google+ and Twitter) and adopts SEO techniques such as permalinks and open graph meta tags.

Through PG’s admin interface, the user can upload new episodes or edit existing ones, manage categories and customize all the details of the podcast feed, including – but not limited to – title, description, cover image, author and language. Furthermore, a number of extra features is offered to more advanced users. PG adopts a tailor-made theme engine that allows the customization of the graphical appearance with new skins or the integration with existing websites.

PG also provides the means to manually upload one or more episodes via (s)FTP and easily include them in the podcast feed from the admin area (called the “FTP Feature”). This allows for the upload of multiple files at once without the need to postprocess the episodes individually, since title, author and other details are extracted directly from the embedded ID3 tags. Moreover, the FTP feature comes in handy as a workaround to the server-side restrictions in size of files uploaded through a web browser, when the hosting provider does not grant the users with the possibility to override the default server settings and increase these limits.

PG has very little server requirements, as a matter of fact, it works in any web host with PHP support. The user’s data is stored in XML format, hence no MySQL DB is needed. PG can be installed in a less than a minute through a 3-step setup wizard. In most of the cases a manual installation is not even necessary, as PG is offered as a preinstalled package by some of the biggest hosting and NAS service providers worldwide (e.g. PG is part of the Softaculous bundle, available to millions of users through the control panel provided with their hosting plan).

PG adopts GNU gettext to handle localizations and is currently available in 13 languages. The translations are autonomously managed by a community of volunteers.

As a final remark, a well-known problem, common to many open source software projects (especially those maintained by small teams), is the lack of quality documentation and support. PG provides the users with a comprehensive documentation that covers a wide range of topics, from common issues and FAQs, to more technical aspects. In addition, PG offers enterprise-like free support through the SourceForge ticket system: over the past 8 years, users have regularly submitted support requests, totaling into the hundreds, most of which were replied to and solved within a few hours.

PG has gone far beyond 100k downloads on SourceForge and counts several thousand active users. The latest version can be downloaded from the official project page.

Convertigo, Mobile Application Development Platform for Enterprises

convertigo-enterprise-mashup-logoThis is a guest blog post from Convertigo. Enterprise mobility projects are spreading like wildfire. When they deliver value, business leaders in every department seem to want more – and that’s when the bottleneck happens. Instead of connecting one data source to one type of mobile device, your IT infrastructure is faced with multiple device platforms, complex security issues, and disparate enterprise applications and data sources – many with no API.

To accommodate the many-device to many-platform mobile application integration scenario and the BYOD (Bring Your Own Device) paradigm that most IT teams operate within, Convertigo brings one of the most advanced Open Source Mobile Application Development Platform to Enterprises willing to embrace mobility in an industrial, secured and managed environment.

Convertigo platform features front-end cross-platform development tools linked with a powerful back-end orchestration middleware able to connect to any Enterprise back-office data and process whether there is an API or not.

Convertigo leverages standard open source technologies such as Eclipse, jQuery Mobile, PhoneGap and many others to build an industrial platform to build, run and manage any B2E, B2B and B2C mobile application connected to back office data.

Project of the Month Changes…

Hi folks,

As most of you know, we have been doing a Project of the Month on SourceForge since October of 2002. When we feature a project, it’s a little like the Colbert Bump, projects get some visibility, and often, good things happen for these projects.

Starting this month, we are making a change in the program. We have recently returned to a Project of the Month voting process where you in the community have the power to select a project from a set of nominees we draw based on project growth, releases, and other data. This project is the “Community Choice.”

We are adding an additional project that is selected by our team, we’re calling the “Staff Pick.” Simply put, a member of our team generally comes across a project they have used, one they love, or one that solves an interesting problem. We consider such nominations from our team and pick one to highlight.

Both projects will get featured on the SourceForge front page. As well, we’ll do a blog post about them both, and continue to track them in our running PoTM historical list. As usual, we’ll also mention these in our monthly newsletter.

Many thanks to you all for your contribute to OSS; the Project of the Month has become what is is because of your contributions to making software that is important to you and that others love.