Archive | Community Showcase RSS for this section

If you’re looking into Hadoop you might be interested in HPCC Systems

This is a guest blog post from HPCC Systems. HPCC Systems and Hadoop are open source projects, with both leveraging commodity hardware nodes and local storage interconnected through IP networks, allowing for parallel data processing and querying across this architecture. But this is where similarities end.

HPCC Systems was designed and developed about 14 years ago (1999-2000), under a different paradigm, to provide for a comprehensive, consistent high-level and concise declarative dataflow oriented programming model, represented by the ECL language. You can express data workflows and data queries in a very high level manner, avoiding the complexities of the underlying parallel architecture of the system.

Hadoop has two scripting languages which allow for some abstractions (Pig and Hive), but they don’t compare with the formal aspects, sophistication and maturity of the ECL language, which provides for a number of benefits such as data and code encapsulation, the absence of side effects, the flexibility and extensibility through macros, functional macros and functions, and the libraries of production ready high level algorithms available.

One of the limitations of the MapReduce model utilized by Hadoop, is that internode communication is left to the shuffle phase, which makes certain iterative algorithms that require frequent internode data exchange hard to code and slow to execute (they need to go through multiple phases of Map, Shuffle and Reduce, each representing a barrier operation that forces the serialization of the long tails of execution).

The HPCC Systems platform provides direct internode communication, leveraged by many of the high level ECL primitives. Another Hadoop disadvantage is the use of Java as the programming language for the entire platform, including the HDFS distributed file system, which adds for overhead from the JVM.  In contrast, HPCC and ECL are compiled into C++, which executes natively on top of the Operating System, lending to more predictable latencies and overall faster execution (performance of the HPCC Systems platform is anywhere between 3 and 10 times faster than Hadoop, on the same hardware).

The HPCC Systems platform is comprised of two components: a back-end batch oriented data workflow processing and analytics system called Thor (a data refinery engine equivalent to Hadoop MapReduce), and a front-end real-time data querying and analytics system called Roxie (a data delivery engine which has no equivalent in the Hadoop world). Roxie allows for real-time delivery and analytics of data through parameterized ECL queries (think of them as equivalent to store procedures in your traditional RDBMS). The closest to Roxie in the Hadoop ecosystem is Hbase, which is a strict key/value store and, thus, provides only for very rudimentary retrieval of values by exact or partial key matching. Roxie allows for compound keys, dynamic indices, smart stepping of these indices, aggregation and filtering, and complex calculations and processing.

Moreover, the HPCC Systems platform presents the users with a homogeneous platform, which is production ready and has been largely proven for many years in our own data services, from a company which has been in the Big Data Analytics business before Big Data was called Big Data.

A Quick Guide to Driving a Project to Success with Sourceforge

exo logoThis is a guest blog post from eXo.At eXo, we’ve been partnering with SourceForge since 2003. As an open-source project, which transformed into a professional open-source vendor over the years, we’ve found SourceForge to be the best place for distributing our software. I’m pleased to share some of our experience with the system.

 

Do your basic homework

SourceForge provides an impressive list of software project management tools (forums, mailing lists, version control, wiki, tickets, etc.). However, your first stop is to choose a categorization for your project and fill in the metadata carefully. This sounds obvious, but it is critical for standing out from the mass of projects living on SourceForge. Put the right name and the right keywords in your description, add a few compelling screenshots and design a logo. Hint: Look at your competitors and do better.

Downloads are your driving force

Once we had this done properly, we found we were easier to find on SourceForge and more traffic started to come to our own website.

But for us, the killer feature is the file management tool. It has a simple yet efficient design that makes it very easy to upload a file via an SCP web interface. Any file referenced as your primary download will turn instantly into a very appealing button on the project page.

And you know what? It works. Visitors just click it and download your software because it looks so easy.

But what’s really great about file management on SourceForge is that it comes with download statistics. At any time, you can see how many people have downloaded your software, where they came from and what operating system they are used. At eXo, we’ve even built a dashboard to track our downloads through the very convenient Download Stats API.

 

So, last year, when we launched eXo Platform 4.0, we decided to make SourceForge the unique location from which to download our eXo Platform Community Edition package. This turned out to be one of our best decisions ever.

The number of downloads instantly took off to levels we had never seen before. We ranked better in SourceForge listings and a virtuous loop started. As we got more reviews, more people visited our website from SourceForge, leading to more downloads, leading to a better rank, and so on. Within a couple of months, the rate of community registration had literally been boosted by an order or magnitude. And things have never stopped since then.

 

Foster your community: build a marketplace

 

As a happy consequence of this renewed success and a growing community, we saw many discussions in the forums. People from different horizons came with new ideas and requirements. We observed a blossoming of side projects built by very motivated individuals to address their requirements. So much creativity had to be made public. We needed to provide a place to promote these projects. That’s why we built the eXoAdd-on center, a collection of third-party add-ons to complement eXo Platform’s core features. Add-ons can be many things, like templates, new apps such as for blogging or chatting or new integrations such as Google Drive or Bonita.

 

Screen Shot 2014-04-25 at 10.05.19.png

 

The add-on center is open to all eXo community members. Submitting an add-on to the center is as simple as filling in a form to describe what it does and where to get the downloads and docs. In fact, eXo imposes no special constraints on add-ons except they must be related to eXo Platform. An add-on can even be proprietary or commercial software but making it open source and free will bring more benefits.

First, we can host open-source code on GitHub under the official exo-addons organization. Second, we can host add-on files on SourceForge, giving them extra exposure and statistics.

The cherry on the cake is that since eXo is referenced in the SourceForge Enterprise directory, add-ons can be promoted directly on our SourceForge project page.

For an open-source project, SourceForge can be a key driver to your downloads and popularity. If you have a business on top of your project, I suggest making SourceForge an integral part of your open-source strategy. Any decision around your strategy should start by asking yourself how you can leverage SourceForge before building anything yourself.

Guest Post from the Podcast Generator project

This is a guest post written by Alberto Betella of the Podcast Generator project

Podcast Generator (PG) is an open source Content Management System specifically designed for podcast publishing. It has been developed and maintained since 2006 by Alberto Betella.

PG provides the user with the tools to easily manage all of the aspects related to the publication of a podcast, from the upload of episodes to its submission to the iTunes store.

PG was originally developed for the academic environment, where teachers often lack the technical skills (or the time) to manage dealing with technicalities (e.g. creation and maintenance of an RSS feed) of publishing a podcast and prefer to focus on producing quality content for their students.

With this in mind, PG is conceived to be extremely simple to use and easy to customize, yet still powerful.

Publishing an episode using PG is as simple as uploading a file through the web browser, along with a title and a short description. PG automatically generates or updates a W3C-compliant RSS podcast feed. By doing so, it performs a preprocessing of the input to avoid the most common formatting errors (e.g. non-alphanumeric characters in filenames, html entities conversion, etc.), ensuring feed interoperability and compatibility with the widest range of RSS clients. The RSS feed includes the support for iTunes-specific tags such as long description of the episodes, keywords, content rating, iTunes category and cover art. The episodes can be optionally organized into thematic categories, each of which features its own separate feed.

In addition, PG produces a dynamic website that includes a list of the most recent podcasts, a podcast archive and an mp3 streaming player. In this way not only are podcast episodes available via RSS, but also through the website, thus gaining an increased discoverability and visibility in search engines. To do so, PG fully implements the sharing capabilities of some among the most popular social networks (Facebook, Google+ and Twitter) and adopts SEO techniques such as permalinks and open graph meta tags.

Through PG’s admin interface, the user can upload new episodes or edit existing ones, manage categories and customize all the details of the podcast feed, including – but not limited to – title, description, cover image, author and language. Furthermore, a number of extra features is offered to more advanced users. PG adopts a tailor-made theme engine that allows the customization of the graphical appearance with new skins or the integration with existing websites.

PG also provides the means to manually upload one or more episodes via (s)FTP and easily include them in the podcast feed from the admin area (called the “FTP Feature”). This allows for the upload of multiple files at once without the need to postprocess the episodes individually, since title, author and other details are extracted directly from the embedded ID3 tags. Moreover, the FTP feature comes in handy as a workaround to the server-side restrictions in size of files uploaded through a web browser, when the hosting provider does not grant the users with the possibility to override the default server settings and increase these limits.

PG has very little server requirements, as a matter of fact, it works in any web host with PHP support. The user’s data is stored in XML format, hence no MySQL DB is needed. PG can be installed in a less than a minute through a 3-step setup wizard. In most of the cases a manual installation is not even necessary, as PG is offered as a preinstalled package by some of the biggest hosting and NAS service providers worldwide (e.g. PG is part of the Softaculous bundle, available to millions of users through the control panel provided with their hosting plan).

PG adopts GNU gettext to handle localizations and is currently available in 13 languages. The translations are autonomously managed by a community of volunteers.

As a final remark, a well-known problem, common to many open source software projects (especially those maintained by small teams), is the lack of quality documentation and support. PG provides the users with a comprehensive documentation that covers a wide range of topics, from common issues and FAQs, to more technical aspects. In addition, PG offers enterprise-like free support through the SourceForge ticket system: over the past 8 years, users have regularly submitted support requests, totaling into the hundreds, most of which were replied to and solved within a few hours.

PG has gone far beyond 100k downloads on SourceForge and counts several thousand active users. The latest version can be downloaded from the official project page.

March 2014 Staff Pick Project of the Month, Win32 Disk Imager

The Win32 Disk Imager project is a father (Tobin) / son (Justin) team, plus another developer, Jeff. Tobin is a regular in our IRC channel (Freenode: #sourceforge). This is a pretty cool story. Read on!

SF: Tell us what the Win32 Disk Imager project can do for folks…

Tobin:  Win32DiskImager is a tool to take filesystem images and raw files and write them to memory devices (USB memory sticks, SD/CF cards, etc).  It can also read from the device and save the image as a backup.

SF: What was the problem you were trying to solve with this effort?

Tobin:  This tool was originally developed for the Ubuntu 9.04 (Jaunty) Netbook release, targeting users of Netbooks with Windows preloaded.  At the time, Ubuntu only shipped CD ISO images and Netbooks don’t have a CD drive.  This was created as an easy to use solution for Windows users interested in trying the Ubuntu.  I should note that the program went from concept to working release in a weekend.  Justin can comment more on this.

Justin: Tobin simply called me up on a Thursday after school (Senior year of high school if I remember correctly), and needed a screenshot by the end of the weekend so they could do preliminary documentation. I sent a screenshot only a few minutes later (gotta love developing with Qt) and then had to learn the win32 API *shudder* and by Friday night I had a fully functional prototype.

SF: Has your original vision been achieved?

Tobin:  For the targeted release, it went quite well.  After it was released in April, 2009, Ubuntu changed the format of their ISO images and combined the Desktop and Netbook images, so the tool was no longer needed for this purpose.  At this time, it was all but abandoned.

Justin: My original vision for it was simply a temporary tool for a temporary problem, that had other uses as well. After being asked to allow Ubuntu to take over the project and turn it into an Ubuntu specific tool, I kindly refused since I wanted to keep it a generalized tool with a wide range of uses. I didn’t quite imagine that that decision is what would ultimately allow it to explode in popularity like it has done.

SF: Who can benefit the most from Win32 Disk Imager?

Tobin:  Anyone that is using Windows based systems to do development work on embedded systems or users that want to test the latest Cyanogenmod on their Android devices.  I have also heard from users that use it to just back up their SD cards from their cameras.

SF: What’s the best way to get the most out of using Win32 Disk Imager?

Tobin:  The program is very simple in design.  The first thing to remember is to backup any important data you may have on your memory device before writing to it.  And also, read the readme.txt file.  If you have questions, please ask.  We try to answer all questions that we can.

SF: What is the Win32 Disk Imager release philosophy; do you all use the release early, release often precept?

Tobin:  There are currently two people actively maintaining it (Justin is focusing on another project also hosted on SourceForge).  We also have received a few translations from other users, which is great. Unfortunately, we don’t spend nearly as much time on it as we probably should.  We try to outline what features or bugs we want to resolve in the next release, then work towards that goal.  My biggest issues are the constantly changing API’s in Windows, and having to find out how to integrate them in when something breaks.

SF: If not or if so, why?

Tobin:  Time and resources are the biggest factors here.

SF: What are the key features from your most current release?

Tobin: As of v0.9, we support generating MD5 checksums for image verification (helpful for downloaded images), Drag and Drop images from Windows Explorer, and the ability to define a default directory for images through an environment variable (defaults to the user’s Downloads directory).  This works quite well in Windows XP, but we have seen issues in newer Windows releases due to API changes.

SF: What did the project team do to make sure these were completed in a timely manner?

Tobin:  Timely???  Due to our sporadic release cycle, just getting an updated release out was challenging enough.  :P

Justin: As a side note here the first release was dubbed the “Truck Stop” release since one of the guys debugging it was doing so from a truck stop since we had very little time to get the project ready for the initial release.

SF: What was the first big thing that happened for your project?

Tobin:  It wasn’t until mid-2010 when I had bought a Nook Color from Barnes and Noble that I discovered other interests in this program.  The guy selling the Nooks was trying to also sell a book on using the Nook. I told him I was a Linux developer and could pretty much figure it out on my own.  Then he showed me the chapter on “Rooting your Nook”. Glancing through it, I saw a screenshot of our program along with a url to the Wiki page instructions that I had written.

I immediately ran a Google search and found an entire community of users, mainly in the Android Hacker community, but also developers of embedded Linux systems and other types of devices.  There were also a large amount of open bugs.  Since Justin had moved on to other projects, I took over as lead maintainer, and along with Jeff, we have cleaned up all of the original bugs and added some new features along the way.

The other major event was moving the project to SourceForge (YEA!!!). This has helped out a lot, both in exposure and in the tools now available to us to make this project more noticeable. Since moving (and subsequently being targeted as a SF Project of the Week), our user base has grown a lot.  Last time I was at Barnes and Noble, I found 6 different publications recommending our tool to their target audiences.

SF: What helped make that happen?

Tobin:  For the first part, word of mouth, I guess.  I can definitely say that just being on SourceForge has been a big thing.

Justin: It’s quite interesting that this project had received such worldwide fame despite having zero forms of advertising on our part. I guess that’s what you call going viral.

SF: What was the net result for that event?

Tobin:  I recently received an email from a German magazine editor, saying they were going to write a feature on our project.  I have also seen countless reviews, blogs, and even several video tutorials on Youtube.  Downloads are continuously growing week over week (I check the stats daily while sipping my morning coffee).

SF: What is the next big thing for Win32 Disk Imager?

Tobin:  We have a lot planned for upcoming releases.  First and foremost is to move to either a newer release of mingw or something equivalent, as there are a lot of new API issues in Windows that aren’t addressed in the release we currently build against.  Once we get that resolved, we have a wish list of features we want to integrate, starting with image compression/decompression on the fly.

Justin: A couple things I’ve been experimenting with outside of the project was to possibly have the drop-down-box show not only the drive letter but also the label on the device (for example mine might show up as “F: TuxDrive”). This would help a lot of people I think since my own personal experience of safely removing the drive on XP where it only tells you the letter has been annoying when the computer has 3 or 4 different removable devices plugged in. Also, it would be nice to eventually support batch processing of multiple images since the program is now also being used a lot in major tech companies where they’re flashing dozens of cards all hooked up to one system.

SF: How long do you think that will take?

Tobin:  Hard to say.  We already missed our soft target of 1.0 for the end of 2013.  But we do have an installer in the tree now.  That was one of my goals for 1.0. Right now I am focusing on an updated tool base that supports the newer APIs for Windows 7/8.

Justin: As for the pieces I’d like to see, it might be difficult as I’m tiding up other projects, most notably my Open RPG Maker, before going off to college this fall. However, I may be able to squeeze enough time in there to get those two small parts easily done.

SF: Do you have the resources you need to make that happen?

Tobin:  We could definitely use more help.  We are always open to contributions. We have already received a few translations from users, along with some code contributions.  I would also like to thank Jeff B (skydvr68) for his contributions in both code and with the questions forum.

SF: If you had it to do over again, what would you do differently for Win32DiskImager?

Tobin:  I’ll let Justin answer the next few.

Justin: I don’t really think there is anything I’d do differently since the initial release, while a bit buggy, was still fully functional.

SF: Is there anything else I should know?

Tobin: If we can get our development environment issues resolved, 2014 will be a great year for new features.  Hopefully.

Justin: Really awesome to have this project recognized as project of the month, especially seeing some of the other projects that usually get nominated. Feels pretty awesome to have played a part in getting this project there.

Convertigo, Mobile Application Development Platform for Enterprises

convertigo-enterprise-mashup-logoThis is a guest blog post from Convertigo. Enterprise mobility projects are spreading like wildfire. When they deliver value, business leaders in every department seem to want more – and that’s when the bottleneck happens. Instead of connecting one data source to one type of mobile device, your IT infrastructure is faced with multiple device platforms, complex security issues, and disparate enterprise applications and data sources – many with no API.

To accommodate the many-device to many-platform mobile application integration scenario and the BYOD (Bring Your Own Device) paradigm that most IT teams operate within, Convertigo brings one of the most advanced Open Source Mobile Application Development Platform to Enterprises willing to embrace mobility in an industrial, secured and managed environment.

Convertigo platform features front-end cross-platform development tools linked with a powerful back-end orchestration middleware able to connect to any Enterprise back-office data and process whether there is an API or not.

Convertigo leverages standard open source technologies such as Eclipse, jQuery Mobile, PhoneGap and many others to build an industrial platform to build, run and manage any B2E, B2B and B2C mobile application connected to back office data.