Attempting Liferay Portal integration with Apache hadoop

Anonymous Kim Kunc

Goal of the project is to have a Liferay 6 instance running on Apache hadoop system.

Hadoop features - how could Liferay Portal benefit from the hadoop framework ?

  1. map/reduce service for portlets and services
    This should allow portlets and services to process map/reduce tasks.

As pointed out in this article, map reduce algorithms are used on large datasets.
In a portal context, indexing and searching capabilities could be enhanced by map/reduce
as well as data mining for social activity & social equity data, messaging data and general auditing capabilities.

See article on warehousing:
http://www.dbms2.com/2008/08/26/why-mapreduce-matters-to-sql-data-warehousing/

  1. HDFS - allow the portal to store data and documents on HDFS nodes.
    Not sure if HDFS works for images as well. As far as I can gather, HDFS is good with large data sets as in 10.000 documents and more.

http://hadoopblog.blogspot.com/2011/05/realtime-hadoop-usage-at-facebook-part.html

  1. Database access - I'm still guessing here ... first goal might be to have some sort of Expando integration as Ray Auge from Liferay suggests in his forum post...
    http://www.liferay.com/de/community/forums/-/message_boards/message/6952760

Also task may be to find out how HBASE integration could work.

  1. adding more features...All ideas are welcome !

See links to hadoop and Liferay Portal here


MongoDB Logo MongoDB