LTYARN Prototype Version 0.1 Released.
(http://ltyarn.sourceforge.net/)
Jul, 20, 2014.
INTRODUCTION
============
Life is not fair, but with a little help, existing large- scale data processing systems (e.g., YARN, Spark, Dryad) can be, ensuring resource sharing between users. However, past work on fair sharing considered memoryless fairness, an instantious fair share without historical information considered. When it comes to cloud computing (i.e., pay-as-you-use computing), it fails to satisfy the service-as-you-pay fairness (i.e., the total service that each user enjoys should be proportional to her payment) from a long-term view. Long-Term Resource Fairness (LTRF) generalizes max-min fairness for this case. LTYARN implements LTRF for YARN in cloud computing.
Currently, it supports hadoop-2.2.0.
Building
=========
1. Download LTYARN_v0.1.patch.
2. Download hadoop-2.2.0-src.tar.gz from http://mirror.nus.edu.sg/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0-src.tar.gz,
and binary hadoop-2.2.0.tar.gz from http://mirror.nus.edu.sg/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
3. Decompress it:
tar xvzf hadoop-2.2.0-src.tar.gz
tar xvzf hadoop-2.2.0.tar.gz
4. cd hadoop-2.2.0-src and run the following command:
patch -p1 < ../LTYARN_v0.1.patch.
5 build the whole source files in hadoop-2.2.0-src:
mvn clean
mvn package -DskipTests
6. copy the jar files to hadoop-2.2.0
cp -r hadoop-2.2.0-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/target/hadoop-mapreduce-client-core-2.2.0.jar hadoop-2.2.0/share/hadoop/mapreduce/
cp -r hadoop-2.2.0-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/hadoop-mapreduce-client-jobclient-2.2.0.jar hadoop-2.2.0/share/hadoop/mapreduce/
cp -r hadoop-2.2.0-src/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/hadoop-mapreduce-client-app-2.2.0.jar hadoop-2.2.0/share/hadoop/mapreduce/
cp -r hadoop-2.2.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/hadoop-yarn-server-resourcemanager-2.2.0.jar hadoop-2.2.0/share/hadoop/yarn/
cp -r hadoop/hadoop-2.2.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/hadoop-yarn-common-2.2.0.jar hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar
Configuring
===========
1. Edit the file hadoop-2.2.0/etc/hadoop/yarn-site.xml by configurating yarn.resourcemanager.scheduler.class with org.apache.hadoop.yarn.server.resourcemanager.scheduler.memorableFair.MemorableFairScheduler, i.e.,
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.memorableFair.MemorableFairScheduler</value>
</property>
2. Configure the hadoop-2.2.0/etc/hadoop/fair-scheduler.xml, e.g.,
<?xml version="1.0"?>
<allocations>
<queue name="Spark">
<schedulingPolicy>fair</schedulingPolicy>
<queueFairShareSchedulingMode>memorable_maxmin_fairshare</queueFairShareSchedulingMode>
<workflowFairShareSchedulingMode>memorable_maxmin_fairshare</workflowFairShareSchedulingMode>
</queue>
<queue name="Facebook">
<schedulingPolicy>fair</schedulingPolicy>
<queueFairShareSchedulingMode>memorable_maxmin_fairshare</queueFairShareSchedulingMode>
<workflowFairShareSchedulingMode>memorable_maxmin_fairshare</workflowFairShareSchedulingMode>
</queue>
<adaptiveTaskQuantumEnabled>true</adaptiveTaskQuantumEnabled>
<watchTimeBasedEnabled>true</watchTimeBasedEnabled>
<defaultQueueFairShareSchedulingMode>memorable_maxmin_fairshare</defaultQueueFairShareSchedulingMode> <!--memoryless_maxmin_fairshare, memorable_maxmin_fairshare -->
<defaultWorkflowFairShareSchedulingMode>memorable_maxmin_fairshare</defaultWorkflowFairShareSchedulingMode> <!--memoryless_maxmin_fairshare, memorable_maxmin_fairshare -->
<defaultQueueRoundRobinTaskTimeQuantum>60</defaultQueueRoundRobinTaskTimeQuantum> <!-- by default, 60 sec -->
<defaultQueueRoundRobinTimeRoundLength>3600</defaultQueueRoundRobinTimeRoundLength> <!-- by default, 1 hour-->
<defaultWorkflowRoundRobinTaskTimeQuantum>60</defaultWorkflowRoundRobinTaskTimeQuantum>
<defaultWorkflowRoundRobinTimeRoundLength>600</defaultWorkflowRoundRobinTimeRoundLength>
<defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
<defaultWorkflowSchedulingPolicy>fair</defaultWorkflowSchedulingPolicy>
</allocations>
3. For other hadoop specific configuration and cluster setup, please refer to http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-common/ClusterSetup.html.
COPYRIGHT
=========
This software is developed by Tang Shanjiang, School of Computer Engineering,
Nanyang Technological University. If any comments or problems, please directly contact Tang shanjiang
using either of the following email addresses:
stang5@e.ntu.edu.sg; tashj@tju.edu.cn; tashj@sina.com.
LTYARN is an open-source software, complying with Apache License, Version 2.0.
LTYARN is distributed WITHOUT WARRANTY, express or implied. The authors accept NO LEGAL
LIABILITY OR RESPONSIBILITY for loss due to reliance on the program.
To use this software, please cite the following paper:
/******************************************************
Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun, Liu:
"Long-Term Resource Fairness: Towards Economic Fairness on Pay-as-you-use Computing Systems", ICS'14, pp. 251-260, 2014.
Shanjiang Tang, Bu-Sung Lee, and Bingsheng He:
"Fair Resource Allocation for Data-Intensive Computing in the Cloud," IEEE Transactions on Services Computing, 2016.
*******************************************************/