Menu

Passing parameters to mapper in hadoop pipes

Help
2014-05-09
2014-05-09
  • mayank jalan

    mayank jalan - 2014-05-09

    First of all awesome job.

    I would like to know is it possible to pass parameters to Hadoop pipes map program.
    Can i pass input to record reader also.
    Please help.

    The example you have shown only works with Hadoop script. can you provide a similar example for

    example

    Job Parameters
    Suppose you want to select all lines containing a substring to be given at run time. Create a module grep.py:

    def mapper(_, text, writer, conf): # notice the fourth 'conf' argument
    if text.find(conf['grep-expression']) >= 0:
    writer.emit("", text)
    Job parameters, like in Hadoop pipes, are passed via the -D option:

    pydoop script --num-reducers 0 -t '' -D grep-expression=my_substring \ grep.py hdfs_input hdfs_output

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.