Re: [Scribeserver-users] Scribe and Hadoop

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi folks,

Can shed some light on scribe and hdfs/hadoop integration at FB:

-          when we (actually Avinash - who's leading the Cassandra project now) started out - we investigated writing log files from scribe directly to hdfs (using libhdfs c++ api). However there were a few issues with this approach that steered us in a different direction:
o        hdfs uptime: there have been periods of sustained downtime and we can't rule that out in the future. There are many reasons - software upgrades being the most common. Buffering data in scribe for such large periods didn't seem like a very good route
o        lack of append support in hdfs in early days
o        desire to build loosely coupled systems (otherwise we would have to upgrade scribe servers with new libhdfs every time we had a software upgrade on hdfs)
o        flexibility in transforming data while copying into hdfs (more on this later)

-          currently we have a rsync like model to pull data from scribe to hdfs:
o        scribe writes data to netapp filers. These filers are high speed buffers for the most part
o        we have 'copier' jobs that 'pull' data from scribe output locations in these filers to hdfs. They maintain file offsets for copied data in a registry - so that these jobs can be periodically invoked so that continuous copying can happen.
o        'copier' jobs can run in continous mode - or can be invoked to copy (or re-copy) data from older dates (this can be important if incorrect data was logged or data shows up late)
o        'copier' jobs are map-only jobs in hadoop - this means that we can increase the copy parallelism if required. For example - if we are falling behind or hdfs was down for long time and there's a lot of accumulated data - the copiers will dial up the parallelism (up to a maximum - so as not to trip the filers up completely).

-          data 'copied' into hdfs - is eventually 'loaded' into Hive (this is our open source date warehousing layer on top of Hadoop). Usually this loading is a nightly process - but in some small number of cases - we load data at hourly granularity for semi-real-time applications. Application processing over scribe log sets is typically using Hive QL.

-          one interesting angle is 'close of books'. Scribe itself does not provide any hard guarantees on when data for a given date will be logged by. However several applications (especially revenue sensitive ones) need a hard deadline (invoke me when all data for a given day has been logged). For such applications - the loading process typically waits until 2am or so in the night (on day N+1) and then scans data from day N-1, N, N+1 to find all the relevant data for day N (using unix timestamps that are typically logged with the data). This is the data that's loaded into date partition N for the relevant hive table. Clearly the 2am boundary is arbitrary and we will move towards more heuristic based ways of determining when data for a given date is (almost) complete.

-          we have instances of text, json and thrift data sets logged via scribe. For the case of thrift (particularly when there's a thrift file with heterogenous records) - we do some transforms in the copying process to make the subsequent loading easier. Thrift data also shows up as TFileTransport format - and this cannot be parallel processed by Hadoop natively (although it wouldn't be so hard to arrange that as well) - so we always convert thrift data into sequencefiles as it's copied into hadoop.

there are several pieces here that are not open sourced - and depending on community interest can be made available. The scribe to hdfs copier code for one. TFileTransport's java implementation is also not open sourced (since there is constant talk of superseding it with newer better transports).

Please let us know if there are more questions and would be happy to answer.

Joydeep

------ Forwarded Message
From: Johan Oskarsson <jo...@os...>
Date: Wed, 19 Nov 2008 09:36:38 -0800
To: <scr...@li...>
Subject: [Scribeserver-users] Scribe and Hadoop

I understand Scribe is being used to put logs into the Hadoop HDFS at
Facebook. I'd love to hear more about how that works and how to
replicate the setup you guys have.

/Johan

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Scribeserver-users mailing list
Scr...@li...
https://lists.sourceforge.net/lists/listinfo/scribeserver-users

------ End of Forwarded Message