Re: [Scribeserver-users] Scribe and Hadoop

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Joydeep.

we currently are just doing the naivé approach writing log files 
directly into hadoop to individual files, rotating them every 15minutes 
to avoid the append problem. We used logtail on the client side to 
de-couple the system and a map/red job which then aggregates the info 
10-30 minutes later.

I was wondering if you had seen the recent contribution to hadoop called 
'chukwa', and what your thoughts were on it.

personally i'm looking at scribe (and chukwa) for realtime logging and 
decision systems.

Joydeep Sen Sarma wrote:
>
> Hi folks,
>
> Can shed some light on scribe and hdfs/hadoop integration at FB:
>
> - when we (actually Avinash – who’s leading the Cassandra project now) 
> started out – we investigated writing log files from scribe directly 
> to hdfs (using libhdfs c++ api). However there were a few issues with 
> this approach that steered us in a different direction:
>
> o hdfs uptime: there have been periods of sustained downtime and we 
> can’t rule that out in the future. There are many reasons – software 
> upgrades being the most common. Buffering data in scribe for such 
> large periods didn’t seem like a very good route
>
> o lack of append support in hdfs in early days
>
> o desire to build loosely coupled systems (otherwise we would have to 
> upgrade scribe servers with new libhdfs every time we had a software 
> upgrade on hdfs)
>
> o flexibility in transforming data while copying into hdfs (more on 
> this later)
>
> - currently we have a rsync like model to pull data from scribe to hdfs:
>
> o scribe writes data to netapp filers. These filers are high speed 
> buffers for the most part
>
> o we have ‘copier’ jobs that ‘pull’ data from scribe output locations 
> in these filers to hdfs. They maintain file offsets for copied data in 
> a registry – so that these jobs can be periodically invoked so that 
> continuous copying can happen.
>
> o ‘copier’ jobs can run in continous mode – or can be invoked to copy 
> (or re-copy) data from older dates (this can be important if incorrect 
> data was logged or data shows up late)
>
> o ‘copier’ jobs are map-only jobs in hadoop – this means that we can 
> increase the copy parallelism if required. For example – if we are 
> falling behind or hdfs was down for long time and there’s a lot of 
> accumulated data – the copiers will dial up the parallelism (up to a 
> maximum – so as not to trip the filers up completely).
>
> - data ‘copied’ into hdfs – is eventually ‘loaded’ into Hive (this is 
> our open source date warehousing layer on top of Hadoop). Usually this 
> loading is a nightly process – but in some small number of cases – we 
> load data at hourly granularity for semi-real-time applications. 
> Application processing over scribe log sets is typically using Hive QL.
>
> - one interesting angle is ‘close of books’. Scribe itself does not 
> provide any hard guarantees on when data for a given date will be 
> logged by. However several applications (especially revenue sensitive 
> ones) need a hard deadline (invoke me when all data for a given day 
> has been logged). For such applications – the loading process 
> typically waits until 2am or so in the night (on day N+1) and then 
> scans data from day N-1, N, N+1 to find all the relevant data for day 
> N (using unix timestamps that are typically logged with the data). 
> This is the data that’s loaded into date partition N for the relevant 
> hive table. Clearly the 2am boundary is arbitrary and we will move 
> towards more heuristic based ways of determining when data for a given 
> date is (almost) complete.
>
> - we have instances of text, json and thrift data sets logged via 
> scribe. For the case of thrift (particularly when there’s a thrift 
> file with heterogenous records) – we do some transforms in the copying 
> process to make the subsequent loading easier. Thrift data also shows 
> up as TFileTransport format – and this cannot be parallel processed by 
> Hadoop natively (although it wouldn’t be so hard to arrange that as 
> well) – so we always convert thrift data into sequencefiles as it’s 
> copied into hadoop.
>
> there are several pieces here that are not open sourced – and 
> depending on community interest can be made available. The scribe to 
> hdfs copier code for one. TFileTransport’s java implementation is also 
> not open sourced (since there is constant talk of superseding it with 
> newer better transports).
>
> Please let us know if there are more questions and would be happy to 
> answer.
>
> Joydeep
>
> ------ Forwarded Message
> *From: *Johan Oskarsson <jo...@os...>
> *Date: *Wed, 19 Nov 2008 09:36:38 -0800
> *To: *<scr...@li...>
> *Subject: *[Scribeserver-users] Scribe and Hadoop
>
> I understand Scribe is being used to put logs into the Hadoop HDFS at
> Facebook. I'd love to hear more about how that works and how to
> replicate the setup you guys have.
>
> /Johan
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's 
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great 
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the 
> world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/ 
> <http://moblin-contest.org/redirect.php?banner_id=100&url=/>
> _______________________________________________
> Scribeserver-users mailing list
> Scr...@li...
> https://lists.sourceforge.net/lists/listinfo/scribeserver-users
>
>
> ------ End of Forwarded Message
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> ------------------------------------------------------------------------
>
> _______________________________________________
> Scribeserver-users mailing list
> Scr...@li...
> https://lists.sourceforge.net/lists/listinfo/scribeserver-users
>