Re: [Openscrobbler-devel] Distributing the system

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Thanks for the clarifications, sorry for the delay in reply time.

On Mar 5, 2005, at 1:08 AM, Jonathan Dance wrote:

> Continuing my inline-whoring....
>
> On Mar 4, 2005, at 11:56 PM, Mr.Deep wrote:
>
>> I think it would be better to develop it as the central 
>> server/clusters system that we have been discussing, as you 
>> mentioned, building an AS-like clone may end up just making things 
>> harder on us because we'll have to put a significant amount of effort 
>> into regrouping it into a distributed system.  I guess the bad part 
>> of going straight to the central server/clusters system is that it 
>> will take longer, right?
>
> Yea, it could take a really long time (especially at the current pace) 
> to get there. It would of course make sense to avoid as much 
> re-development as possible, but I think it reasonable to assume that 
> we're not going to jump from 0% to 100% - we're going to need a way to 
> get there, and that probably involves a "standalone" cluster-ish 
> system in the shorter term.
>
>> I finally took a look at the docs, and I am still having difficulty 
>> figuring out exactly what sort if db interaction is going to be 
>> taking place when a song play is submitted, and when a view [misc 
>> data] request is received.  I *think* it is better from a db design 
>> standpoint to simply insert the fact that a song is played when it is 
>> (and I think that's what the song_data table is for), but I think we 
>> would be able to provide a faster overall experience to the users if 
>> we were to include play counts with every song, artist, album, etc, 
>> and update them with every submission.  I think it would be worth it 
>> to have faster statistic browsing at the cost of slower submission 
>> processing.  I think i'm pretty much suggesting that we keep a 
>> submission queue / cruncher, and hope to have faster / simpler 
>> queries for viewing statistics.  Are we already planing on doing 
>> something like this (updating total playcounts) and I'm just not 
>> seeing it being mentioned? Is it really stupid for some reason that I 
>> don't understand? Are we doing anything to improve upon AS beyond 
>> turning it into a distributed system? (and is this even one of the 
>> project goals?, does it need to be?)
>
> The database at the moment is currently a result of the ERD and is not 
> final nor optimized.  I also did it before I came up with any 
> solutions for the distributed system.
>
> At the moment, there is also no easy place to put the "cached" data in 
> the DB. (We call it caching, even though it's not in RAM or anything.) 
> For example:
> - Total song count [for all users] is easy. Just put it in songs.
> - Song count per user is not. There's no "user-songs" table. (Yet)
>
> Only saving aggregated statistics makes you lose granularity 
> (basically you lose the "time" element); for instance, you can't say 
> what happened in the past week, unless you capture that specifically. 
> We're having the same issue at work trying to create a stats package 
> for our game - balancing lots of details with performance, as well as 
> storage.
>
> Yes, at the moment this is all in "song_data" which does the job just 
> great, just not too quickly. My hope was to escape this problem by:
> - clustering
> - caching data to memory or disk - use memcached and/or store 
> generated profile data somewhere.
>
> As far as goals... no it doesn't need to be distributed (or, depending 
> on your point of view, aggregated). This is a "would be cool" factor 
> that would help bring all users together. It stemmed from the fact 
> that it would be awesome if all the music tracking sites could be 
> networked in a way so that there could be a "definitive" aggregation. 
> Having 5, 10, or 100 little sites with all their own statistics would 
> be inherently bad. (For example: LiveJournal. You want everyone to 
> have their journal at LJ so that you don't have to hop around the web, 
> etc.) There is a definite advantage to having lots of people on one 
> system. By aggregating the pieces, you create one system where there 
> was previously many.
>
> Maybe this is why I always think of it as "bringing together the 
> clusters" because part of my idea was even a site like Audioscrobbler, 
> which does not run Openscrobbler, could possibly contribute to the 
> global statistics. If there was an API that could be implemented for 
> any system, then even this would be possible.
>
> But I digress.
>
> The more important goal in the shorter-term is/was to get an 
> open-source listener tracking system that is geared to providing a 
> smaller number of users a larger number of features (compared to 
> Audioscrobbler). I want to see the return of time played, 
> weekly/monthly/etc stats, and stats that update more often then 
> whenever-they-feel-like-it. I also want to see albums! My real 
> motivation for doing this is more out of user frustration than geek 
> pride.
>
>> The "Please Wait ..." screen should be fine, it would definitely 
>> better than just having the page take forever.
>
> Yea, that'd be bad. I think the other option is to display something 
> like:
> "Global statistics has not yet been generated for this 
> <song,album,artist>. Your request has been added to the queue and will 
> be processed shortly. Please check back in a few minutes."
>
>> Concerning the ids we're going to be assigning, what if the central 
>> server just had some sort of id maping table, so it would know that 
>> song 13 on server a = song 337 on server b?
>
> Not sure. Need more thought on how IDs will be used any how critical 
> it is that things get "aligned."
>
> --JD
>
>
>
> -------------------------------------------------------
> SF email is sponsored by - The IT Product Guide
> Read honest & candid reviews on hundreds of IT Products from real 
> users.
> Discover which products truly live up to the hype. Start reading now.
> http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> _______________________________________________
> Openscrobbler-devel mailing list
> Ope...@li...
> https://lists.sourceforge.net/lists/listinfo/openscrobbler-devel
>