Thread: Re: [GD-General] Re: asset & document management (Page 2)
Brought to you by:
vexxed72
From: Neil S. <ne...@r0...> - 2003-05-17 15:56:00
|
> How does it handle an artist that creates 182 versions of a 80 megabyte > binary file in the course of 3 weeks? I suppose CVS would end up with 14 GB > of archives, by which time it has probably long croaked :-) Well, it doesn't do binary diffs, which means it would have to store a complete copy of every version of the file (i.e. 14GB, same as CVS). It can, however, do a basic binary compare so you can avoid storing multiple copies of the same file. In my experience, artists tend to check out an entire directory, change one or two files, and then check them all in again, so this simple compare can save a lot of space. The only downside is that it only seems to do the compare when you ask it to "revert unchanged files", not when simply checking in the files, so it's a bit of a pain if people forget to do that (also quite common). One ray of light, though, is that it will use a user-provided diff utility, so you could try giving it a binary-aware diff utility. What I'm not sure about is whether it will handle the output from a binary diff or not. I was planning to look into this at some point, so if I ever get round to it, I'll let you know what I find out. > I've looked at Perforce in the past, it's a really nice product, simliar to > what we already know. But does it handle really, really large amounts of data? Contrary to what I've said above, it actually does handle a lot of data rather well, albeit in a brute-force manner and using a lot of hard disk space. With a huge amount of data, it does start to slow a little, but not nearly as badly as you would expect. We split our code and data into seperate depots, just to maintain a certain level of slickness in the code depot, but we do have a lot of data, and this wouldn't be necessary for all projects. On top of this, Perforce is extremely reliable, almost supernaturally so, which is a major factor when looking after your most important assets. ;) - Neil. |
From: Neil S. <ne...@r0...> - 2003-05-17 16:23:00
|
> Well, it doesn't do binary diffs, which means it would have to store a > complete copy of every version of the file (i.e. 14GB, same as CVS). It can, I just realised I wasn't very clear here. When I say it doesn't do binary diffs, I mean it doesn't store just the differences, but the entire file. It can perform a binary diff and show you the results, but that isn't very helpful from a storage point of view. - Neil. |
From: Stefan B. <ste...@te...> - 2003-05-17 21:10:09
|
> I just realised I wasn't very clear here. When I say it doesn't do = binary > diffs, I mean it doesn't store just the differences, but the entire = file. > It can perform a binary diff and show you the results, but that isn't = very > helpful from a storage point of view. While P4 doesn't do binary diffs, they do support compression (gzip), which works reasonably well on most data (exception: hauge WAV files). It would be nice if they supported some sort of xdelta-like diff = storage instead, but it seems like it's not a massive issue nowadays when huge drives are cheap. Cheers, Stef! :) -- Stefan Boberg, R&D Manager - Team17 Software Ltd. bo...@te... |
From: Neil S. <ne...@r0...> - 2003-05-17 22:10:12
|
> While P4 doesn't do binary diffs, they do support compression (gzip), > which works reasonably well on most data (exception: hauge WAV files). Good point. I was forgetting about that. Although, it's not much consolation when you make a 20 byte change to a 20 meg file and it has to use another 10 megs of disk space. One solution is to try and avoid having monolithic file formats so no single file is going to waste large chunks of space for every check-in, but you can't do much about formats that you don't create yourself. > It would be nice if they supported some sort of xdelta-like diff storage > instead, but it seems like it's not a massive issue nowadays when huge > drives are cheap. I asked Perforce about this a while ago (when we first started using it, in fact) and IIRC they said that although storing binary deltas would save disk space, it would hurt performance quite badly, as they would have to construct enormous files from lots of tiny changes. They went for the performance option, which is fair enough. One idea I had was to use deltas but, in a manner similar to video files (like mpeg), store complete images every now and then, reducing the maximum reconstruction to some - possibly user-specified - amount. I don't know if we'll ever see this though, certainly not if storage costs keep going the way they are. - Neil. |
From: Anders N. <br...@ho...> - 2003-05-17 22:21:12
|
Why aren't they storing the most recent file and then doing the binary diffs backwards? So they store the diff of how to get from the newest file to the next newest file. That way the newest file would be fast to use, you can save full files every time you branch etc. Might be slower to get old versions but that's mostly for backup/safety anyway. Anders Nilsson >> It would be nice if they supported some sort of xdelta-like diff storage >> instead, but it seems like it's not a massive issue nowadays when huge >> drives are cheap. Neil: >I asked Perforce about this a while ago (when we first started using it, in >fact) and IIRC they said that although storing binary deltas would save disk >space, it would hurt performance quite badly, as they would have to >construct enormous files from lots of tiny changes. They went for the >performance option, which is fair enough. |
From: Neil S. <ne...@r0...> - 2003-05-17 23:10:15
|
> Why aren't they storing the most recent file and then doing the binary diffs > backwards? So they store the diff of how to get from the newest file to the > next newest file. That way the newest file would be fast to use, you can > save full files every time you branch etc. Might be slower to get old > versions but that's mostly for backup/safety anyway. Well, they aren't storing any binary diffs at the moment, never mind backwards. ;) I get your point, but I don't think they were willing to compromise performance on _any_ revision of a file, including the most common usage of getting the latest version. One of the things they seem to pride themselves in is Perforce's ability to reliably roll-back to any changelist or label very quickly, specifically as a bug-finding tool, i.e. you can roll back to a version where a nasty bug does not exist and then roll forward to see what broke. This would not be possible if they only optimised for the latest version of a file, so they chose not to store binary diffs at all. What I was suggesting was a halfway-house, where you could tradeoff overall performance (on all files) against disk usage. - Neil. |
From: J C L. <cl...@ka...> - 2003-05-20 07:12:25
|
On Sun, 18 May 2003 00:26:40 +0200 Anders Nilsson <br...@ho...> wrote: > Why aren't they storing the most recent file and then doing the binary > diffs backwards? So they store the diff of how to get from the newest > file to the next newest file. That way the newest file would be fast > to use, you can save full files every time you branch etc. Might be > slower to get old versions but that's mostly for backup/safety anyway. The reverse diffs are one of the many reasons I generally won't use CVS or any of the other RCS based repositories. Very large chunks of the historical versions of the IRIX and HP-UX source trees are no longer accessible in any form due to undetected corruption of the ,v files in the backups. In one way or another the ,v file was corrupted, the corrupted file was backed up, time passed, and no uncorrupted versions exist. Forward diff based systems, like BitKeeper and SCCS are necessarily more expensive, but have the advantage of checking the logical consistency of the files on every operation. Ergo, corruption and other problems are revealed essentially instantly. In practice the added expense when compared to the overhead of open(2)/close(2) is generally minimal, even for files with >2^16 changesets (which I have). -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |
From: J C L. <cl...@ka...> - 2003-05-20 07:06:53
|
On Sat, 17 May 2003 23:10:05 +0100 Neil Stewart <ne...@r0...> wrote: > I asked Perforce about this a while ago (when we first started using > it, in fact) and IIRC they said that although storing binary deltas > would save disk space, it would hurt performance quite badly, as they > would have to construct enormous files from lots of tiny changes. They > went for the performance option, which is fair enough. BitKeeper (of which I'm inordinately fond) takes the default approach of checking in UUencoded versions of binary files. As a one-size-fits-all a[approach it works reasonably. You can of course also add your own diff/delta/etc tools with appropriate configs and it will willingly use those methods instead for the files types you specify. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |
From: Colin F. <cp...@ea...> - 2003-05-17 16:49:15
|
>>> On top of this, Perforce is extremely reliable, =================== >>> almost supernaturally so, [...] ======================== THAT is one heck of a compliment! Maybe their marketing department will check with their lawyers and eventually get the go-ahead to put a bullet-point on the box: * Supernaturally Reliable But this will raise questions, like WHICH deity is behind the reliability, and whether or not the supernatural force will eventually demand souls (better read that EULA!). Also, a new kind of potential incompatibility is introduced: Your new "Aligned Good" application might not work on your "Chaotic Evil" operating system. Tech support becomes a prayer line, computers become shrines, user manuals become scriptures, uninstall becomes an XOR-cism, programmers become clerics, users become disciples, and the Internet becomes the Astral plane! Maybe they'll have a new anime series like Yu Yu Hakusho or Inu-Yasha, or new Dungeons & Dragons expansion packs, or a new set of Magic: The Gathering cards, featuring the battle for cyberspace. Okay, I got in trouble the last time I extrapolated an analogy, so I'll just stop now. --- Colin |
From: Neil S. <ne...@r0...> - 2003-05-17 17:06:19
|
> THAT is one heck of a compliment! Maybe their marketing > department will check with their lawyers and eventually > get the go-ahead to put a bullet-point on the box: > > * Supernaturally Reliable It's kind of hard to disprove, so they might just get away with it. > But this will raise questions, like WHICH deity is behind > the reliability, and whether or not the supernatural force > will eventually demand souls (better read that EULA!). > > <snip> > > Maybe they'll have a new anime series like Yu Yu Hakusho or > Inu-Yasha, or new Dungeons & Dragons expansion packs, or > a new set of Magic: The Gathering cards, featuring the > battle for cyberspace. Yikes, I'll have some of what you've been smoking. - Neil. |
From: Enno R. <en...@de...> - 2003-06-06 17:04:54
|
After all the input I got from this thread a few weeks back, I decided to give Sharepoint another shot for document management. Sharepoint Server is hideously expensive, though - I need to pay for the SPS server, the Win2K server, plus CALs for all our clients - in total, I end up somewhere around $12K real quick for a small team of 25 people, unless I've totally misunderstood MS licensing (which wouldn't surprise me either). Then I found Sharepont Team Services, and it looked good. This is a toned--down version of Sharepoint that comes with Frontpage 2002, which costs $169. It installs on IIS so yes, I still need one Windows Server License. It needs an SQL server 2000, because without that, I don't get any search functionality. That's another 5,000 USD. But I still need to buy a Client Access License (not for SPS, but for the Windows server) for every user in the company, because although this is all done via web pages, it requires logon using windows authentication - and once you do that, every user needs one CAL. For 25 clients, that's 1,800 USD for the server + CALs. I end up paying around 7,000 USD for something that is a better web portal, with most of that cost going into the search button. That's really, really frustrating. I suppose for a company that's already using Windows servers and MS SQL servers, Sharepoint Team Services is really well-priced (at $169), but having to buy this long trail of other products is just way too much. Damnit! Enno. |
From: Garett B. <gt...@st...> - 2003-06-06 19:38:35
|
>Then I found Sharepont... Subversion and a Linux server sounds like a cheap alternative. As soon as I get my Linux box up, I'll try it out. I would install it on a Windows XP box, but I can't afford to install XP Service Pack 1. Perhaps I too will someday be able to afford Windows servers :P $0.0002, Garett Bass gt...@st... |
From: Enno R. <en...@de...> - 2003-06-06 20:05:31
|
Garett Bass wrote: >>Then I found Sharepont... > > Subversion and a Linux server sounds like a cheap alternative. As soon as I > get my Linux box up, I'll try it out. I would install it on a Windows XP > box, but I can't afford to install XP Service Pack 1. Perhaps I too will > someday be able to afford Windows servers :P Subversion is okay for managing code and assets. CVS is still better for us, because we need cvs annotate (aka cvs blame), and subversion does not have that yet. But it looks like whenever subversion is good enough, migration from CVS will not be difficult. But Sharepoint is supposed to solve another problem. CVS does a good job of keeping versions, but it's not a good tool for documentation management. In Sharepoint, I can orgnaize documents much better, I get a nice portal with search functionality that even our marketing folks can use, and I get features like the alerts - notifications when somebody changes a certain document that I've shown an interest in. Plus, I can file documents with meta-data, like keywords, production status, etc. that I can search or organize them by. CVS or subverion could be a nice part of the backend for this. If I'd make my own, it would come out somewhere as a mix between a mySQL database to store the metadata, CVS to keep the versions, and some portal features like in phpBB. Enno. |