You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(53) |
Sep
(93) |
Oct
(16) |
Nov
(15) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(32) |
Feb
(37) |
Mar
(13) |
Apr
(19) |
May
(124) |
Jun
(119) |
Jul
(2) |
Aug
(3) |
Sep
(3) |
Oct
|
Nov
(7) |
Dec
|
| 2004 |
Jan
(13) |
Feb
(13) |
Mar
(5) |
Apr
(7) |
May
(1) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2005 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(2) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(13) |
Nov
|
Dec
|
| 2007 |
Jan
|
Feb
(1) |
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(9) |
Sep
|
Oct
(5) |
Nov
(29) |
Dec
(16) |
| 2009 |
Jan
(33) |
Feb
(3) |
Mar
(4) |
Apr
(6) |
May
(4) |
Jun
(2) |
Jul
(26) |
Aug
(6) |
Sep
(4) |
Oct
(1) |
Nov
(1) |
Dec
(19) |
| 2010 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(1) |
Jun
(32) |
Jul
(13) |
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
|
| 2011 |
Jan
(1) |
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
| 2012 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2015 |
Jan
|
Feb
(4) |
Mar
|
Apr
(11) |
May
(43) |
Jun
(1) |
Jul
(4) |
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
|
2
|
3
|
4
|
5
|
|
6
(1) |
7
|
8
|
9
|
10
(1) |
11
(5) |
12
|
|
13
(2) |
14
(4) |
15
(1) |
16
(6) |
17
(2) |
18
(1) |
19
|
|
20
(2) |
21
(2) |
22
|
23
|
24
|
25
(1) |
26
|
|
27
(3) |
28
(1) |
29
|
30
|
|
|
|
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-28 16:50:16
|
Hi all,
I've just read on SlashDot that the Bilski case has been decided. The news
summaries seem to say there is no change. The Supreme Court decided based
on previous precedents ("no abstract ideas", which covers math) and there is
_not_ a new rule to decide what is and what is not patentable. I'll be
reading the decision and more news to see what the results are and whether
or not Tornado Codes is an "abstract idea".
Software patents are still valid in Japan. Yutaka Sawada has found a patent
by Luby in 1999/2000, but that is later than the ~1995 time period that
would contain Tornado Codes. So, if the US patent is invalidated, I'd hope
to use Tornado Codes in PAR3.
LDPC codes did exist before Tornado Codes, so we have a number of paths to
look at.
I also think we should look at >=16bit Reed-Solomon Codes, since I think we
want to keep that feature and it is currently broken. I've heard a lot of
descriptions of approaches. Can we build a list of them with (1) number of
usable blocks [currently 2^15], (2) storage efficiency [less than 100% if we
used 2^16+1 as the field], (3) algorithm speed, and (4) links to documents
explaining how to implement them?
Michael Nahas
|
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-27 23:41:14
|
Hi My comments are inline below sections, which my reply does not refer to are replaced by [...] for clarities sake. On Fri, Jun 25, 2010 at 02:37:49PM -0400, Michael Nahas wrote: [...] > 6) Incomplete recovery > This is difficult to do. Clients can support it, but I don't think the spec > should say that all clients need to support it. I agree, but we must at least make sure that this works and we dont end up with a messup similar to matrixes being singular in par2 occasionally. For example all ways that i remember seeing discussed here on how to store GF(2^32+-c) codes conflict with error correction, or to say it differently significantly reduce the ability to correct errors with unknown locations. > > 7) Duplicate input slices > http://www.quickpar.org.uk/forum/viewtopic.php?id=1096 > The use case here was that someone was taking a large video file and > splitting it up into pieces and then editing it. So, the input set had a > lot of data that overlapped. I have done multiple subdirectories where each > subdirectory held different versions of a program, which only differed in > 20% of the files. > I guess the user wanted the PAR client to identify duplicate slices in the > input. I don't know if this included offsets that were not a multiple of > the slice size. (That is, the file changes did not align with the length of > a slice.) Duplicate slices would not be duplicated in the input slice set > and either one of the slices as an input could be used to repair the other > before going to RS recovery (or another algorithm). > This is a neat idea, but I don't think we should handle it - unless it > interferes with the recovery algorithm. And I don't think it interferes > with RS or significantly with LDPC. Detecting duplicates is difficult > unless they are slice-aligned and even then it doesn't gain us a smaller > recovery file or a much shorter recovery time. It just means there are more > cases that we could recover from - and some of these may be done by a > recovering client without changing the spec. storing a checksum per file during par3 creation would allow a client to know about duplicate files during error correction. A client could choose to ignore this information. Also i would suggest any developer who is having tons of slightly different versions of source trees laying around to look at git [...] > 12) >32k slices error. > Yes, it's a client issue. Do we need to do more to make sure clients are > compatible? Also, on an error, a client should be required to show the > Creator Packet so that users can know the client that did this. A few reference par3 files for conformance testing would likely be usefull. Where such files available one could test ones client against these intentionally damaged files to check if it works. These files could test all corner cases that are tricky and could easily be implemented wrongly [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Concerning the gods, I have no means of knowing whether they exist or not or of what sort they may be, because of the obscurity of the subject, and the brevity of human life -- Protagoras |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-27 23:19:00
|
On Sun, Jun 27, 2010 at 10:07:22AM +0900, ten_fon@... wrote: > Hello, parchive developers. > I am Yutaka Sawada. > Michael Niedermayer 's proposal to use concatenation > is interesing and will be useful. > But, simple concatenation is hard to implement without using > large temporary file as Michael Nahas said. i dont know if it is hard, but it is not impossible > When an user protect 5GB data with 500MB recovery file for DVD, > he needs additional 5GB free space to keep joined file temporary. > Because my HDD is slow and small, > I don't want to to use large temporary file for my usage. > How users think additional free disk space ? > Or DVD users have large enough HDD normally ? > > from Niedermayer's previous mail; > > and simply concatenation: > > [.......... > > ..file1.... > > ........... > > .....][.... > > ........... > > ...file2... > > ..........] > > [.......... > > ...file3... > > ...][...... > > file4...][. <-- partial damaged slice #1 > > file5.][f6] <-- partial damaged slice #2 > > > If you now for example loose file5 > > you need 1 slice to recover it not 2 > > In each column you have only 1 lost symbol for file5, > > thus you never need more than 1 slice to recover it > > The idea is a partial damaged slice. > In PAR2, there are two types of input slices; > complete slice and damaged slice. > I note "partial damaged slice #1 and #2" in your sample. > However those two partial damaged slices are not damaged > in same offset, I need 2 recovery slices anyway. > > Caution: (Unknown position) Error Correction requires > double recovery symbols than (Known position) Erasure Correction. assuming you know that file5 is damaged and know that the other files are not damaged then you only need 1 recovery slice and the error locations are known > > There is no way to check, which byte is complete or damaged in > multiple partial damaged slices. > Even if a damaged part is one side of two partial damaged slices, > it is treated as unknown position error, > then it requires two recovery slices. > The problem is information, which part is damaged or complete. > PAR client keeps it for each slices. The user simply might have all files except file5 then the client could guess what is available and what not. also there could be checksums stored per file or at finer granularity than slices > > When there are 3 or more partial damaged slices, > Error Correction may help. > For example, 2 recovery slices can recover 3 partial damaged slices, > if only one part is damaged in 3 slices at same offset. > The offset of damaged byte is the key. > Error Correction is useful, > only when they are different each others. > It is possible, but will be rare case ? > In this case: splited slice in simple concatenation, > it can not help. > > > > from Re: [parchive-devel] QuickPar use cases; > > use libnoe, it will correct your damaged bytes > > (if theres sufficient redundancy of course only) > > I omitted the Error Correction capability from > my sample implementation of PAR3. > I thought the usefulness might be rare. > Only when there are many partial damaged slices, > and the damaged parts are different offsets, > it becomes useful. > However I can modify my sample to use that feature of libnoe, > is this worth to do ? i think it is. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB I do not agree with what you have to say, but I'll defend to the death your right to say it. -- Voltaire |
|
From: <ten_fon@ma...> - 2010-06-27 01:07:32
|
Hello, parchive developers. I am Yutaka Sawada. Michael Niedermayer 's proposal to use concatenation is interesing and will be useful. But, simple concatenation is hard to implement without using large temporary file as Michael Nahas said. When an user protect 5GB data with 500MB recovery file for DVD, he needs additional 5GB free space to keep joined file temporary. Because my HDD is slow and small, I don't want to to use large temporary file for my usage. How users think additional free disk space ? Or DVD users have large enough HDD normally ? from Niedermayer's previous mail; > and simply concatenation: > [.......... > ..file1.... > ........... > .....][.... > ........... > ...file2... > ..........] > [.......... > ...file3... > ...][...... > file4...][. <-- partial damaged slice #1 > file5.][f6] <-- partial damaged slice #2 > If you now for example loose file5 > you need 1 slice to recover it not 2 > In each column you have only 1 lost symbol for file5, > thus you never need more than 1 slice to recover it The idea is a partial damaged slice. In PAR2, there are two types of input slices; complete slice and damaged slice. I note "partial damaged slice #1 and #2" in your sample. However those two partial damaged slices are not damaged in same offset, I need 2 recovery slices anyway. Caution: (Unknown position) Error Correction requires double recovery symbols than (Known position) Erasure Correction. There is no way to check, which byte is complete or damaged in multiple partial damaged slices. Even if a damaged part is one side of two partial damaged slices, it is treated as unknown position error, then it requires two recovery slices. The problem is information, which part is damaged or complete. PAR client keeps it for each slices. When there are 3 or more partial damaged slices, Error Correction may help. For example, 2 recovery slices can recover 3 partial damaged slices, if only one part is damaged in 3 slices at same offset. The offset of damaged byte is the key. Error Correction is useful, only when they are different each others. It is possible, but will be rare case ? In this case: splited slice in simple concatenation, it can not help. from Re: [parchive-devel] QuickPar use cases; > use libnoe, it will correct your damaged bytes > (if theres sufficient redundancy of course only) I omitted the Error Correction capability from my sample implementation of PAR3. I thought the usefulness might be rare. Only when there are many partial damaged slices, and the damaged parts are different offsets, it becomes useful. However I can modify my sample to use that feature of libnoe, is this worth to do ? Because James Plank's RS implementation is Erasure Correction, partial damaged slices are available for Niedermayer's libnoe style RS_FFT only. Unknown Error Correction will become an interesting feature of PAR3, or strange gadget ? There is no problem in implementation, partial damaged slices can be taken while file verification, libnoe has that feature already. But, recoverying speed will become a little slow. Best regards, Yutaka Sawada |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-25 18:37:57
|
Thanks, Yutaka Sawada. 1) Unicode I think we've agreed on using UTF-8, which is an extension of ASCII. 2) Folder recursion This is already supported by the spec. I think we've agreed to support creating empty directories. 3) More than 32k blocks We've seen that it usually isn't needed and it is dependent on the recovery algorithm. We know we're going to support a new algorithm. If the algorithm is LDPC, we will probably support more than 32k blocks. 4) Automatic removal of recovery files This is a client feature. We don't need to change the spec to do this. I will say that deleting data is always dangerous. I feel it should only be done with the user's express permission. 5) Automatic calling of UnRAR, UnPkZip, etc. A client can do this. We don't need to change the spec. 6) Incomplete recovery This is difficult to do. Clients can support it, but I don't think the spec should say that all clients need to support it. 7) Duplicate input slices http://www.quickpar.org.uk/forum/viewtopic.php?id=1096 The use case here was that someone was taking a large video file and splitting it up into pieces and then editing it. So, the input set had a lot of data that overlapped. I have done multiple subdirectories where each subdirectory held different versions of a program, which only differed in 20% of the files. I guess the user wanted the PAR client to identify duplicate slices in the input. I don't know if this included offsets that were not a multiple of the slice size. (That is, the file changes did not align with the length of a slice.) Duplicate slices would not be duplicated in the input slice set and either one of the slices as an input could be used to repair the other before going to RS recovery (or another algorithm). This is a neat idea, but I don't think we should handle it - unless it interferes with the recovery algorithm. And I don't think it interferes with RS or significantly with LDPC. Detecting duplicates is difficult unless they are slice-aligned and even then it doesn't gain us a smaller recovery file or a much shorter recovery time. It just means there are more cases that we could recover from - and some of these may be done by a recovering client without changing the spec. 8) Subslice checksums Supported. 9) Redundancy beyond 100% Supported. But we can only support ~32k slices, so if you want lots of recovery slices, you need to use fewer input slices. Also, using all ~32k slices is costly. 10) Better support for small files Use TAR or PkZip to make them one large file, then use PAR. 11) Not using a whole block to fix one byte Subslices can do this. 12) >32k slices error. Yes, it's a client issue. Do we need to do more to make sure clients are compatible? Also, on an error, a client should be required to show the Creator Packet so that users can know the client that did this. 13) Matrix inversion problems Yes, it's a known PAR2 problem. It was known in PAR1 and I tried to fix it. I think it occurs less often now, but it still occurs and it's my fault. I think that's one of the big motivations for PAR3 - to fix this problem. 14) non-ASCII characters (have 8th bit set) This was another of my failures in the PAR2 spec. I thought 8-bit ASCII was a standard. If I had known there were multiple extensions to ASCII with the 8th bit set, I would have either chosen one or clearly stated that ASCII was 7-bit and having the 8th bit set would violate the spec. I think we fix it with making everything UTF-8. Any comments, critique, or criticism? Michael Nahas On Sat, Jun 19, 2010 at 9:50 PM, <ten_fon@...> wrote: > Hello, parchive developers. > I am Yutaka Sawada. > > from Michael Nahas 2010-06-16 > > What are the use cases and problems you see in the Par2 client forums? > > The use cases are mainly following two; > (C1) transport files on UseNet > (C2) protect files on backup media like CD/DVD > > I read QuickPar homepage's forum. > There are many interesting idea/thought/proposal/claim. > > [ subject in forum ] > I write short Q&A style. > Refer original post, if you are interested. > > > [ Wishlist >> High need of Folder Recursion and Unicode file name support ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1227 > Q) I want Unicode filename and Directry support. > A) use MultiPar or ICE ECC. > > > [ Wishlist >> QP without limitation of 32765 input blocks ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1105 > Q) Over than 32k slices may be useful, > even if QuickPar can use them only for verify/repair. > A) Wait for PAR3. > > > [ Wishlist >> Feature Request: Unpack & Cleanup ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1177 > Q) PAR client would better to have a feature of archiver. > A) Just use a favorite set of PAR client and Archiver. > > > [ Wishlist >> Incomplete recovery ( as good as possible) ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1214 > [ Wishlist >> A suggestion for a useful new function ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1136 > Q) Recover as possible as it can, > even when there are not enough recovery slices. > A) This is impossible for Reed-Solomon codes. > > > [ Wishlist >> detecting and handling (partially) dupes ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1096 > Q) PAR client should check dupe slices before create recovery slice. > Create recovery slice from identical slices only. > Dupe slices will be recovered by copy. > A) This is an interesting idea, but maybe useless for most files. > > > [ Wishlist >> Idea for improve the performance ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=838 > Q) Using parallel splited slices may help. > A) This idea is same as subslice of "Packed Main packet". > > > [ Wishlist >> Feature request: Redundancy beyond 100% ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=805 > Q) I want to protect my DVD with huge redundancy. > A) Just make backup DVD, much faster. > > > [ Technical support >> Quickpar efficient for small files? ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1190 > Q) PAR2 file is not efficient for small files ? > A) It depends on slice size. Fewer packet repetition may help a bit. > > > [ Technical support >> Data fault and block consumption ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1098 > Q) Only a small byte error requires whole recovery slice ? > A) Yes. Even a small error makes the whole slice to be useless. > However smaller slice size is better for protection, > speed becomes slow as slices are many. > > > [ Technical support >> Error when repairing: "too many input blocks for > Read Soloman matrix" ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1090 > Q) QuickPar can not repair, when there are over than 32768 input slices. > A) PAR2 supports max 32768 input file slices for recovery. > This is known as YencPowerPost problem. > > > [ Technical support >> Matrix inversion error ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1080 > [ Technical support >> Reed Solomon Error ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=211 > Q) QuickPar fails to recover, even if there are enough recovery slices. > A) PAR2 & PAR1 have a flaw, they can not invert matrix for certain > combinations. > > > [ Technical support >> Failed: Error creating description packet ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=161 > Q) I can not create PAR file for certain files. > A) The filename has Unicode character which can not show. > > > [ UseNet >> European language accented file names make recovery impossible > ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1128 > Q) QuickPar fails to treat non-ASCII filename. > A) It is a fault of QuickPar. PAR2 spec is not bad. > > > Best regards, > Yutaka Sawada > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-21 00:46:24
|
On Sun, Jun 20, 2010 at 10:50:32AM +0900, ten_fon@... wrote: > Hello, parchive developers. > I am Yutaka Sawada. > > from Michael Nahas 2010-06-16 > > What are the use cases and problems you see in the Par2 client forums? > > The use cases are mainly following two; > (C1) transport files on UseNet > (C2) protect files on backup media like CD/DVD > > I read QuickPar homepage's forum. > There are many interesting idea/thought/proposal/claim. > > [ subject in forum ] > I write short Q&A style. > Refer original post, if you are interested. > > [...] > [ Wishlist >> Feature request: Redundancy beyond 100% ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=805 > Q) I want to protect my DVD with huge redundancy. > A) Just make backup DVD, much faster. not as reliable though if you make 2 backups and loose the same sector on the original and both backups then you are screwed ;) but with a par like system you must loose > 66% of the slices if you have 2 recovery slices per data slice thats 2DVDs full of sectors not just 3 sectors [...] > [ Technical support >> Data fault and block consumption ] > http://www.quickpar.org.uk/forum/viewtopic.php?id=1098 > Q) Only a small byte error requires whole recovery slice ? > A) Yes. Even a small error makes the whole slice to be useless. > However smaller slice size is better for protection, > speed becomes slow as slices are many. use libnoe, it will correct your damaged bytes (if theres sufficient redundancy of course only) [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > ... defining _GNU_SOURCE... For the love of all that is holy, and some that is not, don't do that. -- Luca & Mans |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-21 00:35:41
|
On Sun, Jun 20, 2010 at 10:38:00AM +0900, ten_fon@... wrote: [...] > The files are like below; full-size slices and remainder-size slice. > file1: [fullsize1][fullsize2][fullsize3][rem1] > file2: [fullsize4][fullsize5][fullsize6][re2] > file3: [fullsize7][fullsize8][r3] > file4: [fullsize9][re4] > file5: [remain5] > file6: [r6] > > PAR2 align the slices like this; > [fullsize1][fullsize2][fullsize3][rem1_____][fullsize4] > [fullsize5][fullsize6][re2______][fullsize7][fullsize8] > [r3_______][fullsize9][re4______][remain5__][r6_______] > total 15 blocks. > > ICE ECC may align the slices like this; > [fullsize1][fullsize2][fullsize3][fullsize4][fullsize5] > [fullsize6][fullsize7][fullsize8][fullsize9][rem1][re2] > [r3][re4][remain5][r6] > total 12 blocks (when block size is same as slice size). > Note, "file5: [remain5]" is splited between 2 blocks. > Even if the file5 is smaller than block size, > it require 2 blocks to recover. > > My idea is; > [fullsize1][fullsize2][fullsize3][fullsize4][fullsize5] > [fullsize6][fullsize7][fullsize8][fullsize9][rem1][re2] > [r3][re4__][remain5__][r6_______] > total 13 blocks (6 remainder slices are joined into 4 blocks). > While the input data is aligned by slice size as same as PAR2, > non-full size slice is padded with other non-full size slices. > The efficiency will be better than PAR2, but worse than ICE ECC. > Bad point is, when the remainder size is large, > it can not be joined with other non-full size slices. > Good point is, file access may be easier to implement, > because slices are not splited between blocks. > > My idea is easy to write at specification. > Nahas wrote like, last slice is padded by 0 to be full slice size. > That may be changed for proposal like, > last slice is padded by other last slices to be full slice size. > Arrangement of input files by remainder size may be important. > How about this works goodly or badly ? > Personally, as I want to add 32-bit RS, this method is not necessary. > If PAR3 will use 16-bit RS only as Niedermayer say, > this joined file style should be added. > Maybe our genius Michaels have good idea ? I do but its really not genius consider your example: [fullsize1] [fullsize2] [fullsize3] [fullsize4] [fullsize5] [fullsize6] [fullsize7] [fullsize8] [fullsize9] [rem1][re2] [r3][re4__] [remain5__] [r6_______] and simply concatenation: [.......... ..file1.... ........... .....][.... ........... ...file2... ..........] [.......... ...file3... ...][...... file4...][. file5.][f6] If you now for example loose file5 you need 1 slice to recover it not 2 In each column you have only 1 lost symbol for file5, thus you never need more than 1 slice to recover it the example code in libnoe of course supports that people though prefer to reimplement with fewer features and more bugs. My question still is why isnt libnoe used as it is for par3? I can happily change its license to LGPL if people want that and with libnoe you get the ability to correct damaged bytes too. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Avoid a single point of failure, be that a person or equipment. |
|
From: <ten_fon@ma...> - 2010-06-20 02:04:51
|
Hello, parchive developers. I am Yutaka Sawada. from Michael Niedermayer 2010-06-17 > I think the current par3 design is not ideal. > Currently 1 slice can not contain more than 1 file, > this leads to a series of problems like The idea is good. As the proof, later file repairing softs seems to join files. They are ICE Graphics's ICE ECC and persicum's CRC32.EXE. from ICE ECC 's website; > ICE ECC offers these new techniques for the protection of your files: > 4. There is no limitation on the number or size of protected files or directories. In sample, ICE ECC set 1024 blocks for 6571 files. He thinks this is a superiority to QuickPar. from Michael Nahas 2010-06-17 > There is the general idea of packing multiple input files into an input slice. I will explain how it works. No need to use TAR, ZIP, or any other special archive format. Just join all input files into one large input data internally. This can be done by well arranged file access, as ICE ECC does. Or simply use large temporary file, as CRC32.EXE does. Because verification is done by the Input File Slices, only the difference is an alignment of input file data. Now, I use words as; Slice is a piece of input file. Block is a piece of total input data. In PAR2 spec, slice and block is same. (non-full size slice is padded by 0s.) In ICE ECC, slice and block is independent. One block may contains multiple slices. For example, There are 6 files of 100MB, 10 files of 3MB, and 10 files of 1MB. The total file size is 640MB. I want to save them on CD with 60MB recovery data. If I use QuickPar, the comparison of slice size vs number is; Slice size : # of input slices : # of recovery slices 1MB : 640 input slices : 60 recovery slices 3MB : 224 input slices : 20 recovery slices 10MB : 80 input slices : 6 recovery slices If I use ICE ECC, the comparison of block size vs number is; Block size : # of source blocks : # of recovery blocks 1MB : 640 source blocks : 60 recovery blocks 3MB : 214 source blocks : 20 recovery blocks 10MB : 64 source blocks : 6 recovery blocks (In them, slice size is always 1MB.) As I set larger block size, # of source blocks become fewer than # of input slices. If I create recovery data by setting fixed % redundancy, ICE ECC can make smaller recovery file for same % redundancy. Therefore, ICE ECC shows better efficiency than QuickPar normally. This is convenient than TAR & PAR, because users do not need to TAR files before verify/repair. Now, I know what it is and how to do. But, I did not implement it for my PAR3 proposal. A same thing has some aspects to see. While Michael Niedermayer said jointed file data will erase a needness of 32-bit RS, I thought 32-bit RS was enough. We see this from opposite side, hehe. >From him, join input files = larger slice size can be efficient = no need to use many slices of 32-bit RS. >From me, 32-bit RS enable many slices = slice size can be smaller = no need to join input files for efficiency. BTW, Michael Niedermayer missed an incident, and Michael Nahas did not point it. For ICE ECC style joined file alignment, while one block may contains multiple slices, one slice may consume two blocks. See my example above. When block size is 10MB, 3 blocks contain 10 files of 3MB. So, a 3MB file is splited to 1MB and 2MB between 2 blocks. If the file is lost, two recovery block is required. This incident was claimed at ICE ECC's web-forum. I thought how to implement without temporary file. The number of non-full size slices is fewer than number of files. Only the last slice of each files can be non-full size. The size will be remainder (file size mod slice size). When there are many varied size files, the avarage of remainders become half of slice size. Then, I make special joined slices for non-full size slices. A joined slice will contain avarage 2 non-full size slices. For example, there are 6 files and 15 input file slices. The number of non-full size slices is 6 or less, and they are joined into avarage 3 joined slices. Then, total slices becomes average 12. If 1 joined slice contain 6 remainder slices, total slices becomes 10. If remainder size is too large to join, total slices becomes 15. The files are like below; full-size slices and remainder-size slice. file1: [fullsize1][fullsize2][fullsize3][rem1] file2: [fullsize4][fullsize5][fullsize6][re2] file3: [fullsize7][fullsize8][r3] file4: [fullsize9][re4] file5: [remain5] file6: [r6] PAR2 align the slices like this; [fullsize1][fullsize2][fullsize3][rem1_____][fullsize4] [fullsize5][fullsize6][re2______][fullsize7][fullsize8] [r3_______][fullsize9][re4______][remain5__][r6_______] total 15 blocks. ICE ECC may align the slices like this; [fullsize1][fullsize2][fullsize3][fullsize4][fullsize5] [fullsize6][fullsize7][fullsize8][fullsize9][rem1][re2] [r3][re4][remain5][r6] total 12 blocks (when block size is same as slice size). Note, "file5: [remain5]" is splited between 2 blocks. Even if the file5 is smaller than block size, it require 2 blocks to recover. My idea is; [fullsize1][fullsize2][fullsize3][fullsize4][fullsize5] [fullsize6][fullsize7][fullsize8][fullsize9][rem1][re2] [r3][re4__][remain5__][r6_______] total 13 blocks (6 remainder slices are joined into 4 blocks). While the input data is aligned by slice size as same as PAR2, non-full size slice is padded with other non-full size slices. The efficiency will be better than PAR2, but worse than ICE ECC. Bad point is, when the remainder size is large, it can not be joined with other non-full size slices. Good point is, file access may be easier to implement, because slices are not splited between blocks. My idea is easy to write at specification. Nahas wrote like, last slice is padded by 0 to be full slice size. That may be changed for proposal like, last slice is padded by other last slices to be full slice size. Arrangement of input files by remainder size may be important. How about this works goodly or badly ? Personally, as I want to add 32-bit RS, this method is not necessary. If PAR3 will use 16-bit RS only as Niedermayer say, this joined file style should be added. Maybe our genius Michaels have good idea ? Best regards, Yutaka Sawada |
|
From: <ten_fon@ma...> - 2010-06-20 01:57:25
|
Hello, parchive developers. I am Yutaka Sawada. from Michael Nahas 2010-06-16 > What are the use cases and problems you see in the Par2 client forums? The use cases are mainly following two; (C1) transport files on UseNet (C2) protect files on backup media like CD/DVD I read QuickPar homepage's forum. There are many interesting idea/thought/proposal/claim. [ subject in forum ] I write short Q&A style. Refer original post, if you are interested. [ Wishlist >> High need of Folder Recursion and Unicode file name support ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1227 Q) I want Unicode filename and Directry support. A) use MultiPar or ICE ECC. [ Wishlist >> QP without limitation of 32765 input blocks ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1105 Q) Over than 32k slices may be useful, even if QuickPar can use them only for verify/repair. A) Wait for PAR3. [ Wishlist >> Feature Request: Unpack & Cleanup ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1177 Q) PAR client would better to have a feature of archiver. A) Just use a favorite set of PAR client and Archiver. [ Wishlist >> Incomplete recovery ( as good as possible) ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1214 [ Wishlist >> A suggestion for a useful new function ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1136 Q) Recover as possible as it can, even when there are not enough recovery slices. A) This is impossible for Reed-Solomon codes. [ Wishlist >> detecting and handling (partially) dupes ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1096 Q) PAR client should check dupe slices before create recovery slice. Create recovery slice from identical slices only. Dupe slices will be recovered by copy. A) This is an interesting idea, but maybe useless for most files. [ Wishlist >> Idea for improve the performance ] http://www.quickpar.org.uk/forum/viewtopic.php?id=838 Q) Using parallel splited slices may help. A) This idea is same as subslice of "Packed Main packet". [ Wishlist >> Feature request: Redundancy beyond 100% ] http://www.quickpar.org.uk/forum/viewtopic.php?id=805 Q) I want to protect my DVD with huge redundancy. A) Just make backup DVD, much faster. [ Technical support >> Quickpar efficient for small files? ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1190 Q) PAR2 file is not efficient for small files ? A) It depends on slice size. Fewer packet repetition may help a bit. [ Technical support >> Data fault and block consumption ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1098 Q) Only a small byte error requires whole recovery slice ? A) Yes. Even a small error makes the whole slice to be useless. However smaller slice size is better for protection, speed becomes slow as slices are many. [ Technical support >> Error when repairing: "too many input blocks for Read Soloman matrix" ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1090 Q) QuickPar can not repair, when there are over than 32768 input slices. A) PAR2 supports max 32768 input file slices for recovery. This is known as YencPowerPost problem. [ Technical support >> Matrix inversion error ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1080 [ Technical support >> Reed Solomon Error ] http://www.quickpar.org.uk/forum/viewtopic.php?id=211 Q) QuickPar fails to recover, even if there are enough recovery slices. A) PAR2 & PAR1 have a flaw, they can not invert matrix for certain combinations. [ Technical support >> Failed: Error creating description packet ] http://www.quickpar.org.uk/forum/viewtopic.php?id=161 Q) I can not create PAR file for certain files. A) The filename has Unicode character which can not show. [ UseNet >> European language accented file names make recovery impossible ] http://www.quickpar.org.uk/forum/viewtopic.php?id=1128 Q) QuickPar fails to treat non-ASCII filename. A) It is a fault of QuickPar. PAR2 spec is not bad. Best regards, Yutaka Sawada |
|
From: <ten_fon@ma...> - 2010-06-18 07:17:09
|
Hello, parchive developers.
I am Yutaka Sawada.
from Michael Nahas 2010-06-15
> Also, the Par2 spec included optional packets for containing input file slices.
> Do we want to push to make those packets mandatory?
This is my question, too.
Do I put the packet on PAR3 spec or not ?
If you see HTML source of my proposal file "par3_spec_prop.htm",
you find "Input File Slice packet" is commented out.
While containing file slice in PAR file is an interesting idea,
there is no PAR2 client which support this packet.
This may be a proof of the useless ?
While one say "use TAR or ZIP to archive many files.",
the one say "do not use RAR to split large files." at same time.
I feel this is odd.
>From a programer's point of view,
I like this slice packet idea, and want to implement.
>From a user's point of view,
I will not use this packet myself...
Because PAR2 has strong file slice searching method,
saving splited files and PAR2 file on same time may be enough.
PAR2 client can find a slice samely,
even if it is in simple splited files, in splited RAR files,
or in PAR files as "Input File Slice packet".
The difference is the total size.
"Input File Slice packet" requires additional size,
packet header and body header. (64 + 24 + 0-3 for alignment)
"Input File Slice packet" may be useful,
when an user want to set more than 100% redundancy.
For example, when there are 2000 input file slices,
one want to save them as PAR files with 105% redundancy.
A) cretae 2100 "Recovery Slice packets".
B) cretae 2000 "Input File Slice packets" and 100 "Recovery Slice packets".
Both will create similar size PAR files,
but the creating speed is different.
from Michael Nahas 2010-06-16
> The order should go: Use cases -> Goals -> Spec -> Code
> The order should not go: Code -> anything else.
> Please refer to your design as "my proposal for Par3".
> We should focus on problems that cannot be fixed using the current spec.
I agree with you. It should be so.
The problem is that, nobody progress PAR3 design.
At first I thought that someone write PAR3 spec and
I helped him by creating sample implementation.
For 2 years, no one write... everybody are busy.
Should I (and users) wait for more years or forever ?
Now, I show the possibility from a programers side, PAR3 will do what.
It is only a proposal. (Thanks for teaching proper English word)
Then, who can be an editor ?
> Can you state the size of overhead for your proposal for Par3?
Par3Main is 22 from header + 4~18
Par3FileDescription is 22~23 from header + 2~32 + length of filename (say 20 bytes?)
Par3InputChecksum is 22~25 from header + 1~3 + 12*InputSize/SliceSize
Par3RecoveryPacket is 22~30 from header + 2~3 + SliceSize
As variable length format, the size is not fixed.
possible Par3Size (without PAR2 packets for compatibility) =
N* (40
+ 75*#ofInputFiles
+ 28*#ofInputFiles + 12*InputSize/SliceSize)
+ 33*RecoverySize/SliceSize + RecoverySize
=
N*(40 + 103*#ofInputFiles + 12*InputSize/SliceSize)
+ 33*RecoverySize/SliceSize + RecoverySize
Par3's overhead per-file is around 60% of PAR2.
When a user create PAR2 file with 90% efficiency,
the efficiency will become 90/((100-90)*0.6+90) = 93% for PAR3.
If PAR2 packets are added to PAR3 file,
possible Par3Size (with PAR2 packets for compatibility) =
N*(40 + 103*#ofInputFiles + 12*InputSize/SliceSize)
+ 33*RecoverySize/SliceSize + RecoverySize
+ 30 + (76 + 228*#ofInputFiles + 20*InputSize/SliceSize)
However the efficiency become worse for N=1or2,
it will better for N=4 or more.
In this, PAR2 packets are used only by PAR2 client,
and PAR3 client will ignore those PAR2 packets.
I feel the idea of smaller packet header is not so bad.
Anyway I can implement any packet from, and I will follow official release.
> What are the use cases and problems you see in the Par2 client forums?
I will search forums, but I can not use internet so much.
I think we need easy web-forum for users.
This mailing-list is hard to post for general public.
(At first I could not post for long time.)
Or can someone access/admin/edit SourseForge's Parchive forum ?
from Michael Nahas 2010-06-17
> A good start would be to look at what languages QuickPar has been translated into.
QuickPar does not translate (encode/decode filename).
I think it gets filename by System Local Codepage,
and write it directly on PAR file.
So, QuickPar does not write (7-bit) ASCII filename strictly.
This may not be problem for single-byte character. (8-bit ASCII extensions)
This gives serious problem for users of multi-byte character (16-bit),
because sometimes the filename is not parsed correctly.
Some characters like \ are not usable for filename,
but multi-byte characters may have them at second byte.
QuickPar refused to accept those filenames as invalid,
then Japanese users can not use Japanese filename sometimes...
> Any objections to the proposed changes?
from PAR2 spec
> File names are case sensitive and can be of any length.
> If a client is doing recovery on an operating system that has
> case-insensitive filenames or limited-length filenames,
> it is up to the client to rename files and directories.
I mention about one more compatiblity issue; normalization.
This may be hard to understand for single-byte character users.
Normally normalization is used for search words.
You can search "ABC", "AbC", or "abc" by using a keyword "abc".
In PAR2 file, filename should be writen as it is,
"AbC.txt" is written as "AbC.txt", not as "ABC.TXT".
This is a good solution.
For multi-byte character, normalization is more complex.
There are some method to show a character.
For easy sample, image [W] and [VV] (these are not real characters).
As graphic [W] and [VV] is similar, but the character code is different.
Unicode has a normalization method to search both by one keyword.
There are problem between OSs.
Windows OS, Linux, Java etc distinguish them.
Mac OS X does not. (I don't know why.)
I think PAR3 should not use normalization for Unicode,
(it should write a filename by its OS style.)
as same as PAR2 does not change case of original filename.
Best regards,
Yutaka Sawada
|
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-17 14:59:02
|
Re M. Niedermayer: First, TAR is Unix specific. Sure, Cygwin has a TAR for Windows, but we don't want every user to install Cygwin. Second, TAR+PAR is slower. Clients creating the PAR file would first have to produce the TAR file. This file is probably too large to keep in memory and the clients might be forced to write it to disk and then read it in again. Last and probably more important, is what to do with the TAR file. Users would have to transmit/save the TAR file. If they don't, it isn't always possible for the recovery client to reproduce it. So, TAR/PkZip can't be done transparently. Any use of TAR/PkZip should be explicit by the user, although it could be made easier by the client. Also, the comparison of 10 files into 1 slice vs. 10 files into 10 slices ignores the slice size. The slice size required for the first case is equal to the sum of all input files; the slice size for the second case is just the size of the largest file. So, the second case uses a smaller recovery file. There is the general idea of packing multiple input files into an input slice. We can do that without TAR. This approach would pack input files together and not start each file at the start of a slice. Doing that would make Par harder to implement. Right now, each input slice belongs to one and only one file. I don't think the benefits of the approach are enough to validate a change to the spec. If it is an issue, people can (explicitly) use TAR or PkZip. We may want to decide this after we decide about the new recovery algorithm(s), which might include LDPC. LDPC can support huge numbers of slices, so this may be less of an issue. Michael Nahas On Thu, Jun 17, 2010 at 2:06 AM, Michael Niedermayer <michaelni@...>wrote: > On Wed, Jun 16, 2010 at 10:46:19AM -0400, Michael Nahas wrote: > [...] > > files; even a few thousand files if the InputSize is large. If users > want > > to do more, I'd recommend they use TAR or PkZip (which enterprising > > developers might integrate into their client for ease-of-use). > > I think the current par3 design (which is the same as par2 in this respect) > is not ideal. > > Currently 1 slice can not contain more than 1 file, this leads to a > series of problems like > * requireing more slices than files (and more slices are slower) > * leaving large parts of slices unused > Think of 10 small files, if all 10 files are lost 10 parity slices are > needed, but if all 10 files would have been in 1 slice only 1 parity slice > would have been needed. > * 16bit RS limits the slices to 65535 thus making 32bit RS codes (which are > slower) required for more files > > If the 1 file per slice limitation is lifted these problems disappear > more than 65535 files could be stored, multiple small files could be > recovered through a single parity slice. And slow 32bit RS codes would not > be needed. (this is why iam arguing against 32bit RS codes, it seems just > the current design that benefits from them, and this design is for many > reasons not ideal) > > How can the slice limit be lifted? > quite simply by using tar. And as you say "which enterprising > developers might integrate into their client for ease-of-use" > it can be done without the user having to do anything. Both the > command line and GUI apps can just implement or call tar. > > now if we take this a small step further, this integration could be > made mandatory by the spec and then only a single way of storing > multiple files would be needed to be implemented by clients. > no special cases for >65535 files and users would not even need to > know that tar is run in the background to combine the files so it > also would be friendly to uneducated/lazy users. Users would just > see a par3 GUI that accepted multiple files > > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Rewriting code that is poorly written but fully understood is good. > Rewriting code that one doesnt understand is a sign that one is less smart > then the original author, trying to rewrite it will not make it better. > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iD8DBQFMGbtqYR7HhwQLD6sRAprTAJ43aLMMzZhJXuHYo2P+SFqvOwcmbQCeLISP > 6Eu8i5aJBAvYfLTAa5/h2Ck= > =mdVv > -----END PGP SIGNATURE----- > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > > |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-17 06:11:14
|
On Wed, Jun 16, 2010 at 10:46:19AM -0400, Michael Nahas wrote: [...] > files; even a few thousand files if the InputSize is large. If users want > to do more, I'd recommend they use TAR or PkZip (which enterprising > developers might integrate into their client for ease-of-use). I think the current par3 design (which is the same as par2 in this respect) is not ideal. Currently 1 slice can not contain more than 1 file, this leads to a series of problems like * requireing more slices than files (and more slices are slower) * leaving large parts of slices unused Think of 10 small files, if all 10 files are lost 10 parity slices are needed, but if all 10 files would have been in 1 slice only 1 parity slice would have been needed. * 16bit RS limits the slices to 65535 thus making 32bit RS codes (which are slower) required for more files If the 1 file per slice limitation is lifted these problems disappear more than 65535 files could be stored, multiple small files could be recovered through a single parity slice. And slow 32bit RS codes would not be needed. (this is why iam arguing against 32bit RS codes, it seems just the current design that benefits from them, and this design is for many reasons not ideal) How can the slice limit be lifted? quite simply by using tar. And as you say "which enterprising developers might integrate into their client for ease-of-use" it can be done without the user having to do anything. Both the command line and GUI apps can just implement or call tar. now if we take this a small step further, this integration could be made mandatory by the spec and then only a single way of storing multiple files would be needed to be implemented by clients. no special cases for >65535 files and users would not even need to know that tar is run in the background to combine the files so it also would be friendly to uneducated/lazy users. Users would just see a par3 GUI that accepted multiple files -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Rewriting code that is poorly written but fully understood is good. Rewriting code that one doesnt understand is a sign that one is less smart then the original author, trying to rewrite it will not make it better. |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-16 16:49:49
|
Nice work! I hate being wrong, let alone twice in one day!
So, a lot of strings that use Latin-1 or other ASCII extensions will show up
as invalid UTF-8 strings.
So the proposed changes are:
1. Changing ASCII to UTF-8 in Par3
Any string that is not valid UTF-8 must be reported as an error to
the user. Clients may try to recover from the error, but the specification
does not suggest how.
2. Removing the optional 16-bit Unicode packets in Par3. (Essentially
deprecating them.)
The spec should probably include a list of common operating systems
(Windows, OS X, Linux, AT&T Unix, BSD Unix) with recommended libraries for
supporting Unicode and, if required, recommended methods for translating
Unicode strings to unique local filenames.
Any objections to the proposed changes?
Mike
On Wed, Jun 16, 2010 at 11:55 AM, Jesus Cea <jcea@...> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 16/06/10 17:51, Michael Nahas wrote:
> > I have argued against changing the spec from ASCII to UTF-8 because I
> > believed the two to be incompatible. You've now corrected me and shown
> > that they are compatible. And I'll gladly accept a change from ASCII to
> > UTF-8 if we don't see that people made the same mistaken assumption that
> > I did OR if they make incompatible mistaken assumptions (e.g., some
> > clients support Latin-1, while some support Windows-1252 or UTF-8).
>
> "autodetecting" utf-8 is pretty safe:
>
> > <http://en.wikipedia.org/wiki/Utf-8#Advantages> )
>
> Anyway I would put in PAR3 standard that the filenames MUST be UTF-8. It
> is the client responsability to translate to/from the local filesystem
> charset.
>
> - --
> Jesus Cea Avion _/_/ _/_/_/ _/_/_/
> jcea@... - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
> jabber / xmpp:jcea@... <xmpp%3Ajcea@...> _/_/
> _/_/ _/_/_/_/_/
> . _/_/ _/_/ _/_/ _/_/ _/_/
> "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
> "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQCVAwUBTBjz8plgi5GaxT1NAQKUiwP9HPdnzTNsxQt70mWErgXUDVcmVEG5yj/j
> q81wc4E1vE9lrodYQBDsTgIX2nFqWuPD7hdB6UxKyjC+GIAB6gJeBuTA7iWIT5xn
> i+xphqRZPW6Te9yNv2ei2R6svLQqFantHqag/6JHVf4keVTB0/IzmgmPl1yx/l5Z
> bNI2hkr5z1Y=
> =z84P
> -----END PGP SIGNATURE-----
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit. See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> Parchive-devel mailing list
> Parchive-devel@...
> https://lists.sourceforge.net/lists/listinfo/parchive-devel
>
|
|
From: Jesus Cea <jcea@jc...> - 2010-06-16 15:55:50
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16/06/10 17:51, Michael Nahas wrote: > I have argued against changing the spec from ASCII to UTF-8 because I > believed the two to be incompatible. You've now corrected me and shown > that they are compatible. And I'll gladly accept a change from ASCII to > UTF-8 if we don't see that people made the same mistaken assumption that > I did OR if they make incompatible mistaken assumptions (e.g., some > clients support Latin-1, while some support Windows-1252 or UTF-8). "autodetecting" utf-8 is pretty safe: > <http://en.wikipedia.org/wiki/Utf-8#Advantages> ) Anyway I would put in PAR3 standard that the filenames MUST be UTF-8. It is the client responsability to translate to/from the local filesystem charset. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@... - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@... _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTBjz8plgi5GaxT1NAQKUiwP9HPdnzTNsxQt70mWErgXUDVcmVEG5yj/j q81wc4E1vE9lrodYQBDsTgIX2nFqWuPD7hdB6UxKyjC+GIAB6gJeBuTA7iWIT5xn i+xphqRZPW6Te9yNv2ei2R6svLQqFantHqag/6JHVf4keVTB0/IzmgmPl1yx/l5Z bNI2hkr5z1Y= =z84P -----END PGP SIGNATURE----- |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-16 15:51:11
|
You're right. I'm wrong. "ASCII" always means 7-bit values. I'd now accept changing ASCII to UTF-8 if a survey of common Par2 usage shows that people are not using Latin-1 (or, worse, Windows-1252), as I would have mistakenly done or if they are using incompatible assumptions about which 8-bit ASCII extension to use. A good start would be to look at what languages QuickPar has been translated into. By the way, I have always supported using Unicode. Unicode was required in the Par1 standard and I wanted to keep it standard in Par2. The Par1 developers told me that Unicode libraries did not exist in all languages at that time and, because of this, that compatibility between Par1 clients was bad. So, Unicode became optional in Par2. Since there appears to be good library support now, I've backed changing Unicode packets to being required. I have argued against changing the spec from ASCII to UTF-8 because I believed the two to be incompatible. You've now corrected me and shown that they are compatible. And I'll gladly accept a change from ASCII to UTF-8 if we don't see that people made the same mistaken assumption that I did OR if they make incompatible mistaken assumptions (e.g., some clients support Latin-1, while some support Windows-1252 or UTF-8). Michael Nahas On Wed, Jun 16, 2010 at 11:09 AM, Jesus Cea <jcea@...> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 16/06/10 16:46, Michael Nahas wrote: > > UTF-8 is NOT compatible with ASCII. It's compatible with 7-bit ASCII, > > but not 8-bit. I would strongly oppose changing the spec so that ASCII > > is now UTF-8, because some people may be using 8-bit ASCII. > > There is not such a thing as "8 bit ascii". ASCII is defines 0-127. You > are probably refering to Latin-1/ISO8859-1 encoding, a (western > european) superset of ASCII, using 0-255. In fact, there are a lot of > different encodings using 0-255, most of them compatible in the ASCII > range 0-127, but with different meaning in the 128-255 range. > > http://en.wikipedia.org/wiki/ASCII#Unicode > > UTF-8 is actually a superset of ASCII, too. That is, an ASCII string is > a valid UTF-8 string. > > http://en.wikipedia.org/wiki/Utf-8 > > (Read second paragraph, and > <http://en.wikipedia.org/wiki/Utf-8#Advantages> ) > > ANY modern spec MUST support UNICODE, and utf-8 is a practical and > convenient way to do it. > > - -- > Jesus Cea Avion _/_/ _/_/_/ _/_/_/ > jcea@... - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ > jabber / xmpp:jcea@... <xmpp%3Ajcea@...> _/_/ > _/_/ _/_/_/_/_/ > . _/_/ _/_/ _/_/ _/_/ _/_/ > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQCVAwUBTBjpOJlgi5GaxT1NAQL0KwP/UmSU68xTbJGfEJsDP/fg+YLbszLHX+Ka > Q0yZvPNt4MSONpQR/38I82Rqkr54ukwD+A/juKRx4O/6ueMDy5v8D3zLWj0+lSzl > kxSdRORR1gNGe+TQ+Ps8Z188/BdCD/wcliDHfATChoCnSBqr+b0zWz+R0qJedJ/O > W78Ez43n5JY= > =KQKs > -----END PGP SIGNATURE----- > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > |
|
From: Jesus Cea <jcea@jc...> - 2010-06-16 15:10:07
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 16/06/10 16:46, Michael Nahas wrote: > UTF-8 is NOT compatible with ASCII. It's compatible with 7-bit ASCII, > but not 8-bit. I would strongly oppose changing the spec so that ASCII > is now UTF-8, because some people may be using 8-bit ASCII. There is not such a thing as "8 bit ascii". ASCII is defines 0-127. You are probably refering to Latin-1/ISO8859-1 encoding, a (western european) superset of ASCII, using 0-255. In fact, there are a lot of different encodings using 0-255, most of them compatible in the ASCII range 0-127, but with different meaning in the 128-255 range. http://en.wikipedia.org/wiki/ASCII#Unicode UTF-8 is actually a superset of ASCII, too. That is, an ASCII string is a valid UTF-8 string. http://en.wikipedia.org/wiki/Utf-8 (Read second paragraph, and <http://en.wikipedia.org/wiki/Utf-8#Advantages> ) ANY modern spec MUST support UNICODE, and utf-8 is a practical and convenient way to do it. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea@... - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea@... _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTBjpOJlgi5GaxT1NAQL0KwP/UmSU68xTbJGfEJsDP/fg+YLbszLHX+Ka Q0yZvPNt4MSONpQR/38I82Rqkr54ukwD+A/juKRx4O/6ueMDy5v8D3zLWj0+lSzl kxSdRORR1gNGe+TQ+Ps8Z188/BdCD/wcliDHfATChoCnSBqr+b0zWz+R0qJedJ/O W78Ez43n5JY= =KQKs -----END PGP SIGNATURE----- |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-16 14:46:25
|
Re: Y. Sawada,
The order should go: Use cases -> Goals -> Spec -> Code
The order should not go: Code -> anything else.
Please refer to your design as "my proposal for Par3". I know you've been
the one pushing this, but from observing the emails it does not look like
you have consensus yet. Without consensus, you'll never get others to
implement your proposal for Par3.
If people want utility, they will use a good client. Many people use a GUI
client rather than a command-line client because they want utility. The
file spec has to do with possibility, not utility. We should focus on
problems that cannot be fixed using the current spec.
UTF-8 is NOT compatible with ASCII. It's compatible with 7-bit ASCII, but
not 8-bit. I would strongly oppose changing the spec so that ASCII is now
UTF-8, because some people may be using 8-bit ASCII. If the Unicode tools
exist, and I think they do now, I would be in favor of making Unicode
support mandatory in Par3.
Can you state the size of overhead for your proposal for Par3? Please state
in a form like I did below for Par2.
Par2Main is 64 from header + 12 + 16 * #ofInputFiles
Par2FileDescription is 64 from header + 48 + length of filename (say 20
bytes?)
Par2InputChecksum is 64 from header + 16 + 20*InputSize/SliceSize
Par2RecoveryPacket is 64 from header + 4 + SliceSize
If N is the number of repetitions of Main, FileDescription, and
InputChecksum packets, and RecoverySize/SliceSize is the number of
Par2RecoveryPackets
Par2Size =
N* (76 + 16*#ofInputFiles
+ 132*#ofInputFiles
+ 80*#ofInputFiles + 20*InputSize/SliceSize)
+ 68*RecoverySize/SliceSize + RecoverySize
=
N*(76 + 228*#ofInputFiles + 20*InputSize/SliceSize)
+ 68*RecoverySize/SliceSize + RecoverySize
Par2's overhead per-file is not to be laughed at. For expected N, it's a
few kB per input file. Personally, I think that's fine for a few hundred
files; even a few thousand files if the InputSize is large. If users want
to do more, I'd recommend they use TAR or PkZip (which enterprising
developers might integrate into their client for ease-of-use).
What are the use cases and problems you see in the Par2 client forums?
Michael Nahas
On Tue, Jun 15, 2010 at 9:36 PM, <ten_fon@...> wrote:
> Hello, Michael Nahas
> I am Yutaka Sawada.
> Even if you are not writing PAR3 specifications,
> your comments as PAR2 designer are helpful.
>
>
> > "My programing skill is not so high, and I can not write
> > standard C++ code." - Yutaka Sawada
> > Does this not worry anyone?!
>
> Don't worry.
> I am making only a sample application of PAR3.
> When PAR3 spec becomes stable,
> other high skill programers like Peter Clements or
> Peter Cordes will create much better PAR3 clients.
> If I miss something, other one may find the bug.
>
>
> > What problems are you trying to solve with Par3?
>
> I want to solve as possible as I can.
> Michael Nahas did very good job at PAR2 design.
> I copy most of PAR2 spec and fix only some small parts.
> In PAR2 specifications, there seems to be no serious problem,
> which can not be solved by conbining other method.
> Many people use old obsolete PAR2 clients still now.
> A simple solution by users are like below;
>
> > * Fix the bug in the Reed-Solomon algorithm.
> Create more recovery slices.
>
> > * Support a faster algorithm
> Use faster PC.
>
> > * Support for Unicode/UTF-8?
> Don't use Unicode filename.
>
> > * Support for large files / more slices?
> Use RAR (archive files and split).
>
> > * Lower overhead in Par2 files?
> Use larger media/drive or faster network.
>
> > * Support large numbers of files?
> > * Better support for directories?
> > * Use with CD/DVD/? disk images.
> Use archiver like TAR/ZIP/RAR.
>
> But, those solutions are not convenient for users.
> When PAR3 will be used widely, it is not only for skilled PC users,
> but it is for novice lazy users also.
> This is a difference between, possible to do, and easy to do.
> I think users want utility.
>
>
> > mail by me at 2010-05-23
> > User can make both PAR2 and PAR3 files for same recovery set.
> > Even if there is no compatibility in packet level,
> > PAR files can contain both PAR2 packets and PAR3 packets same time.
> > PAR2 client reads PAR2 packets, and PAR3 client reads PAR3 packets.
>
> As I write ago, PAR3 file can be constructed as PAR2 format
> by adding some extra packets.
> There are three packet form selections;
> (2.0) PAR2 body data with PAR2 packet header - PAR2 spec,
> (2.5) PAR3 body data with PAR2 packet header - previous PAR3 spec, and
> (3.0) PAR3 body data with PAR3 packet header - new PAR3 spec.
>
> Because a PAR2 client can not understand PAR3 data anyway,
> I thought using PAR2 packet header for PAR3 data might be useless.
> The difference is parser of a PAR3 client.
> In (2.5), a PAR3 client search PAR2 packet header,
> and read PAR3 body data and common PAR2 body data.
> In (3.0), a PAR3 client search PAR3 packet header,
> and read PAR3 body data.
> Both method works well to collect PAR3 data.
>
> Only the problem of using packet (2.5) is that,
> PAR2 packet header is inefficient.
> Then I made smaller compact PAR3 packet header.
> If PAR3 file consists in PAR2 packets and additional PAR3 packets,
> it becomes larger than PAR2 file.
> If PAR3 file consists in PAR3 packets with smaller header,
> it becomes smaller than PAR2 file.
>
> To be clear compatibility, I updated PAR3 specification draft.
> I added a new optional packet, "PAR2 Container packet".
> PAR2 index file can be included in PAR3 file.
> In my latest PAR3 sample of MultiPar v1.1.5.5,
> there is a option, "Create PAR 2.0 compatible PAR3 file".
> Because basic mechanism of PAR3 is same as PAR2,
> (just packet header and checksum are different)
> PAR3 client can create PAR3 packets and PAR2 packets in same time.
> I tested with QuickPar and phpar2.exe.
> Both PAR2 clients can read PAR3 files and verify input files with them.
>
>
> > I don't know if libpar2 uses the trick I showed Peter Clements,
> > but it was to split 16-bit computations into high and low-byte
> operations.
>
> It uses the trick. (and my PAR2 client also.)
> Thank you for good sample implementation.
> I hope that you would make a sample of PAR3.
>
>
> > I strongly recommend against changing exist ASCII names to be UTF-8,
> > since it would not be backwards compatible.
>
> You may be miss-understanding.
> UTF-8 is compatible when the filename is ASCII.
> So, ASCII filenames are not changed by UTF-8 encoding.
>
> If a filename contains non-ASCII characters,
> "ASCII char array" can not represent it anyway.
> By following standard PAR 2.0 specifications,
> Unicode Filename packet is required to support Japanese characters,
> but there were no PAR2 clients which support the optional packet.
> (This was why I made my PAR2 client for Japanese.)
> Because converting between UTF-16 and UTF-8 is always possible,
> UTF-8 will be better than optional Unicode Filename packet.
>
>
> > what will convince people to switch from Par2 to Par3?
>
> Read forums of QuickPar and ICE ECC.
> Both are well used file repairing soft with GUI for Windows OS.
> The lack/weak-point of them will become the reason.
> There are some other applications without GUI,
> but they are not useful for general novice users.
>
> Now, PAR3 file is compatible with PAR2,
> and it is smaller/faster/convenient than PAR2.
> PAR3 can solve most problems of PAR2.
> While the problems of PAR2 are not serious for skilled users,
> some novice users are annoying to solve them by themselves.
> I am making PAR3 for the general public users.
>
>
> Best regards,
> Yutaka Sawada
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit. See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> Parchive-devel mailing list
> Parchive-devel@...
> https://lists.sourceforge.net/lists/listinfo/parchive-devel
>
|
|
From: <ten_fon@ma...> - 2010-06-16 02:03:20
|
Hello, Michael Nahas I am Yutaka Sawada. Even if you are not writing PAR3 specifications, your comments as PAR2 designer are helpful. > "My programing skill is not so high, and I can not write > standard C++ code." - Yutaka Sawada > Does this not worry anyone?! Don't worry. I am making only a sample application of PAR3. When PAR3 spec becomes stable, other high skill programers like Peter Clements or Peter Cordes will create much better PAR3 clients. If I miss something, other one may find the bug. > What problems are you trying to solve with Par3? I want to solve as possible as I can. Michael Nahas did very good job at PAR2 design. I copy most of PAR2 spec and fix only some small parts. In PAR2 specifications, there seems to be no serious problem, which can not be solved by conbining other method. Many people use old obsolete PAR2 clients still now. A simple solution by users are like below; > * Fix the bug in the Reed-Solomon algorithm. Create more recovery slices. > * Support a faster algorithm Use faster PC. > * Support for Unicode/UTF-8? Don't use Unicode filename. > * Support for large files / more slices? Use RAR (archive files and split). > * Lower overhead in Par2 files? Use larger media/drive or faster network. > * Support large numbers of files? > * Better support for directories? > * Use with CD/DVD/? disk images. Use archiver like TAR/ZIP/RAR. But, those solutions are not convenient for users. When PAR3 will be used widely, it is not only for skilled PC users, but it is for novice lazy users also. This is a difference between, possible to do, and easy to do. I think users want utility. > mail by me at 2010-05-23 > User can make both PAR2 and PAR3 files for same recovery set. > Even if there is no compatibility in packet level, > PAR files can contain both PAR2 packets and PAR3 packets same time. > PAR2 client reads PAR2 packets, and PAR3 client reads PAR3 packets. As I write ago, PAR3 file can be constructed as PAR2 format by adding some extra packets. There are three packet form selections; (2.0) PAR2 body data with PAR2 packet header - PAR2 spec, (2.5) PAR3 body data with PAR2 packet header - previous PAR3 spec, and (3.0) PAR3 body data with PAR3 packet header - new PAR3 spec. Because a PAR2 client can not understand PAR3 data anyway, I thought using PAR2 packet header for PAR3 data might be useless. The difference is parser of a PAR3 client. In (2.5), a PAR3 client search PAR2 packet header, and read PAR3 body data and common PAR2 body data. In (3.0), a PAR3 client search PAR3 packet header, and read PAR3 body data. Both method works well to collect PAR3 data. Only the problem of using packet (2.5) is that, PAR2 packet header is inefficient. Then I made smaller compact PAR3 packet header. If PAR3 file consists in PAR2 packets and additional PAR3 packets, it becomes larger than PAR2 file. If PAR3 file consists in PAR3 packets with smaller header, it becomes smaller than PAR2 file. To be clear compatibility, I updated PAR3 specification draft. I added a new optional packet, "PAR2 Container packet". PAR2 index file can be included in PAR3 file. In my latest PAR3 sample of MultiPar v1.1.5.5, there is a option, "Create PAR 2.0 compatible PAR3 file". Because basic mechanism of PAR3 is same as PAR2, (just packet header and checksum are different) PAR3 client can create PAR3 packets and PAR2 packets in same time. I tested with QuickPar and phpar2.exe. Both PAR2 clients can read PAR3 files and verify input files with them. > I don't know if libpar2 uses the trick I showed Peter Clements, > but it was to split 16-bit computations into high and low-byte operations. It uses the trick. (and my PAR2 client also.) Thank you for good sample implementation. I hope that you would make a sample of PAR3. > I strongly recommend against changing exist ASCII names to be UTF-8, > since it would not be backwards compatible. You may be miss-understanding. UTF-8 is compatible when the filename is ASCII. So, ASCII filenames are not changed by UTF-8 encoding. If a filename contains non-ASCII characters, "ASCII char array" can not represent it anyway. By following standard PAR 2.0 specifications, Unicode Filename packet is required to support Japanese characters, but there were no PAR2 clients which support the optional packet. (This was why I made my PAR2 client for Japanese.) Because converting between UTF-16 and UTF-8 is always possible, UTF-8 will be better than optional Unicode Filename packet. > what will convince people to switch from Par2 to Par3? Read forums of QuickPar and ICE ECC. Both are well used file repairing soft with GUI for Windows OS. The lack/weak-point of them will become the reason. There are some other applications without GUI, but they are not useful for general novice users. Now, PAR3 file is compatible with PAR2, and it is smaller/faster/convenient than PAR2. PAR3 can solve most problems of PAR2. While the problems of PAR2 are not serious for skilled users, some novice users are annoying to solve them by themselves. I am making PAR3 for the general public users. Best regards, Yutaka Sawada |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-15 16:59:07
|
Re M. Niedermayer: PAR + TAR/PkZip: Sorry, I must have been unclear. I had, while working on the Par2 spec, recommended getting rid of multi-file support and having users use TAR. However, the Par1 developers recommended strongly that I keep multi-file support. So, it was kept in. I believe we should keep multi-file support. I think that supporting a few thousand files is fine; if users need more than that, they can use TAR or PkZip. Unicode: I think it is better to be backwards compatible and use 16-bit Unicode than to be non-compatible and use UTF-8. Use Cases: * Streaming (e.g., NASA or multicast): These usually have burst errors. The best way to get around it is randomizing the order of the input slices (to some extent) and by interleaving recovery slices before transmission. I don't think we should support streaming, but it is something to think about. * User-level file system (as sister project): I've liked this idea for a while, but I think it is well beyond the scope of PAR. It only works well with XOR-based LDPC, and it has many different requirements than PAR. * Verification/redundancy for file distribution: I'm not sure my own ideas were really clear. :) My thinking was with people who distribute binaries and indicate on a webpage that "the valid MD5 hash is XXXX:XXXX..." to prevent malicious modification. If someone included a Par2 file with the binary, a person could just download and double-click on the Par2 file to verify the binary and MD5 hash. This is supported now by the file spec; however, I don't know if GUIs or the reference implementation print out the MD5 hashes of the files. Neidermayer, your comments got truncated again. I doubt it is my end (Google Mail). Do you know what might be causing it? Mike On Mon, Jun 14, 2010 at 4:04 PM, Michael Niedermayer <michaelni@...>wrote: > On Mon, Jun 14, 2010 at 12:48:21PM -0400, Michael Nahas wrote: > > PAR + TAR/PkZip: I agree with P. Cordes, tools for using TAR on Windows > > exist, as do tools for using PkZip on Unixs (Unices?). If someone needs > > more than 2^16 files, they can use one of those. We don't need to say > > which. > > I think i missunderstood the suggestion. > The way i understood it was that we could support just 1 file in par3 and > use tar for multiple file support (like for example gzip/bzip do as well). > For this case using pkzip on windows and tar on linux would have meant a > annoying incompatibilty. > if its just for a rare case of >65535 files this is no longer a real issue > > > [...] > ^^^^^^^^^^^^^ Truncation? > > UTF-8 Support: I wish it was easy as saying "Unicode has been around for > > ages" to say that support exists. Obviously, Java supports it. It looks > > like GNU has a C library. ( > > http://www.gnu.org/software/libidn/manual/libidn.html#Utility-Functions) > > ... [Cut by Michael Nahas] > > Any others? Any arguments in favor/against one of these? Any ideas on > how > > to implement redundancy for a single-disk backup? Should we have a > sister > > project for a user-level file system? (Sounds expense with RS codes.) > > > > Also, the Par2 spec included optional packets for containing input file > > slices. So, people would not have to use a file-splitter, like RAR, for > > Usenet or multi-disk backups. Do we want to push to make those packets > > mandatory? > [...] > ^^^^^^^^^^^^^^^^ Truncation? > > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Breaking DRM is a little like attempting to break through a door even > though the window is wide open and the only thing in the house is a bunch > of things you dont want and which you would get tomorrow for free anyway > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iD8DBQFMFoszYR7HhwQLD6sRAq6xAJ9HZbLRYFa96bdIYy9uGbJVMkZKcwCdFeVq > 74pvBmxzPFhxV56Ggw5/Ttg= > =tEoX > -----END PGP SIGNATURE----- > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > > |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-14 20:08:36
|
On Mon, Jun 14, 2010 at 12:48:21PM -0400, Michael Nahas wrote: > PAR + TAR/PkZip: I agree with P. Cordes, tools for using TAR on Windows > exist, as do tools for using PkZip on Unixs (Unices?). If someone needs > more than 2^16 files, they can use one of those. We don't need to say > which. I think i missunderstood the suggestion. The way i understood it was that we could support just 1 file in par3 and use tar for multiple file support (like for example gzip/bzip do as well). For this case using pkzip on windows and tar on linux would have meant a annoying incompatibilty. if its just for a rare case of >65535 files this is no longer a real issue [...] > UTF-8 Support: I wish it was easy as saying "Unicode has been around for > ages" to say that support exists. Obviously, Java supports it. It looks > like GNU has a C library. ( > http://www.gnu.org/software/libidn/manual/libidn.html#Utility-Functions) > However, since we have already have Unicode's 16-bit filenames as optional > feature, do we need need UTF-8 or just make the Unicode packet's manditory? If the par3 spec is compatible with existing par2 clients then this is an option. otherwise i think supporting only utf8 is the better choice. > > BSD vs. LGPL/GPL licence for reference implementation: First, this is the > reference implementation. It should be clean and clear, not necessarily the > high-performance library used by everyone. Second, as much as I try to work > on the spec and not the code, I think the license on the reference > implementation should change from the GPL to the LGPL or BSD. This will > allow people to use the code in a dynamic library (*.DLL/*.so) and not have > to make public the source of their entire application. (For the LGPL, they > would have to make public their changes to the library.) Changing the > license will require either getting the permission of the authors of the > current reference code or starting a new reference implementation from > scratch. As it is a reference implementation, and not necessarily a > high-performance implementation, I'd suggest the LGPL. But I'm willing to > leave it up to the person who invests the time to write it. > > [Side note: http://news.slashdot.org/article.pl?sid=10/06/04/1953232 Looks > like Google released VP8 with a BSD license plus a separate patent license. > The patent license is voided if a company brings a patent suit against > Google. Thus, if someone sues Google, Google is free to sue them back using > the VP8 patents.] right, ive just remembered that they changed it from their gpl incompatible license not what they changed it to. their bitstream spec license btw seems CC > > The important question, in my opinion, is what use cases should be aiming to > support? > * Usenet transmission of large files > * Multi-disk backup redundancy (e.g., burn 4 CDs, burn a 5th with redundant > data). > * ? redundancy for remote backups > * ? redundancy on single-disk backup (Already done by DVDisaster; we'd like > to support it, but there isn't an easy implementation yet. Is this better > done by a filesystem?) par* surely can be used for this but when done in a filesystem or at a lower level its possible to do things that a file level par* tool cannot do. Examples are: * random read & write to the disk (a filesystem can just update the parity sectors in the background) * disks use their own FEC With a par* like tool, one would be thus writing several layers of ECC information onto a disk and would not be able to pass any information between them that is if too many errors exist in a sector it would not be vissible to par at all. This is quite inefiicient for example 90% of a sector might be undamaged and could significnatly help a RS error&erasure decoder. Furtermore the hardware generally knows which parts are damaged. For CDs you can see 'readcd -edc-corr' which bypasses the last stage of error correction done by cd drives and does it in software. A hypothetical tool could use the information from sectors that fail this last stage correction on CDs instead of treating them like black box uncorrectable and unavailable > * ? redundancy for streaming (e.g., NASA transmissions) deep space communication is affected by bit flips and burst errors the packet erasure correction aimed at by par* is not suitable for this > * ? verification/redundancy for file distribution (stronger than just an MD5 > check?) iam not sure i understand this use case, could you elaborate? > Any others? Any arguments in favor/against one of these? Any ideas on how > to implement redundancy for a single-disk backup? Should we have a sister > project for a user-level file system? (Sounds expense with RS codes.) > > Also, the Par2 spec included optional packets for containing input file > slices. So, people would not have to use a file-splitter, like RAR, for > Usenet or multi-disk backups. Do we want to push to make those packets > mandatory? [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Breaking DRM is a little like attempting to break through a door even though the window is wide open and the only thing in the house is a bunch of things you dont want and which you would get tomorrow for free anyway |
|
From: Michael Nahas <mike.nahas@gm...> - 2010-06-14 16:48:28
|
PAR + TAR/PkZip: I agree with P. Cordes, tools for using TAR on Windows exist, as do tools for using PkZip on Unixs (Unices?). If someone needs more than 2^16 files, they can use one of those. We don't need to say which. Directories: The spec should probably be modified to support empty directories. P. Cordes says we also have a problem in the reference implementation's support for directories. 32-bit RS: No clear answer yet. UTF-8 Support: I wish it was easy as saying "Unicode has been around for ages" to say that support exists. Obviously, Java supports it. It looks like GNU has a C library. ( http://www.gnu.org/software/libidn/manual/libidn.html#Utility-Functions) However, since we have already have Unicode's 16-bit filenames as optional feature, do we need need UTF-8 or just make the Unicode packet's manditory? BSD vs. LGPL/GPL licence for reference implementation: First, this is the reference implementation. It should be clean and clear, not necessarily the high-performance library used by everyone. Second, as much as I try to work on the spec and not the code, I think the license on the reference implementation should change from the GPL to the LGPL or BSD. This will allow people to use the code in a dynamic library (*.DLL/*.so) and not have to make public the source of their entire application. (For the LGPL, they would have to make public their changes to the library.) Changing the license will require either getting the permission of the authors of the current reference code or starting a new reference implementation from scratch. As it is a reference implementation, and not necessarily a high-performance implementation, I'd suggest the LGPL. But I'm willing to leave it up to the person who invests the time to write it. [Side note: http://news.slashdot.org/article.pl?sid=10/06/04/1953232 Looks like Google released VP8 with a BSD license plus a separate patent license. The patent license is voided if a company brings a patent suit against Google. Thus, if someone sues Google, Google is free to sue them back using the VP8 patents.] The important question, in my opinion, is what use cases should be aiming to support? * Usenet transmission of large files * Multi-disk backup redundancy (e.g., burn 4 CDs, burn a 5th with redundant data). * ? redundancy for remote backups * ? redundancy on single-disk backup (Already done by DVDisaster; we'd like to support it, but there isn't an easy implementation yet. Is this better done by a filesystem?) * ? redundancy for streaming (e.g., NASA transmissions) * ? verification/redundancy for file distribution (stronger than just an MD5 check?) Any others? Any arguments in favor/against one of these? Any ideas on how to implement redundancy for a single-disk backup? Should we have a sister project for a user-level file system? (Sounds expense with RS codes.) Also, the Par2 spec included optional packets for containing input file slices. So, people would not have to use a file-splitter, like RAR, for Usenet or multi-disk backups. Do we want to push to make those packets mandatory? Mike Nahas On Sun, Jun 13, 2010 at 11:50 PM, Michael Niedermayer <michaelni@...>wrote: > On Sun, Jun 13, 2010 at 11:18:21AM +0200, Kristian Trenskow wrote: > [....] > > I have no personal interest in the licensing choice of PAR3. I agree that > LPGL is sufficient, but I have been following the mailinglist for some years > now, and I really believe, that if you want to extend PARs popularity beyond > usenet, you should consider another license. > > > > As of today PAR2 is more or less a Usenet technology, but it's practical > in so many other applications. I think the ISO standardization process - > which was initiated for PAR2 - is a hugely great idea. I really believe you > - the Parchive team - should aim at it again, once the PAR3 specifications > is written, implemented and tested. > > > > I mean, how would you feel, if - as an example - NASA adopted it, and > suddenly your technology was orbiting the earth? NASA already uses a > modified Reed-Solomon algorithm for transferring data from space to earth. > The PAR3 technology would fit in their pants if they could freely adopt it, > and use it for data transfer. I know this is a far fetched example, but I do > believe that governmental institutions like the military, would have far > more confidence in using a technology like PAR3, if it was standardized and > open. > > NASA can internally use GPL software as they see fit, they would have > no obligation to distribute source. > Only if they distribute a binary to someone are they required to > also make the source of the whole program available to her. > note, IANAL of course so just my oppinon > > > > > > And that's not GPL. GPL is bad for technology, because - as you said - it > forces people to redistribute any changes as source code. BSD is more > lightheaded, and make people able to apply the technologies to whatever > purpose they might have. > > btw, did you ever read the GPL? > > > [...] > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Concerning the gods, I have no means of knowing whether they exist or not > or of what sort they may be, because of the obscurity of the subject, and > the brevity of human life -- Protagoras > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > > iD8DBQFMFabrYR7HhwQLD6sRAtxAAJ0d8PrecUfjtp0wP6Bykhgf+exEhACeIJoJ > suVyiUAxD1nf5w7kK22xcW0= > =5XnJ > -----END PGP SIGNATURE----- > > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > > |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-14 03:54:23
|
On Sun, Jun 13, 2010 at 11:18:21AM +0200, Kristian Trenskow wrote: [....] > I have no personal interest in the licensing choice of PAR3. I agree that LPGL is sufficient, but I have been following the mailinglist for some years now, and I really believe, that if you want to extend PARs popularity beyond usenet, you should consider another license. > > As of today PAR2 is more or less a Usenet technology, but it's practical in so many other applications. I think the ISO standardization process - which was initiated for PAR2 - is a hugely great idea. I really believe you - the Parchive team - should aim at it again, once the PAR3 specifications is written, implemented and tested. > > I mean, how would you feel, if - as an example - NASA adopted it, and suddenly your technology was orbiting the earth? NASA already uses a modified Reed-Solomon algorithm for transferring data from space to earth. The PAR3 technology would fit in their pants if they could freely adopt it, and use it for data transfer. I know this is a far fetched example, but I do believe that governmental institutions like the military, would have far more confidence in using a technology like PAR3, if it was standardized and open. NASA can internally use GPL software as they see fit, they would have no obligation to distribute source. Only if they distribute a binary to someone are they required to also make the source of the whole program available to her. note, IANAL of course so just my oppinon > > And that's not GPL. GPL is bad for technology, because - as you said - it forces people to redistribute any changes as source code. BSD is more lightheaded, and make people able to apply the technologies to whatever purpose they might have. btw, did you ever read the GPL? [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Concerning the gods, I have no means of knowing whether they exist or not or of what sort they may be, because of the obscurity of the subject, and the brevity of human life -- Protagoras |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-14 03:12:42
|
On Sun, Jun 13, 2010 at 11:18:21AM +0200, Kristian Trenskow wrote: > Mike > > Personally I just like the BSD-licenses a lot more than the GPL. I don't like being forced. Not to a proprietary model nor to an open sourced. > As you say, the GPL forces people to open up any improvements, which I really think is bad. If (and only if) you want to benefit from other peoples work for free without giving back then yeah its bad I dont think BSD is in the interrest of the people who work on par3 but this is their choice. Its their work and their code and they can choose to license it any way they want. > To me the GPL is as bad as a proprietary license. If you want to go open standard - go BSD. Remember GPL was created at a time were there where hostility towards open software. Therefore there was a great risk of people just taking your work, and putting it into proprietary software. If you wanted to create an open source community, open source had to be forced through. You either have no clue or worse FOSS is being taken by companies and put into their proprietary software and sold for profit without giving anything back today as much as in the past, the difference is that nowadays they dont give a damn if they are alowed to or not. ffmpegs bug tracker lists 140 cases where our licesnse was or is violated Where ffmpeg BSD then it would be many many more companies who did this and in that case we couldnt even stop them. [...] > > As of today PAR2 is more or less a Usenet technology, but it's practical in so many other applications. I think the ISO standardization process - which was initiated for PAR2 - is a hugely great idea. I really believe you - the Parchive team - should aim at it again, once the PAR3 specifications is written, implemented and tested. > > I mean, how would you feel, if - as an example - NASA adopted it, and suddenly your technology was orbiting the earth? NASA already NASA will not > uses a modified Reed-Solomon algorithm for transferring data from space to earth. The PAR3 technology would fit in their pants if they could freely adopt it, and use it for data transfer. I know this is a far fetched example, but I do believe that governmental institutions like the military, would have far more confidence in using a technology like PAR3, if it was standardized and open. You have absolutely no clue, the JPL people know FEC/RS much better than we do. Besides none of the proposed algorithms for par3 is suitable for deep space communication on its own. Several decades ago JPL/NASA used viterbi + interleaved RS which performs far better for their puprose than pure RS would have, which would be by another very large factor better than the pure erasure correction par* does. Anyway deep space communication is a quite specific thing and very different from the packet erasure handling that par* aims for. > > And that's not GPL. GPL is bad for technology, because - as you said - it forces people to redistribute any changes as source code. BSD is more lightheaded, and make people able to apply the technologies to whatever purpose they might have. > > There is a reason why Google releases their open sources as BSD. It's for the greater good of humanity. Not because they want to benefit from the improvements made outside Google. google did not release VP8 under BSD last time i checked i need to recheck though under what they released it now after their first attempt that was gpl incompatible by mistake [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Incandescent light bulbs waste a lot of energy as heat so the EU forbids them. Their replacement, compact fluorescent lamps, much more expensive, dont fit in many old lamps, flicker, contain toxic mercury, produce a fraction of the light that is claimed and in a unnatural spectrum rendering colors different than in natural light. Ah and we now need to turn the heaters up more in winter to compensate the lower wasted heat. Who wins? Not the environment, thats for sure |
|
From: Kristian Trenskow <trenskow@me...> - 2010-06-13 09:18:47
|
Mike Personally I just like the BSD-licenses a lot more than the GPL. I don't like being forced. Not to a proprietary model nor to an open sourced. As you say, the GPL forces people to open up any improvements, which I really think is bad. To me the GPL is as bad as a proprietary license. If you want to go open standard - go BSD. Remember GPL was created at a time were there where hostility towards open software. Therefore there was a great risk of people just taking your work, and putting it into proprietary software. If you wanted to create an open source community, open source had to be forced through. BSD - on the other hand - encourages to the best in people. It doesn't tie anyone to any specific licensing model. If you want to improve/adapt the sources for your proprietary needs - feel free. If you want to improve and redistribute - feel free. I have no personal interest in the licensing choice of PAR3. I agree that LPGL is sufficient, but I have been following the mailinglist for some years now, and I really believe, that if you want to extend PARs popularity beyond usenet, you should consider another license. As of today PAR2 is more or less a Usenet technology, but it's practical in so many other applications. I think the ISO standardization process - which was initiated for PAR2 - is a hugely great idea. I really believe you - the Parchive team - should aim at it again, once the PAR3 specifications is written, implemented and tested. I mean, how would you feel, if - as an example - NASA adopted it, and suddenly your technology was orbiting the earth? NASA already uses a modified Reed-Solomon algorithm for transferring data from space to earth. The PAR3 technology would fit in their pants if they could freely adopt it, and use it for data transfer. I know this is a far fetched example, but I do believe that governmental institutions like the military, would have far more confidence in using a technology like PAR3, if it was standardized and open. And that's not GPL. GPL is bad for technology, because - as you said - it forces people to redistribute any changes as source code. BSD is more lightheaded, and make people able to apply the technologies to whatever purpose they might have. There is a reason why Google releases their open sources as BSD. It's for the greater good of humanity. Not because they want to benefit from the improvements made outside Google. Hope I draw a clearer picture. Regards Kristian On 11/06/2010, at 19.48, Michael Nahas <mike.nahas@...> wrote: > Re M. Niedermayer: > > TAR on Windows: Windows users have PkZip or other tools that can aggregate files and preserve permissions. > > 32-bit RS: You make good points. I think the question comes down to our goals with Par3 and how easy it is to support more slices. I think supporting more slices is better if we can do it well. Hopefully, the Bilski case will help us there by invalidating the LDPC software patents in the USA. (I know Europe doesn't support software patents. Don't know about Japan.) Tornado Codes can get by easily with an 8-bit Reed Solomon plus XOR. > > Unicode: Your comments appear to have been truncated after "2010". (Weird!) Can you repeat them? > > Your libraries: I don't know what makes a standard get adopted by people. For all I know, Par2 was accepted just because of the name. > > > Re: J. Cea: > > DVDisaster: First of all, great example of a use case. Yes, I agree with your (and DVDisaster's author's) point that TAR+PAR2 is not perfect for protecting data on a DVD. My question is, how easy is this to add to PAR? I don't know anything about ISO files. I've always thought of PAR as one of the UNIX pipe-able command line tools, like "tar c file1.txt file2.txt | par c -- -- > foo.par2". Unless there is a good library for working with ISO files or a trick, like filling up an ISO with a dummy PAR file and then overwriting the file contents with the PAR data, I think it is not a use case we should consider. (I don't like ignoring it though; I use TAR+PAR on a DVD for my backups.) > > > Re K. Trenskow: > > Licensing: You didn't make your reasoning clear. > I work with the specifications, which I believe use a Gnu document license. > I think the best design for the reference code is a LGPLed library + a GPLed command-line program that just processed arguments and calls the library. The LGPL is nice since it forces people to make public any improvements to the library, without having to make their entire program public. And the LGPL/GPL only applies to the code; someone else can write their own code from fresh that implements the standard without having to release any of their code. > If that's isn't good enough for you, can you state explicitly what your problem is with the LGPL or GPL and say why the BSD license is better? (BTW, if we change away from a Gnu license now, we have to either get permission from every author or start the Par3 reference code from scratch.) > > > Any more use-cases or goals? > > Comments & criticism always welcome. > > Mike > > > On Fri, Jun 11, 2010 at 12:39 PM, Kristian Trenskow <trenskow@...> wrote: > Please... No matter what you do, please consider the licensing issues > in context of using bringing TAR. As far as I am aware, TAR is GPL'ed. > So please do a recode, in order to make it BSD-license compatible. It > would be wise not to GPL the code - as PAR2 is. If you guys want to > ISO standardize PAR3, you probably should go with a BSD-licensed > reference code. > > // Kristian Trenskow > > On 11/06/2010, at 17.27, Jesus Cea <jcea@...> wrote: > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > On 10/06/10 02:34, Michael Nahas wrote: > >> * Use with CD/DVD/? disk images. (Again can this be solved with > >> TAR?) > > > > I would something like <http://dvdisaster.net/en/> standarized. > > > > Compared with TAR, it protects all the block of the DVD, so you can > > recover the data even if the directory blocks are bad (something you > > can > > not do with a TAR+PAR2). Also, you are protecting media blocks, so > > content and structure (filesystem) are transparent. > > > > - -- > > Jesus Cea Avion _/_/ _/_/_/ _/_/_/ > > jcea@... - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/ > > _/ > > jabber / xmpp:jcea@... _/_/ _/_/ _/_/_/_/_/ > > . _/_/ _/_/ _/_/ _/_/ _/_/ > > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.4.10 (GNU/Linux) > > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > > > iQCVAwUBTBJV45lgi5GaxT1NAQJhEgQAk/TxiWzRGwrGYra+K+wmUsGwCT0WhXAj > > UipjqDII8W3NtKGnNP8SLqCoz2hEQ+W7P21QVMWxWS7/v8o2e44Wfq+n8x9EToKi > > DJ5DOYOXhnj4ryFrBxjlXbfn2nhLMXXsiXXUi4OwHAM83UWG41t8ZKaYifZW5AiB > > fLJmjDLULRQ= > > =T54P > > -----END PGP SIGNATURE----- > > > > --- > > --- > > --- > > --------------------------------------------------------------------- > > ThinkGeek and WIRED's GeekDad team up for the Ultimate > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > > lucky parental unit. See the prize list and enter to win: > > http://p.sf.net/sfu/thinkgeek-promo > > _______________________________________________ > > Parchive-devel mailing list > > Parchive-devel@... > > https://lists.sourceforge.net/lists/listinfo/parchive-devel > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > Parchive-devel mailing list > Parchive-devel@... > https://lists.sourceforge.net/lists/listinfo/parchive-devel > |
|
From: Michael Niedermayer <michaelni@gm...> - 2010-06-13 07:12:59
|
On Fri, Jun 11, 2010 at 01:48:04PM -0400, Michael Nahas wrote: > Re M. Niedermayer: > > TAR on Windows: Windows users have PkZip or other tools that can aggregate > files and preserve permissions. that would make linux and windows par3 incompatible > > 32-bit RS: You make good points. I think the question comes down to our > goals with Par3 and how easy it is to support more slices. I think > supporting more slices is better if we can do it well. Hopefully, the The question remains, if more slices are needed/usefull at all or if this usefullness of more slices arrises out of par2s design of how to distribute files on slices (and par3 has the same issue). And that a different design would not benefit in the same way from more slices. Also if there is a remaining use for more slices, the cost this larger number of slices has, has to be considered 32bit RS has its costs, also RS codes allow error correction at the word level not just at the slice level. A par3 spec should not use a code that makes it impossible to implement such word level correction even if the reference implementation does not support it. Word level correction would allow to recover data that otherwise is lost [...] > > Unicode: Your comments appear to have been truncated after "2010". > (Weird!) Can you repeat them? what i meant was we have the year 2010, utf8 is so old i really think one can assume it to be supported [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Concerning the gods, I have no means of knowing whether they exist or not or of what sort they may be, because of the obscurity of the subject, and the brevity of human life -- Protagoras |