From: Lorenzo C. <lor...@em...> - 2003-10-11 15:32:45
|
Hi, working with the Relationship Calculator, I noted that, because of the way it's implemented, there maight be a tendency to build a set of rel_*.py files which will be difficult to maintain. I fear that in the end there'll be tens of get_relationship() function versions which are difficult to manage if a bug is dicovered. (Alex cloned the apply_filter(), too, while I managed to not do it.) My proposal is to let the programmer to write their own get_relationship() function, if they really need it, but to offer another possobility as first option. In practice, the first customizable functions are get_father(), get_uncle(), get_cousin(), and so on, which are passed the necessary variables as to return a i18n string. If a language defines a relationship which is not know to English (for example, Alex defined get_senior_male_cousin() among others), the definition may be caught by a special function called from within get_raltionship() before any evaluation. If Alex doesn't mind, I can try on my own to improve the mechanism. The idea is summarized here: Relationship.py: def get_father def get_son def get_* def special_cases: return None def apply_filter def get_relationship: apply_filter() special_cases() if ... get_father elif ... get_son elif ... get_* plugins/rel_*.py def get_father def get_son def get_* def special_cases: return None def apply_filter # Only if striclty necessary!!! def get_relationship: # Only if striclty necessary!!! ... register_relcal(get_father, get_son, ...) -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |
From: Alex R. <sh...@al...> - 2003-10-12 16:29:03
|
On Sat, Oct 11, 2003 at 02:53:59PM +0200, Lorenzo Cappelletti wrote: > working with the Relationship Calculator, I noted that, because of the=20 > way it's implemented, there maight be a tendency to build a set of=20 > rel_*.py files which will be difficult to maintain. >=20 > I fear that in the end there'll be tens of get_relationship() function=20 > versions which are difficult to manage if a bug is dicovered. (Alex=20 > cloned the apply_filter(), too, while I managed to not do it.) I agree, I could have just imported it, as you did. > My proposal is to let the programmer to write their own > get_relationship() function, if they really need it, but to offer > another possobility as first option. In practice, the first=20 > customizable functions are get_father(), get_uncle(), get_cousin(), and= =20 > so on, which are passed the necessary variables as to return a i18n=20 > string. >=20 > If a language defines a relationship which is not know to English (for > example, Alex defined get_senior_male_cousin() among others), the > definition may be caught by a special function called from within > get_raltionship() before any evaluation. I see what the idea is. I'm just not sure whether it's worth the effort. Hypothetically, the differences between some language and english could=20 be big enough so that customization of english-based functions won't=20 work or will be far more copmlex than the existing implementation. > The idea is summarized here: >=20 > Relationship.py: >=20 > def get_father > def get_son > def get_* >=20 > def special_cases: > return None > =20 > def apply_filter >=20 > def get_relationship: > apply_filter() > special_cases() > if ... > get_father > elif ... > get_son > elif ... > get_* >=20 >=20 > plugins/rel_*.py >=20 > def get_father > def get_son > def get_* >=20 > def special_cases: > return None >=20 > def apply_filter # Only if striclty necessary!!! > =20 > def get_relationship: # Only if striclty necessary!!! > ... >=20 > register_relcal(get_father, get_son, ...) I see. So the attempt is to not copy get_relationship if the relationship= =20 structure is the same. However, the very need for the rel calc comes=20 =66rom the observation that the relationship structure is not the same.=20 For example, you needed male and female cousins returning different=20 strings in Italian, whereas in English there's no difference. I definitely don't mind you trying it out, but honestly I'm skeptical=20 that it will be help for more than one language :-) The way it is right now, you can copy an existing rel calc and then=20 tweak it to get in compliance with your language. It does involve some=20 duplication, but it's not terribly inefficient in this case, because=20 only one of the similar pieces gets executed, depending on the LANG. As for the bug in one function being copied to all langs, the results=20 will be the same if all langs use that same buggy function :-) The only=20 difference would be in fixing -- one would have to go through all=20 rel_*.py files and fix the same line. Hopefully, (1) it's not so hard=20 and (2) we will catch most of them before there's too many langs=20 (currently 2) involved :-) Just MHO, Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: John S. <st...@lu...> - 2003-10-13 15:21:13
|
>>>>> "Alex" == Alex Roitman <sh...@al...> writes: Alex> I see. So the attempt is to not copy get_relationship if the Alex> relationship structure is the same. However, the very need for Alex> the rel calc comes from the observation that the relationship Alex> structure is not the same. For example, you needed male and Alex> female cousins returning different strings in Italian, whereas Alex> in English there's no difference. So make the default version in English take into account the sex, but the resturned string is just the same (i.e. copied twice), while for Italian or other languages, it's comes out properly. Alex> The way it is right now, you can copy an existing rel calc and Alex> then tweak it to get in compliance with your language. It does Alex> involve some duplication, but it's not terribly inefficient in Alex> this case, because only one of the similar pieces gets executed, Alex> depending on the LANG. As for the bug in one function being Alex> copied to all langs, the results will be the same if all langs Alex> use that same buggy function :-) The only difference would be in Alex> fixing -- one would have to go through all rel_*.py files and Alex> fix the same line. Hopefully, (1) it's not so hard and (2) we Alex> will catch most of them before there's too many langs (currently Alex> 2) involved :-) The problem as I see it is that once a bug is found in one place, it's not going to get fixed in other places, especially if you don't know the other language. I agree with Lorenzo here, it's better to factor out the code into one set of functions, and to provide translation strings for individual languages. With a proper setup, we can provide a base function that will cover most languages, while still having the ability of a wierd and funky language (such as aliens with three sexes ;-) the ability to over ride when needed. Keeping as much code in a common set of functions will make maintenance much simpler, and then the translators will just have to translate strings. So what if we have some redundancy in the English strings file, that's ok. John |
From: Alex R. <sh...@al...> - 2003-10-13 16:08:13
|
On 2003.10.13 10:15, John Stoffel wrote: > Alex> I see. So the attempt is to not copy get_relationship if the > Alex> relationship structure is the same. However, the very need for > Alex> the rel calc comes from the observation that the relationship > Alex> structure is not the same. For example, you needed male and > Alex> female cousins returning different strings in Italian, whereas > Alex> in English there's no difference. >=20 > So make the default version in English take into account the sex, but > the resturned string is just the same (i.e. copied twice), while for > Italian or other languages, it's comes out properly. John, This has been discussed time and again in different contexts, but it =20 seems to always come back :-) There is a tradeoff between the =20 "effectiveness/maintainability" and the translation quality. Take a look at the code of the FTM style reports. Very similar line =20 gets repeated over and over with little changes (He was born on date in =20 place and died on date in place). Prior to that we used to have strings =20 build from the pieces which would be conditionally glued to each other. =20 This worked for English, but was poorly translated into other =20 languages. Eventually, we decided to use complete strings to have human =20 readable text in all languages, which means sacrificing some =20 efficiency. In a similar way, it is next to impossible to have English rel calc =20 compatible with all (or most) of the languages. I remember Lars was =20 saying that in Danish one can use strings like "father's brother's =20 daughter's cousin" which simply don't exist in English. Take another example: Russian distinguishes removed cousins (my mom's =20 second cousin and I are not the same to each other: she's my "second =20 aunt" and I'm her "second nephew"), while English calls us both second =20 cousins once removed to each other. The list can go on and on. With =20 more than a couple of languages, the task of maintaining generic =20 relationship calculator compatible with all quirks of all languages is =20 becoming futile IMHO. > The problem as I see it is that once a bug is found in one place, =20 > it's not going to get fixed in other places, especially if you don't =20 > know the other language. =20 You are correct: we sacrifice some efiiciency in exchange for human =20 readability here. I don't think the problem is that bad though. For =20 sure, if the problem is relevant to more than one language then you can =20 easily figure it out without knowing the language. > I agree with Lorenzo here, it's better to factor out the code into =20 > one set of functions, and to provide translation strings for =20 > individual languages. With a proper setup, we can provide a base =20 > function that will cover most languages, while still having the =20 > ability of a wierd and funky language (such as aliens with three =20 > sexes ;-) the ability to over ride when needed.=20 I am all for it when and where it can be done reasonably well. As for =20 being able to provide a base function, this is exactly where I think =20 the argument is wrong. If you manage to construct such a thing it would =20 be good, but I doubt it is possible. If we could, we would just =20 translate english relationships, but that's clumsy.=20 > Keeping as much code in a common set of functions will make > maintenance much simpler, and then the translators will just have to > translate strings. So what if we have some redundancy in the English > strings file, that's ok. Again, I agree with this in general, but I think one can program a =20 function (it's only one function we're talking about) not obeying this =20 rule and it would not be a huge efficiency/maintainability problem. I would really love to get more input from other translators. I know =20 that when I was writing my rel calc, it was extremely helpful to just =20 write Russian relations and wrap them in a similar form rather than =20 figure how to fit what I know into a generic prototype (and create that =20 prototype as I go, btw :-). Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Alex R. <sh...@al...> - 2003-10-13 16:26:47
|
>> Keeping as much code in a common set of functions will make =20 >> maintenance much simpler, and then the translators will just have =20 >> to translate strings. So what if we have some redundancy in the =20 >> English strings file, that's ok. =20 > > Again, I agree with this in general, but I think one can program a =20 > function (it's only one function we're talking about) not obeying =20 > this rule and it would not be a huge efficiency/maintainability =20 > problem.=20 Actually, now that I thought more about this :-), I would even argue =20 that trying to make a generic function that "fits all" languages is =20 making the code more complex and eventually more error-prone and less =20 efficient. Of course it buys you the assurance that all 1001 bugs are =20 in one file now, but it comes at the expense of far more complex =20 generic program. Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Lorenzo C. <lor...@em...> - 2003-10-13 21:05:17
|
Alex Roitman <sh...@al...>, Mon 13 Oct 2003 11:07 -0500: > languages. Eventually, we decided to use complete strings to have human > readable text in all languages, which means sacrificing some > efficiency. The basic idea to bear in mind here is that programmer should make concepts atomic, not words. Put it in different words, concepts are the same for everyone, a language is just a tool to express the same concept. > Take another example: Russian distinguishes removed cousins (my mom's > second cousin and I are not the same to each other: she's my "second > aunt" and I'm her "second nephew"), while English calls us both second > cousins once removed to each other. The list can go on and on. With But for both languages you have two individuals with a common ancestor and a number of families between them. The concept is the same, the words differ. If you want to go further, father, mother, child, brother/sister, and uncle/aunt are reasonably common concepts to all culters and may constitute the base of the shared function that John was talking about. All the others are just two related individuals with a common ancestor that need to be processed with a language-specific function. (spouses represents an exception here.) Again, the ancestor list should be built in only one place. -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |
From: Alex R. <sh...@al...> - 2003-10-13 21:30:35
|
On 2003.10.13 15:49, Lorenzo Cappelletti wrote: > But for both languages you have two individuals with a common =20 > ancestor and a number of families between them. The concept is the =20 > same, the words differ.=20 Sure, I understand that. The thing is that some relations have names in =20 one language and some don't, while they do in another. In that regard, =20 the concepts are not the same.=20 > If you want to go further, father, mother, child, brother/sister, and = =20 > uncle/aunt are reasonably common concepts to all culters and may =20 > constitute the base of the shared function that John was talking =20 > about.=20 Again, I agree in general that sharing the common concepts is good. The =20 real benefits though will depend on how much common do the things have. =20 If the amount of overlap is not too big, it might not be worth the =20 effort to bring everything to a shareable framework. > All the others are just two related individuals with a common =20 > ancestor that need to be processed with a language-specific =20 > function. (spouses represents an exception here.) Yes, but the choice which have the names and which don't is different =20 in differenet langs. Also, even if there's a choice, different langs =20 have different algorithms (patterns) of naming your common ancestor =20 relatives. > Again, the ancestor list should be built in only one place. Oh, I have no problem with that. I was only talking about the language-=20 specific parts. I will correct my calculator in that regard, =20 sorry :-) Trying to summarize my view of this discussion: sharing is good, but it =20 has to be justified. In our case, all that rel calc has to provide is a =20 function that returns language-specific relation between two people. If =20 that was not language-specific (and I don't mean different words, I =20 mean different concepts since some relations have names in a lang while =20 others don't) then we would not be writing rel calcs in a first place. =20 We would translate English relationships (and nobody was happy with =20 that :-). Therefore, I believe that the rel calc can afford little =20 redundancy _if_ this allows to create it easily without making the =20 whole system overly complex. This does not mean that everybody should =20 copy language-independent functions as I did :-) As for my calculator, I can see how I can factor out father/mother/=20 sister/brother/niece/newphew business and place it into generic tool. I =20 can see no way to squeeze in cousins into english-based tool. I also =20 think that doing so (if there's some tricky way) will make that generic =20 tool overly complex, all to accomodate just one language. What will it =20 look like when we'll add Danish and Japanese? Would it be worth trouble =20 to factor out some of my relations? Maybe it is worth the trouble, I don't know, but I'm fairly happy with =20 the way it is now (except for the lang-independent part I cloned :-) Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Lorenzo C. <lor...@em...> - 2003-10-14 22:50:25
|
Alex Roitman <sh...@al...>, Mon 13 Oct 2003 16:30 -0500: > As for my calculator, I can see how I can factor out father/mother/ > sister/brother/niece/newphew business and place it into generic tool. I > can see no way to squeeze in cousins into english-based tool. I also > think that doing so (if there's some tricky way) will make that generic > tool overly complex, all to accomodate just one language. What will it > look like when we'll add Danish and Japanese? Would it be worth trouble > to factor out some of my relations? Ok, I think the thread has thrown some new light on how I'd like relcalc should be implemented. No father/mother/sister/brother/niece/nephew functions. A language might use a different word for the father of a son or a daughter. Then, each rel_??.py should provide one and only one function which returns a string expressing the relationship between individuals A and B. Language programmer should be given all the necessary pieces of information as to correctly determine the relationship: individual A (along with gender), individual B, A's level, B's level, and common ancestor list. A possible function prototype would be: getRelationshipString(Person.a, Person.b, integer.aLevel, integer.bLevel, list.commonAncestors) Here Alex is right. There'll be some redundancy: languages which computes a relationship in a common way will have the same, cloned if-statement test. But that's the (small) price we have to pay for comprehensivness. (Note that spouse relationship is an exception to this function.) Now that I think more about it, why a list of ancestors? If there's a consanguinity between two people, should there be just one common ancestor? Mmhhh... The fact is that an individual gets birth from a family, not from two individuals. The, our function is: getRelationshipString(Person.a, Person.b, integer.aLevel, integer.bLevel, Family.commonFamily) This way, spouse relationship is no longer an exception. If aLevel=0 and bLevel=0, a is a spouse of b. -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |
From: Alex R. <sh...@al...> - 2003-10-15 20:54:59
|
On 2003.10.14 04:26, Lorenzo Cappelletti wrote: > No father/mother/sister/brother/niece/nephew functions. A language =20 > might use a different word for the father of a son or a daughter. =20 > Then, each rel_??.py should provide one and only one function which =20 > returns a string expressing the relationship between individuals A =20 > and B. =20 Yay! Thank you, that was exactly the goal behind having language-=20 dependent rel calc functions :-)=20 > Language programmer should be given all the necessary pieces of=20 > information as to correctly determine the relationship: individual A=20 > (along with gender), individual B, A's level, B's level, and common=20 > ancestor list. I agree with that. > A possible function prototype would be: >=20 > getRelationshipString(Person.a, Person.b, integer.aLevel,=20 > integer.bLevel, list.commonAncestors) >=20 > Here Alex is right. There'll be some redundancy: languages which > computes a relationship in a common way will have the same, cloned > if-statement test. But that's the (small) price we have to pay for > comprehensivness. (Note that spouse relationship is an exception to=20 > this function.) Seems like a good idea too. > Now that I think more about it, why a list of ancestors? If there's =20 > a consanguinity between two people, should there be just one common =20 > ancestor? Mmhhh... =20 > > The fact is that an individual gets birth from a family, not from two=20 > individuals. The, our function is: >=20 > getRelationshipString(Person.a, Person.b, integer.aLevel,=20 > integer.bLevel, Family.commonFamily) >=20 > This way, spouse relationship is no longer an exception. If aLevel=3D0= =20 > and bLevel=3D0, a is a spouse of b. Sounds very good to me. Any other opinions, anybody? Lorenzo, if you'd like to implement this it would be great. If not, let =20 me know and I'll have a go at this. Should not be too hard to do. Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Lorenzo C. <lor...@em...> - 2003-10-16 21:35:12
|
Alex Roitman <sh...@al...>, Wed 15 Oct 2003 15:54 -0500: > Lorenzo, if you'd like to implement this it would be great. If not, let > me know and I'll have a go at this. Should not be too hard to do. You can go ahead on only one condition: the source code should be well documented. ;-P This is to say that the source code, if well documented, can help other programmer to implement the function for their own language. I also suggest for this purpouse to write a rel_en.py file: it's a kind of template for new rel_??.py files and helps keeping the plugin structure consistent. Hint: Python provides a library for locale support, locale.py. On my Debian installation, it's located in /usr/lib/python2.[123]/ directory. Specifically, I suggest to use getlocale() and normalize(): if normalize(getlocale()) == normalize(locale_from_rel_xx_py), then the getRelationshipString() function is loaded. locale_from_rel_xx_py is of the form "en" for English, "it" for Italian, and so on. I've never tried it out, it's just an idea. -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |
From: Alex R. <sh...@al...> - 2003-10-22 00:57:32
|
On 2003.10.16 08:29, Lorenzo Cappelletti wrote: > Alex Roitman <sh...@al...>, Wed 15 Oct 2003 15:54 -0500: >=20 >> Lorenzo, if you'd like to implement this it would be great. If not, =20 >> let me know and I'll have a go at this. Should not be too hard to =20 >> do. > > You can go ahead on only one condition: the source code should be =20 > well documented. ;-P Lorenzo, I'm kinda tied up at work (preparing for the big conference), so it is =20 unlikely that I'll be able to seriously work on it for about a month. If you'd like to do it yourself the go ahead, it's great. Otherwise, =20 I'll get on it around November 20. Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Lorenzo C. <lor...@em...> - 2003-10-13 21:05:11
|
Alex Roitman <sh...@al...>, Sun 12 Oct 2003 11:29 -0500: > I see what the idea is. I'm just not sure whether it's worth the effort. > Hypothetically, the differences between some language and english could > be big enough so that customization of english-based functions won't > work or will be far more copmlex than the existing implementation. That's why I think that get_relationship should be kept overridable. If one needs to cutomize the function down to its core, they shuld be able to do it. > I see. So the attempt is to not copy get_relationship if the relationship > structure is the same. However, the very need for the rel calc comes > from the observation that the relationship structure is not the same. If there are just a couple of spcial cases, the special_cases() function can trap them. > For example, you needed male and female cousins returning different > strings in Italian, whereas in English there's no difference. Acutally, my idea is not to implement two functions due to gender, but let the programmer check the individual's gender and make a decision by their selves. > I definitely don't mind you trying it out, but honestly I'm skeptical > that it will be help for more than one language :-) Ok, I hope I can send you all a patch in the following few days. > The way it is right now, you can copy an existing rel calc and then > tweak it to get in compliance with your language. It does involve some > duplication, but it's not terribly inefficient in this case, because > only one of the similar pieces gets executed, depending on the LANG. The inefficiency doesn't involve execution rather management. IMHO, a big step forward could be reached by splitting ancestor list built from its processing. The former is common to all languages, while the latter may be customized (and partially copied off). -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |
From: Alex R. <sh...@al...> - 2003-10-14 02:39:16
|
On Mon, Oct 13, 2003 at 10:26:52PM +0200, Lorenzo Cappelletti wrote: > > I definitely don't mind you trying it out, but honestly I'm skeptical= =20 > > that it will be help for more than one language :-) >=20 > Ok, I hope I can send you all a patch in the following few days. Lorenzo, I have checked in rel_ru.po which uses apply_filter from=20 Relationship.py, thanks for pointing that out. As for the patch -- by all means, if there's an easy way to share=20 portions of the written code then let's do it. I may have been too eager=20 in saying that it would not work. I hope I did not offend anybody. I apologize if I did and I take it back :-) Alex --=20 Alexander Roitman http://ebner.neuroscience.umn.edu/people/alex.html Dept. of Neuroscience, Lions Research Building 2001 6th Street SE, Minneapolis, MN 55455 Tel (612) 625-7566 FAX (612) 626-9201 |
From: Lorenzo C. <lor...@em...> - 2003-10-15 20:17:10
|
Alex Roitman <sh...@al...>, Mon 13 Oct 2003 21:39 -0500: > in saying that it would not work. I hope I did not offend anybody. > I apologize if I did and I take it back :-) Not me at all! We were just debating ;-P -- email: lor...@em... Jabber: lo...@li... Fingerprint: 8CDD 3408 53B2 6122 99DA EE37 1523 68FC D906 4C08 Vuoi aiutarci ad avere le descrizioni dei pacchetti Debian in italiano? http://ddtp.debian.org/ |