You can subscribe to this list here.
2000 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
|
Feb
|
Mar
(10) |
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(71) |
2002 |
Jan
(4) |
Feb
(2) |
Mar
(3) |
Apr
(57) |
May
|
Jun
(11) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2003 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(4) |
Oct
|
Nov
|
Dec
|
From: Tiago T. <tia...@te...> - 2003-09-14 00:56:32
|
> It seems at least one person is still on this list... :) But, even after all this long time, we are still about a dozen subscribed to the list. > Good news that some new stuff is finally going on with Traduki! I'll try > to look at it when I get time (which won't be often as I'm taking six > university classes). The language I'm most interested in right now is > Japanese so it would be cool if I could get some kind of output module > hacked together for that. Have to learn Lua first though. :P Well, in my "let's wait some months for anything" message last year I tried to make it clear that I was not closing the project, but going through a good rewrite (I know many vapourware developers say exactly the same for their projects...). I hope to start commiting some code to the CVS soon, but we will probably have to wait some weeks or even months to get sometime really usable (read 'usable' here as 'something that can translate very simple sentences with minor errors when there are no exceptions'), because there are still a few things to design and many, many to implement (not only in each language module, but in the "core" of Traduki). Japanese will certainly be interesting (the homepage gets a lot of hits from Asia), but I have intentionally left Asian languages specific features away from the InterLingvo for the time being: there is obviously room for them as they will be included during the development of these languages but I want to make things simpler for me during the "foundation" phase. The roadmap that I have in my mind is: - develop an Italian TranslateInto() (done for the first version) - develop an Esperanto TranslateFrom() and all the common TranslateFrom() functions (I working in this now) - develop an English TranslateInto() (for announcements, of course!) - finally start completing with other languages. But this is of course subject to change! :) |
From: Justin K. <dop...@co...> - 2003-09-14 00:51:46
|
Tiago Tresoldi wrote: >(by the way, if some-one could do the favor to check into CVS a .tar.gz file >containing the sources that I would sent, I would be very grateful). > > Sure, send it my way. -- __ ____ ___ __ __ | | | \| | | Cast in the name of God...ye not guilty! | | __| | | _| ICQ: 76824935 __| |__ | | | /_ DopefishJustin@MailandNews.com is spambait | | | | | | | dopefish underscore justin at cox dot net |______________/___|__| http://gothmog.homeip.net:8000/ |
From: Justin K. <dop...@co...> - 2003-09-10 06:06:14
|
It seems at least one person is still on this list... :) Good news that some new stuff is finally going on with Traduki! I'll try to look at it when I get time (which won't be often as I'm taking six university classes). The language I'm most interested in right now is Japanese so it would be cool if I could get some kind of output module hacked together for that. Have to learn Lua first though. :P -Justin -- __ ____ ___ __ __ | | | \| | | Cast in the name of God...ye not guilty! | | __| | | _| ICQ: 76824935 __| |__ | | | /_ DopefishJustin@MailandNews.com is spambait | | | | | | | dopefish underscore justin at cox dot net |______________/___|__| http://gothmog.homeip.net:8000/ |
From: Tiago T. <tia...@te...> - 2003-09-07 16:31:07
|
Hello everyone (assuming there people are still listening to this list, otherwise this will just be a note stored in the list archive), After a long time I have news to report regarding Traduki. I won't start telling sad stories to explain the long time without any news, I am sure this is quite easy to understand. But there are news in the project. It certainly is not easy to see, but I have been working extensively in Traduki in the last months. I have uploaded a new version 0.3.0, that is actually more a 'see what I am planning to do version', the NLPTK was released and today I released a small application for my father, who teaches Italian, called "Verbi" (http://planeta.terra.com.br/educacao/tresoldi/verbi.html), that gives you coniugations for italian verbs -- code obviously based in Traduki, although an old version of it. The development of Traduki is finally in the coding part: I spent a really long time with studies of what needed to be implemented and hwo to implement it, and I am finally coding it, in C and Lua (well, most Lua actually). I can't and won't promise any release date, but I will probably start uploading some files to the CVS when I'll feel it is the right time (it shouldn't take too long) and when I finally manage to make my winmodem work in Linux... (by the way, if some-one could do the favor to check into CVS a .tar.gz file containing the sources that I would sent, I would be very grateful). Two days ago I started writing (once more) an updated version of the Interlingvo (the 'spirit' and many parts are still exactly as the previous, don't worry!) and it should finally make Traduki somewhat usable, as the TranslateInto() (now just ti()) method of the italian module is already working, and quite well for noun sintagmas (the verbs are still very raw and the end results for the sentences are yet very buggy). The focus, in this moment, is in the neural network implementation (needed for the parsing, this time I will start with esperanto) and the setup of the project in terms of a 'real' project (not just a .tar.gz file, but a project with README files, Makefile, etc. almost everything of this is ready now). Sometimes I think that Traduki might really become something real something real, sometime, somewhere... :) Hope to have more news soon, Tiago |
From: Pedro M.V. <ma...@in...> - 2003-02-09 23:52:32
|
I suggest create a program to translate www.wikipedia.org frontpage to = all the languages, preferly when the user has a primary language not = included in the wikipedia. Regards. |
From: Tiago T. <tia...@te...> - 2002-08-04 17:57:19
|
Hello people, probably many of you are quite upset with all my delays and disappearances, but please read this message that is quite definitive regarding Traduki. I am still very interested in NLP and in MT as all of us are. However Traduki is still a vapourware... The reason for this is always the same: time. Everyone in the world seems to be interested in Traduki-like software (I am still surprised that our website gets so many hits even more than a year after the last 'true', useless release), but as we know MT is not an easy task if you want to do it the right way, and it is even harder if you have a project with so many goals as Traduki. I must admit that I have only worked about four hours in Traduki during the last two months. Apart some private stuff, the economical crisis here in Brazil (thanks to the tatcherian neo-liberalism of our president and IMF) is a real problem and, just to help me, I was fired last friday due to this crisis. I really don't have much time to work in Traduki, because, as I've already mentioned, I am also trying to set up my own company in the HVAC/R field, a field in no way related to machine translation. This is the first note. The second note is Traduki itself. Traduki still doesn't run, and even being programmed in a high-level such as Python it is very complex. Too much complex if we plan to get help from people who never programmed and maybe never used a computer. The solution here is really complex: even with a high-level representation of linguistic concepts it still impossible to maintain a backward compatibility to last releases and changing so much interlingvo (trying to achive an utopic easy-to- understand version) has made it a strange collection of object that feel completely artificial or, better, un-natural. I am the only one who understands it and still have to check what I wrote months ago to remember some part of it, despite the long guide to interlingvo that I wrote. The result of this is that Traduki should be almost rewritten by scratch, but in a much more difficult way for us (programmers) trying to get an easy to program framework (for the users) in ways that I don't even want to mention here because of the ambitousness. However, I know that I don't and won't have time for doing so in the foreseeable future and it is not right to promise and promise, letting people waiting for something that may never be released. In other words, this message is to state that I am putting Traduki in a permanent state of stand-by. I will work in the changes that I think are needed, but won't make any public release until it will be really usable (at least as usable as linguaphile is nowadays), but all the changes will be on the web site. For those who want to work with machine translation, I suggest to start your own project: maybe it will be so interesting that I will join. Of course feel free to ask for whatever you want and use the code that has already been produced by Traduki, it is GPLed. I know that is not a good news, but at least I will stop wasting your time. I will announce my new version when it gets released, in a year, five years, or ten years. For the time being it is all. And thank you for all the cooperation, ideas and code during this time. Bye! |
From: <hip...@ya...> - 2002-07-19 01:21:23
|
This is just a note to let the Traduki people know that my similar project, Linguaphile has now finally got a web interface where you can play with it: http://linguaphile.sourceforge.net/cgi-bin/translator.pl I should have all the languages online by tonight but only Spanish is likely to be interesting since it has both dictionary and grammar rules. Most languages have skeleton implementations with almost empty dictionaries. Thanks for your time. Andrew Dunbar. ===== http://linguaphile.sourceforge.net http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Tiago T. <tia...@te...> - 2002-06-10 16:51:06
|
> Well, I just did it. A preliminary Klingon translateFrom module using > the new Interlingvo has been committed. Right now it's a massive kludge > combining the old Klingon module and the new Esperanto module; I'll > clean things up as I figure out more about how the new module design > works. It's mainly missing verb prefix handling now. It's also hard to > test things due to the unfinished nature of the Esperanto input module > and the complexity of the test sentence in the dummy English module. More that dummy actually... Good work, I like to see code that is not mine in the CVS :) The good news are that interlingvo should change anymore, it should only get updated. Regarding the new module design, I can't say to much because it was redesigned exactly to give more freedom to language module writers. I know nothing about klingon and don't know what are the needs. Of course the esperanto code will serve like a 'guide', but you will need to figure out by yourself how to make it work with Klingon. |
From: Tiago T. <tia...@te...> - 2002-06-10 16:51:06
|
On 9 Jun 2002 at 14:04, Justin the Almighty wrote: > I was using 2.1.1. Python sure gets out of date fast! ;) Actually it should have worked from inside 2.1.1 with the import future, or something like that. > File "./il.py", line 84, in ToXML > xml_source += self.klauxzo[i].ToXML() > TypeError: unbound method ToXML() must be called with Klauxzo instance > as first > argument (got nothing instead) I knew there would be bugs like that. As you can in see in the code, the translation in XML is very simple but there are some check that should be performed to know if a variable is a string or something else that has its own .ToXML() method. This does not matter too much for the time being, I need to finish a usable epo/tFrom.py to generate valid and 'real' interlingvo structures so that we can work in all the missing parts: translato to, xml, etc. |
From: Justin t. A. <dop...@ya...> - 2002-06-10 04:08:50
|
Well, I just did it. A preliminary Klingon translateTo module using the new Interlingvo has been committed. Right now it's a massive kludge combining the old Klingon module and the new Esperanto module; I'll clean things up as I figure out more about how the new module design works. It's mainly missing verb prefix handling now. It's also hard to test things due to the unfinished nature of the Esperanto input module and the complexity of the test sentence in the dummy English module. Anyway, enjoy! Qapla' -Justin ===== ~ __ ____ ___ __ __ | | | \| | | "If it ain't broke, you're not trying!"--Red Green | | __| | | _| __| |__ | | | /_ ICQ: 76824935 | | | | | | | dop...@ya... |______________/___|__| http://worshipjustin.tk/ Geek code: http://dopefishjustin.tripod.com/geekcode.txt __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com |
From: Justin t. A. <dop...@ya...> - 2002-06-10 04:08:37
|
Well, I just did it. A preliminary Klingon translateFrom module using the new Interlingvo has been committed. Right now it's a massive kludge combining the old Klingon module and the new Esperanto module; I'll clean things up as I figure out more about how the new module design works. It's mainly missing verb prefix handling now. It's also hard to test things due to the unfinished nature of the Esperanto input module and the complexity of the test sentence in the dummy English module. Anyway, enjoy! Qapla' -Justin ===== ~ __ ____ ___ __ __ | | | \| | | "If it ain't broke, you're not trying!"--Red Green | | __| | | _| __| |__ | | | /_ ICQ: 76824935 | | | | | | | dop...@ya... |______________/___|__| http://worshipjustin.tk/ Geek code: http://dopefishjustin.tripod.com/geekcode.txt __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com |
From: Justin t. A. <dop...@ya...> - 2002-06-09 21:04:22
|
--- Tiago Tresoldi <tia...@te...> wrote: > You are using an old Python, probably 2.1. Get 2.2.1 or replace > =file(filename with =open(filename I was using 2.1.1. Python sure gets out of date fast! ;) Upgrading fixed that particular error, but now we move on to the next: ;) Translating 'test-epo.txt' from Esperanto to Esperanto. subject:['La'/'artikolo'@[0], 'knabo'/'nomo'@[1]] verb:['mangxas'/'verbo'@[2]] object:['la'/'artikolo'@[3], 'pomon'/'nomo'@[4]] Traceback (most recent call last): File "./traduki.py", line 93, in ? main() File "./traduki.py", line 74, in main engine(filename, from_lang, to_lang) File "./engine.py", line 59, in engine o.write(translate(entity[i], from_lang, to_lang)) File "./engine.py", line 80, in translate return to_lang.TranslateTo(from_lang.TranslateFrom(text)) File "./epo/tFrom.py", line 117, in TranslateFrom il.il2XML(interlingvo, "test.xml") File "./il.py", line 821, in il2XML f.write(i.ToXML()) File "./il.py", line 84, in ToXML xml_source += self.klauxzo[i].ToXML() TypeError: unbound method ToXML() must be called with Klauxzo instance as first argument (got nothing instead) ===== ~ __ ____ ___ __ __ | | | \| | | "If it ain't broke, you're not trying!"--Red Green | | __| | | _| __| |__ | | | /_ ICQ: 76824935 | | | | | | | dop...@ya... |______________/___|__| http://worshipjustin.tk/ Geek code: http://dopefishjustin.tripod.com/geekcode.txt __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com |
From: Tiago T. <tia...@te...> - 2002-06-09 18:04:40
|
On 8 Jun 2002 at 11:30, Justin the Almighty wrote: > File "./il.py", line 818, in il2XML > f = file(filename, 'w') > NameError: global name 'file' is not defined You are using an old Python, probably 2.1. Get 2.2.1 or replace =file(filename with =open(filename |
From: Justin t. A. <dop...@ya...> - 2002-06-08 18:30:53
|
OK, a change to general.py was committed that fixed that error. Now it progresses further, but still dies: $ ./traduki.py --fromlang epo --tolang epo --file test-epo.txt Translating 'test-epo.txt' from Esperanto to Esperanto. subject:['La'/'artikolo'@[0], 'knabo'/'nomo'@[1]] verb:['mangxas'/'verbo'@[2]] object:['la'/'artikolo'@[3], 'pomon'/'nomo'@[4]] Traceback (most recent call last): File "./traduki.py", line 93, in ? main() File "./traduki.py", line 74, in main engine(filename, from_lang, to_lang) File "./engine.py", line 59, in engine o.write(translate(entity[i], from_lang, to_lang)) File "./engine.py", line 80, in translate return to_lang.TranslateTo(from_lang.TranslateFrom(text)) File "./epo/tFrom.py", line 117, in TranslateFrom il.il2XML(interlingvo, "test.xml") File "./il.py", line 818, in il2XML f = file(filename, 'w') NameError: global name 'file' is not defined ===== ~ __ ____ ___ __ __ | | | \| | | "If it ain't broke, you're not trying!"--Red Green | | __| | | _| __| |__ | | | /_ ICQ: 76824935 | | | | | | | dop...@ya... |______________/___|__| http://worshipjustin.tk/ Geek code: http://dopefishjustin.tripod.com/geekcode.txt __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com |
From: Tiago T. <tia...@te...> - 2002-06-08 13:44:54
|
On 7 Jun 2002 at 17:21, Justin the Almighty wrote: > NameError: global name 'endsIn' is not defined Sorry, I forgot to upload/commit general.py. Already fixed. |
From: Justin t. A. <dop...@ya...> - 2002-06-08 00:21:09
|
Hmm, trying to test things out I get the following error: $ ./traduki.py --fromlang epo --tolang epo --file test-epo.txt Translating 'test-epo.txt' from Esperanto to Esperanto. Traceback (most recent call last): File "./traduki.py", line 93, in ? main() File "./traduki.py", line 74, in main engine(filename, from_lang, to_lang) File "./engine.py", line 59, in engine o.write(translate(entity[i], from_lang, to_lang)) File "./engine.py", line 80, in translate return to_lang.TranslateTo(from_lang.TranslateFrom(text)) File "./epo/tFrom.py", line 45, in TranslateFrom tag_tokens = Tag(tokenizer.tokenize(text)) File "./epo/tFrom.py", line 131, in Tag elif endsIn(token.type().lower(), [u"o", u"on", u"oj", u"ojn"]): NameError: global name 'endsIn' is not defined ===== ~ __ ____ ___ __ __ | | | \| | | "If it ain't broke, you're not trying!"--Red Green | | __| | | _| __| |__ | | | /_ ICQ: 76824935 | | | | | | | dop...@ya... |______________/___|__| http://worshipjustin.tk/ Geek code: http://dopefishjustin.tripod.com/geekcode.txt __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com |
From: Tiago T. <tia...@te...> - 2002-06-07 22:11:59
|
As the old Nike say, yet some more new code in the CVS. You need the NLTK (http://nltk.sf.net/) toolkit in the /traduki/nltk directory to make it work, even if it is not very intensively used yet. I keep working here, no more studing but just testing and developing. The new code allows to get a raw XML version of an Interlingvo structure, but it is probably bug here and there and the semantic cases headache is still be solved (but I am working). There is an example of the 2XML translation at: http://traduki.sf.net/il.xml |
From: Tiago T. <tia...@te...> - 2002-06-02 21:22:02
|
Once more I'm back, no excuses, I would just like you to point your browsers or CVS repositories to the new code in Traduki CVS. Yes, I took lot of things out that I will restart adding soon but the great news are: - The project is approaching the form I plan. Check the test code in epo.py to see - New interlingvo - almost complete now, the only thing that really needs some work are cases - New Guide. What you were wanting. A long description of Traduki Interlingvo that should make some points clear. |
From: Tiago T. <tia...@te...> - 2002-04-22 05:04:28
|
Yes more on IDE and cases, as they are the topics of the week in the list. Some little but important changes to the IDE: now you only navigate with the arrow keys, somewhat like an internet browser: left for backwards, right for forward. The up and left change the field to be edited, and their values can be changed with '+' and '-'. Now there is also support for Unknown and Initial Value (please note that the support for unknown is a trick, but easy to understand), as for debugging we have to test all the possibilities. Regarding cases, I started to implement them into interlingvo today. A little part is done, but there is need to clearly differenciate one from each other not only for the interlingvo implementation but also for building those trees for selecting the right case (not binary anymore, but still trees). I started to write a list with examples, but please give you own. abessive - indicating absence or lack "She is singing without music." ablative - indicating direction from or time when ? absolutive - indicating subject or object of intransitive verb ? accusative - indicating direct object of a verb ? adessive - indicating place where or proximity to "Bob lives at the new street." "Bob lives near the new street." agentive - indicating agent performing action "The lecture was given by Alice." allative - indicating movement towards "I am going to his house." assocative - indicating association with or accompaniment by ? benefactive - indicating for whom or which "She did it for me." causative - indicating causation by "It was done by us." comitative - indicating accompaniment ? dative - indicating indirect object of a verb ? delative - indicating motion downward ? elative - indicating movement out of or away from "He escaped from the jail." equative - indicating likeness or identity ? ergative - indicating subject of a transitive verb ? essive - indicating a temporary state of being ? factive - indicating causation ? genitive - indicating possession, origin or relation my/your/his/her/its/our/your/their illative - indicating movement into or toward "She is going inside the house." "She is going to her house." inessive - indicating location within ? instructive - indicating means whereby ? instrumental - indicating means by which "Alice is writing with the pencil." lative - indicating motion up to or as far as ? locative - indicating location or place where "Bob lives in Europe." nominative - indicating subject of a verb (normal english nouns) partitive - indicating a part of a larger whole ? perlative - indicating movement through or across "Don't go by that road." predicative - indicating the predicate ? privative - indicating absence, deprivation or negation "We can write without a computer." prolative - indicating motion alongside or by ? relative - indicating relation or a prepositional object ? similitive - indicating similarity to "Bob acts like a kid." subessive - indicating location under or below ? sublative - indicating movement towards the top of ? superessive - indicating location upon or on top of ? terminative - indicating motion up to or time until ? translative - indicating process of change or movement through ? vocative- indicating calling or personal address ? Still working. :) Tiago |
From: Tiago T. <tia...@te...> - 2002-04-21 21:46:06
|
I changed some little things, now it should be easier to use and understand it. Navigate with the cursor keys, change with + and -, ENTER is forward, ESC is back. |
From: Tiago T. <tia...@te...> - 2002-04-21 20:09:27
|
Along with some little Interlingvo changes, I've uploaded the first version of the interlingvo ide. It is still useless, don't bother downloading if you want to create Interlingvos. I just wrote the function to display a window and accept values, it can be list of strings, integers, strings, etc. so it shouldn't take long to make it useful (translation: make it produce valid interlingvos). PS: Justin I've updated instead of checked out as you warned me. Thank you again. Tiago |
From: Tiago T. <tia...@te...> - 2002-04-21 18:06:42
|
Andrew, > Quite a few already exist on a site similar to Wiki: > http://www.everything2.com - do a search for a > language name. But there's no common format of > course. I can't really work on this stuff myself just > yet since I don't have my own computer, internet > connection, job, or income ): My income is also small, I'm afraid we'll have to take Internet out from home and I'll only be able to use it from those slooow university terminals with Windows. > I always think of 'my' as a possessive adjective sine > it's primary function is to qualify the noun that > comes after it. It's secondary function is to > provide info on who the possessor is. 'Yours' is a > possessive pronoun as it is not followed by a noun > but refers indirectly to two things: primarily the > object which it directly replaces, secondarily the > person (or persons) who possess that object. I also consider 'my' as a possessive adjective. What I said is that we grammatically use a genitive case of the pronoun for the possessive rule. Yours (like mine and so on) is just another kind of genitive. > And all grammars seem to interpret these differently. That's the cool about languages, DIY. :) > I also say 'I was' because saying 'I were' would sound > affected or snobby in my world (: <grin> When I wrongly said 'I was' my teacher was always complaning. Good to know that... :) > > Yes. Interlingvo has to work in semantic case, > > because grammatical cases are different from > > language to language. The > > problem is that we get the semantic cases through > > grammatic ways... > > Well we have to determine it through a combination of > both grammatical ways and vocabulary. "with" on its > own doesn't tell me whether I hit a hammer with a nail > (instrumental) or walked down the street with a > friend (forget the name of this case). Associative or maybe comitative. [Propositions] > This sounds very close to what we call "clauses" in > English. Perhaps they are the same... The important thing to note is that proposition is abstract while sentence is grammatical. > I guess so but I'm still thinking about mine and not > implementing it because I want to get to the > universals first (: That's why I want to know more > about non-western-european languages. And its a shame the contents available in Internet. > But as you know there are two directions MT can take. > One is language pairs, the other is via a universal > representation. Major online translators all use the > former because it's easier to get right. Traduki and > Linguaphile both use the latter because we're more > ambitious (: Maybe is not only ambition, but different goals. I decided to write Traduki because I wanted free speech, because I wanted to be an active part of the open source moviment and also because I wanted to learn more and more about languages, probably like you. Major online translators just want to give you a fast overview of a text in a different language, they really don't bother about the differences between associative and comitative cases. Money plays an important rule in this systems, Systran for example was written as fast as possible so that SystranSoft could sell it to american military. And they keep using that old code... I know it would be impossible in the '60s to do something like Traduki and Linguaphile, as even today they are/will be avid of computer resources, but they use this old saying 'if it runs, it is ok'. That's also why open source is important: we can't see their code and I suppose that there are so many dirt tricks there that it would be at least funny trying to understand it. Traduki and Linguaphile are much more difficult to implement at first (you just have to think that I've been working in Interlingvo since 1997) but once we get something really usable (and if we keep the development this way, it shouldn't take long for Traduki) writing the language modules will be easier and the output will be better (hopefully, I know :). A company could not wait more than five years for a R&D group release something usable... > Actually I want Linguaphile to use a hybrid so that > bilingual language implementors can create > "partial specializations" like OO languages like C++ > can do, where we can take a shortcut from language A > to language B when we know they are closely related. > This is on a feature by feature basis, not for the > whole language. This is temptating, but I don't want this in Traduki. Not only because it would be difficult to understand and maintain the language modules, but this is a bit lazy imho: if the Interlingvo is well-written, the shortcuts naturally fit inside it. It they don't , that we have to fix Interlingvo. If it is impossible, Interlingvo was a hoax. :) > > And yes, programming language are also ambigous > > sometimes. Ever heard of Python? :) > > If you think Python is ambiguos, don't even think of > looking at Perl! (But I love it) I took a look at Perl once. It is really criptic, but the P-scripting-languages give us, along with ambigousness(is that right?), a freedom that is all we need for MT. With P* we can do whatever we want (or almost :). > > As far as I know, armenian and georgian are the only > > languages that have all the 40-cases and each one is > > a different suffix. > > Of course they both also have their own alphabets so > putting them in text files is very tricky. I hope > to visit these countries on a future trip. But where do you live? I would love to visit different countries to learn more about languages and cultures, too. Maybe we'll do that after the first Traduki&Linguaphile conference in 2010. :) Tiago |
From: <hip...@ya...> - 2002-04-21 04:51:42
|
--- Tiago Tresoldi <tia...@te...> wrote: > > I think you misunderstood me. The sketch grammars > are > > for the humans to read, not the machine. It is > much > > easier to write a sketch grammar than a > translation > > module. People can refer to the sketch grammar to > see > > how the language is supposed to work and use this > as > > an > > aid when working on the translation module. Being > in > > XML would be merely to keep them consistent so > that > > when you've read one, the others have the same > kind of > > information in the same places. > > I misunderstood you. But your idea is really good... > What about implementing it online as a wiki or > something like that? > Than a script (python, even ifyou prefer perl :) > would get the text to insert as a part of the common > Traduki and Linguiphile > distributions. Quite a few already exist on a site similar to Wiki: http://www.everything2.com - do a search for a language name. But there's no common format of course. I can't really work on this stuff myself just yet since I don't have my own computer, internet connection, job, or income ): > > The genitive never felt like a case in the same > sense > > as the other cases to me. Especially in English > but > > maybe this is why we call it the possessive > instead. > > In my opinion, possessive is a pronoun that acts > like an adjective, but the pronoun is not in its > nominative (like 'I' and 'you) > but in its genitive case (like 'my' and 'yours'). > Yes, it is not easy to program something that is > subject to interpretations... :) I always think of 'my' as a possessive adjective sine it's primary function is to qualify the noun that comes after it. It's secondary function is to provide info on who the possessor is. 'Yours' is a possessive pronoun as it is not followed by a noun but refers indirectly to two things: primarily the object which it directly replaces, secondarily the person (or persons) who possess that object. And all grammars seem to interpret these differently. > > English actually does have a subjunctive case. > People > > just don't realize this because it shares the same > > morphology as a past tense: > > Correct English: I wish I were rich. > > Common English: I wish I was rich. > > Illegal English: I wish I am rich. > > After verbs of wishing and others, we don't use > the > > nominative form of "to be" at all. Most speakers > use > > the 3rd person singular past, correct English uses > the > > 3rd person plural past. Native speakers rarely > > realize they do this. Only very new learners of > > English would use the nominitive "am" and I don't > > think I've ever heard it. > > You're right, I used to say 'I was' when I was > learning (well, I still am :) english. At least for > portuguese speakers it is not so > hard, as subjunctive in both languages are quite > similar. I also say 'I was' because saying 'I were' would sound affected or snobby in my world (: > > I think you're stuck in the area of "semantic > cases" > > versus "grammatical cases". In our thoughts or > > Yes. Interlingvo has to work in semantic case, > because grammatical cases are different from > language to language. The > problem is that we get the semantic cases through > grammatic ways... Well we have to determine it through a combination of both grammatical ways and vocabulary. "with" on its own doesn't tell me whether I hit a hammer with a nail (instrumental) or walked down the street with a friend (forget the name of this case). > > "propositions" as I understand you call them, we > have > > Proposition is not a term of mine, but a term really > common in italian grammatics that I think is used in > almost all > languages. A proposition is the semantic smaller > representation like 'the boy eats an apple' (notice > no punctuation). A > sentence, the smaller grammatic representation, is > usually (but not always) built of a single > proposition like 'The boy eats > the apple.' but can be made of various propositions > like 'The boy eats the apple and the girl sings.'. > Sometimes > propositions are independent ('parataxis' - from > greek), but sometimes a proposition has to be > considered dependent or > complement of a parataxis - when we have an > hypotaxis like 'the girl eats an orange' in the > sentence 'The boy eats the > apple and the girl an orange.' This sounds very close to what we call "clauses" in English. > > these semantic cases, no matter which language we > > speak. Such semantic cases map into prepositions, > > postpositions, cases, etc, in complex ways. This > is > > why I want language logic in the core of > Linguaphile. > > Yes I know, when I refer to preposition it is just > because they are what we see in language like > english, just to be short. But > your language logic isn't a sort of interlingvo too? I guess so but I'm still thinking about mine and not implementing it because I want to get to the universals first (: That's why I want to know more about non-western-european languages. But as you know there are two directions MT can take. One is language pairs, the other is via a universal representation. Major online translators all use the former because it's easier to get right. Traduki and Linguaphile both use the latter because we're more ambitious (: Actually I want Linguaphile to use a hybrid so that bilingual language implementors can create "partial specializations" like OO languages like C++ can do, where we can take a shortcut from language A to language B when we know they are closely related. This is on a feature by feature basis, not for the whole language. > > I want the core code to handle the mapping in the > > large scale with the language modules just telling > the > > core which of their grammatical > cases/prepositions/ect > > map onto which semantic cases. All languages are > > ambiguous here - even Esperanto. Languages use > the > > That is a problem, but there is no way to solve it - > ambiguity is something intrinsic to humans. But > esperanto is most > regular I found. > And yes, programming language are also ambigous > sometimes. Ever heard of Python? :) If you think Python is ambiguos, don't even think of looking at Perl! (But I love it) > > same word or case ending for multiple semantic > cases > > and represent the same semantic case with multiple > > words in different situations. This is the part > that > > I want the language modules to deal with. > > Oh - I hear Armenian is the least ambiguous > language > > by the way but Linguaphile doesn't support it yet > (: > > As far as I know, armenian and georgian are the only > languages that have all the 40-cases and each one is > a different > suffix. Of course they both also have their own alphabets so putting them in text files is very tricky. I hope to visit these countries on a future trip. > > That's exactly what I'm talking about whenever I > say > > "mappings". I plan to do a similar thing for > tenses, > > number, gender/noun classes, and all other > language > > features that don't have perfect mapping between > > languages. > > Of course we need to 'map' it for tenses, numbers, > etc. and that's why there are already so many > variables in Interlingvo. > The good thing about the interlingvo is that, if > certain 'language feature' is missing in both source > and destination > languages, there's nothing to care about. > I just want to finish this Interlingvo IDE soon so I > can start really testing Interlingvo itself with > brute-force methods. In theory > it works perfectly, but... > > > > Still working. > > > > Yep lots of work (: Andrew Dunbar. > > It is a very good hobby, isn't it? I love it. Andrew. ===== http://linguaphile.sourceforge.net http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Tiago T. <tia...@te...> - 2002-04-21 04:19:29
|
> I think you misunderstood me. The sketch grammars are > for the humans to read, not the machine. It is much > easier to write a sketch grammar than a translation > module. People can refer to the sketch grammar to see > how the language is supposed to work and use this as > an > aid when working on the translation module. Being in > XML would be merely to keep them consistent so that > when you've read one, the others have the same kind of > information in the same places. I misunderstood you. But your idea is really good... What about implementing it online as a wiki or something like that? Than a script (python, even ifyou prefer perl :) would get the text to insert as a part of the common Traduki and Linguiphile distributions. > The genitive never felt like a case in the same sense > as the other cases to me. Especially in English but > maybe this is why we call it the possessive instead. In my opinion, possessive is a pronoun that acts like an adjective, but the pronoun is not in its nominative (like 'I' and 'you) but in its genitive case (like 'my' and 'yours'). Yes, it is not easy to program something that is subject to interpretations... :) > English actually does have a subjunctive case. People > just don't realize this because it shares the same > morphology as a past tense: > Correct English: I wish I were rich. > Common English: I wish I was rich. > Illegal English: I wish I am rich. > After verbs of wishing and others, we don't use the > nominative form of "to be" at all. Most speakers use > the 3rd person singular past, correct English uses the > 3rd person plural past. Native speakers rarely > realize they do this. Only very new learners of > English would use the nominitive "am" and I don't > think I've ever heard it. You're right, I used to say 'I was' when I was learning (well, I still am :) english. At least for portuguese speakers it is not so hard, as subjunctive in both languages are quite similar. > I think you're stuck in the area of "semantic cases" > versus "grammatical cases". In our thoughts or Yes. Interlingvo has to work in semantic case, because grammatical cases are different from language to language. The problem is that we get the semantic cases through grammatic ways... > "propositions" as I understand you call them, we have Proposition is not a term of mine, but a term really common in italian grammatics that I think is used in almost all languages. A proposition is the semantic smaller representation like 'the boy eats an apple' (notice no punctuation). A sentence, the smaller grammatic representation, is usually (but not always) built of a single proposition like 'The boy eats the apple.' but can be made of various propositions like 'The boy eats the apple and the girl sings.'. Sometimes propositions are independent ('parataxis' - from greek), but sometimes a proposition has to be considered dependent or complement of a parataxis - when we have an hypotaxis like 'the girl eats an orange' in the sentence 'The boy eats the apple and the girl an orange.' > these semantic cases, no matter which language we > speak. Such semantic cases map into prepositions, > postpositions, cases, etc, in complex ways. This is > why I want language logic in the core of Linguaphile. Yes I know, when I refer to preposition it is just because they are what we see in language like english, just to be short. But your language logic isn't a sort of interlingvo too? > I want the core code to hangle the mapping in the > large scale with the language modules just telling the > core which of their grammatical cases/prepositions/ect > map onto which semantic cases. All languages are > ambiguous here - even Esperanto. Languages use the That is a problem, but there is no way to solve it - ambiguity is something intrinsic to humans. But esperanto is most regular I found. And yes, programming language are also ambigous sometimes. Ever heard of Python? :) > same word or case ending for multiple semantic cases > and represent the same semantic case with multiple > words in different situations. This is the part that > I want the language modules to deal with. > Oh - I hear Armenian is the least ambiguous language > by the way but Linguaphile doesn't support it yet (: As far as I know, armenian and georgian are the only languages that have all the 40-cases and each one is a different suffix. > That's exactly what I'm talking about whenever I say > "mappings". I plan to do a similar thing for tenses, > number, gender/noun classes, and all other language > features that don't have perfect mapping between > languages. Of course we need to 'map' it for tenses, numbers, etc. and that's why there are already so many variables in Interlingvo. The good thing about the interlingvo is that, if certain 'language feature' is missing in both source and destination languages, there's nothing to care about. I just want to finish this Interlingvo IDE soon so I can start really testing Interlingvo itself with brute-force methods. In theory it works perfectly, but... > > Still working. > > Yep lots of work (: Andrew Dunbar. It is a very good hobby, isn't it? Tiago |
From: <hip...@ya...> - 2002-04-21 03:17:42
|
--- Tiago Tresoldi <tia...@te...> wrote: > > Since I've been doing Linguaphile, one thing I've > been > > wanting is "sketch grammars" for all the > languages. > > It would be really cool if we could include in one > of > > our projects, or a new SourceForge project for > > everyone > > to use, sketch grammars in a standardized form. > Be it > > xml or plain text. Just so everything is in the > same > > order for each one. In Linguaphile I have lines > at > > the beginning of each language module listing the > > cases > > for the language as I've been able to find them in > > books or on the internet. Feel free to look at > the > > Linguaphile CVS code (the archives are very old) > and > > see if these lists of cases or anything else is > useful > > for you. Or comment on the mistakes I have made > (: > > I liked your definition 'sketch grammars'. :) > Having those in an external file in not so easy for > Traduki however. Each language will have something > similar to what you > mentioned but in the parsing side, i.e., the > translateFrom() function. Maybe some languages will I think you misunderstood me. The sketch grammars are for the humans to read, not the machine. It is much easier to write a sketch grammar than a translation module. People can refer to the sketch grammar to see how the language is supposed to work and use this as an aid when working on the translation module. Being in XML would be merely to keep them consistent so that when you've read one, the others have the same kind of information in the same places. > reuse this code, this is up > to the language and its manteiner. > Regarding XML, I really like it as a standard, but > it is really to slow, at least with my > implementation. I know that our main > goal is quality and not speed, I know that we can > implement something like a cache during the > translation, but it is still very > slow. > Regarding the cases, I am planning to do the > following: in the propositions, the object and the > subject sintagma will > become lists of object and subject sintagmas, where > each one has the information for the cases and even > the emphasis > value for each one (so that it can translate 'At New > York John has arrived' for the question 'Where has > John arrived?' > instead of 'John arrived at New York'). > The problem of the cases is still to solven. Many > linguistis say that languages like english have only > the nominative case > (and of course the genitive), but I don't agree. The genitive never felt like a case in the same sense as the other cases to me. Especially in English but maybe this is why we call it the possessive instead. English actually does have a subjunctive case. People just don't realize this because it shares the same morphology as a past tense: Correct English: I wish I were rich. Common English: I wish I was rich. Illegal English: I wish I am rich. After verbs of wishing and others, we don't use the nominative form of "to be" at all. Most speakers use the 3rd person singular past, correct English uses the 3rd person plural past. Native speakers rarely realize they do this. Only very new learners of English would use the nominitive "am" and I don't think I've ever heard it. > They have the case as word's suffixes, but those who > aren't suffixes are > expressed by other words - generally our enemies, > the prepositions. What I mean is that in 'I am going > to Paris', Paris is > nominative as a single word, but the 'to' > preposition makes it be an allative case of the verb > go with the sense of 'physically > changing location'. The problem is that there is not > too much similarity among languages regarding cases > and in some of > them like georgian all are really cases, is some we I think you're stuck in the area of "semantic cases" versus "grammatical cases". In our thoughts or "propositions" as I understand you call them, we have these semantic cases, no matter which language we speak. Such semantic cases map into prepositions, postpositions, cases, etc, in complex ways. This is why I want language logic in the core of Linguaphile. I want the core code to hangle the mapping in the large scale with the language modules just telling the core which of their grammatical cases/prepositions/ect map onto which semantic cases. All languages are ambiguous here - even Esperanto. Languages use the same word or case ending for multiple semantic cases and represent the same semantic case with multiple words in different situations. This is the part that I want the language modules to deal with. Oh - I hear Armenian is the least ambiguous language by the way but Linguaphile doesn't support it yet (: > only have prepositions and so on. The main > difference regards the > transitivity of verbs. I still am not really sure > about how to solve this problem, but I was thinking > about implementing binary > trees with relationship of 'proximity sense' among > all the cases, so that if a case is missing in a > language we can get > another appropriate one and if the case is different > an information attached to each verb sense in each > language vortaro > could help finding the right case and thus the right > suffix and/or preposition. That's exactly what I'm talking about whenever I say "mappings". I plan to do a similar thing for tenses, number, gender/noun classes, and all other language features that don't have perfect mapping between languages. > Still working. Yep lots of work (: Andrew Dunbar. ===== http://linguaphile.sourceforge.net http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |