|
From: Kevin D. <ke...@do...> - 2009-04-08 10:56:54
|
On Monday 06 April 2009 09:46, Jimmy O'Regan wrote: > 2009/4/6 Paramesh Krishna <cut...@gm...>: > > Dear all, > > I am preparing linguistic data for Tamil Morphological Analyser. I have > > two doubts.. Kindly solve it. > > 1. In Tamil we have four suffixes which is due to morpho phonemic > > process. I dont need to analyse that one. > > For example, > > the words ends in "k,c, t, p" I dont need any analysis of these morphs. > > But I need the analysis of the words which preceeds it. > > Your examples are of words that follow. No - the last phoneme of the previous word alters to harmonise with the first phoneme of the following word. My understanding was that he wants to know how to handle this final consonant harmony. > > e.g. 'mannEk' here I dont need 'k' to be analysed. It is due to the > > morpho-phonemic process . In sentence it will be > > "mannaEk katanwu" . if it is ' mannEw wavira'' here 'w' is added because > > the following word starts in 'w' . This rule applicable for all > > inflections in a paradigm. Is there any elements in the XML file to take > > care of this? > > You could add an extra paradigm that contains the phonetic > alterations; that will only allow analysis, not generation, which is a > more difficult problem here. I doubt any suggestions on how to achieve > this in apertium would be of much use to you. The problem is that it applies across a word-boundary. It's actually similar to Celtic mutation, but that has been handled within the word, even though the trigger for it is also across a word-boundary. I assume you mean something similar to that for the analysis side. For English-->Tamil, I woudl try approaching it by having the ending specified as -eX, and then a post-generation shaping rule that changes X to the first letter of the following word. -- Pob hwyl / Best wishes Kevin Donnelly www.cymraeg.org.uk - Welsh-English autotranslator www.klebran.org.uk - Gwirydd gramadeg rhydd i'r Gymraeg www.eurfa.org.uk - Geiriadur rhydd i'r Gymraeg |