In the last couple of years there have been a number of small revisions to wording of <desc> or <gloss>, and additions of new elements, which means that the translations are slightly out of sync. It is time to return to the translators and give them a (hopefully small..) list of things to check.
I'm changing this to amber because it's not clear to me that we know how to ensure this gets done, nor even how to determine which bits need re-translating.
We can use svn log and grep to find any <desc> or <gloss> elements that have changed since the date of each translation in the file. That shouldn't be hard at all, surely?
There is a script at Utilities-1/CheckI18N.sh which was my attempt at this a few years ago. It needs a rethink, but the job is doable. For every gloss and desc, extract todays version and compare with the version as of the date in @version. Separately, find all the <desc> and <gloss> which have no translations at all (ie which are new).
There's a good day's work here to refine the scripting, to get to the stage where one can offer a report to translators of what needs doing.
I'll take that on, if someone can tell me the start date (i.e. the last date when we know the translations were all up to date).
there never was a time when everything was up to date; but I dont think you need to know. I have in mind a simple:
for each //gloss|desc
extract just that element to fileA
revert the containing file back to the date in @version
extract just that element to fileB
diff fileA fileB
done
That makes sense. Which languages do we do this for? All of them? Different files seem to have different collections of languages.
I agree that this needs to be done in a consistent and scriptable way... specifically we should be able to output information of when different language versions were updated for any desc on the website. (It might be good to be able to display both when the english element was changed and when this particular language version was last updated on any particular output form.)
Council 2012 f2f in Ann Arbor decides: Rahtz to implement and kick off ongoing internationalisation updates. Also to check FM1 to see if translator's list is up to date.
Providing translations for the <desc>s of new elements should be given priority, it seems to me. Can we generate a list of those easily? Maybe we should try crowd-sourcing this problem?
This is a list of all elements which have no translations at all (meaning no <desc>s with @versionDate):
calendar
classRef
constraint
constraintSpec
dim
elementRef
gb
licence
line
listChange
listForest
listTranspose
macroRef
metamark
mod
notatedMusic
pc
precision
redo
retrace
scriptDesc
scriptNote
sourceDoc
spGrp
styleDefDecl
substJoin
summary
surfaceGrp
transpose
undo
A discussion on the Council list has brought this ticket alive again. We now have a plan to generate spreadsheets for Google, one for each language, with (per Sebastian):
column 1: element/class name
column 2: attribute name (where appropriate)
column 3:
<desc>
column 4: translation (if any)
column 5:
<gloss>
column 6: translation (if any)
column 7:
<remarks>
column 8: translation (if any)
column 9: reap statement (optional)
and then ask would-be translators to edit the spreadsheets. From those, we selectively cull newly-supplied or edited translations and put them back into the specs.
This looks good, but
(a) what on earth is a "reap statement"?
(b) should we record whether or not the spec includes at least one example in the target language too?
I assume the "reap statement" is the indicator that says whether the translation is out of date (in other words, the English source has changed subsequent to the versionDate on the translation).
proof of concept at https://docs.google.com/spreadsheet/ccc?key=0AhciBT9b4XaZdFo4Mk1FMEFlSHNUUkxuZmFaR0xqenc#gid=0
whether its better as a table rather than a spreadsheet, I am not sure.
I havent done colouring of cells which are empty, which is important,
or even considered the issue of how to indicate those which may have changed.
See script at P5/Utilities/catalogue-fori18n.xsl
On 22 Oct 2013, at 17:10, Lou Burnard louburnard@users.sf.net
wrote:
<resp>
i'd wonder if that was a rather different area of expertise
Sebastian Rahtz
Director (Research) of Academic IT
University of Oxford IT Services
13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
It doesn't take much expertise to see whether there is one or not.
Supplying it if there isn't is a different question, of course.
On 22/10/13 17:58, Sebastian Rahtz wrote:
Related
Bugs: #312
it doesn't take any expertise at all to see whether there is an example for language X, we can do that as part of the transformation to the table....
proof of concept spreadsheet at https://docs.google.com/spreadsheet/ccc?key=0AhciBT9b4XaZdFo4Mk1FMEFlSHNUUkxuZmFaR0xqenc&usp=drive_web#gid=0
its possible it would be better as a table. i dont know which format people find easier to edit
Might it be possible to have a form that uses this as a data source and so presents a rendered view of this and allows them to update a textbox with the translation?
in what way is the Google spreadsheet not providing exactly that functionality?
Emailed original French team leader 2013-11-18; no response. Wrote to AC and FC re Italian 2013-12-20. Wrote to MB re Chinese 2013-12-20.
Last edit: Martin Holmes 2013-12-20