From: Ed A. <ed...@me...> - 2003-10-26 19:24:28
|
After Oscar Carlsson suggested doing so for movies, I am looking at the output of tv_grab_sn and trying to move all information currently in bracketed strings into its proper place. For example in <title>Kongen av Queens (t) (19)</title> <desc>(King of Queens) blah blah blah</desc> there should be a second <title> element giving the bracketed text 'King of Queens', which should then disappear from <desc>. Ideally one title should have lang="no" and one lang="en", but the grabber isn't that clever. So here's a list of all the bracketed things I found in one day of listings, and I wonder if you could tell me what they mean and how they should be handled. Some are Norwegian, some Swedish, some are other languages or none. This list is partly a todo list for me. - Strange numbers on channel 67.dagenstv.com, eg <title>Rebel TV (13)</title> <title>Concrete Wave (11)</title> Not every programme on that channel has such a number but most do. - Bracketed title at the start of a description I can handle. - Another number at end of title, this time channel 33: <title>On the Set (39)</title> and similar examples. Is this an episode number? - What appears to be a time: <programme start="20031026093000 CET" showview="5178607" channel="21.dagenstv.com"> <title>I Mumindalen (2:10)</title> <desc>Japansk tegnefilmserie etter Tove Janssons bker.</desc> </programme> So can I use '2:10' to calculate the stop time? Ah hang on, it seems to be an episode number, see below. - (t), does this mean teletext subtitles? Eg in <title>En i havet (t) (3:8)</title> - (3:8), see above example. Third episode of eight? - (ttv) as in <title>TV-universitetet Campus - litteraturvetenskap (ttv)</title> - What does all this mean? <programme start="20031027030500 CET" showview="88490805" channel="17.dagenstv.com"> <title>Baywatch - Die Rettungsschwimmer von Malibu (163:243)</title> <desc>Amerikansk dramaserie. Der gigantische Zitteraal. (Eel Nino.)</desc> </programme> It's strange enough that the title and description are in different languages, but what is (Eel Nino.)? Hmm, I think it's just part of the episode title in some silly way. - (k), as in <programme start="20031026171500 CET" showview="4014572" channel="103.dagenstv.com"> <title>Hndbold: Danmark-Serbien og Montenegro (k)</title> <desc>Landskamp (k).</desc> </programme> - (7. Ar), where 'A' is a-with-ring. Seen in <desc>(The Amazing Panda Adventure) Amerikansk familiefilm fra 1995. Medvirkende:Stephen Lang, Yi Ding, Ryan Slater. Regi:Christopher Cain. (7. Ar)</desc> - When the title of a programme contains (R), I assume this is some trademark nonsense? <title>Gillettes sportsverden</title> <desc>(Gillette World Sport (R)) Ukemagasin med de heteste sportshendelsene fra hele verden. Sendt fredag.</desc> This also happens for (Champions League Weekly (R)). Hmm, maybe it means 'repeat' in English. - Don't know what this means, perhaps it is a malformed episode number: <title>Forfulgt i Hollywood (:13)</title> - Some programmes seem to have the original title at the beginning _and_ end of the description: <desc>(The world is not enough) Brittisk/amerikansk action frn 1999. Nr Agent 007 fr i uppdrag att skydda en het oljearvtagerska, kastas han in i ett passionerat och sprakande ventyr. Han stlls mot en av sina vrsta motstndare: Renard, en hnsynsls high-tech terrorist. I rollerna: Pierce Brosnan, Sophie Marceau, Robert Carlyle. Regi: Michael Apted. (The world is not enough)</desc> I suppose I should handle this too. - Sometimes the episode title is bracketed, rather than the programme title: <desc>CatDog. Dogs Monstertruck/Der Bienenstock (Monster Truck Folly/CatDog's Gold). Serie. Zeichentrick, USA</desc> Sometimes both original title and original episode title appear: <title>Titus</title> <desc>(Titus). 35/III. Im Sau-Stall (Amy's Birthday). 54-teilige Sitcom, USA 2000.</desc> - (sv) means 'in Swedish'? <programme start="20031026175000 CET" showview="464268" channel="29.dagenstv.com"> <title>Strms (sv)</title> <desc>Fritidsmagasin om bl a mat, trdgrd, hobby, verkstad och btliv.</desc> </programme> We also see <programme start="20031026151000 CET" showview="95373881" channel="2.dagenstv.com"> <title>Eastenders omnibus</title> <desc>Episode 459. (sv) A round-up of the week's events from Albert Square.</desc> </programme> Maybe it doesn't mean Swedish after all. - (9.) could be another episode number? <programme start="20031027040500 CET" channel="50.dagenstv.com"> <title>Dosjei X (9.)</title> </programme> - A number appearing after a capitalized episode title seems to be an episode number: <title>Tio Willy</title> <desc>EL INVITADO SORPRESA (12) blah blah</desc> - Initial bracketed number must be year: <title>Quince</title> <desc>(1998) 82'48'' INTERPRETE/S: JAVIER ALBALA, ZOE BERRIATUA, BEATRIZ RICO blah blah</desc> - Bracketed letters after someone's name are not something the grabber should process, I assume: <desc>Debat mellem folketingsmedlemmerne Svend Auken, Pia Christ-mas-Mller (K), Niels Helveg Petersen og Poul Ndgaard (DF).</desc> - (R) in a description definitely means 'repeat': <desc>My mum's a doctor. The multi-award-winning children's programme, starring Tinky Winky, Dipsy, Laa-Laa and Po, and their friend Noo Noo the vacuum cleaner. (R)</desc> - I don't know what (He) means however: <title>House invaders</title> <desc>Episode 7. Linda Barker and Anna Ryder-Richardson host this interior design series, showing how to transform one's home without spending any money. (He) (R)</desc> - Or this: <title>Friends for dinner</title> <desc>Gary Rhodes. Seven-part series in which seven of Britain's top chefs turn up on the doorsteps of their biggest fans to help them prepare the dinner party of their dreams. In this episode, Gary Rhodes whips up a dessert for fan Louise Jenson. (HePoHu) (R)</desc> Perhaps 'He', 'Po', and 'Hu' are programme categories. In fact, <desc>Superstars Of Speed. Six-part series in which Jeremy Clarkson travels the world to discover why some people go to extraordinary lengths in the pursuit of speed. Along the way he meets desert racers, speed skiers, test pilots, skydivers and astronauts. Clarkson talks to Michael Scumacher, Colin Mcrae and Greg Rusedski, to learn why some excel at high speed and other's don't. (HeDkCzPoHu) (R)</desc> Those look like country codes (apart from 'He'). What is going on? - I don't know what this is, except that it might be some kind of episode number and episode title: <title>OU mind bites</title> <desc>Mind Bites (Cutdown/65). Thought-provoking shorts.</desc> - 'Erstsendung' can be translated as 'repeat' in the current format, the new DTD provides a way to specify the original screening. <programme start="20031026080000 CET" showview="4256065" channel="1.dagenstv.com"> <title>Spanien - Sprache, Land und Leute</title> <desc>13-teiliger Sprachkurs. 8. Madrid. (Erstsendung: 21.2.1991).</desc> </programme> - (7/8) at end of description - this must be another place to store episode number. <desc>Die erste gesprochene Literaturgeschichte der Lyrik im Fernsehen von und mit Lutz Grner. 107. Bertolt Brecht (7/8).</desc> - 'Wiederholung von' can be treated as 'repeat', again the new format allows 'repeat-date' so you can store all the information. <desc>(Wiederholung von 16.00 Uhr).</desc> - (USA 1972 ) must be country of origin and year. <desc>Comedy (USA 1972 ) En revisor frn New York upplever skoj och ventyr nr han rver ett familjehotell i Klippiga Bergen.</desc> - Occasional credits in brackets: <desc>(Prsentation: Jrme Chauvelot) blah blah</desc> - (film noir et blanc) can become category and <colour>no</colour>. - Sometimes bracketed episode title is followed by a dot. <desc>(The hunger). Spielfilm. blah blah</desc> - (Svartvit) - what does this mean? <programme start="20031026070000 CET" showview="67895666" channel="24.dagenstv.com"> <title>Vi sitter i sjn</title> <desc>(All at Sea) Amerikansk komedi frn 1957 med Alec Guinness, Irene Brown, Maurice Denham. Regi: Charles Frend. 79 min. En ttling till en gammal sjfararfamilj har dlig hlsa och frvandlar hemmet till ett hotell och njescenter. Nr lokalbefolkningen blir avundsjuka p hans nya rikedom kar spnningen. Tillbakablickande sekvenser visar Alec Guinness i fler roller som ttlingens olika frfder. (Svartvit)</desc> Is 'Svartvit' related to 'sv'? - Sometimes we get the familiar big programme containing others: <programme start="20031026111500 CET" showview="79521065" channel="51.dagenstv.com"> <title>Rudis Rabenteuer</title> <desc>Pingu (10.15). Das abenteuerliche Leben eines kleinen Pinguins. Pingu spielt Eishockey. Puppentrickserie. Angelina Ballerina (10.20). Frulein Lilly geht weg. Zeichentrickserie. Siebenstein (10.35). Rudi zieht aus. Rudis Tipp (11.00). Der Koffer auf Spurensuche.</desc> </programme> I think I'd prefer to handle this inside tv_extractinfo_en or its successor, since it's not particularly grabber-specific (OK, neither are some of the other things mentioned) and because tv_extractinfo_en already has code for this. The clash of timezones might be tricky. - Sometimes two different titles are inside brackets separated by a slash: <title>Die blonde Versuchung</title> <desc>(The Marrying Man/Too Hot to Handle). blah blah</desc> <title>Tagebuch eines Vergewaltigers</title> <desc>(Cronaca di un amore violato / Journal d'un viol). blah blah</desc> I also noticed lots of parsable things that weren't in brackets, but I will limit myself to the above list for now. Please let me know what you think of those things listed which mean something in Swedish or Norwegian. -- Ed Avis <ed...@me...> |