From: <hon...@gm...> - 2014-05-11 16:14:18
|
There are a couple of requests on the Feature Request board to add a programme's CRID (unique programme identifier) to the DTD. c.f. FR #71 Add support for representing CRIDs http://sourceforge.net/p/xmltv/feature-requests/71/ FR #105 XMLTV DTD: Add attributes for DVB ONID and SID http://sourceforge.net/p/xmltv/feature-requests/105/ There probably aren't many sources where a CRID is available but, where they are available, then it opens new possibilities for downstream data handling. If we were to do this, I think a "system" attribute might also be useful (as per the episode-num element). This wouldn't be predefined but just some value agreed between grabber and application. E.g. _uk_atlas has crids available and up to now I've encoded them as a second episode-num element, e.g. <episode-num system="xmltv_ns"> 6.16/70. </episode-num> <episode-num system="brand.series.episode"> mr2.ryp52.r8cnm </episode-num> This could morph into: <crid system="brand.series.episode">mr2.ryp52.r8cnm</crid> FR#71 could simply be: <crid system="item">123456</crid> FR#105 could be <crid system="onid.sid">12.34</crid> So, should we add a new element to the DTD for <crid> ? Or do you think that the episode-num entity adequately handles CRIDs ? Rgds, Geoff |
From: Robert E. <rm...@gm...> - 2014-05-12 04:19:17
|
On 5/11/2014 11:14 AM, hon...@gm... wrote: > There are a couple of requests on the Feature Request board to add a programme's CRID (unique programme identifier) to the DTD. > > c.f. > FR #71 Add support for representing CRIDs > http://sourceforge.net/p/xmltv/feature-requests/71/ > > FR #105 XMLTV DTD: Add attributes for DVB ONID and SID > http://sourceforge.net/p/xmltv/feature-requests/105/ > > > There probably aren't many sources where a CRID is available but, where they are available, then it opens new possibilities for downstream data handling. > > If we were to do this, I think a "system" attribute might also be useful (as per the episode-num element). This wouldn't be predefined but just some value agreed between grabber and application. > > > E.g. _uk_atlas has crids available and up to now I've encoded them as a second episode-num element, e.g. > > <episode-num system="xmltv_ns"> 6.16/70. </episode-num> > <episode-num system="brand.series.episode"> mr2.ryp52.r8cnm </episode-num> > > This could morph into: > <crid system="brand.series.episode">mr2.ryp52.r8cnm</crid> > > > FR#71 could simply be: > <crid system="item">123456</crid> > > FR#105 could be > <crid system="onid.sid">12.34</crid> > > > So, should we add a new element to the DTD for <crid> ? > > Or do you think that the episode-num entity adequately handles CRIDs ? > What's special about a CRID that it shouldn't go into episode-num? Many apps already look to episode-num for unique identifiers, so they won't have to do much other than accept crid as a "trusted unique provider". Robert |
From: <hon...@gm...> - 2014-05-12 06:57:11
|
On Sun, 11 May 2014 23:02:08 -0500, Robert Eden wrote: > What's special about a CRID that it shouldn't go into episode-num? Many apps already look to episode-num for unique identifiers, so they won't have to > do much other than accept crid as a "trusted unique provider". Maybe a perception / terminology thing perhaps: e.g. a film will have a CRID but it seems slightly odd to look in something labelled "episode-num" for something that isn't a series/ episode? Or does it? :-) |
From: Karl D. <de...@sp...> - 2014-05-21 04:53:10
|
On 12.05.2014 08:58, hon...@gm... wrote: > On Sun, 11 May 2014 23:02:08 -0500, Robert Eden wrote: > >> What's special about a CRID that it shouldn't go into episode-num? Many apps already look to episode-num for unique identifiers, so they won't have to >> do much other than accept crid as a "trusted unique provider". > > Maybe a perception / terminology thing perhaps: e.g. a film will have a CRID but it seems slightly odd to look in something labelled "episode-num" for something that isn't a series/ episode? Or does it? :-) Like Nick noted the CRID is a broadcasters ID in a special system. One programme may have a bag of related CRIDs, think series link and episode id. Multiple grabbers should return the same id for the same broadcasters content. But one grabber will likely return different ids for the same content transmitted by different broadcasters. What I'm missing is a way to signal "this is an episode of series X but we don't know which one". Similar to having a series link id but no programme id. Also the CRIDs should have a way to signal additional information that is encoded in the content_type bits of the content_identifier descriptor in DVB-SI. Regards, Karl PS: I'd love to see an XMLTV-TNG schema that uses modern XML concepts and data types in addition to allowing more complex content, like separate description/titles/ids for series and episodes. Also having more examples would be nice. E.g. how to model the 8 o'clock news in a way that allows it to be repeated across multiple channels with one of the later repeats carrying deaf-signage etc. |
From: Ben B. <lin...@bu...> - 2014-05-21 10:40:07
|
Karl Dietz wrote, On 21.05.2014 06:51: > On 12.05.2014 08:58, hon...@gm... wrote: >> On Sun, 11 May 2014 23:02:08 -0500, Robert Eden wrote: >> >>> What's special about a CRID that it shouldn't go into episode-num? Many apps already look to episode-num for unique identifiers, so they won't have to >>> do much other than accept crid as a "trusted unique provider". >> Maybe a perception / terminology thing perhaps: e.g. a film will have a CRID but it seems slightly odd to look in something labelled "episode-num" for something that isn't a series/ episode? Or does it? :-) > Like Nick noted the CRID is a broadcasters ID in a special system. 1. Agreed, there seems to be a specific definition for the format of CRIDs. We should use another tag name, e.g. <stableid> What we want here is just an opaque string that meets the 2 requirements mentioned: stable over years, and unique within the system. 2. Could you please add these requirements, including definitions of it that I sent in my last post, to the DTD text? > One programme may have a bag of related CRIDs, think series link and > episode id. Multiple grabbers should return the same id for the same > broadcasters content. But one grabber will likely return different ids > for the same content transmitted by different broadcasters. > > What I'm missing is a way to signal "this is an episode of series X but > we don't know which one". Similar to having a series link id but no > programme id. > > Also the CRIDs should have a way to signal additional information that > is encoded in the content_type bits of the content_identifier descriptor > in DVB-SI. > > Regards, > Karl > > PS: I'd love to see an XMLTV-TNG schema that uses modern XML concepts > and data types in addition to allowing more complex content, like > separate description/titles/ids for series and episodes. Also having > more examples would be nice. E.g. how to model the 8 o'clock news in a > way that allows it to be repeated across multiple channels with one of > the later repeats carrying deaf-signage etc. > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > xmltv-devel mailing list > xml...@li... > https://lists.sourceforge.net/lists/listinfo/xmltv-devel > |
From: <hon...@gm...> - 2014-05-21 12:54:53
|
On Wed, 21 May 2014 12:39:56 +0200, Ben Bucksch wrote: > 1. Agreed, there seems to be a specific definition for the format of > CRIDs. We should use another tag name, e.g. <stableid> What is wrong with the tag <crid> ? The proposed use here is totally in line with the Wikipedia entry. Don't get hung up on the "format" section in that wiki entry. However if you really want to follow it then why not simply specify the crid as a locator, e.g.: <title>Scary Movie</title> <crid>crid://atlas.metabroadcast.com/episode/8hmr</crid> or <title>EastEnders</title> <crid>crid://atlas.metabroadcast.com/brand/cf2</crid> <crid>crid://atlas.metabroadcast.com/episode/cwhz8v</crid> <episode-num system="xmltv_ns">.4859.</episode-num> > 2. Could you please add these requirements, including definitions of it > that I sent in my last post, to the DTD text? Feel free to edit the proposed text and post your version. |
From: Ben B. <lin...@bu...> - 2014-05-22 13:02:50
|
hon...@gm... wrote, On 21.05.2014 14:54: > [http://en.wikipedia.org/wiki/CRID] > Don't get hung up on the "format" section in that wiki entry. Well, it says: "In fact, a CRID is a so-called URI <http://en.wikipedia.org/wiki/Uniform_resource_identifier>." That's quite affirmative, and we (or at least my intended usage of the <crid> tag) would violate it, because I'd treat it as opaque string, not URI. If Wikipedia indeed has the correct definition of "CRID", I wouldn't want to hijack and abuse it in a non-conformant way. Is there a more authoritative definition of what a "CRID" is? >> 2. Could you please add these requirements, including definitions of it >> that I sent in my last post, to the DTD text? > Feel free to edit the proposed text and post your version. Will do in my next post. |
From: <hon...@gm...> - 2014-05-22 14:33:24
|
On Thu, 22 May 2014 15:02:37 +0200, Ben Bucksch wrote: > Is there a more authoritative definition of what a "CRID" is? See the references (RFC4078, etc) at the bottom of the WP article. |
From: <hon...@gm...> - 2014-05-21 15:36:35
|
On Wed, 21 May 2014 06:51:19 +0200, Karl Dietz wrote: > One programme may have a bag of related CRIDs, think series link and > episode id. Multiple grabbers should return the same id for the same > broadcasters content. But one grabber will likely return different ids > for the same content transmitted by different broadcasters. I agree. > What I'm missing is a way to signal "this is an episode of series X but > we don't know which one". Similar to having a series link id but no > programme id. Isn't that simply a CRID for the series? e.g. an episode might have one crid for the series and another for the episode. Generic episode: <crid system="series">xxxx</crid> Specific episode: <crid system="series">xxxx</crid> <crid system="episode">xxxx</crid> > Also the CRIDs should have a way to signal additional information that > is encoded in the content_type bits of the content_identifier descriptor > in DVB-SI. Isn't that part of the transport stream - is a grabber going to have access to that information? Rgds, Geoff |
From: Karl D. <de...@sp...> - 2014-05-21 15:58:08
|
On 21.05.2014 17:37, hon...@gm... wrote: > On Wed, 21 May 2014 06:51:19 +0200, Karl Dietz wrote: > >> What I'm missing is a way to signal "this is an episode of series X but >> we don't know which one". Similar to having a series link id but no >> programme id. > > Isn't that simply a CRID for the series? e.g. an episode might have one crid for the series and another for the episode. > > Generic episode: > <crid system="series">xxxx</crid> > > Specific episode: > <crid system="series">xxxx</crid> > <crid system="episode">xxxx</crid> The idea behind the XMLTV grabbers is to outsource the things you have to know about the data source to make sense of it into the grabbers. Encoding information in the absence of other information appears to be a step backwards. Lets say you have an episode that you know the series CRID but not the episode CRID. But you do know which episode it is (say via the episode title). Now you must either make up a fake episode CRID or wrongly signal an unknown episode. With your example as input a grabber should turn the implicit "has a series CRID but no episode CRID, so it must be generic due to our knowledge of the data source" into an explicit "this is a generic episode". My local cable feed transmits CRIDs for the events in a way that does not allow to see if its a generic episode or a specific one. Both may or may not appear next to a series CRID. So your implicit signaling would fail for that data source. (Also their source is so bad, I've regularly seen three different CRIDs for the same episode :( >> Also the CRIDs should have a way to signal additional information that >> is encoded in the content_type bits of the content_identifier descriptor >> in DVB-SI. > > Isn't that part of the transport stream - is a grabber going to have access to that information? I'm not sure I understand. Yes, the type is next to the CRID in the transport stream. But it should be next to the CRID in other data sources, too. How else do you know if you should treat it as a seriesid or a programid? Regards, Karl |
From: <hon...@gm...> - 2014-05-21 17:04:46
|
On Wed, 21 May 2014 17:56:12 +0200, Karl Dietz wrote: > With your example as input a grabber should turn the implicit "has a > series CRID but no episode CRID, so it must be generic due to our > knowledge of the data source" into an explicit "this is a generic episode". Yes I take your point. I guess we are spoilt with the Atlas data source since it has definitive info for series and episode; so the grabber *knows* the lack of an episode id means it's a generic. i.e. the "has a series CRID but no episode CRID" *is* explicit. Conversely if Atlas has no series id then we *know* it's a one-off (e.g. film). I don't have any experience of another source which provides CRIDs so I bow to your experience. > My local cable feed transmits CRIDs for the events in a way that does > not allow to see if its a generic episode or a specific one. Both may or > may not appear next to a series CRID. So your implicit signaling would > fail for that data source. (Also their source is so bad, I've regularly > seen three different CRIDs for the same episode :( True, but I think that's all you can do - i.e. if you have no CRID specifying an episode then you can only assume it's a generic. Far from ideal, but that's what my PVR does. It's the age-old story; there's only so much one can do to cater for bad incoming data :-( A question might be: how does one differentiate between 'explicit' signaling (e.g. Atlas) and 'implicit' signaling? > I'm not sure I understand. Yes, the type is next to the CRID in the > transport stream. But it should be next to the CRID in other data > sources, too. How else do you know if you should treat it as a seriesid > or a programid? Ok can I go back a step then: what 'additional information' would you like to add? Rgds, Geoff |
From: Nick M. <kno...@gm...> - 2014-05-12 10:00:53
|
On 11 May 2014 17:14, <hon...@gm...> wrote: > > There are a couple of requests on the Feature Request board to add a programme's CRID (unique programme identifier) to the DTD. > > c.f. > FR #71 Add support for representing CRIDs > http://sourceforge.net/p/xmltv/feature-requests/71/ > > FR #105 XMLTV DTD: Add attributes for DVB ONID and SID > http://sourceforge.net/p/xmltv/feature-requests/105/ <snip> > So, should we add a new element to the DTD for <crid> ? I would like to see a new and separate element to allow the inclusion of crid* to <programme> elements. Semantically the purpose of a CRID is very different to the (original) purpose of the <episode-num> element, and I'd prefer to not shoehorn a universal resource identifier for content into an element whose defined purpose is to give season/episode/part numbering for episodic broadcasts (but which are, e.g., irrelevant for movies and TV specials). The Wikipedia article on CRIDs (http://en.wikipedia.org/wiki/Crid) includes a lot of useful information stemming from previous TV-Anytime work and links to ETSI papers. I think the potential for a usable CRID implementation in the DTD necessitates using a new element that sits directly under the <programme> element. In addition to clearly separating the implementation of CRIDs from episode numbering, it also makes it clear to current and future XMLTV data consumers that this element has a clearly defined purpose. Cheers, Nick |
From: <hon...@gm...> - 2014-05-16 16:47:23
|
So as a straw-man proposal then: ++++++++++++++++++++++++++++++++++++++++++++++++ <!ELEMENT programme (title+, sub-title*, desc*, credits?, date?, category*, keyword*, language?, orig-language?, length?, icon*, url*, country*, episode-num*, video?, audio?, previously-shown?, premiere?, last-chance?, new?, subtitles*, rating*, star-rating*, review*, crid*)> <!-- CRID : Content Reference Identifier Not the episode number or series number. This is an identifier which uniquely identifies some 'content' within all the programmes for this grabber. A CRID may refer to a series (a 'group' CRID), or an individual programme. There are several ways of defining a CRID, so the 'system' attribute lets you specify which you mean. By definition, a CRID must *uniquely* identify some content within the context of a specific grabber. When using CRIDs in downstream applications they should construct a URI consisting of grabber name + CRID. Where this is not unique then the 'system' attribute should also be included. This is to ensure a reference to a CRID is unique and does not overlap between grabbers. This is to allow for XML data from multiple grabbers to be combined without their CRIDs conflicting. --> <!ELEMENT crid (#PCDATA)> <!ATTLIST crid system CDATA #IMPLIED> ++++++++++++++++++++++++++++++++++++++++++++++++ Please feel free to fix/amend as necessary. Geoff |
From: Ben B. <lin...@bu...> - 2014-05-22 13:03:32
|
Suggestion for definition: <!ELEMENT programme (title+, sub-title*, desc*, credits?, date?, category*, keyword*, language?, orig-language?, length?, icon*, url*, country*, episode-num*, video?, audio?, previously-shown?, premiere?, last-chance?, new?, subtitles*, rating*, star-rating*, review*, cid*)> <!ELEMENT cid (#PCDATA)> <!ATTLIST cid system CDATA> <!-- CID : Content Identifier This is an identifier which uniquely identifies some 'content' within all the programmes for this grabber. An ID may refer to a film, episode of a series, or e.g. a news or sports broadcast. If the video content (film, episode etc.) is the same, the ID should be the same, even if broadcasted at a different time. If the video content is different, the ID must be different. Concrete criteria: * Unique - There must never ever be 2 different videos with the same ID. That must be true globally for all programs. * Stable - There must never be the same videos with 2 different IDs. I.e. if the same video is broadcasted within days or weeks, the ID MUST be the same. If the same video is broadcasted again 3 years later, it SHOULD have the same ID as 3 years before. These IDs can be used as database key, duplication detection etc. If any of the above criteria are not met by your IDs, you MUST NOT use the <cid> tag. If the above criteria are not met, then the downstream application will run into serious problems: * Showing the wrong title and description for a programme * recording the wrong shows, e.g. recording the documentation called "Titanic" instead of the movie * massive duplication and waste of disk space by re-recording shows * not recording shows that should be recorded The IDs themselves are opaque strings, and valid within a "system" that you need to specify. IDs guarantee their properties only within the system. A system could be the IDs of the data source, or a third-party database. |
From: <hon...@gm...> - 2014-05-22 14:28:23
|
On Thu, 22 May 2014 15:03:20 +0200, Ben Bucksch wrote: > * Unique - There must never ever be 2 different videos with the same > ID. That must be true globally for all programs. > * Stable - There must never be the same videos with 2 different IDs. In Utopia yes I'd agree with you; unfortunately that will never be possible. Unless you maintain some sort of central database to which all grabbers refer then you are never going to get a unique id across all grabbers. Even more so where the id is provided by the broadcaster/data source (rather than being invented by the grabber) - as the WP article freely acknowledges. Likewise you will never guarantee that the same content has the same ID without trying to match every incoming programme against some global database of all the programmes ever found by any grabber at any time ever. Is there any particular reason you want the id as a plain string rather than formatted as a locator? (Can't your downstream application parse out the string from the locator?) |
From: Ben B. <lin...@bu...> - 2014-05-22 14:42:24
|
hon...@gm... wrote, On 22.05.2014 16:28: > Unless you maintain some sort of central database to which all grabbers refer then you are never going to get a unique id across all grabbers. The spec proposal doesn't ask for an ID across all grabbers - that's what the "system" is for. The criteria (unique and stable) must be true only within the system. I assume that each grabber which uses <cid> will use its own "system" for the source, e.g. <cid system="tvmovie.de">7645487</cid>. Alternatively, the "tvmoviedb" or "IMDB" could be "system"s. What's important is that the grabber author verified that the IDs from the source fulfill these criteria. > Is there any particular reason you want the id as a plain string rather than formatted as a locator? For me, IDs are always just plain strings. I don't see a reason to make it more complicated and make an URI out of it. If you want to have a CRID, you can always do "crid://" + system + "/" + id . But that would be longer, and presumably make e.g. DB index searches slower, so I wouldn't do that in my app. Ben |
From: <hon...@gm...> - 2014-05-22 15:30:34
|
On Thu, 22 May 2014 16:42:17 +0200, Ben Bucksch wrote: > The spec proposal doesn't ask for an ID across all grabbers - that's > what the "system" is for. Ah right, my bad. I read your "That must be true globally for all programs" and interpreted "program" as grabber script, rather than "programme" (which the Americans misspell as "program") ;-) > What's important is that the grabber author verified that the IDs from > the source fulfill these criteria. So you're still gong to need to maintain a database in the grabber to be able to do this. And obviously there's no way a grabber could verify, for example, an IMDb id was unique in the IMDb database. I think some things you have to take on trust; if the quality of the data source looks reliable then you have to assume the id they pass to the grabber is correct. Else don't use it. Rgds, Geoff |
From: Ben B. <lin...@bu...> - 2014-05-22 15:43:28
|
hon...@gm... wrote, On 22.05.2014 17:31: >> What's important is that the grabber author verified that the IDs from >> the source fulfill these criteria. > So you're still gong to need to maintain a database in the grabber to be able to do this. The grabber source often has such a database, not the grabber. E.g. tvmovie.de has (had?) such IDs that come as part of their XML. > And obviously there's no way a grabber could verify, for example, an IMDb id was unique in the IMDb database. > > I think some things you have to take on trust; if the quality of the data source looks reliable then you have to assume the id they pass to the grabber is correct. Else don't use it. Exactly. I verify this with manual spot checks and then use the grabber over time and watch whether problems appear. |
From: <hon...@gm...> - 2014-05-23 08:04:12
|
On Thu, 22 May 2014 17:43:07 +0200, Ben Bucksch wrote: > > I think some things you have to take on trust; if the quality of the data source looks reliable then you have to assume the id they pass to the grabber is correct. > > Exactly. I verify this with manual spot checks and then use the grabber > over time and watch whether problems appear. Ok. So that just leaves: > * Stable - There must never be the same videos with 2 different IDs. > I.e. if the same video is broadcasted within days or weeks, the ID > MUST be the same. How do you propose the grabber proves this? (I don't think you can say "MUST" here.) |
From: Ben B. <lin...@bu...> - 2014-05-23 11:29:33
|
hon...@gm... wrote, On 23.05.2014 10:05: >> > * Stable - There must never be the same videos with 2 different IDs. >> > I.e. if the same video is broadcasted within days or weeks, the ID >> > MUST be the same. > How do you propose the grabber proves this? > > (I don't think you can say "MUST" here.) Same way, with spot checks and watching how it works over time. Why is this important? Applications like MythTV and Zeipis (mine) record airings based on abstract schedules like "All Star Trek". At least here, it is very common to air the new episode on the afternoon, and repeat it during the night or in the next morning. Zeipis and co need a way to detect this and avoid re-recording the show. We can use the title/subtitle, and that works most of the time, but not always. Better would be a unique ID to know that this is the same content. In such a case, it is trivial to verify that the IDs are stable: If the re-airing has the same ID, the IDs are stable. If it doesn't, the ID (from the source) is useless for us and should be ignored. If a grabber were to add such non-stable IDs, and Zeipis would rely on them, Zeipis would re-record the same show over and over again. Zeipis would need to add hacks to avoid such broken and useless IDs, and they'd cause a big problem for users and developers. Therefore, it is critical that this criteria is met by the grabber and that's why it's a MUST. Ensuring (by the source) and verifying (by the grabber author) whether the ID is stable over years is a lot lot harder. It would still be very useful, though, so that a re-airing of an older movie that I have already recorded and stored is not recorded again. I really want that stability over years, but it's very hard to guarantee. This is why it's a SHOULD. Ben |
From: <hon...@gm...> - 2014-05-23 13:16:03
|
On Fri, 23 May 2014 13:29:27 +0200, Ben Bucksch wrote: > Same way, with spot checks and watching how it works over time. I understand why it's important; I've been running PVRs using EPG data for over 12 years. Consequently I am well aware of the many limitations and data quality issues with source data providers. The problem is your use of the word "MUST". To say that occasional spot checks will detect programmes which fail this test, breaks your requirement for "MUST". At the time the grabber is writing out the record it cannot guarantee the id is unique and/or stable and therefore it fails the "MUST" test and so cannot write it. No ids will ever pass your requirement for "MUST". |
From: Ben B. <lin...@bu...> - 2014-05-23 13:40:02
|
hon...@gm... wrote, On 23.05.2014 15:16: > On Fri, 23 May 2014 13:29:27 +0200, Ben Bucksch wrote: > >> Same way, with spot checks and watching how it works over time. > I understand why it's important; I've been running PVRs using EPG data for over 12 years. Consequently I am well aware of the many limitations and data quality issues with source data providers. > > The problem is your use of the word "MUST". To say that occasional spot checks will detect programmes which fail this test, breaks your requirement for "MUST". At the time the grabber is writing out the record it cannot guarantee the id is unique and/or stable and therefore it fails the "MUST" test and so cannot write it. > > No ids will ever pass your requirement for "MUST". I guess we have a different idea of "MUST". It doesn't mean the grabber author must guarantee against all odds, bugs, and future changes. It just says he needs to verify that it's true - as far as he reasonably can - and he MUST act when he comes to know that it's not true anymore. A SHOULD allows a knowing violation, with reason. A MUST does not allow knowing violations. No MUST will guarantee against bugs or future changes. I *do* want to make sure that no grabber author comes with a different idea of what an ID is, he says "but the spec allows it". For example, IDs for whole series (not episodes) should not be added as cid. Likewise, IDs that are generated purely based on title (e.g. "Titanic" as doc and movie) should not be added. The grabber source (not the grabber itself) must have some sort a database on their end, and the ID provides that critical link between my database and theirs. Ben |
From: <hon...@gm...> - 2014-05-23 13:59:18
|
On Fri, 23 May 2014 15:39:55 +0200, Ben Bucksch wrote: > I guess we have a different idea of "MUST". It doesn't mean the grabber > author must guarantee against all odds, bugs, and future changes. It > just says he needs to verify that it's true - as far as he reasonably > can - and he MUST act when he comes to know that it's not true anymore. See RFC2119. MUST is absolute; it doesn't mean PROBABLY ;-) http://tools.ietf.org/html/rfc2119 |
From: Ben B. <lin...@bu...> - 2014-05-23 14:16:46
|
hon...@gm... wrote, On 23.05.2014 15:59: > See RFC2119. MUST is absolute; it doesn't mean PROBABLY;-) > > http://tools.ietf.org/html/rfc2119 I know very well what MUST means, I've implemented IMAP/POP/SMTP/MIME protocol parts in Thunderbird. The 2 criteria I wrote for IDs *are* an "absolute requirement of the specification" (that's all RFC 2119 says about "MUST"). I do mean it in this way. If you violate this, you're not XMLTV. But no implementation of anything (not even Thunderbird's MIME handling) can guarantee absence of bugs or know about future changes that might break things. MUST means: 'If you don't violate this rule, you are violating this specification, and you must fix it. There's no debate about it, no excuses, you cannot intentionally derive from it for any reason.'. This is how I mean it: If you violate these criteria, you have to fix it immediately or you're out. SHOULD means: 'You really need to do this. But there may' (quote) "exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course." I do *not* want every new grabber author coming and arguing that his IDs don't match these criteria, but they are useful anyway for this or that reason, and that's why he added them here, and if I don't like it, I can ignore them in my app, and whatever. I want to make sure right here what we mean with "ID", and that it will have serious consequences when these assumptions are not met, so that we don't repeat this argument every year with new devs every time. Ben |
From: Ben B. <lin...@bu...> - 2014-05-23 13:57:14
|
Ben Bucksch wrote, On 22.05.2014 17:43: > The grabber source often has such a database, not the grabber. E.g. > tvmovie.de has (had?) such IDs that come as part of their XML. To expand on this: Every editorial office needs this, in their own interest, so that they don't have to re-write the description for every airing. Similarly, they need to verify the match, so that the TV magazine doesn't write "best-selling movie of all times" for a documentation titled "Titanic". If they are so nice to give us their DB ID, that's very valuable, because it allows me to make this link between their DB and mine, even if the description differs slightly. Some sources do give these IDs in the data. I'd like to capture these, and use them. To have a proper abstraction of grabbers (international apps using them need it, that's why we have XMLTV), we need to nail down what exactly we mean with "ID", so that we don't have conflicting interpretations and assumptions, with resulting problems downstream. Ben |