1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

Ticket #38 (closed enhancement: wontfix)

Opened 3 years ago

Last modified 2 years ago

Enchance Support for MIME entities and email.

Reported by: phreedom_ Owned by: m4db0b
Priority: critical Milestone: next-release
Component: ontology-nie Version: 1.1
Keywords: Cc: Philip, Van, Hoof, <spam@…>, phreedom_, m4db0b, mylka

Description

Email is definitely a message. However it's a bit too elaborate and convoluted compared to other message types.

Our first attempt at representing email was to try and make it look like a typical message by pretending it's just a text with possibly some files attached. It makes sense to keep this representation for compatibility with all other message types.

However there's a demand to also keep the true structure of email as a tree of MIME entities. This can be easily done by representing MIME entities as a nie:DataObject to store useful properties such as encoding, content-disposition and representing MIME entity contents using a nie:InformationElement child appropriate for the content type. Multipart MIME entities should be treated as containers.

nmo:Email can be linked to its MIME representation using a dedicated property(like the proposed nmo:hasContent).

Alternatively, nmo:Email can be subclassed from a MIME multipart/container and can directly contain MIME entities,
serving as the topmost multipart MIME entity container typically found in email.

The first approach has an advantage in that it keeps the structure of email intact in all cases.

The second approach can save us an extra indirection(by not having a separate resource for the topmost container) at the price of being unable to represent the simplest case without workarounds, a plaintext email.

In this case, Email has to be treated as a container that contains a single MIME entity representing the plaintext content. The number of triples and resources needed for these use cases however stays the same regardless of the approach taken.

I'll leave it to others to flame as to the merits of these two approaches :)

Proposed changes(1st approach)

Please pay attention to comments.

# better name?
nmo:hasContent
  a rdf:Property ;
  rdfs:domain nmo:Email ;
  rdfs:comment "Links nmo:Email with its topmost MIME entity" ;
  rdfs:label "hasContent" ;
  nrl:maxCardinality "1" ;
  rdfs:range nmo:MimePart .
# name = MIMEMultipart, MimePartCountainer?
nmo:Multipart
  a rdfs:Class ;
  rdfs:comment "A MIME entity container. Corresponds to multipart/* MIME types." ;
  rdfs:label "Multipart" ;
  rdfs:subClassOf nfo:DataContainer .

# name=mimePartBoundary?
nmo:partBoundary a rdf:Property ;
  rdfs:domain nmo:Multipart ;
  rdfs:comment "Boundary string used to delimit MIME entities contained in a multipart/* MIME entity." ;
  rdfs:label "partBoundary" ;
  rdfs:range xsd:string .

# Supercedes MimeEntity. Should we use MimeEntity name instead?
nmo:MimePart
  a rdfs:Class ;
  rdfs:comment "A MIME entity, as defined in RFC2045, Section 2.4." ;
  rdfs:label "MimePart" ;
  rdfs:subClassOf nfo:EmbeddedFileDataObject .

# Content-disposition header is encountered in emails ONLY, not necessarily in other MIME use cases
nmo:contentDisposition
  a rdf:Property ;
  rdfs:domain nmo:MimePart ;
  rdfs:comment "Content-disposition header value." ;
  rdfs:label "contentDisposition" ;
  nrl:maxCardinality "1" ;
  rdfs:range xsd:string .

# It's not very nice to steal MessageHeader for this, but
# defining another equivalent class doesn't make sense as well as renaming the existing one.
nmo:mimeHeader
  a rdf:Property ;
  rdfs:domain nmo:MimePart ;
  rdfs:comment "MIME entity header." ;
  rdfs:label "mimeHeader" ;
  rdfs:range rdfs:MessageHeader .

Deprecate this

This is broken. Should be a nie:DataObject. Superceded by nmo:MimePart:

nmo:MimeEntity
  a       rdfs:Class ;
  rdfs:comment "A MIME entity, as defined in RFC2045, Section 2.4." ;
  rdfs:label "MimeEntity" ;
  rdfs:subClassOf nie:InformationElement .

This isn't needed since contents of an email is either text or something that's barely representable as a single mimetype. Putting mimetypes of all contained resources here is just as reasonable as putting all mimetypes of files in a folder into mimeType property of the said folder.

nmo:contentMimeType a rdf:Property ;
	rdfs:domain nmo:Email ;
	rdfs:range xsd:string ;
	rdfs:subPropertyOf nie:mimeType .

We should use nie:hasLogicalPart child to point to real attachments(as opposed to inlined text and whatnot) to enable mail clients to easily list things users consider attachments, not everything that's embedded.

nmo:hasAttachment
  a       rdf:Property ;
  rdfs:comment "Links a message with files that were sent as attachments." ;
  rdfs:domain nmo:Message ;
  rdfs:label "hasAttachment" ;
  rdfs:range nfo:Attachment ;
  rdfs:subPropertyOf nie:hasPart .

The second approach

Would drop nmo:hasContent and add

nmo:Email rdfs:subClassOf nmo:Multipart .

Change History

  Changed 3 years ago by phreedom_

  • cc Philip, Van, Hoof, <spam@…>, phreedom_ added; Philip Van Hoof <spam@…> removed

  Changed 3 years ago by pvanhoof

follow-up: ↓ 7   Changed 3 years ago by leo_sauermann

  • cc m4db0b added
  • owner changed from leo_sauermann to m4db0b
  • status changed from new to assigned

I would agree to this change, if this is more elaborated:
"However there's a demand to also keep the true structure of email as a tree of MIME entities."

which application has the demand, for what scenario/use case?

as this is a NMO ticket, RobertoGuido should comment, he has volunteered to maintain NMO (OntologyMaintenance).

follow-ups: ↓ 8 ↓ 9   Changed 3 years ago by m4db0b

At first glance I disagree in mapping 1:1 MIME messages in the ontology, but clearly this permits enough flexibility to manage any kind of incoming messages and needs.

Some questions:

  • the "second approach" seems far more easy to understand, what kind of "workarounds" may be required to handle plain text mails?
  • what about nmo:htmlMessageContent and nmo:plainTextMessageContent in nmo:Message? For nmo:Email those will be overlapped by nmo:MimePart?
  • nmo:mimeHeader is really required? nmo:contentDisposition has been already explicited, mimetype may be stored elsewhere, what kind of informations have to rappresent this?
  • in the above description is unclear what are reccomandations about nmo:hasAttachment

follow-ups: ↓ 6 ↓ 10   Changed 3 years ago by pvanhoof

- What is that "the second approach" (what do you mean with that?)?

- nmo:htmlMessageContent and nmo:plainMessageContent could inherit nmo:MimePart, indeed

- nmo:mimeHeader isn't really required. Problem is that headers aren't at the level of messages, but at the level of MIME parts. The standard headers are included as well defined properties indeed, but any MIME part in MIME can have an arbitrary amount of extra headers. The top MIME part (the E-mail itself) is just a MIME part like any other, and that's what the E-mail's headers are (the headers of the top MIME part). So nmo:messageHeaders is actually incorrect. Mimetype is indeed already a well defined property called nmo:contentMimeType, I think.

- nmo:hasAttachment ain't fixed in this proposal yet. The recommendation right now is to use it alongside the hasPart stuff (have two ways to indicate that the E-mail has attachments, via the structural mapping and via a one-level hasAttachment property).

The scenario or use-case for the full structure of the E-mails is a "E-mail as a desktop service" service and the kind of complicated queries that you can find in the document that I referred to earlier.

in reply to: ↑ 5   Changed 3 years ago by m4db0b

Replying to pvanhoof:

- What is that "the second approach" (what do you mean with that?)?

The part labelled "the second approach" in the above description:

nmo:Email rdfs:subClassOf nmo:Multipart .

- nmo:htmlMessageContent and nmo:plainMessageContent could inherit nmo:MimePart, indeed

A nmo:MimePart with a specified nmo:contentMimeType wouldn't be better? Just to avoid proliferation of (too) specific subtypes...
At the current stage, for other kind of nmo:Messages (cfr. nmo:IMMessage) content where is supposed to be saved?

Mimetype is indeed already a well defined property called nmo:contentMimeType, I think.

Ok, I agree. If nmo:mimeHeader isn't targetting other kind of informations, it can be dropped.

have two ways to indicate that the E-mail has attachments

Two different ways to hold the same information doesn't seems an optimal solution... I'm trying to figure out as nmo:hasAttachment is used by other kind of nmo:Message, so to find a convergence and adopt a single way to rappresent the "attachment" concept.

in reply to: ↑ 3   Changed 3 years ago by phreedom_

Replying to leo_sauermann:

I would agree to this change, if this is more elaborated:
"However there's a demand to also keep the true structure of email as a tree of MIME entities."

which application has the demand, for what scenario/use case?

  • Using Nepomuk as a primary storage for emails. pvanhoof is a better person to explain this in detail, but he seems to be too shy to tell his story here ;)
  • nmo:MimeEntity is already in the ontology and it's broken.

in reply to: ↑ 4 ; follow-up: ↓ 11   Changed 3 years ago by phreedom_

Replying to m4db0b:

Some questions:
* what about nmo:htmlMessageContent and nmo:plainTextMessageContent in nmo:Message? For nmo:Email those will be overlapped by nmo:MimePart?

To keep compability with other messages, both *MessageContent? properties have to be used along with nmo:MIMEPart. Alternatively we might consider changing *MessageContent? properties somehow to avoid duplication.

* nmo:mimeHeader is really required? nmo:contentDisposition has been already explicited, mimetype may be stored elsewhere, what kind of informations have to rappresent this?

In theory MIME entities are also used in areas other than email eg HTTP provides responses that are MIME entities, so there can be some other uses and some extra headers.

BTW if we ever want to represent HTTP stuff as MIME, we are better off moving MIME to another ontology such as NFO or NMIME.

* in the above description is unclear what are reccomandations about nmo:hasAttachment

I think that the generic nmo:hasAttachment should be a subproperty of nie:hasLogicalPart.

We have two very similar, highly-correlated and, as a consequence, confusing concepts of embedding and attachment:

  • HTML page can have an embedded stylesheet or svg picture, but it's "attached" form the POV of the user.
  • Email can contain/embed PGP signature, but it's not an attachment from the POV of the user.
  • Email can contain some document and it's an attachment from the POV of the user.
  • HTML page like this ticket can have an attachment from the POV of the user, but it can reside on another server, so no containment.

@leo_sauermann: am I stepping on PIMO toes with this example? This requires the HTML document to be treated as a ticket first. OTOH of we had a bug tracker ontology, we wouldn't have to reinterpret some other object as a ticket?

The key here is "POV of the user": it's not about how it's stored, it's about what role it's intended to play. Attachment is a relation between InformationElements?.

Embedding is storage of "helper" resources which are used by the containing resource, but may be and often are useless all by themselves = a kind of physical containment.

in reply to: ↑ 4   Changed 3 years ago by phreedom_

Replying to m4db0b:

* the "second approach" seems far more easy to understand, what kind of "workarounds" may be required to handle plain text mails?

Better illustrate this with examples. I'm omitting lots of obvious properties for clarity. Any of email contents can be a nmo:Multipart instead, but it works the same for both approaches, so I didn't elaborate this in the examples.

1st approach: plaintext email


user:Email
  a nmo:Email;
  nmo:hasContent user:EmailContent.

user:EmailContent
  a nmo:MimePart, nfo:PlainTextDocument.

2nd approach: plaintext email


user:Email
  a nmo:Email;
  nmo:hasPart user:EmailContent.

user:EmailContent
  a nmo:MimePart, nfo:PlainTextDocument.

1st approach: multipart email


user:Email
  a nmo:Email;
  nmo:hasContent user:EmailMultipartContent.

user:EmailMultipartContent
  a nmo:MimePart, nmo:Multipart;
  nie:hasPart user:EmailTextContent, user:EmailHtmlContent.

user:EmailTextContent
  a nmo:MimePart, nfo:PlainTextDocument.

user:EmailHtmlContent
  a nmo:MimePart, nfo:HtmlDocument.

2n approach: multipart email


user:Email
  a nmo:Email;
  nie:hasPart user:EmailTextContent, user:EmailHtmlContent.

user:EmailTextContent
  a nmo:MimePart, nfo:PlainTextDocument.

user:EmailHtmlContent
  a nmo:MimePart, nfo:HtmlDocument.

in reply to: ↑ 5   Changed 3 years ago by phreedom_

Replying to pvanhoof:

- nmo:htmlMessageContent and nmo:plainMessageContent could inherit nmo:MimePart, indeed

Not so easy. These properties apply to all messages. The only way to fix this is to make their range nfo:PlainTextDocument and nfo:HtmlDocument respectively. Then you could point then to MIME parts of whatever is necessary. Currently these are literals.

One more nastiness is that actual email text can consist of several parts. Such as inlined signatures added by mailing lists and whatnot. Although usually this isn't such a big deal.

- nmo:mimeHeader isn't really required. Problem is that headers aren't at the level of messages, but at the level of MIME parts. The standard headers are included as well defined properties indeed, but any MIME part in MIME can have an arbitrary amount of extra headers. The top MIME part (the E-mail itself) is just a MIME part like any other, and that's what the E-mail's headers are (the headers of the top MIME part). So nmo:messageHeaders is actually incorrect.

nmo:messageHeader applies to all messages so the question is rather: do other messages need arbitrary header-value pairs? If not, and if we follow the 1st approach, we can drop them.

The scenario or use-case for the full structure of the E-mails is a "E-mail as a desktop service" service and the kind of complicated queries that you can find in the document that I referred to earlier.

You could have been more elaborate ;)

in reply to: ↑ 8 ; follow-up: ↓ 13   Changed 3 years ago by m4db0b

Replying to phreedom_:

Alternatively we might consider changing *MessageContent properties somehow to avoid duplication.

As you use in those examples, adoption of already existing nfo:PlainTextDocument and nfo:HtmlDocument may help.

BTW if we ever want to represent HTTP stuff as MIME, we are better off moving MIME to another ontology such as NFO or NMIME.

I'm not sure about this: given I cannot figure a situation in which HTTP stuffs would need to be saved maintaining MIME informations, generic "multipart" concept is already expressed in NIE (:hasPart, :hasLogicalPart...) and more specific hierarchy are required case by case.
We can discuss move of MIME rappresentation later...

I think that the generic nmo:hasAttachment should be a subproperty of nie:hasLogicalPart.

This make sense.

Replying to phreedom_:

nmo:MimeEntity is already in the ontology and it's broken.

But modify it would break nmo:IMMessage.

To summarize: a solution may be in having a nmo:Message property which hold content of simple plain messages (e.g. nmo:IMMessage, hypotetical future nmo:SMS and so on), and specific properties for those messages (nmo:Email) composed by many parts. I agree in using subproperties of nie:hasPart and nie:hasLogicalPart to build relations among those parts, just check out a way to express it so to maintain coherence for all nmo:Message types.

follow-up: ↓ 14   Changed 3 years ago by phreedom_

Looks like there was another related discussion: ticket:6

in reply to: ↑ 11 ; follow-up: ↓ 15   Changed 3 years ago by phreedom_

Replying to m4db0b:

Replying to phreedom_:

Alternatively we might consider changing *MessageContent properties somehow to avoid duplication.

As you use in those examples, adoption of already existing nfo:PlainTextDocument and nfo:HtmlDocument may help.

This would break all messages. Also, for the purpose of fulltext search it's advantageous that the plainTextContent property is assigned to the "master" or topmost resource.

BTW if we ever want to represent HTTP stuff as MIME, we are better off moving MIME to another ontology such as NFO or NMIME.

I'm not sure about this: given I cannot figure a situation in which HTTP stuffs would need to be saved maintaining MIME informations, generic "multipart" concept is already expressed in NIE (:hasPart, :hasLogicalPart...) and more specific hierarchy are required case by case.
We can discuss move of MIME rappresentation later...

This would involve breakage of ontologies. Not something we want. We are better off breaking stuff now and then making a pause.

Replying to phreedom_:

nmo:MimeEntity is already in the ontology and it's broken.

But modify it would break nmo:IMMessage.

Not sure how IMMessage ended up as a MIMEEntity child. What's the point/use case for this? Antoni?

in reply to: ↑ 12   Changed 3 years ago by m4db0b

Replying to phreedom_:

Looks like there was another related discussion: ticket:6

And I'm quite a bad maintainer...
Previous ticket closed as duplicate, with notice of the migration of the discussion.

We are better off breaking stuff now and then making a pause.

If so, please open a new ticket to dive into the specific question.

in reply to: ↑ 13 ; follow-up: ↓ 17   Changed 3 years ago by phreedom_

antonimylka replying to phreedom_:

nmo:MimeEntity is already in the ontology and it's broken.

But modify it would break nmo:IMMessage.

Not sure how IMMessage ended up as a MIMEEntity child. What's the point/use case for this? Antoni?

AFAIR IMMessage is not related to MIMEEntity, they don't even have a common superclass (I think), the idea was to have one nmo:Message which would be subclassed to email and immessage.

In general IMMessage wasn't supposed to be a MimeEntity?.

Don't know how this MimeEntity? appeared as a superclass.

  Changed 3 years ago by mylka

It seems that it was I who did it,

http://dev.nepomuk.semanticdesktop.org/changeset/5428

I don't remember any use case in Aperture for IMMessage. Aperture didn't support any Instant Messaging archives (which means that nepomuk didn't either). We don't use the IMMessage class for anything. It must have been an elaborate typo. As far as I'm concerned this subClassOf relationship can be deleted.

in reply to: ↑ 15   Changed 3 years ago by m4db0b

Replying to phreedom_:

Don't know how this MimeEntity appeared as a superclass.

Wow, we open the Pandora's jar...

To summarize, and avoid blockers: please formulate a proposal for moving "multipart" management in NFO and submit to a dedicated ticket.
In the same time MimeEntity on nmo:IMMessage will be investigated and eventually reverted, as suggested by mylka.

  Changed 3 years ago by mylka

I'll try to be as brief as possible

  1. to m4db0b: I also disagree to map the MIME structure 1:1 to the ontology unless someone REALLY needs it
  2. to phreedom: It's impossible to distinguish automatically what is an attachment from POV of the user or not. We've gone through this and ended up with an implementation where the first part of multipart/mixed is the message itself and every other part is an attachment, including keys, and nameless plaintext adverts added by SF mailing lists. Trying to make this distinction at the ontology level will lead to inconsistencies
  3. to m4db0b, I don't think there is need for a specific 'multipart' treatment in NIE, there is no need for everything that 'contains' something to belong to a specific class. The idea is that EVERY INFORMATION ELEMENT can contain more data objects. A word document can contain OLE objects (nie:hasPart), a website can contain images (in this case probably nie:hasLogicalPart would be better). No need for a generic 'multipart' class.
  4. to pvanhoof and others: could you please point us to a more detailed description what you want to achieve with this? Why is it a good idea to use an RDF store for the primary storage of emails instead of a usual mbox file, or an embedded IMAP server backed by that mbox file. And if so, what application really needs the full MIME structure instead of the simplified one (multipart/alternative as a single object, the first part of multipart/mixed being 'the' email, everything else an attachment). This was not the intention of NIE at the beginning. I'm not saying that it's a bad idea, just that this is important for me, because changing the representation of emails will break aperture and all downstream applications, which will mean quite some work for me and the respective maintainers, just wanted some justification when I defend myself against the outrage that will surely ensue.

The last being said, if this change is really important for you and there are killer applications being held back by this, I'll go for the second approach, without the need to introduce an artificial 'content' node for a single-part plaintext email:

Email is a subclass of MimePart? (or entity)

Single-part plain-text email (approach 2)

user:Email
  # we can attach many types to a single URI
  a nmo:Email, nfo:PlainTextDocument; 
  # since it's a PlainTextDocument, we can
  # discard plainTextMessageContent and use
  # plainTextContent instead
  nie:plainTextContent "whatever" .

Multipart/mixed (email with an attachment)

user:Email
  # email is a part, but not a Multipart, we need to
  # add the nmo:Multipart explicitely
  a nmo:Email, nmo:Multipart;
  nie:hasPart user:EmailTextContent, user:EmailHtmlContent.

user:EmailTextContent
  a nmo:MimePart, nfo:PlainTextDocument.

user:EmailHtmlContent
  a nmo:MimePart, nfo:HtmlDocument.

Though Aperture will not represent an email like this, but will discard the HtmlPart? and present such an email as a single plain-text one like this:

user:Email
  a nmo:Email, nfo:PlainTextDocument .

A multipart/mixed could be then represented with

user:Email
  a nmo:Email, nmo:Multipart;
  nie:hasPart user:EmailTextContent, user:PdfAttachment.

user:EmailTextContent a nmo:MimePart, nfo:PlainTextDocument.
user:PdfAttachment a nmo:MimePart, nfo:PdfDocument .

Though Aperture will probably represent it like this (the first part of a multipart/mixed is promoted to 'be' the email itself, only remaining parts are reported as attachments):

user:Email
  a nmo:Email, nmo:Multipart, nfo:PlainTextDocument;
  nie:hasPart user:PdfAttachment.

user:PdfAttachment a nmo:MimePart, nfo:PdfDocument .

You could also make the hasAttachment a subproperty of hasPart (which is the case actually), and everyone should be happy.

This was a long thread and it's already 4 am. Could someone bake an exact proposal for change - what classes/properties should be added, what exactly should be changed. It might be easier to discuss when we can edit a concrete proposal on a wiki before doing the commit. I'll try to review it all once more sometime in near future.

  Changed 3 years ago by pvanhoof

1: I do

RE 2: That's wrong. In MIME what is an attachment and what isn't an attachment can be distinguished by using the content-disposition property of it. Your implementation leads to inconsistencies and is *very* wrong.

RE 3. Please take a look at how NIE handles archives. The NMO proposal is modeled after this. That's because an E-mail is an archive.

RE 4. The RDF store is not the primary store for e-mail. The RDF store is the primary store for E-mail's metadata. Which is a legitimate use for it. You are misrepresenting the use-case by claiming that the RDF store is becoming a primary store for E-mail, which has never been said, which isn't the case and this NMO proposal isn't trying to turn NMO into an ontology suitable for this purpose.

I again want to point to how archives are done in NIE/NFO and then ask you why, if this isn't the intention of NIE, it was done the same way for archives. I would also like to point out and I'll repeat (because this is important) that an E-mail is a kind of archive.

An E-mail by itself has, just like a .zip file, no meaning for a user other than transportation uses (it keeps the relevant things together).

What has meaning for the user? That's the message, that's the message inside of a forward, that's the attachments, that's the attachments inside of a forward. Those things do have meaning for the user.

Best way to compare an "E-mail" in the real world is "an envelope" that contains pieces of "paper" (the mime parts) that contain a message (the contents of the mime parts). Attached to those pieces of paper you can have printed photos (a JPEG attachment) or other materials that can be attached.

Everybody who sees an E-mail as a file, is wrong and doesn't get the nature of E-mail. Stop looking at how *your* broken E-mail client represents *your* E-mails, and start looking at the use-cases. Think out of the box.

ps. And yes there's a really important killer application being held back by this. I refer to it as E-mail as a desktop service. Which will basically remove the need for "E-mail clients" and make each and every application a potential consumer of E-mail metadata content. This is being developed by multiple developers at this moment, by the way.

  Changed 3 years ago by pvanhoof

Also note that the proposal doesn't require big changes if you don't want to store full representation of MIME in the RDF store. We're ourselves not doing it for our current support for NMO through a so-called Push plugin driven by KMail and Evolution's processes (those processes push changes and e-mail metadata into our RDF store):

http://git.gnome.org/cgit/tracker/tree/src/plugins/evolution/tracker-evolution-registrar.c

In fact, we didn't have to make any changes at all and yet the RDF triples that get stored were still compatible with this proposal.

It's when you want to store a more rich metadata experience that you have to start caring about the parts inside of an E-mail.

The only change that *will* affect you, but this change has not been agreed upon and this proposal ain't proposing to change it yet, is the nmo:hasAttachment changes. Phreedom's reasoning is that with this proposal nmo:hasAttachment becomes redundant.

My vote is on keeping nmo:hasAttachment but then fixing it (it has several very big problems, which me and Phreedom have discussed already on IRC).

Just don't do the nmo:hasContent and the nmo:MimePart stuff? It'll be compatible, but yeah, it'll be better for the consumers of your RDF data if you do it (of course).

  Changed 3 years ago by pvanhoof

To give an example where a MimePart? is a "image/png", is part of a root E-mail, and yet isn't an attachment:

Embedded photos like a "bullet.png" to glorify a bulleted list have as header Content-Disposition: INLINE (it doesn't matter how much you hate such HTML E-mails, by the way. That dislike ain't the point).

Those MimeParts? are NOT attachments. Meanwhile the HTML E-mail sender might have actually attached photos that he took with his camera. Those will, if the writer E-mail client gets it right (and nowadays most do), have Content-Disposition: ATTACHMENT. Those ARE attachments.

So the E-mail will contain:

Multipart {
  Multipart as Alternative {
    MimePart { text/plain}
    MimePart { text/html }
  }
  MimePart { bullet.png, Content-Disposition: INLINE }
  MimePart { DSC0001.jpeg, Content-Disposition: ATTACHMENT }
  MimePart { DSC0002.jpeg, Content-Disposition: ATTACHMENT }
  MimePart { DSC0003.jpeg, Content-Disposition: ATTACHMENT }
}

Semantically this means that the E-mail has THREE attachments (and NOT EIGHT, nor FOUR but only THREE). So this means that nmo:hasAttachment has to be used ONLY THREE times.

This is also possible:

Multipart {
  Multipart as Alternative {
    MimePart { text/plain}
    MimePart { text/html }
  }
  MimePart { DSC0004.jpeg, Content-Disposition: ATTACHMENT }
  rfc822 {
    Multipart {
      Multipart as Alternative {
        MimePart { text/plain}
        MimePart { text/html }
      }
      MimePart { bullet.png, Content-Disposition: INLINE }
      MimePart { DSC0001.jpeg, Content-Disposition: ATTACHMENT }
      MimePart { DSC0002.jpeg, Content-Disposition: ATTACHMENT }
      MimePart { DSC0003.jpeg, Content-Disposition: ATTACHMENT }
    }
  }
}

This last example has only ONE attachment AND it has a forward that has THREE attachments.

The current NMO can't cope with neither of the two use-cases that I gave. And shockingly are those two use-cases the two most common kinds of E-mails that you receive (in your INBOX, not in your mailinglists, because mailinglist E-mails are shockingly not representative and shockingly simple compared to most other real-life E-mails).

  Changed 3 years ago by pvanhoof

Note that on IMAP bodystructure does give you the content-disposition. I have explained how IMAP's bodystructure works, and how you can use it, in a super high resolution here (but you have to read it, of course):

http://live.gnome.org/Tracker/Documentation/EmailSparql

You can find an even higher resolution that talks about the technical and engineering aspects of communicating with an IMAP server in order to get bodystructure in a most efficient way here (again you have to read it, of course):

http://pvanhoof.be/files/email-metadata-0-0-4.pdf.bz2

And I wrote a little parser for bodystructure that I have explicitly stripped from any license for everybody to use:

http://svn.tinymail.org/svn/tinymail/trunk/libtinymail-camel/camel-lite/bs/bodystruct.c

Unlike what mylka claims, and I'll quote him: "It's impossible to distinguish automatically what is an attachment from POV of the user or not", it actually IS possible to distinguish automatically, using content-disposition, whether a part is an attachment or isn't an attachment.

My examples above should illustrate this already. But to be perfectly sure that everybody reading this, if anybody, which I'm starting to doubt, is VERY WELL aware of the content-disposition MIME header:

http://www.ietf.org/rfc/rfc2183.txt

This is not an attachment:

2.1  The Inline Disposition Type

   A bodypart should be marked `inline' if it is intended to be
   displayed automatically upon display of the message.  Inline
   bodyparts should be presented in the order in which they occur,
   subject to the normal semantics of multipart messages.

This is an attachment:

2.2  The Attachment Disposition Type

   Bodyparts can be designated `attachment' to indicate that they are
   separate from the main body of the mail message, and that their
   display should not be automatic, but contingent upon some further
   action of the user.  The MUA might instead present the user of a
   bitmap terminal with an iconic representation of the attachments, or,
   on character terminals, with a list of attachments from which the
   user could select for viewing or storage.

This does NOT make a part an attachment. Notice the "may" suggest.

2.3  The Filename Parameter

   The sender may want to suggest a filename to be used if the entity is
   detached and stored in a separate file.

This is an attachment when the value is 'attachment' (just like above).

A forwarded e-mail is a child in a multipart. The child has a content-disposition set. This can be either 'inline' or 'attachment'. ONLY when it's 'attachment', is the forwarded E-mail an attachment. If the value is 'inline' it isn't an attachment. Not semantically, not realistically, not pragmatically, not imaginary, no nothing.

2.9  Content-Disposition and Multipart

   If a Content-Disposition header is used on a multipart body part, it
   applies to the multipart as a whole, not the individual subparts.
   The disposition types of the subparts do not need to be consulted
   until the multipart itself is presented.  When the multipart is
   displayed, then the dispositions of the subparts should be respected.

   If the `inline' disposition is used, the multipart should be
   displayed as normal; however, an `attachment' subpart should require
   action from the user to display.

   If the `attachment' disposition is used, presentation of the
   multipart should not proceed without explicit user action.  Once the
   user has chosen to display the multipart, the individual subpart
   dispositions should be consulted to determine how to present the
   subparts.

As you can see is the RFC specification NOT ambiguous. What IS true is that VERY FEW people ever read this stuff, and then come up with the explanation "that it's impossible to know". But this isn't accurate. Of course.

I hope I made this very clear now.

  Changed 3 years ago by mylka

Content-Disposition is the answer indeed. I should have researched the issue better.

As for the structure. I can only repeat what I said. I don't see the need for a separate 'content' node and the 'hasContent' property. I'd be for the 2nd approach. So far there is no concrete proposal (i mean written in 2KB of turtle, not 500KB of English) for the exact way the NMO would look with the 2nd approach. My concern ATM is making sure I understand exactly what consequences would this have for Aperture users. A turtle file would make it easier for me (or a link to one, in case I overlooked it).

As for the use cases there are two issues:

  • simplified vs. full mime tree - little has been told about things that can only be made possible by storing the full mime tree instead of a simplified one (other that email clients that show emails in this way are "broken")
  • google shows your usage scenario (Email as a service) only on your own blog post (link). Is there anything else? It's interesting, what's the current implementation status, is there interest from end-user app developers? I'd like to know more, post some more links.

  Changed 3 years ago by mylka

  • cc mylka added

added myself to cc

  Changed 3 years ago by pvanhoof

RE: The "content" property got added to keep the separation between IE and DO in place. Otherwise you'd mix IE and DO, and that is against the philosophy of Nepomuk in general. You could also experiment with nie:isStoredAs instead, of course. Check out ticket #46 for more information on that.

Note that I'm not going to make Turtle files for you if it just means removing a property and changing another property. Do that yourself, it's too easy to just take the existing Turtle files and perform that trivial change. You can download a NMO Turtle in raw format here btw:

http://git.gnome.org/cgit/tracker/tree/data/ontologies/34-nmo.ontology

  Changed 3 years ago by m4db0b

Let's try to recap this discussion and check a solution...

About the "mail as a service" mentioned by pvanhoof: this approach seems to be at least as old as Evolution Data Server, Akonadi project is going in the same direction, and I'm aware about a pair of other projects planning that architecture (Lobotomy and Itsme. Disclosure: I'm involved in both). So I agree with a more fine-grained handling of mail messages, in perspective of future evolutions.

pvanhoof and/or phreedom: please accomplish the request by mylka, submit us a complete formal proposal which may be easily evaluated, updated with latest considerations.

  Changed 2 years ago by phreedom_

I'm splitting the discussion into smaller and more specific issues:

Related ticket: #56. This is a generalization of alternative representations of email(as both message and mime tree).

Related ticket: #53. Attachment vs embedding.

  Changed 2 years ago by m4db0b

  • status changed from assigned to closed
  • resolution set to wontfix

Closed with no action due lack of agreement and previous split of the issue in more detailed tickets (#53 and #56).

Note: See TracTickets for help on using tickets.