Fragmented messages

  • Enrique Marcote Peńa

    I'm adding support for message fragmentation into OSERL. We are all probably splitting messages on our own, so I think that it might save us some work if OSERL provides some kind of support for it.

    Some utilities for splitting and concatenating messages will be included, but there are a couple of design decision I'd like to hear your comments about.

    a) Shall we split and join messages auto-magically? 

    If so, every time a short_message larger than ?SM_MAX_SIZE is submitted, OSERL will split it and send it fragmented into several submissions.

    On the other peer, whenever a fragmented message arrives, OSERL will wait for all fragments, join them and forward the concatenated message up.  Notice that only one PDU will be forwarded to the callback module, although several  are received at the session layer.  It should also be noticed that the resulting PDU (after concatenation) does not correspond with any PDU received from the other peer.  It will be assembled in our session and will contain the same parameters as the first one of the sequence (or the last one), but  having as the short_message parameter, the concatenation of all segments.

    By using this approach, fragmentation will be transparent for higher layers...Sounds good, but might be undesirable for some people... can you think of any case where undesired?

    If we decide to automate concatenation, the fragmentation method needs also to  be selected.  In most cases UDH will be used, but people connected to SMSC  supporting SAR TLVs might prefer it. Maybe, we could set the concatenation  method as a session config option.  Something like:

    Option = {concatenation, Method}
    Method = udh | tlv | none

    Default value = udh

    sm_max_size could be another option (Default = ?SM_MAX_SIZE)

    On the other hand, one might prefer to explicitly call a different submission function for splitting the messages (see submit_sm/3 and submit_multi/3 in latest gen_esme.erl at the CVS).

    b) Do fragmentation at the session layer or at the ESME/SMSC layer.  The first one seems better but, which response shall be returned?  On response to  a gen_esme:submit_sm/2 call, the result {ok, ResponsePdu} | {error, CommandStatus} is returned. What if gen_esme:submit_sm/2 implies several submit_sm operations? At the session layer several submit_sm_resp PDUs are received.  We'd probably need to return a list of results in the functions submit_sm, submit_multi, data_sm and deliver_sm. Do you find it messy?

    Suggestions are welcome.  Thank you,


    • Anders Nygren

      Anders Nygren - 2005-07-20

      This is a tricky subject. I have spent quite some time thinking about this for a project I will start soon.

      What I need is, (only ESME at this time)
      - send messages longer than 160 bytes
      - I dont expect to receive segmented messages, so I haven't considered that case.
      - segmentation of messages using UDH
      - registered_delivery for all messages
      - scheduled_delivery for all messages
      - on the application level I want to receive ONE delivery receipt for the complete long message.
      - but for charging purposes I also need to know how many segments were sent, and the delivery status for each segment.
      - cancel messages with cancel_sm.

      So what I would like is an interface between my application and gen_esme that lets me
      - submit_sm with a message >160 bytes

      - the submit_sm should return something like {GlobalMessageId,SegmentMessageIds} where
      GlobalMessageId = A possibly local message id used for the "long message"
      SegmentMessageIds = a list of message_ids returned by the SMSC for the segments of the "long message"

      - cancel a "long message" using cancel_sm(GlobalMessageId), gen_esme, or gen_esme_session then automagically issues cancel_sm foreach segment
      - if some segemnts have already been delivered maybe the cancel_sm should do nothing, I am not sure about this right now.

      - when a delivery_receipt is received gen_esme, or gen_esme_session, collects the delivery receipts for all segments in the long message and does one delivery_receipt callback to the application. This callback should give the {GlobalMessageId,FinalDeliveryStatus,FinalDeliveryTime,SegmentDeliveryStatusList}
      FinalDeliveryStatus is delivered if all segments were delivered, and failed if one or more segments failed.
      FinalDeliveryTime is the delivery time for the last segment to be delivered.
      SegmentDeliveryStatusList is a list of {SegmentMessageId,SegmentDeliveryStatus,SegmentDeliveryTime}

      Since this means that there has to be a table mapping GlobalMessageId to SegmentMessageIds there has to be a cleanup function that removes old mappings in case some delivery_receipts are lost, when a mapping is removed a delivery_receipt callback should be done indication which segments have unknown delivery status.

      So to answer Your question
      "By using this approach, fragmentation will be transparent for higher layers...Sounds good, but might be undesirable for some people... can you think of any case where undesired? "

      For me it would not be practical if segmentation is completely hidden. I think that it will also be impractical for applications using query_sm, cancel_sm and replace_sm.
      (btw, I suspect that cancel_sm and replace_sm can be difficult to use with segmented messages. What happens if one or more segments have already been delivered when the cancel_sm or replace_sm is issued?)

      How to select which segmentation method to use, UDH or TLV?
      I would prefer to have a configuration parameter, rather than a submit_sm parameter.

      Where to do the segmentation? ESME/SMSC or session layer?
      I think it is better to do it at the gen_esme/gen_smsc level. That would keep the gen_*_session clean, only doing the SMPP protcol.


    • Anders Nygren

      Anders Nygren - 2005-07-21

      One more comment, regarding the code in CVS for segmenting messages.

      In gen_esme:submit_sm/3 and gen_esme:submit_multi/3 the
      message is has to be in short_message.
      I think it would be better to require that when long
      messages are sent they have to be in message_payload.
      Have a configuration parameter to gen_esme to specify
      which method to use for sending long messages,
      UDH | TLV | message_payload.
      That way the application does not have to know which
      method the SMSC supports.


    • Enrique Marcote Peńa

      I see one inconvenient with that.  You will always need to check whether the messages you are sending are longer than 160 or not.  Most of the SMSC do not support the message_payload, so you will be putting the data into the message_payload only if the message is long, otherwise using short_message,  since the SMSC only supports this parameter.

      Definitely I like setting the concatenation method as a setup option (and sm_max_size), but I'm not sure about using always message_payload.

    • Anders Nygren

      Anders Nygren - 2005-07-22

      Automatic reassembly on the SMSC side is not a good idea, for several reasons.

      1, Since long messages are sent segmented from the SMSC to the MSE they have to be split again.
      2, If registered_delivery is used it will be impossible for the SMSC to generate correct delivery_receipts.
      3, cancel_sm,query_sm and replace_sm will be more difficult or even impossible.

      Automatic reassembly on the ESME side could be useful, but may lead to some tricky error cases.

      Consider what should be done if the connection is lost or a restart and we have not received all segments. To cope with that it will be necessary to store all segments on disk when they are received to be able to reassemble the message when the system is up again and the remaining segments are received.
      Another case is when one or more segments are never received.



Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks