Menu

#192 Generate RSS summary in archives

Mailman 2.2 / 3.0
open
nobody
Pipermail (42)
9
2003-09-02
2002-12-23
No

Here's a first-draft patch. Things that need fixing:

* The generated RSS feed needs to be validated. (It passed the
W3C's RDF validator, but RSS validators still need to be checked.)

* The date should be given in YYYY-MM-DD format, which requires
parsing the .fromdate attribute.

* How do I get the URL for an archived message? The generated RSS
currently just uses the filename, which is wrong. How do I get
at the PUBLIC_ARCHIVE_URL setting?

* Getting the most recent N postings is inefficient; the code loops through all of the archived messages and takes the last N of them.
We could add .last() and .prev() methods to the Database class, but that's more ambitious for 2.1beta than I like. (Would be nice to get this into 2.1final...)

* The list index page should have a LINK element pointing to
the RSS file.

Please make any comments you have, and I'll rework the patch accordingly.

Discussion

1 2 > >> (Page 1 of 2)
  • A.M. Kuchling

    A.M. Kuchling - 2002-12-23

    Generate an RSS summary for lists

     
  • A.M. Kuchling

    A.M. Kuchling - 2002-12-23

    Logged In: YES
    user_id=11375

    Argh; SF choked on the file upload. Attaching the patch again...

     
  • Barry Warsaw

    Barry Warsaw - 2002-12-23

    Logged In: YES
    user_id=12800

    Deferring until post-2.1

     
  • Barry Warsaw

    Barry Warsaw - 2002-12-23
    • milestone: --> Mailman 2.2 / 3.0
     
  • captain larry

    captain larry - 2002-12-23

    Logged In: YES
    user_id=147905

    Just voting for support here. This is *great* thanks for
    the patch and I hope the maintainers include it as soon as
    it's appropriate :)

    Adam.

     
  • A.M. Kuchling

    A.M. Kuchling - 2002-12-23

    Logged In: YES
    user_id=11375

    Updated patch:

    * Dates are now rendered as ISO-8601 (date only, not the time of the message)

    * By hard-wiring 2002-December, I got the RSS to validate using Mark Pilgrim's validator.

     
  • A.M. Kuchling

    A.M. Kuchling - 2002-12-23

    Updated patch

     
  • Uche Ogbuji

    Uche Ogbuji - 2003-03-18

    Logged In: YES
    user_id=38966

    I'd like to add my vote to this item. This is a fantastic
    idea, Andrew. Thanks.

    --Uche

     
  • Justin Mason

    Justin Mason - 2003-03-26

    Logged In: YES
    user_id=935

    big thumbs up from me too. Much better solution than
    http://taint.org/mmrss/ ;)

     
  • Barry Warsaw

    Barry Warsaw - 2003-03-27
    • priority: 5 --> 7
     
  • Barry Warsaw

    Barry Warsaw - 2003-04-18

    Logged In: YES
    user_id=12800

    Andrew, to get the url for the archived message use
    mlist.GetBaseArchiveURL(), which knows about private vs.
    public archives, the host name and the list name. From
    there you should be able to tack on just the part of the
    path under "archives/private/listname". See
    Mailman/Handlers/Scrubber.py for an example.

    Only other minor comment: NUM_ARTICLES can probably go in
    Defaults.py.in

     
  • Dan Brickley

    Dan Brickley - 2003-06-22

    Logged In: YES
    user_id=7830

    Does anyone have a patch to remove the hardwiring of
    "2002-December" and get the appropriate date from mailman
    somehow?

     
  • Dan Brickley

    Dan Brickley - 2003-06-22

    Logged In: YES
    user_id=7830

    I thought I'd have a look at this myself, though have modest
    knowledge of both Python and MailMan.

    In the course of trying to patch the patch, I tried running
    the archiver over just the last couple of messages, to speed
    things along:
    "../../bin/arch -s 4390 rdfweb-dev".
    Traceback (most recent call last):
    File "../../bin/arch", line 187, in ?
    main()
    File "../../bin/arch", line 177, in main
    archiver.close()
    File "/usr/local/mailman/Mailman/Archiver/pipermail.py",
    line 310, in close
    self.write_TOC()
    File "/usr/local/mailman/Mailman/Archiver/HyperArch.py",
    line 1082, in write_TOC
    rss.write(self.RSS())
    File "/usr/local/mailman/Mailman/Archiver/HyperArch.py",
    line 769, in RSS
    date, msgid = self.database.dateIndex.first()
    AttributeError: HyperDatabase instance has no attribute
    'dateIndex'

    Not sure what's going on there, but this seemed as good a
    place of any to keep note of it.

    Investigating...

     
  • Dan Brickley

    Dan Brickley - 2003-06-22

    Logged In: YES
    user_id=7830

    OK, I've regenerated the patch with some code which works
    for me.

    http://rdfweb.org/2003/06/mailman-rss/rsspatch

    Health warning:

    * I suspect it may fail in conditions when
    get_archives() returns
    a list not a string (does this ever happen?).
    * See also problems mentioned below, regenerating partial
    archives seems tricky.

    Hope this is useful anyways...

    Dan <danbri@w3.org>

     
  • A.M. Kuchling

    A.M. Kuchling - 2003-07-11

    Logged In: YES
    user_id=11375

    Here at last is an updated version of the patch that's crawling closer to being complete. There's now a RSS_NUM_ARTICLES setting in Defaults.py, the generated URLs are correct, and I modified the English template to link to the RSS file.

    Remaining things: check the generated RSS for correctness; edit all of the other language templates to include the RSS file (I may ask for CVS write access to do that). It would be really nice if the Mailman upgrade script could update existing general list information pages to include the LINK element; any suggestion about how to go about that?

     
  • A.M. Kuchling

    A.M. Kuchling - 2003-07-11

    Logged In: YES
    user_id=11375

    Attaching correct version of the patch.

     
  • A.M. Kuchling

    A.M. Kuchling - 2003-07-11

    July 2003 version of the patch

     
  • A.M. Kuchling

    A.M. Kuchling - 2003-07-15

    Logged In: YES
    user_id=11375

    OK, done!

    This patch is now ready to go in: some people have looked at
    the RSS and haven't spotted any problems. Barry, can I
    please get CVS write access to check this in?

     
  • Barry Warsaw

    Barry Warsaw - 2003-09-02

    Logged In: YES
    user_id=12800

    Bumping priority.

     
  • Barry Warsaw

    Barry Warsaw - 2003-09-02
    • priority: 7 --> 9
     
  • Michael Weber

    Michael Weber - 2003-09-10

    Logged In: YES
    user_id=863445

    So far the patch is included (by the way: i hope that
    Defaults.py.in in the patch *means* Defaults.py ) and
    mailman get a restart. Hopefully i add the two lines in
    listinfo.html ( /de/ because we have german speaking lists)
    and take a look for the xml file).
    After search the whole device (only to be sure) i can say:
    There is no file like this. Is another patch need before?
    Another setup to make? I cant find any hint here... so i
    have to ask. But the idea is great... if it work on my lists
    its genious...
    regards, Michael
    running version 2.1.1

     
  • Roy M. Silvernail

    Logged In: YES
    user_id=670974

    I'm trying to enrich the RSS output by adding a proper
    [description] and a [content:encoded] module, but I am
    having the devil's own time locating the raw message text.
    Be happy to contribute a patch if you can point me to the
    raw content (without the italics markup for quoting).

    Thanks!

     
  • Richard Barrett

    Richard Barrett - 2004-12-06

    Logged In: YES
    user_id=75166

    The following is based on the July 2003 version of the patch file posted
    on sourceforge.

    The RSS patch adds the RSS() function as member function of the
    HyperArchive class defined in HyperArch.py.

    It has been reported that the following statement in RSS():

    date, msgid = self.database.dateIndex.first()

    may generate an AttributeError exception:

    AttributeError: HyperDatabase instance has no attribute 'dateIndex'

    The RSS patch appears to make the assumption that whenever the
    RSS() function is called from the write_TOC() member function of the
    HyperArchive class the __openIndices() function has already been called
    on the latest period archive associated with the list, whose TOC page is
    being generated by write_TOC(), and that no intervening call to
    __closeIndices() has been made.

    If the assumption were correct then whenever the RSS() function was
    called on a HyperArchive instance, the xxxxxIndices attributes of the
    HyperDatabase instance "owned" by the HyperArchive instance would be
    pointing to valid instance of DumbBTree.

    Unfortunately, this assumption is not correct. In order to do its work,
    write_TOC() does not itself need to perform any call to the
    __openIndices() function for the list/archive/database whose TOC page is
    to be recreated. It just happens that in some circumstances, some of the
    code which might call write_TOC may have called the __openIndices()
    function at some prior point and left the HyperDatabase instance with a
    valid set of xxxxxIndices attributes in place when write_TOC() is called.

    For the RSS patch to be work reliably the code in the RSS() function has
    to be changed so that it ensures that the conditions it wants prevail when
    it executes the statement giving the problem.

    The following is an untested code change but if part of the RSS()
    function's code definition in HyperArch.py is modified from:

    <quote>
    # Get the most recent messages. The only index operation
    # we can count on is traversal by increasing date, so
    # we end up traversing all of the entries and remembering the last
    # N of them. Sigh.
    items = []
    try:
    date, msgid = self.database.dateIndex.first()
    items.append(msgid)
    except KeyError:
    pass

    while 1:
    try:
    </quote>

    to read:

    <quote>
    # Get the most recent messages. The only index operation
    # we can count on is traversal by increasing date, so
    # we end up traversing all of the entries and remembering the last
    # N of them. Sigh.
    items = []
    got_first = 0
    try:
    msgid = self.database.first(self.archives[0], 'date')
    if msgid:
    items.append(msgid)
    got_first = 1
    except KeyError:
    pass

    while got_first and 1:
    try:
    </quote>

    this should fix the exception problem.

     
  • Jeff Schoby

    Jeff Schoby - 2006-01-19

    Logged In: YES
    user_id=198250

    Does anyone have any idea why when I run ~mailman/bin/arch
    <listname> it will generate the rss.xml file, but when new
    emails come in to said list it doesn't do ANYTHING with the
    xml file?

    I'd like for the rss feed to update itself every time a new
    post comes into a list. Right now it isn't doing that.

     
  • Cesar Fernandez

    Cesar Fernandez - 2010-06-05

    Based on you patch, I've developed another patch for mailman.
    It includes the body of the messages as well as author,published date, link to pipermail archives, thread information..
    It's part of a bigger project which add's voting capabilities for mailman: http://votemm.libresoft.es/votemm/

     
1 2 > >> (Page 1 of 2)

Log in to post a comment.