Here's a first-draft patch. Things that need fixing:
* The generated RSS feed needs to be validated. (It passed the
W3C's RDF validator, but RSS validators still need to be checked.)
* The date should be given in YYYY-MM-DD format, which requires
parsing the .fromdate attribute.
* How do I get the URL for an archived message? The generated RSS
currently just uses the filename, which is wrong. How do I get
at the PUBLIC_ARCHIVE_URL setting?
* Getting the most recent N postings is inefficient; the code loops through all of the archived messages and takes the last N of them.
We could add .last() and .prev() methods to the Database class, but that's more ambitious for 2.1beta than I like. (Would be nice to get this into 2.1final...)
* The list index page should have a LINK element pointing to
the RSS file.
Please make any comments you have, and I'll rework the patch accordingly.
Generate an RSS summary for lists
Logged In: YES
user_id=11375
Argh; SF choked on the file upload. Attaching the patch again...
Logged In: YES
user_id=12800
Deferring until post-2.1
Logged In: YES
user_id=147905
Just voting for support here. This is *great* thanks for
the patch and I hope the maintainers include it as soon as
it's appropriate :)
Adam.
Logged In: YES
user_id=11375
Updated patch:
* Dates are now rendered as ISO-8601 (date only, not the time of the message)
* By hard-wiring 2002-December, I got the RSS to validate using Mark Pilgrim's validator.
Updated patch
Logged In: YES
user_id=38966
I'd like to add my vote to this item. This is a fantastic
idea, Andrew. Thanks.
--Uche
Logged In: YES
user_id=935
big thumbs up from me too. Much better solution than
http://taint.org/mmrss/ ;)
Logged In: YES
user_id=12800
Andrew, to get the url for the archived message use
mlist.GetBaseArchiveURL(), which knows about private vs.
public archives, the host name and the list name. From
there you should be able to tack on just the part of the
path under "archives/private/listname". See
Mailman/Handlers/Scrubber.py for an example.
Only other minor comment: NUM_ARTICLES can probably go in
Defaults.py.in
Logged In: YES
user_id=7830
Does anyone have a patch to remove the hardwiring of
"2002-December" and get the appropriate date from mailman
somehow?
Logged In: YES
user_id=7830
I thought I'd have a look at this myself, though have modest
knowledge of both Python and MailMan.
In the course of trying to patch the patch, I tried running
the archiver over just the last couple of messages, to speed
things along:
"../../bin/arch -s 4390 rdfweb-dev".
Traceback (most recent call last):
File "../../bin/arch", line 187, in ?
main()
File "../../bin/arch", line 177, in main
archiver.close()
File "/usr/local/mailman/Mailman/Archiver/pipermail.py",
line 310, in close
self.write_TOC()
File "/usr/local/mailman/Mailman/Archiver/HyperArch.py",
line 1082, in write_TOC
rss.write(self.RSS())
File "/usr/local/mailman/Mailman/Archiver/HyperArch.py",
line 769, in RSS
date, msgid = self.database.dateIndex.first()
AttributeError: HyperDatabase instance has no attribute
'dateIndex'
Not sure what's going on there, but this seemed as good a
place of any to keep note of it.
Investigating...
Logged In: YES
user_id=7830
OK, I've regenerated the patch with some code which works
for me.
http://rdfweb.org/2003/06/mailman-rss/rsspatch
Health warning:
* I suspect it may fail in conditions when
get_archives() returns
a list not a string (does this ever happen?).
* See also problems mentioned below, regenerating partial
archives seems tricky.
Hope this is useful anyways...
Dan <danbri@w3.org>
Logged In: YES
user_id=11375
Here at last is an updated version of the patch that's crawling closer to being complete. There's now a RSS_NUM_ARTICLES setting in Defaults.py, the generated URLs are correct, and I modified the English template to link to the RSS file.
Remaining things: check the generated RSS for correctness; edit all of the other language templates to include the RSS file (I may ask for CVS write access to do that). It would be really nice if the Mailman upgrade script could update existing general list information pages to include the LINK element; any suggestion about how to go about that?
Logged In: YES
user_id=11375
Attaching correct version of the patch.
July 2003 version of the patch
Logged In: YES
user_id=11375
OK, done!
This patch is now ready to go in: some people have looked at
the RSS and haven't spotted any problems. Barry, can I
please get CVS write access to check this in?
Logged In: YES
user_id=12800
Bumping priority.
Logged In: YES
user_id=863445
So far the patch is included (by the way: i hope that
Defaults.py.in in the patch *means* Defaults.py ) and
mailman get a restart. Hopefully i add the two lines in
listinfo.html ( /de/ because we have german speaking lists)
and take a look for the xml file).
After search the whole device (only to be sure) i can say:
There is no file like this. Is another patch need before?
Another setup to make? I cant find any hint here... so i
have to ask. But the idea is great... if it work on my lists
its genious...
regards, Michael
running version 2.1.1
Logged In: YES
user_id=670974
I'm trying to enrich the RSS output by adding a proper
[description] and a [content:encoded] module, but I am
having the devil's own time locating the raw message text.
Be happy to contribute a patch if you can point me to the
raw content (without the italics markup for quoting).
Thanks!
Logged In: YES
user_id=75166
The following is based on the July 2003 version of the patch file posted
on sourceforge.
The RSS patch adds the RSS() function as member function of the
HyperArchive class defined in HyperArch.py.
It has been reported that the following statement in RSS():
date, msgid = self.database.dateIndex.first()
may generate an AttributeError exception:
AttributeError: HyperDatabase instance has no attribute 'dateIndex'
The RSS patch appears to make the assumption that whenever the
RSS() function is called from the write_TOC() member function of the
HyperArchive class the __openIndices() function has already been called
on the latest period archive associated with the list, whose TOC page is
being generated by write_TOC(), and that no intervening call to
__closeIndices() has been made.
If the assumption were correct then whenever the RSS() function was
called on a HyperArchive instance, the xxxxxIndices attributes of the
HyperDatabase instance "owned" by the HyperArchive instance would be
pointing to valid instance of DumbBTree.
Unfortunately, this assumption is not correct. In order to do its work,
write_TOC() does not itself need to perform any call to the
__openIndices() function for the list/archive/database whose TOC page is
to be recreated. It just happens that in some circumstances, some of the
code which might call write_TOC may have called the __openIndices()
function at some prior point and left the HyperDatabase instance with a
valid set of xxxxxIndices attributes in place when write_TOC() is called.
For the RSS patch to be work reliably the code in the RSS() function has
to be changed so that it ensures that the conditions it wants prevail when
it executes the statement giving the problem.
The following is an untested code change but if part of the RSS()
function's code definition in HyperArch.py is modified from:
<quote>
# Get the most recent messages. The only index operation
# we can count on is traversal by increasing date, so
# we end up traversing all of the entries and remembering the last
# N of them. Sigh.
items = []
try:
date, msgid = self.database.dateIndex.first()
items.append(msgid)
except KeyError:
pass
while 1:
try:
</quote>
to read:
<quote>
# Get the most recent messages. The only index operation
# we can count on is traversal by increasing date, so
# we end up traversing all of the entries and remembering the last
# N of them. Sigh.
items = []
got_first = 0
try:
msgid = self.database.first(self.archives[0], 'date')
if msgid:
items.append(msgid)
got_first = 1
except KeyError:
pass
while got_first and 1:
try:
</quote>
this should fix the exception problem.
Logged In: YES
user_id=198250
Does anyone have any idea why when I run ~mailman/bin/arch
<listname> it will generate the rss.xml file, but when new
emails come in to said list it doesn't do ANYTHING with the
xml file?
I'd like for the rss feed to update itself every time a new
post comes into a list. Right now it isn't doing that.
Based on you patch, I've developed another patch for mailman.
It includes the body of the messages as well as author,published date, link to pipermail archives, thread information..
It's part of a bigger project which add's voting capabilities for mailman: http://votemm.libresoft.es/votemm/