|
From: seanh <sea...@gm...> - 2009-07-17 17:37:19
|
Hi, I believe that PyBlosxom (and blosxoms generally) have a fairly serious issue with permalinks. Correct me if I'm wrong, but PyBlosxom provides two ways to link directly to a particular post: by category or by date. One or the other of these is what's usually referred to as a "permalink" in PyBlosxom blogs. But neither of these things is permanent at all. Re-categorise a post, which doesn't seem unusual or unreasonable, and that permalink is broken. Edit a post and, since PyBlosxom updates the date, that permalink is broken. This is terrible, right? Broken links everywhere. As a file-system based weblog engine the file path seems like the natural thing to use as a unique identifier. But I feel that this is probably a mistake in the same way that it's a mistake to use the mtime as the date of a post even if it seems like the natural thing to do. There are several plugins that do dates in a different way, but I don't see any to fix permalinks. Solutions? Never re-categorise a post. You could use tags instead, and can change those without breaking the permalinks. I think additional tools are necessary to make tags manageable though. PyBlosxom isn't very good at tags though because it's filesystem-based and the filesystem doesn't do tags. Without a lot of custom scripting you aren't going to get tag auto-completion and suggestion, splitting, joining, renaming and deleting tags, etc. Jekyll [1] has a neat way of doing it. The user can specify, in the configuration file, what the format of permalinks should be based on various metadata about the post: date, title, and categories. [1]: http://wiki.github.com/mojombo/jekyll/permalinks Use unchanging "published" dates. PyBlosxom can already do this. In combination with one of the plugins that prevents the date from being modified when you edit a post, that ought to fix it. I think PyBlosxom could do with a plugin that lets a post have both a published date and a modified date. The original published date never changes and can be used as a permalink, the modified date changes and can be used to inform readers of when the post was last modified. Wordpress seems to put original publication dates in permalinks also, followed by a user-configurable "slug", e.g. 2009/05/23/hello-world. "hello-world" is not the title of the post (or the categories or tags), it's a slug that's there purely for the purpose of the permalink. Change the slug and (in theory) you've broken the permalink, although in fact if you visit an old slug wordpress seems to be able to redirect you to the new one. Still, even without the redirect, this enables you to edit, rename and re-categorise a post, the only thing you mustn't change is the slug. PyBlosxom could do slugs using a #slug metadata line in each post file. A slight problem is that you're expecting the user to come up with unique slugs, but you have no way of telling them, when they're just typing the slug in their text editor, that it's not unique. Another problem is that when a visitor requests a slug PyBlosxom would have to inspect the contents of every post file, so clearly some sort of index or cache is going to be needed. |
|
From: seanh <sea...@gm...> - 2009-07-17 17:50:45
|
On Fri, Jul 17, 2009 at 6:36 PM, seanh<sea...@gm...> wrote: > > PyBlosxom could do slugs using a #slug metadata line in each post > file. A slight problem is that you're expecting the user to come up > with unique slugs, but you have no way of telling them, when they're > just typing the slug in their text editor, that it's not unique. > Another problem is that when a visitor requests a slug PyBlosxom would > have to inspect the contents of every post file, so clearly some sort > of index or cache is going to be needed. > It occurred to me after writing this that the filename (minus extension) is equivalent to a slug so you wouldn't need to add a metadata line. Now to imitate wordpress permalinks we'd have to be able to combine pyblosxom's date-based URLs with filenames (without categories) to form URLs like: /2009/Jul/17/filename. |
|
From: Steve H. <sho...@gm...> - 2009-07-17 18:02:43
|
On Fri, Jul 17, 2009 at 12:50 PM, seanh<sea...@gm...> wrote: > > It occurred to me after writing this that the filename (minus > extension) is equivalent to a slug so you wouldn't need to add a > metadata line. Now to imitate wordpress permalinks we'd have to be > able to combine pyblosxom's date-based URLs with filenames (without > categories) to form URLs like: /2009/Jul/17/filename. What I'd really like to do is organize my files in folders like: /2009/Jul/17/filename. That would give me true permalinks and prevent the duplicate slug problem. But doing this interferes with PyBlosxom's date based archives, so it's not possible right now. Steve |
|
From: will <wi...@bl...> - 2009-07-17 19:42:57
|
Steve Hoelzer wrote: > On Fri, Jul 17, 2009 at 12:50 PM, seanh<sea...@gm...> wrote: >> It occurred to me after writing this that the filename (minus >> extension) is equivalent to a slug so you wouldn't need to add a >> metadata line. Now to imitate wordpress permalinks we'd have to be >> able to combine pyblosxom's date-based URLs with filenames (without >> categories) to form URLs like: /2009/Jul/17/filename. > > What I'd really like to do is organize my files in folders like: > /2009/Jul/17/filename. That would give me true permalinks and prevent > the duplicate slug problem. But doing this interferes with PyBlosxom's > date based archives, so it's not possible right now. Are you sure about that? I'm pretty sure we solved that issue a few versions ago. |
|
From: will <wi...@bl...> - 2009-07-17 19:50:17
|
seanh wrote: > > I believe that PyBlosxom (and blosxoms generally) have a fairly > serious issue with permalinks. > > Correct me if I'm wrong, but PyBlosxom provides two ways to link > directly to a particular post: by category or by date. One or the > other of these is what's usually referred to as a "permalink" in > PyBlosxom blogs. But neither of these things is permanent at all. > Re-categorise a post, which doesn't seem unusual or unreasonable, and > that permalink is broken. Edit a post and, since PyBlosxom updates the > date, that permalink is broken. > > This is terrible, right? Broken links everywhere. > PyBlosxom lets you use whatever permalink format you want to use. The format you want to use depends on how you do your blog. If categories are fixed, then you can use a category based permalink. If mtimes are fixed, use that. If you want some third thing, you can write a plugin to do it however you want. PyBlosxom has no internal notion of a permalink. It's just a url in the story template. It can be composed however you want or even not included if that suits your fancy. Blosxom is the same way. So... it sounds like you see this as an architectural flaw, but I see this as a freeing sort of thing since it allows different people with different needs to implement whatever works for them. /will |
|
From: Steve H. <sho...@gm...> - 2009-07-17 21:22:48
|
On Fri, Jul 17, 2009 at 2:42 PM, will<wi...@bl...> wrote: > Steve Hoelzer wrote: >> >> What I'd really like to do is organize my files in folders like: >> /2009/Jul/17/filename. That would give me true permalinks and prevent >> the duplicate slug problem. But doing this interferes with PyBlosxom's >> date based archives, so it's not possible right now. > > Are you sure about that? I'm pretty sure we solved that issue a few > versions ago. No, I'm not sure. I tried this a long time ago with PyBlosxom 1.4.1 (I think) and it didn't work then. I'm still on 1.4.1 now but I'll upgrade if this issue is fixed. If I get some time I'll experiment and report back. Steve |
|
From: seanh <sea...@gm...> - 2009-07-20 13:11:44
|
On Fri, Jul 17, 2009 at 8:50 PM, will<wi...@bl...> wrote: > seanh wrote: > PyBlosxom lets you use whatever permalink format you want to use. The > format you want to use depends on how you do your blog. If categories are > fixed, then you can use a category based permalink. If mtimes are fixed, > use that. If you want some third thing, you can write a plugin to do it > however you want. Actually, can you even link to a particular post by date? I think you can link to a day, month or year, but not a post. So it looks like it's either categories as permalinks or writing a plugin to do it another way. I'm not sure how best to write a plugin for this. Lets say we want some sort of permalink IDs that are independent of either category or date. We could use the filename or a #slug metadata line. Either way I see two difficulties: 1. the user could potentially create non-unique IDs and 2. I think the plugin would need to build an index of the IDs in order to be able to respond to permalink requests quickly enough. One way it could work is similar to the rdate utility. Before publishing content to your blog you run a 'slug' script that adds unique, short #slug metadata lines to any of your posts that don't already have them. These could be numeric or based on the post title or filename (with numbers added for uniqueness of necessary). The script also creates an index of slugs. A pyblosxom plugin then uses this index to handle requests for posts via their slugs. I dunno, feels inelegant. Perhaps keeping categories fixed is better. |
|
From: seanh <sea...@gm...> - 2009-07-20 13:26:16
|
On Mon, Jul 20, 2009 at 2:11 PM, seanh<sea...@gm...> wrote: > > I dunno, feels inelegant. Perhaps keeping categories fixed is better. > I noticed that in wordpress if you change the unique slug of a post and then visit the old slug you get re-directed to the new one. I wonder if you could do something like that in pyblosxom? Use categories+filename as permalinks but if you move a post leave a file behind with an HTML redirect in it. It could get messy. If you moved a whole category you might have to leave a whole lot of HTML redirects, and if you moved the same post more than once you would get a chain of redirects. If you just move the odd post one time I guess it would work. |
|
From: will <wi...@bl...> - 2009-07-20 13:51:19
|
seanh wrote: > On Mon, Jul 20, 2009 at 2:11 PM, seanh<sea...@gm...> wrote: >> I dunno, feels inelegant. Perhaps keeping categories fixed is better. > > I noticed that in wordpress if you change the unique slug of a post > and then visit the old slug you get re-directed to the new one. I > wonder if you could do something like that in pyblosxom? Use > categories+filename as permalinks but if you move a post leave a file > behind with an HTML redirect in it. Off the top of my head, I think there are a few ways you could solve the moving categories problem: First is to use a version control system for keeping track of entries in your blog. I think there are a couple out there already that use the checkin time for the mtime for a file. If the version control system knows about file moves, then you could write a plugin that checks the version control system and provide the redirects. Second is to write a shell script that moves a post from one category to another and make sure you only use that shell script for moving posts. It'd move the post but also add an entry to some redirect-index file. Then write a plugin to check that file for redirects and provide as needed. You mentioned the problem where moving a file a few times provides multiple redirects, but I think you could squash multiple moves into one move in the plugin without having to do multiple redirects. Alternatively, I think it's probably easier in the long run to give each entry a unique identifier and use the unique identifiers as the permalink. Then you have a cron job that compiles all the unique identifiers into an index file that's read by a plugin whenever a permalink slug is requested. Maybe hashing the original category/filename/title is a good unique identifier? So long as you store it in the metadata of the entry and it stays the same, it probably doesn't matter what the original composing criteria is. /will |
|
From: seanh <sea...@gm...> - 2009-08-08 21:23:08
|
On Mon, Jul 20, 2009 at 2:51 PM, will<wi...@bl...> wrote: > seanh wrote: > Off the top of my head, I think there are a few ways you could solve the > moving categories problem: > > First is to use a version control system for keeping track of entries in > your blog. I think there are a couple out there already that use the > checkin time for the mtime for a file. If the version control system knows > about file moves, then you could write a plugin that checks the version > control system and provide the redirects. I actually use git to track the files in my blog already, although I don't integrate pyblosxom with git at all. What you suggest wouldn't work though, git doesn't always know when you move a file. If you both edit and move a file in the same commit and forget to use git mv, then often git records it as deleting the file and adding a new one. Ok, you could just not do this, it's bad behaviour, but I've been using VCS's to track my personal notes for years and I always do this sort of thing, so I know I can't rely on myself. I've had a change of heart about this whole issue after reading an article by Cory Doctorow about how he used tags to index a large collection of notes: http://www.locusmag.com/Perspectives/2009/05/cory-doctorow-extreme-geek.html (see 2. Research: Twitter meets notekeeping) I used to prefer pyblosxom categories over the tags plugin because they seemed more core to the pyblosxom way of doing things (use the filesystem). But now I think paths should be used as permalinks only, paths are what make sense as permalinks in pyblosxom and conversely permalinks are really all that paths are good for. As a means of organising and browsing your posts paths are really brittle and inflexible: permalinks break, a post can only be in one category at a time, the category views only show the first n posts in the category anyway (although I think this got fixed for 1.5). I've realised that tags are much better for categorising because they're so lightweight and flexible, the existing tag plugin works well, tag pages show all posts with the tag, they don't break permalinks when you change them, a post can have multiple tags. Tags, if you pay attention to them and use them carefully, let you build up an index of your site that facilitates a directed yet serendipitous browsing (to paraphrase Cory). But the problem with tags in pyblosxom (I've used them in the past) is that they can easily become a mess, you end up with multiple different tags for the same thing, often just slightly different spellings or something, posts with no tags, and lots of tags you barely use. So I think pyblosxom requires an additional utility to let you manage your tags with care. I've already written this script but never really put it to work, I'll have to dust if off to revive the tags for my notes, so I'll try to post the source when I do. It's a nearly trivial command-line program in python that walks your entries dir and can: * Print out a list of all your tags and the no. of posts each tag has, sorted by popularity or alphanumerically. * Print out all the posts that have a particular tag or tags (you could potentially allow AND, OR and NOT etc. but I don't know how useful that would be) * Print out all the tags that a particular post has * Rename a tag * Delete a tag * Split one tag into two * Join two tags into one All of these operations require reading all of the files in your entries dir, and the last few require modifying the #tags lines of lots of files (e.g. when I say rename a tag, I mean rename that tag in every entry file that it occurs in). This script is a standalone module not technically a pyblosxom plugin, and it's independent of the tags plugin, but it occurs to me that the two could probably share much of their code. |
|
From: Steve H. <sho...@gm...> - 2009-09-09 21:14:45
|
On Fri, Jul 17, 2009 at 4:22 PM, Steve Hoelzer<sho...@gm...> wrote: > On Fri, Jul 17, 2009 at 2:42 PM, will<wi...@bl...> wrote: >> Steve Hoelzer wrote: >>> >>> What I'd really like to do is organize my files in folders like: >>> /2009/Jul/17/filename. That would give me true permalinks and prevent >>> the duplicate slug problem. But doing this interferes with PyBlosxom's >>> date based archives, so it's not possible right now. >> >> Are you sure about that? I'm pretty sure we solved that issue a few >> versions ago. > > No, I'm not sure. I tried this a long time ago with PyBlosxom 1.4.1 (I > think) and it didn't work then. I'm still on 1.4.1 now but I'll > upgrade if this issue is fixed. If I get some time I'll experiment and > report back. I finally gave this a try and it does not work for me using the current trunk (r1333). I had a very simple test case with three entries in /2009/09/, py["num_entries"] = 2, and date archives with month numbers turned on. $ python pyblosxom-cmd staticrender pyblosxom-cmd version 1.5 dev Trying to import the config module.... Performing static rendering. rendering 3 entries. rendering 3 category indexes. rendering 3 date indexes. rendering 0 arbitrary urls. (before) building 9 files. building 9 files. rendering '/2009/09/firstpost.html' ... rendering '/2009/09/secondpost.html' ... rendering '/2009/09/thirdpost.html' ... rendering '/index.html' ... rendering '/2009/index.html' ... rendering '/2009/09/index.html' ... rendering '/2009/index.html' ... rendering '/2009/09/index.html' ... rendering '/2009/09/09/index.html' ... All of the index.html files contain the 2nd and 3rd post but not the first. Date based archives should not be limited by py["num_entries"]. It also seems wasteful to render some pages twice (ex: /2009/index.html). It would be nice to detect the duplicates and get rid of them. Steve |
|
From: will <wi...@bl...> - 2009-09-09 21:23:24
|
Steve Hoelzer wrote: > On Fri, Jul 17, 2009 at 4:22 PM, Steve Hoelzer<sho...@gm...> wrote: >> On Fri, Jul 17, 2009 at 2:42 PM, will<wi...@bl...> wrote: >>> Steve Hoelzer wrote: >>>> What I'd really like to do is organize my files in folders like: >>>> /2009/Jul/17/filename. That would give me true permalinks and prevent >>>> the duplicate slug problem. But doing this interferes with PyBlosxom's >>>> date based archives, so it's not possible right now. >>> Are you sure about that? I'm pretty sure we solved that issue a few >>> versions ago. >> No, I'm not sure. I tried this a long time ago with PyBlosxom 1.4.1 (I >> think) and it didn't work then. I'm still on 1.4.1 now but I'll >> upgrade if this issue is fixed. If I get some time I'll experiment and >> report back. > > I finally gave this a try and it does not work for me using the > current trunk (r1333). I had a very simple test case with three > entries in /2009/09/, py["num_entries"] = 2, and date archives with > month numbers turned on. Can you show the relevant parts of your config.py file? I don't understand how you have things set up. Alternatively, tar up your example and send it to me so that I can reproduce what you're seeing. > It also seems wasteful to render some pages twice (ex: > /2009/index.html). It would be nice to detect the duplicates and get > rid of them. If you can implement this, feel free to do so and send in a patch. /will |
|
From: Steve H. <sho...@gm...> - 2009-09-10 04:19:50
|
On Wed, Sep 9, 2009 at 4:23 PM, will<wi...@bl...> wrote: > Steve Hoelzer wrote: >> >> On Fri, Jul 17, 2009 at 4:22 PM, Steve Hoelzer<sho...@gm...> wrote: >>> >>> On Fri, Jul 17, 2009 at 2:42 PM, will<wi...@bl...> wrote: >>>> >>>> Steve Hoelzer wrote: >>>>> >>>>> What I'd really like to do is organize my files in folders like: >>>>> /2009/Jul/17/filename. That would give me true permalinks and prevent >>>>> the duplicate slug problem. But doing this interferes with PyBlosxom's >>>>> date based archives, so it's not possible right now. >>>> >>>> Are you sure about that? I'm pretty sure we solved that issue a few >>>> versions ago. >>> >>> No, I'm not sure. I tried this a long time ago with PyBlosxom 1.4.1 (I >>> think) and it didn't work then. I'm still on 1.4.1 now but I'll >>> upgrade if this issue is fixed. If I get some time I'll experiment and >>> report back. >> >> I finally gave this a try and it does not work for me using the >> current trunk (r1333). I had a very simple test case with three >> entries in /2009/09/, py["num_entries"] = 2, and date archives with >> month numbers turned on. > > Can you show the relevant parts of your config.py file? I don't understand > how you have things set up. Alternatively, tar up your example and send it > to me so that I can reproduce what you're seeing. I had three entries: /2009/09/firstpost.txt /2009/09/secondpost.txt /2009/09/thirdpost.txt And minor changes in config.py: py["num_entries"] = 2 py["static_monthnames"] = 0 py["static_monthnumbers"] = 1 Let me know if you need more info. Steve |
|
From: will <wi...@bl...> - 2009-09-10 13:56:17
|
Steve Hoelzer wrote:
>
> I had three entries:
>
> /2009/09/firstpost.txt
> /2009/09/secondpost.txt
> /2009/09/thirdpost.txt
>
> And minor changes in config.py:
>
> py["num_entries"] = 2
> py["static_monthnames"] = 0
> py["static_monthnumbers"] = 1
>
> Let me know if you need more info.
Seems like you don't have truncate_date set. Add this:
["truncate_date"] = 1
/will
|
|
From: Steve H. <sho...@gm...> - 2009-09-10 14:03:33
|
On Wed, Sep 9, 2009 at 4:23 PM, will <wi...@bl...> wrote:
> Steve Hoelzer wrote:
>> It also seems wasteful to render some pages twice (ex:
>> /2009/index.html). It would be nice to detect the duplicates and get
>> rid of them.
>
> If you can implement this, feel free to do so and send in a patch.
I'm not sure if you care about the order that pages are rendered. If
not, here's a simple solution. At the end of the "walk" function,
instead of this:
return _walk_internal(root, recurse, pattern, ignorere, return_folders)
Use sorted(set(...)) like this:
return sorted(set(_walk_internal(root, recurse, pattern, ignorere,
return_folders)))
Or make it explicit:
result = _walk_internal(root, recurse, pattern, ignorere, return_folders)
result = sorted(set(result)) # remove duplicates from list
return result
Steve
|
|
From: Steve H. <sho...@gm...> - 2009-09-10 14:11:24
|
On Thu, Sep 10, 2009 at 8:55 AM, will <wi...@bl...> wrote: > Steve Hoelzer wrote: >> >> I had three entries: >> >> /2009/09/firstpost.txt >> /2009/09/secondpost.txt >> /2009/09/thirdpost.txt >> >> And minor changes in config.py: >> >> py["num_entries"] = 2 >> py["static_monthnames"] = 0 >> py["static_monthnumbers"] = 1 >> >> Let me know if you need more info. > > Seems like you don't have truncate_date set. Add this: > > ["truncate_date"] = 1 I didn't know about that. It's not in the example config.py. Setting truncate_date to 1 or 0 didn't seem to have any effect for me. /2009/09/index.html is rendered with the 2nd and 3rd entries only. I expected to see all 3. Steve |
|
From: Steve H. <sho...@gm...> - 2009-09-10 15:08:13
|
On Thu, Sep 10, 2009 at 9:03 AM, Steve Hoelzer <sho...@gm...> wrote:
> On Wed, Sep 9, 2009 at 4:23 PM, will <wi...@bl...> wrote:
>> Steve Hoelzer wrote:
>>> It also seems wasteful to render some pages twice (ex:
>>> /2009/index.html). It would be nice to detect the duplicates and get
>>> rid of them.
>>
>> If you can implement this, feel free to do so and send in a patch.
>
> I'm not sure if you care about the order that pages are rendered. If
> not, here's a simple solution. At the end of the "walk" function,
> instead of this:
>
> return _walk_internal(root, recurse, pattern, ignorere, return_folders)
>
> Use sorted(set(...)) like this:
>
> return sorted(set(_walk_internal(root, recurse, pattern, ignorere,
> return_folders)))
Umm, nope. This is wrong. That only applies to the categories and
doesn't check if date archives result in an identical path.
The real fix is to use sorted(set(...)) on 'renderme' near the end of
runstaticrenderer(). The middle lines are new:
print "building %s files." % len(renderme)
# date archives and categories may give identical paths, so remove
duplicates
renderme = sorted(set(renderme))
for url, q in renderme:
Steve
|
|
From: will <wi...@bl...> - 2009-09-10 23:01:36
|
Steve Hoelzer wrote: > > Umm, nope. This is wrong. That only applies to the categories and > doesn't check if date archives result in an identical path. > > The real fix is to use sorted(set(...)) on 'renderme' near the end of > runstaticrenderer(). The middle lines are new: > > print "building %s files." % len(renderme) > > # date archives and categories may give identical paths, so remove > duplicates > renderme = sorted(set(renderme)) > > for url, q in renderme: Nice! I tested it out and it looks good to me. Checked in as r1337. Thank you! /will |
|
From: will <wi...@bl...> - 2009-09-10 15:17:23
|
Steve Hoelzer wrote: > > I didn't know about that. It's not in the example config.py. It's a "new". I thought I wrote an email to pyblosxom-devel to see if the change satisfied the needs, but I can't find it. Regardless, it hasn't been documented yet except in the commit comments. > Setting truncate_date to 1 or 0 didn't seem to have any effect for me. > /2009/09/index.html is rendered with the 2nd and 3rd entries only. I > expected to see all 3. Bah... It's a bug. I fixed it, but I screwed up my repository somehow, so I have to fix that before I can check it in. I'll try to get to it today. /will |
|
From: will <wi...@bl...> - 2009-09-10 18:54:59
|
will wrote: > Steve Hoelzer wrote: >> Setting truncate_date to 1 or 0 didn't seem to have any effect for me. >> /2009/09/index.html is rendered with the 2nd and 3rd entries only. I >> expected to see all 3. > > Bah... It's a bug. I fixed it, but I screwed up my repository somehow, > so I have to fix that before I can check it in. I'll try to get to it > today. It's checked in now as r1334. To summarize, there are three truncate_* config variables which will affect how truncation works: truncate_date - whether or not date-based archives are truncated to num_entries. truncate_category - whether or not category-based archives are truncated to num_entries. truncate_frontpage - whether or not the front page is truncated to num_entries. I added the last one figuring someone's going to ask for it some day and I needed to add a check for the situation anyhow. I haven't updated the documentation, yet. /will |