Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo


#24 Sanitize Path/File Names

Chuck Rhode

CDDB is raw data and can contain any text, which is inappropriate as names in most file systems.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=567780 at least complains about filename handling. There may be more here and there, and I would have filed my complaint as a bug except that it's not your fault. In fact I can't put a decent scope around my problem, but here it is anyway.

It is a good thing that *ripperx* allows users to modify proposed path (directory) and file names. That works when users realize what is appropriate as a file name in the destination environment, but this is not always the case. I believe -- Oh, Lordy! I believe -- that there is a lowest common denominator for file names that would work in most modern directory schemes: 32 ASCII lower and upper case characters with hyphens, spaces, and some other punctuation, but I could be wrong. Limits may actually be much looser: 64 UTF-8 bytes. I don't know, and I don't want to experiment -- hence my problem defining scope. However, there are characters that pose problems in most schemes: probably colon, comma, questionmark and linefeed but probably not ampersand and dollarsign. But it would be nice for *ripperx* to provide an automated way to convert them all to ... something ... maybe hyphen.

The itch I am trying to scratch is this: I'm converting semi-classical CDs to *.mp3s and transferring them to an Android phone. Many track names use "foreign" characters and library-reference-type punctuation and have long lists of artists' attribution.

Some Linux utility (that *ripperx* calls) balls up on long file names, and *ripperx* (or this other process) can't delete the *.wav file name. I get around this by manually shortening file names.

I'm able to move filenames containing colons to my Android phone and see them there, but I'm not able arbitrarily to add filenames that DON'T contain colons after that! It's strange; the Android directory is messed up in some way. Reformatting the SD card is to no avail, but deleting the directory containing the bad file names, removing colons from all file names on the Linux side and transferring them to a new directory on the Android side fixes the problem. How about that? Again I can't say whether or not the scope of this issue ought to encompass Linux, Android, USB support, or other MP3 players.

Suffice it to say I wish *ripperx* could optionally reduce path/filenames to arbitrarily low common denominators. Give me a button that would do that. Give me two -- one to clip filenames and one to transform "special" characters, or give me three for eliding characters at various "danger levels."

Gee, thanks for your attention. -ccr-