I was wondering what aspect of the scanning for media is the slowest. I would imagine grabbing the metadata is the slowest part. Or maybe extracting the album art.
Well, not quite. Getting the metadata does take some time, but first - there is nothing we can do about it, because it is handled by external libraries, and second - it's not the main bottleneck.
The virtual layout has the biggest impact (I do not mean scripting in particular, using JS instead of builtin is only insignificantly slower), the main speed reduction happens when adding many items. I.e., because of the layout one physical file is added 7 or more times in the virtual structure (in Artists, Albums, Genre, etc.). Turn the layout off, or use a JS import script with a simplified layout (similar to what boilerjt posted), you will see a huge improvement in import speed. Some people reported 1 hour instead of 1 day on NAS devices.
For 0.12 Leo added caching, so that the number of requests to the database is reduced, during our original tests it seemed to improve the import speed (we still were not very happy, but it was definitely an improvement). Currently something seems broken, so we need to retest the caching feature before releasing, but at least some improvements should be there.
That being said, we are still not happy and clearly need to improve things, so work on that will continue.
I did some quick benchmarking and I get a huge improvement if I set ignore-unknown=yes in the config file. I noticed it was taking forever to process my iTunes m4p files (which I know are DRMed and can't be played on Linux or my PS3). I didn't include those in the config file so it would ignore it. It was taking 5 seconds to "evaluate" 1 m4p file where it was no time to process my mp3 files.
I also added the following code in content_manager.cc in the addFile function so I could tell when the import finished which I found useful. I'm sure the -D debug mode does something similar but I didn't want to see all the other mess.
if (recursive && IS_CDS_CONTAINER(obj->getObjectType()))
addRecursive(path, hidden, task);
// Added this
log_info("Done Adding %s\n",path.c_str());
Well, according to this http://manuals.playstation.net/document/en/ps3/current/music/filetypes.html you should be able ot paly your mp4 audio files that are without DRM, but for the DRM crap there is of course no nice solution.
The import of mp4 files may be slow because of libmp4v2, as far as I know it reads the whole file when parsing metadata, so it's indeed not the fastest library and adds additional overhead.
In my previous post I was referring to a configuration without libmp4v2.
Yes, I am still using the libmp4v2 lib to process the m4a files (mp4 with no DRM). That works well and suffers no performance loss and yes they do play on the PS3.
Processing 10GB of mp3/m4a (mp4) files and about 6GB of jpgs, it took about 3 hours with ignore-unknown=no. With ignore-unknown=yes and having all of my extensions defined in the config xml (but not m4p files), it took 12 minutes.
...well, of course if you import less data it will be faster, that's obvious. So if you are trying to add containers where you have lots of other, unrelated files and not only your media, then it will impact the import speed.
However this part is not really important - what content is being added is up to the users, our goal is to make the import of *any* content faster.