On Thu, 26 Oct 2000, Todd Holbrook wrote:
On Thu, 26 Oct 2000, Daniel Lee wrote:
> I found Freiburger's comments more intriguing. He may have a
> good point about most items only being used once or twice.
I've heard that argument a lot, but have yet to see stats on it. I'm just
finishing up some modifications to the statistics part of our ILL system
and should be able to pull up some of that kind of data for the last year
of requests. I'm curious what I'll find.
He's right, of course, and I'd love to see those stats, Todd (sounds
like a mess of a system... now I remember you telling me about that
But his assessment doesn't account for a few other things, namely having
to do with purpose and changed market conditions. First, napster isn't
perfect either. You can find some obscure stuff sometimes, but more often
than not you're limited to fairly popular stuff. If something like
docster takes off, it will be most precisely useful for eliminating the
redundant scanning and copying and faxing we're all doing for the most
heavily used articles. Perhaps only 1% of requests are repeated more than
100 times, but every repeats means another scan. That's probably a huge
proportion of the actual content flowing through the system, and if it's
only 10% of the total it's still an incredible time savings for our
Second is that he isn't considering the added value individuals provide
when they allow external services to track their interests. I know
there's an ugly big brother thing here, but who among you hasn't
benefitted from the amazon.com recommendation system? It's great!
Imperfect, but still remarkably salient.
Now imagine you're doing obscure research in a biochemistry lab. For your
articles you're keeping a library of bibliographic and fulltext contents
using something like docster with citation management bits thrown in.
Maybe 5% of the articles you'll reference in your original piece will be
widely known, seminal pieces. Maybe 30-50% will be lesser known but still
from prominent publications. The other half is likely to be fairly
obscure for a given topic.
But the fact that you've co-located those obscure articles with the
prominent ones is immensely interesting to everybody else in the world,
just like the amazon "those who bought this also bought..." feature. If
we could harvest that metadata -- even just the metadata -- the group of
articles that used to remain mildly obscure (say, requested only 2-10
times a year) are suddenly going to appear more and more as suggested
readings in metadata environments that track such things. What researcher
doesn't mind a truly good suggestion of something they haven't read?
Good research is always, at some point, exhaustive to some degree. Good
researchers know that they have to take advantage of every useful tool to
make sure their research is thorough.
It certainly would provide an incentive to leave your computer on. And if
you knew that people querying your content were forced to be honest about
making payments, perhaps fewer people would be 'free riders' as described
in the gnutella context.
This would be a huge boon for researchers, and it would skew the numbers
away from what the NLM findings suggest.