On Jun 23, 2011, at 11:57 AM, Wes Hardaker wrote:
> JJ> I've sent this patch 3 times now. I'd like a clear call that you
> JJ> WILL integrate the patch, and then I can/will generate the patch and
> JJ> the "portable" script and even take on the tedious explaining of a
> JJ> deep dark innocuous change as needed.
> WH> I'd happily agree to it, if it solves the issues I'm worried about.
> JJ> Make a list of issues please so I can respond.
> So, after brain storming and discussing things a bit, this is the
> results of the thinking:
> 1) We'll switch happily to a directory based "cache". Everyone has
> agreed this is a better solution if avoids DB locking issues.
This is the crucial issue imho. If we have consensus here (and net-snmp
leadership), the rest of the ducks will fall in a row.
> 2) But we can't depend on cron. Thus we need to be able to execute any
> resulting script or rpm invocation from within the agent itself (in
> the background). This could be:
> - fire once at start up
> - fire it once a day, or via some other default period
> - check the directory stat time against the rpm stat time and only
> regenerate when needed (this, IMHO, is preferred over once/day).
> 3) Needs to be configurable so that the system can do the "install in
> cron" thing. Rough config tokens needed:
> RPMDirectory /path/to/it
> RPMFrequency 1d (or directory to stat?)
> RPMCommand /path/to/script/or/whathaveyou
These goals seem opposed imho. The cron script is proposed
solely for a _COMPLETE_ solution and is otherwise handled
better directly in RPM to synchronously maintain the directory
based store. Note that a rpm --rebuilddb can/will recreate
every file in the directory store to populate, and --rebuilddb
is a fairly common operation.
> 4) The default location needs to be in a space we control, such as
> /var/net-snmp/hrmib until the RPM folks or "whoever" are willing to
> make a standard place for it (it's beyond our scope to dictate
> something somewhere else).
Yup: noone has a clue what "FHS compliance" means, and yet "control"
is needed somehow. So the directory path becomes a run-time configurable
somehow to keep _EVERYONE_ happy.
> 5) Because the cache may be old, take a long time to update via the
> script, and launched in the background we can't wait before replying.
> So *if* a query comes in then we'll simply answer from an old cache
> (either from disk or memory). This isn't ideal, but we're willing to
> live with the damage. Anytime we go to an on-disk cache instead of
> directly linking to the library we're going to run into this problem
> one way or another. Perfect data synchronization is lost, but not
> hopefully not horribly so.
There is no "old" if the directory entries are maintained synchronosuy
with rpmdb add/remove.
The is no "long time" either because the cron script is not the
typical implementation (a patch into RPM is preferred imho).
There are some obscure corner cases where the directory store
and what is in an rpmdb briefly diverge. Most of these corner
cases already cause failure modes with "locking" and the locking
isn't per-RPM-update but rather de facto "concurrent access" from
Berkeley DB. You SHOULD end up with more predictable HR-MIB info
if/when RPM is patched to add/remove entries in the directory store
synchronously for various technical reasons. Note that RPM does no
locking whatsoever when run as non-root and so "concurrent" can/will lead
to a segfault (of net-snmp) and obscure mis-behaviors if net-snmp is run
as non-root. These corner cases are not currently seen because packages
are seldom updated while net-snmp is monitoring the HR-MIB, and net-snmp
is often run as root.
I'm pretty sure that -- with the directory cache -- one can follow a live
upgrade directly using net-snmp monitoring with no lock contention whatsoever.
So "perfect data synchronization" has always been a desirable fiction,
and I claim that the directory store will increase, not decrease, reliability
of "perfect data synchronization".
Sure there's some very weird exceptional behaviors like
that will lead to confusions. But HR-MIB is reporting exactly what
is present there, and those failures (and lack of perfect data sync)
already exist for all current directory schemes like on Debian and Sloaris.
So don't do
should suffice as an answer (but there's no pleasing everyone all the time).
> Dave has checked in code to do the basic read-and-respond-from-path code
> (I haven't read it yet), but not the above needed configuration features
> to go with it. IE, right now since we're not doing either cron or
> run-ourselves support, the data table will be blank.
> Sound good?
Sounds perfect if we are still in agreement and there is consensus.
Jan Safranek: Agreed or not?
If agreed, I'll post a patch for rpm-4.9.x to synchronously handle
/var/cache/hrmib entries here so that rpm.org can choose to ignore.
73 de Jeff