upd_stats utilities submitted by Douglas Wilson<dougw@cpan.org>
Version dated: Mar 22, 2005
----------------------------------------------------------------------------
Generic unix utilities:
dirsz - Give size(s) of directorie(s) with some nifty options. I used to
write this sort of utility in shell/awk, but this is better/faster.
notail - like 'tail' except return all BUT last 'n' lines
tailf - like 'tail -f' but exit upon reaching some given regexp(s).
tac - my perl implementation of 'tac' utility, cat file in reverse order.
sendfile - send email with file attachment(s). Requires Mail::Sender module.
Informix specific utilities:
onfileschk - check permissions of Informix files against what's in onfiles file.
locks - info on who has what type of locks on what.
wholock - info on who has locks on a certain table and what rows are locked.
itags - create 'tags' file for 4gl files in current directory.
tuser - Shell script to report on locks on Standard Engine tables.
Fourgen specific utilities:
db - set DBSRC path (from libs in Makefile) and start 4gl debugger.
litags - create 'tags' file for library functions of Fourgen generated modules.
----------------------------------------------------------------------------
Fixed wholock which didn't always use the correct table number.
Added db debugger utility for Fourgen users.
----------------------------------------------------------------------------
Updated sendfile to work better with later version of library
Updated tuser to output file name for table
tuser wasn't reporting on the ppid of the locking process if there was more
than one lock on the file.
----------------------------------------------------------------------------
Updated litags to skip files it can't open, and added tuser to archive.
Updated some perl scripts to silence warnings that weren't complaining in
earlier versions of perl.
----------------------------------------------------------------------------
Utility: updstats
Synopsis: Generates optimal update statistics statements according to
the Informix 7.3 Performance Guide, and optionally executes them.
Author: Douglas Wilson <dougw@cpan.org>
Version(s) supported: only 7.xx has been tested
File(s): updstats
Comments: Requires perl (5.005 or later), and the perl modules DBI (1.13 or
later) and DBD::Informix (0.95 or later) modules (available from
www.cpan.org). DBD::Informix installation requires ESQL/C (maybe not, read
the docs).
You may have to change the path to perl at the top line of the script,
depending on where your perl is installed, and a default database is
hardcoded in the script which you may want to change.
The 'connect' statements in the program may need a username and password
depending on your database setup. I just have the empty strings '' for those
which works for me.
The output of the script by default goes to stdout, so you may redirect
it to a file, or you can immediately execute it with the -x option,
which just pipes the output to xupdstats (which must be in your PATH).
If you want to simultaneously output to a file and execute it, thats
what 'tee' is for.
Yes, this does pretty much the same thing (I hope) as Art's dostats.ec
utility (BTW thanks for the help Art), and without as many options (so far),
(ok, now there is a bunch of options which do all sorts of obscure things)
but I just wanted to do it in perl, and I wanted to automatically execute
the statements in parallel (which gets done in xupdstats).
Release Notes:
Version 1.1: Allows multiple database selection with '-d' option, allows
host selection with '-h' option, removed '-a' ansi option which is
unneccessary since according to DBD::Informix man pages, all databases
are in 'AutoCommit' mode by default. No longer outputs column list for
MEDIUM updates when there are no indexes on the table.
Single column indexes were generating update LOW statements, but no longer
(except with -F option).
Added -F option to disable single column optimization.
Version 1.2: Added -p option to update statistics for stored procedures.
Added optional argument to -x option to start n sub-processes.
Updated usage message.
Version 1.22: '-t all' was not updating all tables, now it is.
Version 1.23: added '-pt' option to work with xupdstats '-t' option.
Version 1.24: The Performance Guide says to update high on tables with
a small number of rows. It neglects to say what is a small number. So I
added an option to update high on tables with less than or equal to
whatever number of rows you would like to define as a small number. The
number of rows is retrieved from 'systables.nrows' so this will be
inaccurate for tables which have never had statistics updated, but will
save a bit of time over doing a count(*) (and saves me programming time).
Maybe I'll change it or add an option to actually do a count(*) on the
table in the future since a count(*) doesn't really take that much time.
Also, changed all options that take an argument to actually require an
argument.
Version 1.25: No longer generates duplicate UPDATE LOW statements for
indexes with duplicate column lists.
Version 1.26: Added support for standard engine with '-s' option. This
is untested though since I don't have access to a standard engine database.
Added -R,-r,and -C options to specify resolution and confidence for
HIGH and MEDIUM updates. No error checking is done on the values supplied,
you're expected to read the manual on such things. Also since the
Performance guide suggests different resolution on large tables, I added
a -rn option to only add the resolution clause if the number of rows is
>= some number.
Added -c option to perform a count(*) on all the tables instead of using
'systables.nrows', this about doubles the execution time (from ~12
to ~24 seconds on my database of about 1000 tables), but I also streamlined
the code in the process (saving about 1 second on the aforementioned
database).
Version 1.30: Now the '-s' option for SE should actually work, also
the '-pt' option, with the help of patches from Roderick Schertler
<roderick@argon.org> (Thanks again).
Since there's no sysmaster database in SE, I use a DBI function (data_sources)
to get the available databases if needed (e.g. if you say 'all' databases or
use a wildcard in the database name), but the function only works for the
local host, so using '-h' will not get the database NAMES from the specified
host for SE. I thought about aborting in this case, but who knows it
might still behave the way you want if you have duplicate database names
on systems.
Version 1.31: Slight bug if you had wildcards in the database name.
You will still be out of luck if you use '-h' and also have a database name
that contains an explicit host name, e.g. 'db@host'; this will probably cause
an error. Or if you use 'all' to get all available databases and also
include a 'db@host' database arg, then you will lose the 'db@host' arg. I may
remedy that later if its desired and after I think about it more.
----------------------------------------------------------------------------
Utility: xupdstats
Synopsis: Executes the sql output from updstats utility. Several child
processes are spawned, and each batch of statements with a common tablename
are sent to a process. As each batch finishes in each process, another is
sent.
Author: Douglas Wilson <dougw@cpan.org>
Version(s) supported: only 7.xx has been tested
File(s): xupdstats
Comments: Requires perl (5.005 or later?), and the perl DBI and DBD::Informix
modules (available from www.cpan.org). DBD::Informix installation
requires ESQL/C.
You may have to change the path to perl at the top line of the script,
depending on where your perl is installed, and a default database is
hardcoded in the script which you may want to change.
The 'connect' statements in the program may need a username and password
depending on your database setup. I just have the empty strings '' for those
which works for me.
Release Notes:
Version 1.1: Allows multiple databases via 'database' statements
in the sql script. Connection to a new database is made in the subprocesses
only when neccessary.
Allows host selection with '-h' option, removed '-a' ansi option which is
unneccessary since according to DBD::Informix man pages, all databases are
in 'AutoCommit' mode by default.
Version 1.2: Allow update statistics for stored procedures.
Set lock mode to wait for the sql connections since the system tables being
updated have page level locking.
Version 1.22: Stored procedure statistics were not really being updated,
now they are.
Added an option to execute all statements for a table in parallel. I think
think is ok since we're generating complete column lists and adding
'DISTRIBUTIONS ONLY' whenever possible.
Added some signal handling, so that this program is automatically nohup'd
if you run it in the background but if you kill any processes
(with -15 (SIGTERM), please) then they will all die after finishing the
current sql statement. I'm open to suggestions on how to kill the
processes more cleanly when one of them is TERM'd. None of that matters
if you just let it run all the way through anyway.
Version 1.23: Program waits for update stats on tables to finish before
starting on stored procedures.
Version 1.24: Last modification fixed. We were only waiting for tables to
finish on the first database; after that, we were waiting, but in the wrong
place.
Also, changed all options that take an argument to require an argument.
Version 1.25: Somehow a bug sneaked in where -t was being done for all
tables even if -t was not specified. Fixed it.
Version 1.30: Doing a 'ps' when this is running should be more informative
thanks to patch from Roderick (see above).
Version 1.31: Better error reporting. Main program exits w/error status
if any child sql statement fails. Again thanks to Roderick for patch.
Version 1.32: Sql is distributed better to child processes.
Version 1.33: Was unnecessarily reconnecting to database in last
batch of sql statements.
----------------------------------------------------------------------------