From: walter h. <wh...@bf...> - 2013-11-01 18:10:34
|
Am 01.11.2013 18:39, schrieb Tim Mooney: > In regard to: Welcome to lprng, walter harms said (at 3:00pm on Nov 1, 2013): > >> Hello Tim Mooney, > > Hi Walter & lprng-devel! > > I'm new to the list, but not LPRng. The university where I work has > been using LPRng since the mid 1990s. We used its predecessor, PLP, > before that. You'll find my name scattered throughout the CHANGES > document for various patches, especially to the documentation, and > suggestions over the years. > > I went looking for a fork of LPRng a few years ago, after it was clear > that Patrick Powell was no longer doing maintenance on it. I found your > lprng SourceForge project at that time, but almost no changes had been > made to the source to that point, so I didn't follow the project closely. > > I found the project again because I was looking to see if anyone that > was still using it was experiencing the kinds of issues that we do, > specifically segfaults that appear to be memory corruption. > > We're using Patrick Powell's 3.8.33 (I see that he released a 3.8.35, > but the changes are tiny), and had the same issues under 3.8.24 before > that. > > We use LPRng with GoPrint, a commercial print management and cost-recovery > system primarily targeted at universities and libraries. All queues are > set up with LPRng as hold queues. Users interact with GoPrint at > touch-screen kiosks near the printers, using their campus ID card to check > their printing balance against our ID system and select and release print > jobs they've submitted. It's GoPrint that decides whether they should > be able to actually print the job, and it's that software that ultimately > issues the "lpc move" and "lpc release" commands to LPRng, to get the > selected print jobs to release to the printer. The system receives > anywhere from a few hundred to more than 8,000 print jobs per day, and > page totals printed per day range from 5,000 to more than 35,000. > > The system itself is quite slick, but unfortunately LPRng has been the > weak link. Our system logs and dmesg output on the RHEL 5.10 print > server that hosts GoPrint and LPRng are filled with segfaults from lpd > workers: > > $ dmesg | egrep 'lpd.*segfault' | wc -l > 1498 > > I've just recently spent some time on it, and have been able to identify > a common stack trace: > > Program terminated with signal 11, Segmentation fault. > #0 0x0078905e in malloc_consolidate () from /lib/libc.so.6 > (gdb) where > #0 0x0078905e in malloc_consolidate () from /lib/libc.so.6 > #1 0x0078b2e7 in _int_malloc () from /lib/libc.so.6 > #2 0x0078d27a in calloc () from /lib/libc.so.6 > #3 0x0070c80b in _dl_new_object () from /lib/ld-linux.so.2 > #4 0x00708011 in _dl_map_object_from_fd () from /lib/ld-linux.so.2 > #5 0x00709f71 in _dl_map_object () from /lib/ld-linux.so.2 > #6 0x00713d41 in dl_open_worker () from /lib/ld-linux.so.2 > #7 0x007100d6 in _dl_catch_error () from /lib/ld-linux.so.2 > #8 0x00713742 in _dl_open () from /lib/ld-linux.so.2 > #9 0x0082cbf2 in do_dlopen () from /lib/libc.so.6 > #10 0x007100d6 in _dl_catch_error () from /lib/ld-linux.so.2 > #11 0x0082cda5 in __libc_dlopen_mode () from /lib/libc.so.6 > #12 0x008099b9 in init () from /lib/libc.so.6 > #13 0x00809b53 in backtrace () from /lib/libc.so.6 > #14 0x007829b1 in __libc_message () from /lib/libc.so.6 > #15 0x0078ad35 in _int_free () from /lib/libc.so.6 > #16 0x0078eda9 in free () from /lib/libc.so.6 > #17 0x0805b0a3 in Set_str_value (l=0xffbb5120, key=0x80b214e "user", > value=0x823aff2 "er=er=er.er.er.en.ensens8ns83s83") > at ./common/linelist.c:1053 > #18 0x0807cc7f in Perm_check_to_list (list=0xffbb5120, check=0x80bac00) > at ./common/permission.c:704 > #19 0x08092dd6 in Do_queue_control (user=0x823bbe0 "goprint", action=20, > sock=0xffbb5340, tokens=0xffbb5230, error=0xffbb517c "", errorlen=180) > at ./common/lpd_control.c:448 > #20 0x08093b43 in Job_control (sock=0xffbb5340, input=<value optimized > out>) > at ./common/lpd_control.c:184 > #21 0x0806435e in Service_lpd (talk=5, from_addr=0xffbb537c "127.0.0.1 > port 0") > at ./common/lpd_dispatch.c:341 > #22 0x0806482c in Service_connection (args=0xffbb54b4) > at ./common/lpd_dispatch.c:310 > #23 0x080575a6 in Do_work (name=0x80a60d2 "server", args=0xffbb54b4) > at ./common/linelist.c:3853 > #24 0x080587d8 in Make_lpd_call (name=0x80a60d2 "server", > passfd=0xffbb54c0, > args=0xffbb54b4) at ./common/linelist.c:3823 > #25 0x0805cd8c in Start_worker (name=0x80a60d2 "server", parms=0xffbb54fc, > fd=12) at ./common/linelist.c:3882 > #26 0x0804a989 in Accept_connection (sock=7, lpd_socket=0, unix_socket=1) > at ./common/lpd.c:1015 > #27 0x0804bf61 in main (argc=Cannot access memory at address 0x0 > ) at ./common/lpd.c:693 > > > The segfault is triggered by the call to free() in Set_str_value (frame > #16 and #17), and it's "value" that's corrupt, but I haven't yet had time > to track down where the initial corruption is happening. > > I see that since I first looked at your lprng fork, you have made > significant changes, dropping a lot of the cruft that had accumulated > over the years. That seems like a good direction to take. > > I'll review the your changes document, but how compatible is lprng with > the older LPRng, from a config file and accepted options perspective? Is > it essentially a drop-in replacement for LPRng, or has there been enough > divergence that some things now work differently or e.g. the control > file format is now different? > the current lprng should be a drop-in-replacement. We have done some cleanup work. I use it for some sites, with have some hight volume (100 Jobs/min) without major problems. It would be interesting to see if the current code causes the same problems. re, wh |