sigsafe-devel Mailing List for sigsafe
Status: Pre-Alpha
Brought to you by:
slamb
You can subscribe to this list here.
| 2004 |
Jan
|
Feb
(6) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Scott L. <sl...@sl...> - 2006-05-11 01:46:57
|
On May 10, 2006, at 3:25 PM, Marcin 'Qrczak' Kowalczyk wrote: > Scott Lamb <sl...@sl...> writes: >> You're right on all points. I've made the fixes to the head revision >> in Subversion. When I get a chance to test everywhere, I'll make a >> new release. > > Thank you! I was worried that the project is dead :-) It's not dead but not as lively as I'd like. If any open source project of mine becomes hugely popular, I fear it won't be one designed to help Unix developers avoid subtle race conditions. The documentation's been a lot more successful than the code. I've occasionally noticed in my referrer logs people talking about it in languages I don't understand. (It was weird to see a bunch of _my_ code examples floating in a sea of Japanese text.) I've been meaning to bulk it out more. Probably the easiest thing to do would be just to introduce more asynchronous IO people to the self-pipe trick. It makes a lot of sense there. On the other hand, using sigsafe's code is much easier if you are doing synchronous operations (or you are writing a library which can be used for either). Reliable thread cancellation is probably the best example, especially considering that POSIX thread cancellation really doesn't work. Regards, Scott -- Scott Lamb <http://www.slamb.org/> |
|
From: Marcin 'Q. K. <qr...@kn...> - 2006-05-10 22:25:39
|
Scott Lamb <sl...@sl...> writes:
> You're right on all points. I've made the fixes to the head revision
> in Subversion. When I get a chance to test everywhere, I'll make a
> new release.
Thank you! I was worried that the project is dead :-)
--
__("< Marcin Kowalczyk
\__/ qr...@kn...
^^ http://qrnik.knm.org.pl/~qrczak/
|
|
From: Scott L. <sl...@sl...> - 2006-05-05 05:31:36
|
Marcin, Your email got lost in my mailbox. I just happened upon it today. You're right on all points. I've made the fixes to the head revision in Subversion. When I get a chance to test everywhere, I'll make a new release. Thanks a lot for the feedback! My apologies for not responding until now. Regards, Scott On Nov 17, 2005, at 6:50 AM, Marcin 'Qrczak' Kowalczyk wrote: > Hello. > > I added conditional usage of sigsafe in the runtime of my compiler of > my language Kogut <http://kokogut.sourceforge.net/>. Thanks for the > library, it seems to work, I hope it's stable. Here are various issues > with sigsafe I encountered: > > - It compiles with debug=1 by default, which causes signal handlers > to write [S] to stderr. It's not even documented, and changing it > seems to require editing SConstruct. > > - sigsafe_install_tsd must be called after sigsafe_install_handler > has been called at least once. This fact is not documented. > I think sigsafe_install_tsd should be usable when called first too. > > - sigsafe_read and sigsafe_write return int instead of ssize_t. > > - There is no binding for open, which may block in case of a fifo. > > - There are no bindings for send, sendto, recv, and recvfrom. I would > not use them, because I use sockets in non-blocking mode only, but > perhaps somebody else would. > > - When looking at sources, I saw: > | static void sigsafe_init(void) { > | /* "volatile" so our seemingly-useless references aren't > optimized away. */ > | volatile void *fp; > Here volatile applies to the target of the pointer, not to the > pointer > itself, so it doesn't change anything. > > - The documentation says that read might trigger SIGPIPE. I believe > this is false, SIGPIPE is generated only for writing. > > -- > __("< Marcin Kowalczyk > \__/ qr...@kn... > ^^ http://qrnik.knm.org.pl/~qrczak/ > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. Get Certified Today > Register for a JBoss Training Course. Free Certification Exam > for All Training Attendees Through End of 2005. For more info visit: > http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click > _______________________________________________ > Sigsafe-devel mailing list > Sig...@li... > https://lists.sourceforge.net/lists/listinfo/sigsafe-devel -- Scott Lamb <http://www.slamb.org/> |
|
From: Marcin 'Q. K. <qr...@kn...> - 2005-11-17 14:50:21
|
Hello. I added conditional usage of sigsafe in the runtime of my compiler of my language Kogut <http://kokogut.sourceforge.net/>. Thanks for the library, it seems to work, I hope it's stable. Here are various issues with sigsafe I encountered: - It compiles with debug=1 by default, which causes signal handlers to write [S] to stderr. It's not even documented, and changing it seems to require editing SConstruct. - sigsafe_install_tsd must be called after sigsafe_install_handler has been called at least once. This fact is not documented. I think sigsafe_install_tsd should be usable when called first too. - sigsafe_read and sigsafe_write return int instead of ssize_t. - There is no binding for open, which may block in case of a fifo. - There are no bindings for send, sendto, recv, and recvfrom. I would not use them, because I use sockets in non-blocking mode only, but perhaps somebody else would. - When looking at sources, I saw: | static void sigsafe_init(void) { | /* "volatile" so our seemingly-useless references aren't optimized away. */ | volatile void *fp; Here volatile applies to the target of the pointer, not to the pointer itself, so it doesn't change anything. - The documentation says that read might trigger SIGPIPE. I believe this is false, SIGPIPE is generated only for writing. -- __("< Marcin Kowalczyk \__/ qr...@kn... ^^ http://qrnik.knm.org.pl/~qrczak/ |
|
From: Scott L. <sl...@sl...> - 2004-02-24 22:55:16
|
sigsafe 0.1.1 is up on sourceforge. It focuses on documentation improvements and an improved race condition checker. <http://prdownloads.sourceforge.net/sigsafe/sigsafe-0.1.1.tar.gz? download> The complete release notes are below. Scott $Id: RELEASE_NOTES 662 2004-02-24 21:43:54Z slamb $ sigsafe version 0.1.1 release notes: * The documentation is improved, but it still has a ways to go: - The "Background" section stops abruptly. There are a lot more topics to cover. - The "Pattern reference" could use more explanation and should be adapted to use the same terminology. - The "Goal reference" doesn't really exist yet. - Some other sections don't even have skeletons yet. If you spot anything wrong with what's already there, please let me know. If it's unclear to you, odds are it is to someone else, too. And if it seems wrong to you, let me know so I can correct or justify it. * There are no additional platforms; still only Linux/x86 and Darwin/ppc are supported. I originally this release to support more, but from the feedback I've gotten, the documentation improvements are more important now. The next couple releases will probably also be focused on documentation. * The race checker is in pretty good shape now on Linux/x86. Try it out with "tests/race_checker/race_checker -qm" for a quick test. * The race checker still doesn't work under Darwin. It will give results like this: $ ./race_checker -qm Running sigsafe_read test attached stepped once ERROR: Child was killed/dump from signal 1. ... Test Result Expected ---------------------------------------------------------------- * sigsafe_read misc. failure success * racebefore_read misc. failure ignored signal * raceafter_read misc. failure forgotten result * - These 3 tests did not return the expected result. The child isn't exiting from signal 1; the Mach tracing code I'm using just isn't working. As far as I know, sigsafe works fine under Darwin. |
|
From: Scott L. <sl...@sl...> - 2004-02-19 19:12:53
|
On Feb 19, 2004, at 12:20 PM, Shantanu Bhardwaj wrote:
> Hi,
>
> you mentioned something about needing help with your project, i just
> glanced through the home page but id like to help, so if you can give
> me some more information to make a decision. what kind of help do you
> need .... whats done ... whats to be done etc etc ...
Hey,
(I've cc'd this to the mailing list; I hope you don't mind.)
Glad to hear from you, especially so quickly. What's done:
- the Linux/x86 and Darwin/ppc (OS X / Power Macintosh) implementations
appear to mostly working
- the race condition checker works on Linux/x86
- good code organization, some comments
What needs to be done:
- additional platforms.
I've got (and you can have) access to oodles of platforms: systems
running Linux/alpha, Linux/ia64, HP-UX/PA-RISC, Solaris/sparc,
NetBSD/alpha, NetBSD/x86, FreeBSD/x86, OpenVMS/alpha,
Windows+Cygwin/x86, and probably others I'm forgetting. (All of these
through SourceForge's compile farm or HP's test drive.)
There are notes in the top-level README about porting to new platforms.
If they're not clear (and they're probably not), feel free to ask
anything.
It's kind of fun to write the assembly. And it's easier than you might
think to find the necessary information. For the open source platforms,
you can look at the libc source code. Processor manuals. Google helps a
lot. And if all else fails, you can try something like "disassemble
read" in the debugger to find out how libc does it.
It's also very rewarding to see the race condition checker and
microbenchmark tell you with certainty that your code is working as
intended. With a lot of projects, it's not so clear-cut.
- case studies.
Example 1: Apache uses a way of implementing socket timeouts that I bet
is really slow (though correct), especially on platforms like Darwin
that have slow system calls. I'd like to try adapting their code to use
sigsafe and see what effect it has on performance. Even if they don't
accept the code, something like "we modified the world's most powerful
webserver to use sigsafe and found a XX% performance increase on
platform X" would be a powerful statement.
Example 2: In my atoms project, I have a signal document that mentions
use of signals in FreeBSD userspace. I basically did a
$ find . -name '*.[ch]' | xargs egrep '\<(signal|sigaction)\>'
and started marking down what they did with signals and whether I
thought it was safe or not. It'd be nice to finish that (with the same
terminology as everything else) and provide patches to make everything
safe.
- documentation! I've started a little bit, but from the responses I've
gotten so far, I need a lot more. So I'd like to present the
information in a lot of different ways:
1. an introduction section that explains all the concepts and goes over
a few of the patterns in depth. (This is started, but it doesn't
mention lots of things: realtime signals, blocked signals,
thread-directed vs. process-directed signals, asynchronous vs.
synchronous signals.)
2. a pattern reference showing patterns you might find in existing code
and good/bad things about them related to safety, portability, ease of
use, etc. (Also started, but I wrote it _before_ the introduction
section; it will have to at least be updated to use the same
terminology and style.) Maybe names for each pattern and a table up
above with really simplified versions of these notes.
3. a more goal-based reference. Stuff like "I want to wait for either a
keyboard interrupt or a child to exit, how should I do that?" with a
preferred method or two, either stated again or just through links to
the pattern reference.
4. doxygen docs on each relevant "normal" system call. Both links to
the relevant standards (largely the Single UNIX Specification version
3) and additional notes. The setitimer manual page, for example,
doesn't even bother to mention that alarms are directed to the
_process_, not an individual thread. That's a huge difference, and it
would be nice to have it actually said _right_there_.
5. performance graphs. I've got a benchmark for it, but it would be fun
to make an actual graph with gnuplot on various platforms. And useful -
it would give people a visual idea of how sigsafe compares, and how
important system call overhead is on various platforms.
6. a couple other tables, like showing different platform
characteristics (if it uses SA_RESTART by default, if it supports the
realtime extensions, etc.), a copy of the list of async signal-safe
functions from SUSv3, etc.
If you feel like you know enough to write any of these sections, great.
If you don't, that makes your services as a proof-reader that much more
valuable - I'm trying to explain things, and I can't know if I've
accomplished that if everyone proof-reading already understands the
material.
I'd like the documentation to become the authoritative reference for
how to use signals well, useful even without the sigsafe code.
>
> -Shantanu
Thanks,
Scott
|
|
From: Scott L. <sl...@sl...> - 2004-02-18 01:03:33
|
On Feb 17, 2004, at 3:53 PM, Scott Lamb wrote: > Is that convincing or do I need to do a different benchmark? I suppose > I could take a real thread-based webserver and butcher it to double > the number of IO system calls it makes. I'm fairly confident I'd find > a substantial change in throughput. Or fix it to use half as many system calls. I just remembered that Apache always does a select() before read() with their APR read with timeout code. This would probably be a good case study for sigsafe. I might just do it... >> Also, I think you might be wrong about the setjmp() trick (which >> would be the proper trick for read()), read() will return EINTR only >> if the I/O operation did not complete > > I've added (since the 0.1.0 release) a test for this to race_checker. > And it returns the result I expected - the jump can happen after the > system call completes. (With the setjmp() method, it doesn't really > use EINTR at all.) I also added some more description of this to the > documentation. Ahh, now I think I see what you are talking about here. My example of the longjmp method checks for EINTR return; that's unnecessary and confusing. That path will only be taken if a _different_ signal arrives (one which does not jump), which is not what I want to focus on. I've removed it from the example. Thanks, Scott Lamb |
|
From: Scott L. <sl...@sl...> - 2004-02-17 21:57:54
|
On Feb 17, 2004, at 1:18 PM, Pierre Phaneuf wrote:
> Scott Lamb wrote:
>
>> If you're not using thread cancellation, I think that trick is safe.
>> But if you're using a thread per connection (and not already using
>> poll()), adding a poll() before every read() or write() would be
>> slower - twice the system call overhead. It might be too slow to be
>> acceptable.
>
> If you don't use poll(), I suppose you're kind of screwed. But
> personally, I think that if you use block on read() and use threads to
> cope with that, you're already screwed, for other reasons. :-)
Well, you won't have the fastest code around, but a lot of tremendously
popular code does it this way: Apache, most Java-based servers, etc. I
imagine many of their authors and users would disagree with you. ;)
> System call overhead nowadays is almost nil, so that's not a very good
> reason.
In my tests, it seems significant on some platforms. In the
tests/bench_read_* microbenchmark, I found that:
- On Linux 2.6 with a 1GHz Athlon, bench_read_raw takes ~9 seconds and
bench_read_select takes ~21.
- On OS X 10.3 with a 1GHz G4, bench_read_raw takes ~140 seconds and
bench_read_select takes ~200.
So I think that supports your assertion on Linux - it's a threefold
speed decrease but it did an awful lot of system calls in that time,
enough so that I don't think it's significant in the context of the
larger program.
But on OS X with a slightly-faster processor, both tests ran an order
of magnitude slower. So I think that extra 40% is quite significant.
These results match my expectations - Linux is well-known for having
fast system calls and context switches, while OS X is based on the Mach
microkernel.
Is that convincing or do I need to do a different benchmark? I suppose
I could take a real thread-based webserver and butcher it to double the
number of IO system calls it makes. I'm fairly confident I'd find a
substantial change in throughput.
Arguably you just shouldn't use platforms so slow, but again, people do
anyway. (You can have my Mac when you pry it from my cold, dead
hands...) Besides, if you start saying "my platform had better do X
really well", before long you'll be left with no platforms, because in
my experience no platform does _everything_ well.
> Also, I think you might be wrong about the setjmp() trick (which would
> be the proper trick for read()), read() will return EINTR only if the
> I/O operation did not complete
I've added (since the 0.1.0 release) a test for this to race_checker.
And it returns the result I expected - the jump can happen after the
system call completes. (With the setjmp() method, it doesn't really use
EINTR at all.) I also added some more description of this to the
documentation.
> (look at the implementation of fread() or fwrite() in libc).
Which libc are you talking about? glibc? I don't see any code that uses
longjmp in the libc/libio, which is where fread seems to live. I see a
lot of IO_jump_t stuff, but that seems to just be a virtual function
table rather than a jump buffer for longjmp()/setjmp(). In libioP.h:
* The _IO_FILE type is used to implement the FILE type in GNU libc,
* as well as the streambuf class in GNU iostreams for C++.
* These are all the same, just used differently.
* An _IO_FILE (or FILE) object is allows followed by a pointer to
* a jump table (of pointers to functions). The pointer is accessed
* with the _IO_JUMPS macro. The jump table has a eccentric format,
* so as to be compatible with the layout of a C++ virtual function
table.
* (as implemented by g++). When a pointer to a streambuf object is
* coerced to an (_IO_FILE*), then _IO_JUMPS on the result just
* happens to point to the virtual function table of the streambuf.
* Thus the _IO_JUMPS function table used for C stdio/libio does
* double duty as the virtual function table for C++ streambuf.
...
/* The 'sysread' hook is used to read data from the external file into
an existing buffer. It generalizes the Unix read(2) function.
It matches the streambuf::sys_read virtual function, which is
specific to this implementation. */
typedef _IO_ssize_t (*_IO_read_t) __PMT ((_IO_FILE *, void *,
_IO_ssize_t));
#define _IO_SYSREAD(FP, DATA, LEN) JUMP2 (__read, FP, DATA, LEN)
#define _IO_WSYSREAD(FP, DATA, LEN) WJUMP2 (__read, FP, DATA, LEN)
I did see this one comment in fileops.c:
/* These must be set before the sysread as we might longjmp
out
waiting for input. */
but that doesn't lead me to believe they put a lot of thought into it.
> If you're using the other calls you mention, you should be using the
> write() trick and you already don't have the race. The trick doesn't
> work in Cygwin, but handily, Windows doesn't have signals! So you're
> still good (just put the code in and watch it never do the unsupported
> thing by never getting a signal). But seeing that you actually report
> that the trick doesn't work with Cygwin would kind of assume that you
> can get signals on Cygwin, so I might be wrong here.
You can get signals on cygwin. They do it with some weird tricks I
don't really understand. If you're curious, it's described here:
<http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/winsup/cygwin/how-
signals-work.txt?rev=1.11&content-type=text/x-cvsweb-
markup&cvsroot=src>
But if you look around in that same directory, it's pretty obvious that
it's not safe to longjmp() out of a signal handler during select() as
well as other functions. They use the C++ guard/monitor pattern, so
they'd probably have to implement forced unwinding on longjmp to make
it work right.
>> btw, feel free to ask anything like this on the sigsafe-devel mailing
>> list. I'd like to get a discussion going on there.
>
> I sent this to the list, but I'm not really interesting on joining
> that list (I'm on too many already). You can keep on Cc'ing me on this
> thread though.
Works for me.
>
> --
> Pierre Phaneuf
> http://advogato.org/person/pphaneuf/
> "I am denial, guilt and fear -- and I control you"
Thanks. I appreciate the feedback.
Scott Lamb
|
|
From: Pierre P. <pp...@lu...> - 2004-02-17 19:23:49
|
Scott Lamb wrote: > If you're not using thread cancellation, I think that trick is safe. > But if you're using a thread per connection (and not already using > poll()), adding a poll() before every read() or write() would be > slower - twice the system call overhead. It might be too slow to be > acceptable. If you don't use poll(), I suppose you're kind of screwed. But personally, I think that if you use block on read() and use threads to cope with that, you're already screwed, for other reasons. :-) System call overhead nowadays is almost nil, so that's not a very good reason. Also, I think you might be wrong about the setjmp() trick (which would be the proper trick for read()), read() will return EINTR only if the I/O operation did not complete (look at the implementation of fread() or fwrite() in libc). If you're using the other calls you mention, you should be using the write() trick and you already don't have the race. The trick doesn't work in Cygwin, but handily, Windows doesn't have signals! So you're still good (just put the code in and watch it never do the unsupported thing by never getting a signal). But seeing that you actually report that the trick doesn't work with Cygwin would kind of assume that you can get signals on Cygwin, so I might be wrong here. > btw, feel free to ask anything like this on the sigsafe-devel mailing > list. I'd like to get a discussion going on there. I sent this to the list, but I'm not really interesting on joining that list (I'm on too many already). You can keep on Cc'ing me on this thread though. -- Pierre Phaneuf http://advogato.org/person/pphaneuf/ "I am denial, guilt and fear -- and I control you" |
|
From: Scott L. <sl...@sl...> - 2004-02-17 08:02:53
|
Testing. Hopefully having a message will kickstart the archives, which haven't shown up yet. |