tcl-core Mailing List for Tcl (Page 819)

The Tool Command Language implementation

tcl-core — Tcl Core Team discussions about Tcl development

You can subscribe to this list here.

2000	Jan	Feb	Mar	Apr	May	Jun (19)	Jul (96)	Aug (144)	Sep (222)	Oct (496)	Nov (171)	Dec (6)
2001	Jan (4)	Feb (4)	Mar (9)	Apr (4)	May (12)	Jun (6)	Jul	Aug	Sep (1)	Oct (2)	Nov	Dec
2002	Jan	Feb	Mar	Apr	May	Jun (1)	Jul (52)	Aug (47)	Sep (47)	Oct (95)	Nov (56)	Dec (34)
2003	Jan (99)	Feb (116)	Mar (125)	Apr (99)	May (123)	Jun (69)	Jul (110)	Aug (130)	Sep (289)	Oct (211)	Nov (98)	Dec (140)
2004	Jan (85)	Feb (87)	Mar (342)	Apr (125)	May (101)	Jun (60)	Jul (151)	Aug (118)	Sep (162)	Oct (117)	Nov (125)	Dec (95)
2005	Jan (141)	Feb (54)	Mar (79)	Apr (83)	May (74)	Jun (125)	Jul (63)	Aug (89)	Sep (130)	Oct (89)	Nov (34)	Dec (39)
2006	Jan (98)	Feb (62)	Mar (56)	Apr (94)	May (169)	Jun (41)	Jul (34)	Aug (35)	Sep (132)	Oct (722)	Nov (381)	Dec (36)
2007	Jan (34)	Feb (174)	Mar (15)	Apr (35)	May (74)	Jun (15)	Jul (8)	Aug (18)	Sep (39)	Oct (125)	Nov (89)	Dec (129)
2008	Jan (176)	Feb (91)	Mar (69)	Apr (178)	May (310)	Jun (434)	Jul (171)	Aug (73)	Sep (187)	Oct (132)	Nov (259)	Dec (292)
2009	Jan (27)	Feb (54)	Mar (35)	Apr (54)	May (93)	Jun (10)	Jul (36)	Aug (36)	Sep (93)	Oct (52)	Nov (45)	Dec (74)
2010	Jan (20)	Feb (120)	Mar (165)	Apr (101)	May (56)	Jun (12)	Jul (73)	Aug (306)	Sep (154)	Oct (82)	Nov (63)	Dec (42)
2011	Jan (176)	Feb (86)	Mar (199)	Apr (86)	May (237)	Jun (50)	Jul (26)	Aug (56)	Sep (42)	Oct (62)	Nov (62)	Dec (52)
2012	Jan (35)	Feb (33)	Mar (128)	Apr (152)	May (133)	Jun (21)	Jul (74)	Aug (423)	Sep (165)	Oct (129)	Nov (387)	Dec (276)
2013	Jan (105)	Feb (30)	Mar (130)	Apr (42)	May (60)	Jun (79)	Jul (101)	Aug (46)	Sep (81)	Oct (14)	Nov (43)	Dec (4)
2014	Jan (25)	Feb (32)	Mar (30)	Apr (80)	May (42)	Jun (23)	Jul (68)	Aug (127)	Sep (112)	Oct (72)	Nov (29)	Dec (69)
2015	Jan (35)	Feb (49)	Mar (95)	Apr (10)	May (70)	Jun (64)	Jul (93)	Aug (85)	Sep (43)	Oct (38)	Nov (124)	Dec (29)
2016	Jan (253)	Feb (181)	Mar (132)	Apr (419)	May (68)	Jun (90)	Jul (52)	Aug (142)	Sep (131)	Oct (80)	Nov (84)	Dec (192)
2017	Jan (329)	Feb (842)	Mar (248)	Apr (85)	May (247)	Jun (186)	Jul (37)	Aug (73)	Sep (98)	Oct (108)	Nov (143)	Dec (143)
2018	Jan (155)	Feb (139)	Mar (72)	Apr (112)	May (82)	Jun (119)	Jul (24)	Aug (33)	Sep (179)	Oct (295)	Nov (111)	Dec (34)
2019	Jan (20)	Feb (29)	Mar (49)	Apr (89)	May (185)	Jun (131)	Jul (9)	Aug (59)	Sep (30)	Oct (44)	Nov (118)	Dec (53)
2020	Jan (70)	Feb (108)	Mar (50)	Apr (9)	May (70)	Jun (24)	Jul (103)	Aug (82)	Sep (132)	Oct (119)	Nov (174)	Dec (169)
2021	Jan (75)	Feb (51)	Mar (76)	Apr (73)	May (53)	Jun (120)	Jul (114)	Aug (73)	Sep (70)	Oct (18)	Nov (26)	Dec
2022	Jan (26)	Feb (63)	Mar (64)	Apr (64)	May (48)	Jun (74)	Jul (129)	Aug (106)	Sep (238)	Oct (169)	Nov (149)	Dec (111)
2023	Jan (110)	Feb (47)	Mar (82)	Apr (106)	May (168)	Jun (101)	Jul (155)	Aug (35)	Sep (51)	Oct (55)	Nov (134)	Dec (202)
2024	Jan (103)	Feb (129)	Mar (154)	Apr (89)	May (60)	Jun (162)	Jul (201)	Aug (61)	Sep (167)	Oct (111)	Nov (133)	Dec (141)
2025	Jan (122)	Feb (88)	Mar (106)	Apr (113)	May (203)	Jun (185)	Jul (124)	Aug (202)	Sep (176)	Oct (27)	Nov	Dec

Flat | Threaded

<< < 1 .. 817 818 819 820 821 .. 1259 > >> (Page 819 of 1259)

Re: [TCLCORE] TIP #355: Stop Fast Recycling of Channel Names on Unix

From: Joe M. <jo...@mi...> - 2009-09-01 13:29:34

>
> At the C level, we've got that right now with Tcl_GetChannelHandle() and
> have done for many years. Do you also want that at the Tcl script level?
>

I'm not 100% sure; however, it would make migration easier for any scripts
that happen to depend on the current behavior.  We could add the handle
introspection functionality into the [chan] ensemble (e.g. [chan handle])
and have that be a simple wrapper for the Tcl C API function you mention.

--
Joe Mistachkin <jo...@mi...>

Re: [TCLCORE] TIP #355: Stop Fast Recycling of Channel Names on Unix

From: Donal K. F. <don...@ma...> - 2009-09-01 12:37:52

Joe Mistachkin wrote:
> I agree that this should be done for both Unix and Windows. Can we
> somewhow preserve the ability to introspect for the underlying file
> descriptor (Unix) and file handle (Windows)?

At the C level, we've got that right now with Tcl_GetChannelHandle() and
have done for many years. Do you also want that at the Tcl script level?

Donal.

Re: [TCLCORE] TIP #355: Stop Fast Recycling of Channel Names on Unix

From: Joe M. <jo...@mi...> - 2009-09-01 10:58:23

>
> This TIP proposes to replace the file descriptor in unix channel names 
> by an ever-increasing, process-unique counter.
>

I agree that this should be done for both Unix and Windows. Can we
somewhow preserve the ability to introspect for the underlying file
descriptor (Unix) and file handle (Windows)?

--
Joe Mistachkin <jo...@mi...>

[TCLCORE] TIP #355: Stop Fast Recycling of Channel Names on Unix

From: Alexandre F. <ale...@gm...> - 2009-09-01 10:22:17

 TIP #355: STOP FAST RECYCLING OF CHANNEL NAMES ON UNIX 
========================================================
 Version:      $Revision: 1.1 $
 Author:       Alexandre Ferrieux <alexandre.ferrieux_at_gmail.com>
 State:        Draft
 Type:         Project
 Tcl-Version:  8.7
 Vote:         Pending
 Created:      Tuesday, 01 September 2009
 URL:          http://purl.org/tcl/tip/355.html
 WebEdit:      http://purl.org/tcl/tip/edit/355
 Post-History: 

-------------------------------------------------------------------------

 ABSTRACT 
==========

 This TIP proposes to put an end to the unix-specific habit of naming 
 channels after the underlying file descriptor, by using a much 
 longer-lived incremented counter instead. 

 BACKGROUND 
============

 Tcl (non-reflected) channel names are of the general form 
 $/KIND/$/HANDLE/ on all OSes, $/KIND/ being something like "*file*", 
 "*pipe*", etc, and $/HANDLE/ being the OS file descriptor or handle. 
 This is clearly a cost-effective way of guaranteeing process-wide 
 unicity at any given time. 

 However, on unix, file descriptors are in a "compact table", i.e. they 
 are small integers that are reused as quickly as possible, to keep the 
 range small for efficiency reasons (and also constraints like the 
 select() API). And as witnessed by [Bug 
 2826430][<URL:https://sourceforge.net/support/tracker.php?aid=2826430>], 
 channel name recycling is dangerous. Quite possibly a bunch of 
 applications running today get occasionally hit, with very hard to 
 decipher symptoms, and an even harder to reproduce setup. 

 PROPOSED CHANGE 
=================

 This TIP proposes to replace the file descriptor in unix channel names 
 by an ever-increasing, process-unique counter. 

 RATIONALE 
===========

 This change would bring unix channels in line with the rest of the 
 crowd, since Windows handles seem to have a very long cycle, and 
 reflected channels already use a counter. It could even be more 
 ambitious in that (1) even Windows channels use the counter instead of 
 relying on the OS's lifetime guarantee, and (2) reflected channels use 
 the same counter instead of their own. Choice left open for discussion. 

 The implementation is trivial: a new public function, 
 *Tcl_GetProcessUniqueNum*, returns a global integer counter which is 
 incremented under mutex. (It will wrap at MAXINT and I am listening to 
 whoever thinks it is still a problem...). 

 REFERENCE IMPLEMENTATION 
==========================

 The patch attached to the aforementioned bug 
 [<URL:https://sourceforge.net/support/tracker.php?aid=2826430>] adds 
 this stub entry, and updates all unix channel naming schemes to use it. 

 COPYRIGHT 
===========

 This document has been placed in the public domain. 

-------------------------------------------------------------------------

 TIP AutoGenerator - written by Donal K. Fellows

Re: [TCLCORE] An Emacs tuning

From: Donal K. F. <don...@ma...> - 2009-09-01 08:23:19

Alexandre Ferrieux wrote:
> Q: Is there any reason not to add an explicit:
> 
>    * tab-width: 8
> 
> so that the source will be laid out correctly in all emacses
> regardless of their local tab-width ?

Go on, add it. It might be best to only do so when you're editing the 
file for other purposes too.

Donal.

Re: [TCLCORE] CFV: TIP #354: Minor Production-Driven TclOO Revisions

From: Joe E. <jen...@fl...> - 2009-08-31 22:42:28

> Donal K. Fellows wrote:
> >  TIP #354: MINOR PRODUCTION-DRIVEN TCLOO REVISIONS 
> 
> This is a Call For Votes on this TIP.

TIP#354: YES


--Joe English

  jen...@fl...

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Alexandre F. <ale...@gm...> - 2009-08-31 15:51:46

Hi Daniel,

I am not aware of your answer to the following. I'm insisting because
I'm about to try the pthread_kill method...

>  > my main concern was portability (compared to the existing solution,
>  >  which may well be the lowest common denominator but as a consequence
>  >  does work everywhere...), along with inability to detect at configure
>  >  time which of the various alternative options work correctly (and
>  >  perform better) on a given platform.
>  >  See for example how libev has to decide at compile-time&runtime which
>  >  of its options to use on a given system due to various OS brokenness
>  >     http://cvs.schmorp.de/libev/ev.c?revision=1.313&view=markup
>
>
> Following this links I see many ifdefs indeef, but they rather seem to
>  be geared towards making the best use of the given platform (select vs
>  poll, kevents vs signalfd, etc). I don't see anything pertaining to
>  pthread_kill being unsupported or buggy. Googling also yields reports
>  about a few pthread_kill issues, but they seem to be restricted to
>  corner cases (like sending to a dead thread or to self, or using the
>  KILL signal) which are very far from the use we'd be making of it
>  (pthread_kill to a living thread, different from the sender, of a
>  benign signal, with an empty handler). Do you have more precise
>  evidence ?

-Alex

[TCLCORE] An Emacs tuning

From: Alexandre F. <ale...@gm...> - 2009-08-31 15:32:30

Hi,

Sorry in advance for a cosmetic question.
The Tcl source contains the following Emacs-tuning trailer at the
bottom of (nearly) every C file:

 /*
  * Local Variables:
  * mode: c
  * c-basic-offset: 4
  * fill-column: 78
  * End:
  */

Moreover, many of the files contain mixed space (0x20) and tab (0x09)
to implement indentation. And, it turns out the layout is okay only if
the local tab-width is 8. This of course is because most maintainers
use 8 locally.

Q: Is there any reason not to add an explicit:

   * tab-width: 8

so that the source will be laid out correctly in all emacses
regardless of their local tab-width ?

TIA,

-Alex

[TCLCORE] Status of the Tcl 8.6 roadmap

From: Larry W. V. <lv...@gm...> - 2009-08-31 14:29:58

I was noticing on the wiki's Tcl 8.6 roadmap page <
http://wiki.tcl.tk/20966 > that there was hope that Tcl/Tk 8.6 would
be released by Tcl 2009 (which of course is September 30 - October 2,
2009).

I was wondering if that goal is still reachable?

Thank you to everyone working so hard on 8.6!



-- 
Tcl - The glue of a new generation.   http://wiki.tcl.tk/
Larry W. Virden
http://www.xanga.com/lvirden/
Even if explicitly stated to the contrary, nothing in this posting
should be construed as representing my employer's opinions.

Re: [TCLCORE] CFV: TIP #354: Minor Production-Driven TclOO Revisions

From: Kevin K. <kev...@gm...> - 2009-08-31 12:35:24

Donal K. Fellows wrote:
> Donal K. Fellows wrote:
>>  TIP #354: MINOR PRODUCTION-DRIVEN TCLOO REVISIONS 

TIP#354: YES

I think that Donal would be astonished at anything else;
we'd discussed these changes extensively in advance of the
TIP.

-- 
7e de ke9tv/2, Kevin

[TCLCORE] CFV: TIP #354: Minor Production-Driven TclOO Revisions

From: Donal K. F. <don...@ma...> - 2009-08-31 10:55:18

Donal K. Fellows wrote:
>  TIP #354: MINOR PRODUCTION-DRIVEN TCLOO REVISIONS 

This is a Call For Votes on this TIP. As I've noted before, these are
really small changes that have proved necessary in deployments (and are
actually deployed now) so there's absolutely no question of whether they
work. :-)

The vote will last a week (until 12:00 in my timezone next Monday, i.e.,
[clock format 1252321200]) and will be conducted by sending messages in
the usual fashion to this mailing list. My vote follows:

TIP#354: YES

Donal.

Re: [TCLCORE] tcl with static libraries

From: Jan N. <nij...@us...> - 2009-08-31 09:21:10

2009/8/31  <n_n...@ia...>:
> Sorry for this newbie question. Please let me know if this question is not
> related to this mailing list.

This is, indeed, not a question for tcl-core, but for the answer see:

    <http://wiki.tcl.tk/52>

Regards,
         Jan Nijtmans

Re: [TCLCORE] tcl with static libraries

From: Arjen M. <arj...@de...> - 2009-08-31 09:18:24

Hello Narges,

you better ask this sort of questions on comp.lang.tcl - this
mailing list is for discussion Tcl's implementation itself.

But I can suggest an answer to your question: tclkit and starkits
and starpacks can be used such a situation. (There are other
solutions too, like freewrap, but I will stick to this one).

Tclkit is a standalone run-time executable for Tcl and Tk.
It contains all that is needed to run a Tcl program. It acts
much like tclsh or wish, but it does not require an installation
directory.

Starkits and starpacks are packaging solutions for your applications:
they contain the source code and require no installation either,
just copy the file or files to the right place and all is done.

You can find a lot of information about them on the wiki:
http://wiki.tcl.tk - look for tclkit or starkit.

Regards,

Arjen

On 2009-08-31 14:42, n_n...@ia... wrote:
> Dear all,
> 
> I want to run some Tcl codes on a machine which doesn't have Tcl and even
> a C compiler installed. The administrator says that these softwares won't
> be installed and we should use static libraries.
> 
> But I have told that even compiling with a static Tcl library won't help,
> as Tcl also needs some scripts it loads at start up. So what can I do? Is
> it possible to install Tcl on a machine which doesn't have a C compiler
> installed.
> 
> Sorry for this newbie question. Please let me know if this question is not
> related to this mailing list.
>

[TCLCORE] tcl with static libraries

From: <n_n...@ia...> - 2009-08-31 09:11:43

Dear all,

I want to run some Tcl codes on a machine which doesn't have Tcl and even
a C compiler installed. The administrator says that these softwares won't
be installed and we should use static libraries.

But I have told that even compiling with a static Tcl library won't help,
as Tcl also needs some scripts it loads at start up. So what can I do? Is
it possible to install Tcl on a machine which doesn't have a C compiler
installed.

Sorry for this newbie question. Please let me know if this question is not
related to this mailing list.


Many thanks,
Narges Nikoofard

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: David G. <dav...@po...> - 2009-08-27 13:59:13

Attachments: signature.asc

Roman Puls wrote:

> @David: does the current IOCP support worker threads, or is it a single worker thread? What do you think?

iocpsock uses a single (hidden) worker thread for the entire application
blocked on GetQueuedCompletionStatus() herein known as GQCS. When the
completion packet comes in to HandleIO()
<http://iocpsock.cvs.sourceforge.net/viewvc/iocpsock/iocpsock/iocpsock_lolevel.c?revision=1.116&view=markup#l_2273>
it is minimally processed and sent to the thread responsible for that
channel. iocpsock does support multiple Tcl threads in the same way
that channels are transferable.

Note that blocking calls are emulated to behave that way. If you want
to start a worker thread and block on [read], go ahead.

I'm of the camp that doesn't believe in one thread per client connection
is the best model. A thread that is idle 99.9997% of its lifetime is a
huge waste of the memory, etc. it locks. IOCP with 70k concurrent TCP
sockets on like Windows Server 2008 is easily doable. 70k threads is
NOT. Thread pooling is a better approach for controlling resources.

Have the main thread do the listening and initial recv processing. Pass
the channel to a free worker thread when a block is ready for processing
such as an SQL query. Maybe return the channel to the main thread
between blocks? Maybe not transfer at all if thread::send is sufficient
for any singular replies. tclhttpd uses this method as it was written
before channels were transferable.

If you've read the docs on IOCP, you've probably seen how the worker
threads "drive" the processing. This model is a bit different when
viewed from Tcl running iocpsock as it has been tcl'ized back to the
event loop.

The large pink elephant in the room that no one talks about with IOCP is
that multiple threads in use with GQCS, though wonderfully efficient,
has a one-to-many-to-one problem regarding processing order. iocpsock
avoids the problem by only using a single thread blocked on GQCS. The
minimal processing in HandleIO() is done in such a way as to never let
our single _golden_ thread block on anything -- pure feed-forward. Not
even an EnterCriticalSection() is used as I found it really wrecked
things. Well, actually, there is one but didn't want to venture into
lock-free linked lists, but did play around a little and there is some
experimental code that isn't in use @
http://iocpsock.cvs.sourceforge.net/viewvc/iocpsock/iocpsock/linkedlist.c?revision=1.4&view=markup#l_14

Performance is found on how quickly you can process the other side of
the event sources. IOW, how quickly can the next [fileevent] script be
run. Want to spread the "meat" of each over threads? Sure!

A long time ago I tested Tcl_DoOneEvent() to see how fast is was at
processing existing but null events under profiling. I forget the exact
setup. On a 500MHz pIII it clocked 1500 iteration per sec. HandleIO()
did over 3500 if I remember right, so feel comfort knowing iocpsock can
fill Tcl faster than it can be drained.

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Roman P. <pu...@x-...> - 2009-08-27 11:19:01

Hi Alex,

thanks for coming back on my points.

I already sent an answer to Daniel discussing the application scenario with some hundred threads.

Regarding the broadcast channel: well, that might lead to http://en.wikipedia.org/wiki/Thundering_herd_problem - so maybe it's not the best idea. What do you think?

Back to the screaming metal: I guess this is sort of theological discussion wether to server each client with a single thread, or to have one process serve all clients (Tcl), or have thread pools with each thread serving many clients. My believe (of course) is the latter, especially to avoid stack space waste, so I am currently working on libevent 2.x as an extension for Tcl, having SSL, buffer events, multi-threading, epoll and IOCP on windows (when it's available in libevent).

My first tests however show an incredible performance boost, having moderate CPU usage serving some hundred clients.

As for the Tcl-core: a relatively easy step to increase the total number of sockets (and performance) would be to replace select by epoll/poll, where available. 

With pthread_kill I remember that signal handling can be quite expensive (time) and overruns may occur. Personally, I'll stay with pipes (which can overrun as well) ;)

Finally, and as asked before in the email to Daniel, are there any general plans on modifying / extending the io handling in the core? Is there a place for a wish list? Let me know!

@David: does the current IOCP support worker threads, or is it a single worker thread? What do you think?

Thanks & take care!
  Roman

PS: what I am doing right now is to have libevent live in separate threads and have inter-thread communication over fd's (to the tcl core). Whenever a worker thread has decrypted and re-assembled a complete message, it simply places an event to the main tcl thread. That's nice in terms of that the application logic remains single-threaded, but having worker threads which can share the ssl encryption etc.

> -----Original Message-----
> From: Alexandre Ferrieux [mailto:ale...@gm...] 
> Sent: Thursday, August 27, 2009 10:56 AM
> To: Roman Puls
> Cc: tcl...@li...;David Gravereaux
> Subject: Re: [TCLCORE] multithreaded Tcl IO / select vs. 
> epoll and IOCP
> 
> On 8/26/09, Roman Puls <pu...@x-...> wrote:
> >
> >  Hi Alex,
> >
> >
> >  > About the extra-fd solution: indeed we had thought of 
> one  > extra 
> > pipe per thread for interthread communication, but it  > is 
> judged Too 
> > Fscking Expensive by many... Now you're saying  > "shared 
> between the 
> > threads"...
> >  > what do you have in mind ? One single pipe, with N readers  > 
> > selecting on it ? Is this supposed to work ? Also, this would  > 
> > awaken everybody whenever a thread::send is done; I'm  > concerned 
> > about the efficiency.
> >  > Please elaborate.
> >
> >
> > Well, it depends on the application or architecture: if 
> we're talking about io-centric worker threads (which 
> typically reflect the number of CPUs or cores), the overhead 
> for pipes will not be large. Have one fd per thread, and 
> you're done writing to it. Also, for Tcl, that pipe could 
> handle all sort of event notification.
> 
> The problem is that a generic thing like the Tcl core cannot 
> make restrictive assumptions on the app :-}. I remember 
> people talking about thousands of threads were one pipe for 
> each one would be unthinkable; all the more so if 
> inter-thread communication is *not* used in the app... which 
> is notably often the case in the category of high-throughput 
> TCP servers !
> 
> 
> >  However, reading the man pages for 'eventfd', 'signalfd', 
> 'timerfd_create', select/epoll seems to be the way on how to 
> realize that sort of event multiplexing / inter-thread 
> communication. But of course, these are for kernels >= 2.6.22 
> only, and linux only.
> >  See http://lwn.net/Articles/225714/: "It brings Linux 
> toward a single interface for event delivery without the need 
> for a new, complex API."
> >
> 
> Very interesting... for our children :) (I have 2.4 machines around)
> 
> > For the rest of the unix world, still pipes seem to be the only way.
> 
> Yes. Pipes *or* pthread_kill when applicable.
> 
> >  And, as you were asking: select/epoll'ing the same handle 
> will probably not work (or lead to unwanted results), but 
> duping the pipe handles will do. But as written before, the 
> cleanest approach for me is to have one pipe per thread.
> 
> Just checked: both duping and sharing work with select() on my 2.6.9.
> So we do have a cheap broadcast channel after all, even 
> though it doesn't fit the purpose.
> 
> >  About the 'screaming metal': well, for the server side, I 
> need one accept socket (true), but N reading/writing sockets 
> (with let's say N around 512). Your proposal was probably not 
> to have 512 threads sitting on a blocked select, right?
> 
> My proposal was to have no select at all:
> 
>    while(1) {
>        s2=accept(...);
>        pthread_create(worker,s2);
>    }
> 
>    void worker(int fd) {
>        while(1) {
>            n=read(fd,...)
>            if (n<=0) break;
>            ...
>        }
>        close(fd);
>    }
> 
> >
> >  The only clean way to handle it is to have either one big 
> IO/loop (what I do now, with the known limits) or to have 
> several IO/loops, each sitting on its own (worker)thread.
> 
> Yup. The latter. But no notifier needed.
> 
> >  Being curious if Daniel has something to add, and - what 
> do the windows guys think? Who's actually taking care of that 
> part? Do you plan to integrate IOCP?
> 
> IIRC David Gravereaux (Cced) maintains a branch where he 
> rewrote all the Windows socket code using IOCP. Maybe he can step in ?
> 
> -Alex
> 
>

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Roman P. <pu...@x-...> - 2009-08-27 10:55:06

Hi Daniel,

> > Well, it depends on the application or architecture: if 
> we're talking 
> > about io-centric worker threads (which typically reflect 
> the number of 
> > CPUs or cores), the overhead for pipes will not be large. 
> Have one fd 
> > per thread
> 
> AFAIR the objection to such a scheme came from users like 
> AOLServer that have 1000s of threads, so that one extra fd 
> per thread would be very problematic...
> 

absolutely true! That's why I meant "it depends on the application or architecture".

My suggestions were made with my type of application in mind, and with the current solution of having one jumbo select, no benefits arise from the usage of multiple threads, so one can (basically) stick with the single-threaded version. But of course, it's still possible to attach an extension which brings in the power of multithreaded IOCP/epoll/kqueue.

If Tcl (in future) wants to address performant, high-scale IO applications (like network servers), the current model has IMHO reached it's limits. It might be worth to think about a model where the user can freely decide wether or not a jumbo select is done or if a select/epoll should only take care for the thread specific fd's. 

BTW: are there any plans regarding IO for future releases? Have we got a roadmap?

Cheers,
  Roman

PS: thanks for the poll notifier link!

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Alexandre F. <ale...@gm...> - 2009-08-27 09:53:59

On 8/26/09, Daniel A. Steffen <da...@us...> wrote:
> On Wed, Aug 26, 2009 at 15:05, Alexandre
>  Ferrieux<ale...@gm...> wrote:
>  > I remember Daniel A. Steffen (TCTus emeritus) waving red flags about
>  > pthread_kill(), and that stopped me in my tracks. Maybe he can
>  > elaborate on the problems ? Daniel ?
>
>
> my main concern was portability (compared to the existing solution,
>  which may well be the lowest common denominator but as a consequence
>  does work everywhere...), along with inability to detect at configure
>  time which of the various alternative options work correctly (and
>  perform better) on a given platform.
>  See for example how libev has to decide at compile-time&runtime which
>  of its options to use on a given system due to various OS brokenness
>     http://cvs.schmorp.de/libev/ev.c?revision=1.313&view=markup

Following this links I see many ifdefs indeef, but they rather seem to
be geared towards making the best use of the given platform (select vs
poll, kevents vs signalfd, etc). I don't see anything pertaining to
pthread_kill being unsupported or buggy. Googling also yields reports
about a few pthread_kill issues, but they seem to be restricted to
corner cases (like sending to a dead thread or to self, or using the
KILL signal) which are very far from the use we'd be making of it
(pthread_kill to a living thread, different from the sender, of a
benign signal, with an empty handler). Do you have more precise
evidence ?

>  [...] writing a bug-free notifier
>  is significantly harder than you might assume and takes a long time
>  I can only assume that writing an notifier that works correctly on
>  multiple platforms can only be worse ;-)

Yes, but the idea here is not to reinvent the wheel: bring back the
(unmodified, heavily tested) unthreaded notifier into each thread, and
only add the *possibility* of doing interthread communication by a
pthread_kill -> EINTR. The key is that (1) all single-threaded uses
are guaranteed to work and (2) all non-thread::send uses too...

-Alex

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Alexandre F. <ale...@gm...> - 2009-08-27 08:56:01

On 8/26/09, Roman Puls <pu...@x-...> wrote:
>
>  Hi Alex,
>
>
>  > About the extra-fd solution: indeed we had thought of one
>  > extra pipe per thread for interthread communication, but it
>  > is judged Too Fscking Expensive by many... Now you're saying
>  > "shared between the threads"...
>  > what do you have in mind ? One single pipe, with N readers
>  > selecting on it ? Is this supposed to work ? Also, this would
>  > awaken everybody whenever a thread::send is done; I'm
>  > concerned about the efficiency.
>  > Please elaborate.
>
>
> Well, it depends on the application or architecture: if we're talking about io-centric worker threads (which typically reflect the number of CPUs or cores), the overhead for pipes will not be large. Have one fd per thread, and you're done writing to it. Also, for Tcl, that pipe could handle all sort of event notification.

The problem is that a generic thing like the Tcl core cannot make
restrictive assumptions on the app :-}. I remember people talking
about thousands of threads were one pipe for each one would be
unthinkable; all the more so if inter-thread communication is *not*
used in the app... which is notably often the case in the category of
high-throughput TCP servers !


>  However, reading the man pages for 'eventfd', 'signalfd', 'timerfd_create', select/epoll seems to be the way on how to realize that sort of event multiplexing / inter-thread communication. But of course, these are for kernels >= 2.6.22 only, and linux only.
>  See http://lwn.net/Articles/225714/: "It brings Linux toward a single interface for event delivery without the need for a new, complex API."
>

Very interesting... for our children :) (I have 2.4 machines around)

> For the rest of the unix world, still pipes seem to be the only way.

Yes. Pipes *or* pthread_kill when applicable.

>  And, as you were asking: select/epoll'ing the same handle will probably not work (or lead to unwanted results), but duping the pipe handles will do. But as written before, the cleanest approach for me is to have one pipe per thread.

Just checked: both duping and sharing work with select() on my 2.6.9.
So we do have a cheap broadcast channel after all, even though it
doesn't fit the purpose.

>  About the 'screaming metal': well, for the server side, I need one accept socket (true), but N reading/writing sockets (with let's say N around 512). Your proposal was probably not to have 512 threads sitting on a blocked select, right?

My proposal was to have no select at all:

   while(1) {
       s2=accept(...);
       pthread_create(worker,s2);
   }

   void worker(int fd) {
       while(1) {
           n=read(fd,...)
           if (n<=0) break;
           ...
       }
       close(fd);
   }

>
>  The only clean way to handle it is to have either one big IO/loop (what I do now, with the known limits) or to have several IO/loops, each sitting on its own (worker)thread.

Yup. The latter. But no notifier needed.

>  Being curious if Daniel has something to add, and - what do the windows guys think? Who's actually taking care of that part? Do you plan to integrate IOCP?

IIRC David Gravereaux (Cced) maintains a branch where he rewrote all
the Windows socket code using IOCP. Maybe he can step in ?

-Alex

Re: [TCLCORE] TIP #354: Minor Production-Driven TclOO Revisions

From: Donal K. F. <don...@ma...> - 2009-08-27 08:37:14

Donal K. Fellows wrote:
>  TIP #354: MINOR PRODUCTION-DRIVEN TCLOO REVISIONS 
> ===================================================
[...]
>  This TIP describes a few small changes required for solving issues that 
>  have been found when using TclOO in production. 

To be clear, this TIP describes features that are already deployed. The
change of 'forward' resolution makes them more usable at all (before,
they were really not fit for purpose), [info object namespace] is just
what's needed to enable a whole range of neat features in scripts that
aren't quite what the standard model does, and Tcl_GetObjectName is just
something that Kevin wanted and which I already had internally. All have
had rather a lot of in-production road-testing now.

Because these are deployed features, I'm not going to hang around a lot
before calling the vote and the vote period will be short. The TIP is
really just there to act as documentation of what was done. (OK, I think
this is slightly naughty by the rules, but I ask forgiveness.)

Donal.

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Daniel A. S. <da...@ca...> - 2009-08-26 16:13:07

On Wed, Aug 26, 2009 at 17:51, Roman Puls<pu...@x-...> wrote:
> Well, it depends on the application or architecture: if we're talking about io-centric worker threads (which typically reflect the number of CPUs or cores), the overhead for pipes will not be large. Have one fd per thread

AFAIR the objection to such a scheme came from users like AOLServer
that have 1000s of threads, so that one extra fd per thread would be
very problematic...

Cheers,

Daniel

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Daniel A. S. <da...@us...> - 2009-08-26 16:08:28

On Wed, Aug 26, 2009 at 15:05, Alexandre
Ferrieux<ale...@gm...> wrote:
> I remember Daniel A. Steffen (TCTus emeritus) waving red flags about
> pthread_kill(), and that stopped me in my tracks. Maybe he can
> elaborate on the problems ? Daniel ?

my main concern was portability (compared to the existing solution,
which may well be the lowest common denominator but as a consequence
does work everywhere...), along with inability to detect at configure
time which of the various alternative options work correctly (and
perform better) on a given platform.
See for example how libev has to decide at compile-time&runtime which
of its options to use on a given system due to various OS brokenness
    http://cvs.schmorp.de/libev/ev.c?revision=1.313&view=markup

If it is purely a matter of plugging-in a better-performing notifier
on one platform, Linux, you should indeed be able to do this by
calling Tcl_SetNotifier() before Tcl_Init() from an embedding
executable without needing changes in the core (at some point in 8.5
history I fixed some issues with Tcl_SetNotifier so that all notifier
functions should now be cleanly hookable, if not, please file a bug,
that's definitely something that should work).

You might be able to start with the following poll()/pthread_kill()
based alternative notifier:
    http://sourceforge.net/tracker/index.php?func=detail&aid=1470152&group_id=10894&atid=110894

Some real-world experience with non-select()-based notifiers would
indeed be very welcome and provide much better data for any future
decisions about the default unix notifier in tcl than theoretical
discussion about which API combination "should work" and "should
perform better"...

Having implemented an alternative notifier for one platform
(tcl/macosx/tclMacOSXNotify.c), my experience was that the devil is
very much in the details, that the tcl testsuite will ferret out many
(but not all) tricky cornercases, and that writing a bug-free notifier
is significantly harder than you might assume and takes a long time
(just fixed an obscure bug in tclMacOSXNotify.c a few days ago and the
initial version dates from 2005...)
I can only assume that writing an notifier that works correctly on
multiple platforms can only be worse ;-)

Cheers,

Daniel

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Roman P. <pu...@x-...> - 2009-08-26 15:52:15

Hi Alex,

> About the extra-fd solution: indeed we had thought of one 
> extra pipe per thread for interthread communication, but it 
> is judged Too Fscking Expensive by many... Now you're saying 
> "shared between the threads"...
> what do you have in mind ? One single pipe, with N readers 
> selecting on it ? Is this supposed to work ? Also, this would 
> awaken everybody whenever a thread::send is done; I'm 
> concerned about the efficiency.
> Please elaborate.

Well, it depends on the application or architecture: if we're talking about io-centric worker threads (which typically reflect the number of CPUs or cores), the overhead for pipes will not be large. Have one fd per thread, and you're done writing to it. Also, for Tcl, that pipe could handle all sort of event notification.

However, reading the man pages for 'eventfd', 'signalfd', 'timerfd_create', select/epoll seems to be the way on how to realize that sort of event multiplexing / inter-thread communication. But of course, these are for kernels >= 2.6.22 only, and linux only. For the rest of the unix world, still pipes seem to be the only way.

See http://lwn.net/Articles/225714/: "It brings Linux toward a single interface for event delivery without the need for a new, complex API."

And, as you were asking: select/epoll'ing the same handle will probably not work (or lead to unwanted results), but duping the pipe handles will do. But as written before, the cleanest approach for me is to have one pipe per thread.


About the 'screaming metal': well, for the server side, I need one accept socket (true), but N reading/writing sockets (with let's say N around 512). Your proposal was probably not to have 512 threads sitting on a blocked select, right?

The only clean way to handle it is to have either one big IO/loop (what I do now, with the known limits) or to have several IO/loops, each sitting on its own (worker)thread.

Even if I am motivated to contribute to the topic (also with code, if needed), I'll probably leave the Tcl-Loop here and run my own loops in separate threads, just letting Tcl know if there's a complete message (as Tcl_Obj). Unfortunately, this will break TLS stacking (and I hate that weird openssl documentation...)


Being curious if Daniel has something to add, and - what do the windows guys think? Who's actually taking care of that part? Do you plan to integrate IOCP?

Cheers!
  Roman
 

> -----Original Message-----
> From: Alexandre Ferrieux [mailto:ale...@gm...] 
> Sent: Wednesday, August 26, 2009 3:05 PM
> To: Roman Puls
> Cc: tcl...@li...
> Subject: Re: [TCLCORE] multithreaded Tcl IO / select vs. 
> epoll and IOCP
> 
> On 8/26/09, Roman Puls <pu...@x-...> wrote:
> >
> >  Hi Alexandre,
> >
> >  thanks for the fast reply and your extensive answers, and 
> questions ;) !
> >
> >
> >  Let's start with the SSL/TLS empty reads: yes, we see 
> them, and silently ignore them. To get a better picture: 
> we're having around 500 clients per server, establishing a 
> TLS connection to it which is held over lifetime of the 
> clients. Stalling does apparently not occur, but we're 
> running a higher layer keep-alive over the sockets, and we 
> cannot see any freezing clients so far. Possibly we don't see 
> the effect as the connections keep open for a very long time? 
> As for the empty reads, it happens once in a while, and we 
> tend to live with it - for now.
> 
> OK. Be sure to come back to the bugreport if you get into 
> trouble again.
> 
> >  About the select 'bottleneck': from my general 
> understanding, I would have expected to have a thread only to 
> poll it's own fd's. Having a jumbo select on ALL available 
> fd's of course does eliminate the benefit of IO-centric 
> worker threads. However, having a thread locked in a 
> select/epoll/poll sys-call, we're no longer responsive to 
> inter-thread communication from other threads. pthread_kill 
> might be an option with pselect, epoll_pwait (or with the 
> plain version, restoring the masks afterwards). Another 
> option may be to have special fd's shared between the 
> threads, so a write operation would wake up the others.
> 
> About the extra-fd solution: indeed we had thought of one 
> extra pipe per thread for interthread communication, but it 
> is judged Too Fscking Expensive by many... Now you're saying 
> "shared between the threads"...
> what do you have in mind ? One single pipe, with N readers 
> selecting on it ? Is this supposed to work ? Also, this would 
> awaken everybody whenever a thread::send is done; I'm 
> concerned about the efficiency.
> Please elaborate.
> 
> >  Talking about performance, yes, we've got ~35 user, and 
> 55% kernel CPU usage. Right now, to be honest, I've got no 
> clue on how to monitor a single (p)thread inside a process. 
> Any tools out there?
> 
> Use the 'H' flag in ps:
> 
>  ps axuwwwH
> 
> it has the effect of giving each thread a separate line of output.
> For some reason all seem to have the process's pid (on my old 
> CentOS 4.7), but a complimentary strace can disambiguate.
> 
> You can also go directly to /proc/$PID/task/$TPID/stat and 
> monitor the [c][us]time values to reimplement a smarter ps. 
> You can even do it in Tcl ;-)
> 
> >  Some words about epoll, by the way: "The Linux-specific 
> epoll(7) API provides an interface that is more efficient 
> than select(2) and poll(2) when monitoring large numbers of 
> file descriptors." My experience is to get a much faster 
> feedback with many fd's, check out 
> http://www.monkey.org/~provos/libevent/libevent-benchmark.jpg.
>  And regardless of select, epoll, kqueue, poll, it seems as 
> the event-loop architecture needs to interrupt those system 
> calls when it comes to inter-thread communication. These 
> days, select is the lowest common approach, and the other 
> mechanisms exist for good reasons.
> 
> While I agree in general, it would seem that in your case:
> 
>   - the incoming-TCP part just needs a select on one fd, the 
> server socket. Or even none, just a blocking accept() in a 
> loop. Do you confirm ?
> 
>   - the individual mid-life IO also need a single fd. 
> Possibly no select either, just a blocking read().
> 
> So if you're upping your sleeves to make the metal scream (as 
> it seems you're preparing to), why not streamline this part 
> to accept() on one side and read() on the other, without any 
> intervention of the notifier, hence no select() nor epoll() 
> nor libevent ? (Note that you don't need to replace the 
> notifier: just refrain from calling [vwait]/Tcl_DoOneEvent).
> 
> > Are you aware of an existing implementation of an extern 
> (replacement) loop? Or could I plug-in a modified version, 
> which does implement pthread_kill?
> 
> I remember Daniel A. Steffen (TCTus emeritus) waving red 
> flags about pthread_kill(), and that stopped me in my tracks. 
> Maybe he can elaborate on the problems ? Daniel ?
> 
> -Alex
> 
>

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Alexandre F. <ale...@gm...> - 2009-08-26 13:05:33

On 8/26/09, Roman Puls <pu...@x-...> wrote:
>
>  Hi Alexandre,
>
>  thanks for the fast reply and your extensive answers, and questions ;) !
>
>
>  Let's start with the SSL/TLS empty reads: yes, we see them, and silently ignore them. To get a better picture: we're having around 500 clients per server, establishing a TLS connection to it which is held over lifetime of the clients. Stalling does apparently not occur, but we're running a higher layer keep-alive over the sockets, and we cannot see any freezing clients so far. Possibly we don't see the effect as the connections keep open for a very long time? As for the empty reads, it happens once in a while, and we tend to live with it - for now.

OK. Be sure to come back to the bugreport if you get into trouble again.

>  About the select 'bottleneck': from my general understanding, I would have expected to have a thread only to poll it's own fd's. Having a jumbo select on ALL available fd's of course does eliminate the benefit of IO-centric worker threads. However, having a thread locked in a select/epoll/poll sys-call, we're no longer responsive to inter-thread communication from other threads. pthread_kill might be an option with pselect, epoll_pwait (or with the plain version, restoring the masks afterwards). Another option may be to have special fd's shared between the threads, so a write operation would wake up the others.

About the extra-fd solution: indeed we had thought of one extra pipe
per thread for interthread communication, but it is judged Too Fscking
Expensive by many... Now you're saying "shared between the threads"...
what do you have in mind ? One single pipe, with N readers selecting
on it ? Is this supposed to work ? Also, this would awaken everybody
whenever a thread::send is done; I'm concerned about the efficiency.
Please elaborate.

>  Talking about performance, yes, we've got ~35 user, and 55% kernel CPU usage. Right now, to be honest, I've got no clue on how to monitor a single (p)thread inside a process. Any tools out there?

Use the 'H' flag in ps:

 ps axuwwwH

it has the effect of giving each thread a separate line of output.
For some reason all seem to have the process's pid (on my old CentOS
4.7), but a complimentary strace can disambiguate.

You can also go directly to /proc/$PID/task/$TPID/stat and monitor the
[c][us]time values to reimplement a smarter ps. You can even do it in
Tcl ;-)

>  Some words about epoll, by the way: "The Linux-specific epoll(7) API provides an interface that is more efficient than select(2) and poll(2) when monitoring large numbers of file descriptors." My experience is to get a much faster feedback with many fd's, check out http://www.monkey.org/~provos/libevent/libevent-benchmark.jpg. And regardless of select, epoll, kqueue, poll, it seems as the event-loop architecture needs to interrupt those system calls when it comes to inter-thread communication. These days, select is the lowest common approach, and the other mechanisms exist for good reasons.

While I agree in general, it would seem that in your case:

  - the incoming-TCP part just needs a select on one fd, the server
socket. Or even none, just a blocking accept() in a loop. Do you
confirm ?

  - the individual mid-life IO also need a single fd. Possibly no
select either, just a blocking read().

So if you're upping your sleeves to make the metal scream (as it seems
you're preparing to), why not streamline this part to accept() on one
side and read() on the other, without any intervention of the
notifier, hence no select() nor epoll() nor libevent ? (Note that you
don't need to replace the notifier: just refrain from calling
[vwait]/Tcl_DoOneEvent).

> Are you aware of an existing implementation of an extern (replacement) loop? Or could I plug-in a modified version, which does implement pthread_kill?

I remember Daniel A. Steffen (TCTus emeritus) waving red flags about
pthread_kill(), and that stopped me in my tracks. Maybe he can
elaborate on the problems ? Daniel ?

-Alex

Re: [TCLCORE] multithreaded Tcl IO / select vs. epoll and IOCP

From: Roman P. <pu...@x-...> - 2009-08-26 11:39:30

Hi Alexandre,

thanks for the fast reply and your extensive answers, and questions ;) !

Let's start with the SSL/TLS empty reads: yes, we see them, and silently ignore them. To get a better picture: we're having around 500 clients per server, establishing a TLS connection to it which is held over lifetime of the clients. Stalling does apparently not occur, but we're running a higher layer keep-alive over the sockets, and we cannot see any freezing clients so far. Possibly we don't see the effect as the connections keep open for a very long time? As for the empty reads, it happens once in a while, and we tend to live with it - for now.

About the select 'bottleneck': from my general understanding, I would have expected to have a thread only to poll it's own fd's. Having a jumbo select on ALL available fd's of course does eliminate the benefit of IO-centric worker threads. However, having a thread locked in a select/epoll/poll sys-call, we're no longer responsive to inter-thread communication from other threads. pthread_kill might be an option with pselect, epoll_pwait (or with the plain version, restoring the masks afterwards). Another option may be to have special fd's shared between the threads, so a write operation would wake up the others.

Talking about performance, yes, we've got ~35 user, and 55% kernel CPU usage. Right now, to be honest, I've got no clue on how to monitor a single (p)thread inside a process. Any tools out there?

Some words about epoll, by the way: "The Linux-specific epoll(7) API provides an interface that is more efficient than select(2) and poll(2) when monitoring large numbers of file descriptors." My experience is to get a much faster feedback with many fd's, check out http://www.monkey.org/~provos/libevent/libevent-benchmark.jpg. And regardless of select, epoll, kqueue, poll, it seems as the event-loop architecture needs to interrupt those system calls when it comes to inter-thread communication. These days, select is the lowest common approach, and the other mechanisms exist for good reasons.

Let me come to your questions (quoted now):

> 1a) Since you're doing all this socket event dispatching in 
> C, are you sure Tcl is at the proper place ?

Sure. It's the framework plus scripting on the higher layer.

> 1b) IOW, do you observe a performance boost by doing this 
> instead of script level and unmodified tclsh ?

Yep, we see an improvement of at least 35%, depeding on the current load and number of channels. 

> 2a) In the target application (not the minimized case), what is the
> bottleneck: connection creation/destruction, or mid-life I/O 
> and computations ?

Hehe, I knew somebody would ask that ;) Well, it's actually connection setup (ssl needs some heavy math calculations), and then mid-life I/O (encryption, copying data around). The server itself is written as a pure async I/O multiplexer, sending messages across the clients and to some backend systems. Also, there's no long term operation, and the only time-consuming action is to write files (which is cached, raid-10 and quite fast and around 200MB/sec).

> 2b) IOW, what are the figures of the number of new 
> connections per second and average connection lifetime ?

new connections are usually 1/minute, but may get bursty after network failures. Average connection lifetime is days to weeks.

> 3) What do the worker threads need to share ? Could they be 
> processes instead ?

Unlikely. They need to share a lot of state information. The server sends out requests to the clients and async waits for completion of the jobs. Of course, it would be technically possible to do another multiplexing between processes, but I'd strongly prefer to have it inside one process.

> 4) have you considered letting an Apache or equivalent handle 
> the connections, and use FastCGI or other as a gateway to 
> your worker process ?

So we did, but the nature of long-term connections and the need of a multiplexer/fragmenters (like BEEP) inside each tls-channel makes it more than hard. 

Well, after reading again, I ask myself if "Tcl_SetNotifier" might be an option. Would it be possible to replace the existing select loop by something different, like the above mentionend libevent, which supports timers, signals, epoll? Are you aware of an existing implementation of an extern (replacement) loop? Or could I plug-in a modified version, which does implement pthread_kill?

Finally:

> > Do we have sort of round-robin in the core, so that every 
> socket gets 
> > the same amount of "activity"?
> 
> Yes, unless you file a bug report showing the contrary ;-)

I had the *feeling* that some connections are slower and that connections bursting data get more notifications. But that may already happen on the tcp layer, so I'll investigate on that first!

Thanks so far!
  Roman

> -----Original Message-----
> From: Alexandre Ferrieux [mailto:ale...@gm...] 
> Sent: Wednesday, August 26, 2009 11:11 AM
> To: Roman Puls
> Cc: tcl...@li...
> Subject: Re: [TCLCORE] multithreaded Tcl IO / select vs. 
> epoll and IOCP
> 
> Hi Roman
> 
> On Wed, Aug 26, 2009 at 9:04 AM, Roman Puls<pu...@x-...> wrote:
> > Dear Core-Developers,
> >
> > we're using Tcl (8.6b1.1) quite heavy for our server applications, 
> > appreciating the simplicity of the event loop, using 
> stacked channels
> > (ssl) and the ability to run it both on linux/windows. We do most 
> > parts in C (using the Tcl-API heavily) and glue the 
> components on the script level.
> 
> A selfish side question, since you're mentioning it: are you 
> using fileevents on ssl-capped channels ? If yes, and you 
> have no bugs to report in this area, then your expertise 
> would be very helpful to proceed with TLS's nasty stalls and 
> empty reads (see Tcl bug 1945538).
> 
> > These days, I asked myself how to improve the performance (as we're 
> > using ssl quite heavy) and tried to make a quick prototype 
> to test for 
> > multithreading to take advantage of the Multi-CPU core. The server 
> > accepts incoming connections, and echos back all data it 
> receives to the socket.
> >  [...]
> > Having 4 threads, we do not take advantage of all CPUs 
> (load <= 100%). 
> > Is that a limitation of the Tcl_DoOneEvent / select() call, 
> or do you 
> > see any problems with the code above?
> 
> There is indeed a bottleneck in the current unix notifier: it 
> is centralized. The reason is the poor marriage between 
> select() and pthreads, which makes it hard to efficiently 
> awaken another thread's select().
> Basically things work this way:
> 
>   - a central Notifier thread does a jumbo select() for 
> everybody else.
>   - the Tcl-interp-coupled threads themselves wait on a 
> condition variable
>   - when an fd awakens the central select, the notifier pings 
> the proper cond var(s).
>   - interthread communication (thread::send) also uses the cond vars.
> 
> This is a long-term dream of ours (well, at least mine) to 
> replace this with a more distributed architecture. The reason 
> is not only performance, but also better symmetry with the 
> unthreaded case, avoiding the presence of that extra hidden 
> thread for all single-threaded uses of a threaded Tcl. In the 
> end this would allow to make the threaded core the default in 
> unix with confidence.
> 
> (FWIW, one possibility to do this would be to use 
> pthread_kill as an interthread awakening mechanism, each 
> thread having his own select() just as in the unthreaded core)
> 
> Now I am not 100% sure this is _your_ bottleneck. But it 
> could well be ;-)
> 
> To investigate, you could first break down the CPU 
> consumption between user and kernel (I suspect high kernel, 
> because of the futex syscalls); then get a per-thread 
> breakdown of the consumption; then attach strace to the hungriest one.
> 
> > We're unable to establish more than 1024 socket connections 
> (ulimit is 
> > set to 8192 file handles). Having 1024 sockets, we can 
> still open regular files.
> > Is it possible to bypass that socket limit? Can we somehow 
> specify the 
> > backlog for the listening socket?
> 
> Two things here:
> 
>  (1) the FD_SETSIZE limit in this case is lower than the max 
> number of open fds. This is directly linked with the 
> centralized select() issue, hinting at the fact that the OS 
> doesn't expect a 2048-fd select (and yes, epoll() is one 
> modern answer).
>  (2) the backlog of the listening socket is an entirely 
> separate issue. If you're hitting it, you must be seeing 
> ECONNREFUSED in the clients. Is it the case ? If yes, the 
> limit can be upped in /etc/sysctl.conf through 
> net.ipv4.tcp_max_syn_backlog.
> 
> > As for one of the next major tcl-releases (yes, I know that 
> we're not 
> > having the 'typical' tcl application and that linux is just 
> one *nix), 
> > would it make sense to implement epoll for linux (e.g. 
> > http://www.monkey.org/~provos/libevent/ ) and let's say IOCP
> > (http://sourceforge.net/projects/iocpsock) for the windows port?
> 
> I don't have the full picture in mind, so just one remark 
> here: AFAIK epoll would just allow bigger sets of waited-for 
> fds, but not ease the pthread interaction. So it is 
> orthogonal to the decentralization effort.
> 
> > Have you got any other suggestion on how to improve 
> performance / take 
> > advantage of multi-processor architectures with the existing core 
> > implementation?
> 
> Random questions:
> 
> 1a) Since you're doing all this socket event dispatching in 
> C, are you sure Tcl is at the proper place ?
> 1b) IOW, do you observe a performance boost by doing this 
> instead of script level and unmodified tclsh ?
> 2a) In the target application (not the minimized case), what is the
> bottleneck: connection creation/destruction, or mid-life I/O 
> and computations ?
> 2b) IOW, what are the figures of the number of new 
> connections per second and average connection lifetime ?
> 3) What do the worker threads need to share ? Could they be 
> processes instead ?
> 
> Based on answers to these, it might or might not be worth 
> asking this one:
> 
> 4) have you considered letting an Apache or equivalent handle 
> the connections, and use FastCGI or other as a gateway to 
> your worker process ?
> 
> > Do we have sort of round-robin in the core, so that every 
> socket gets 
> > the same amount of "activity"?
> 
> Yes, unless you file a bug report showing the contrary ;-)
> 
> -Alex
> 
>

26 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 817 818 819 820 821 .. 1259 > >> (Page 819 of 1259)