From: Todd J. F. <to...@in...> - 2007-09-15 15:16:21
|
As I mentioned in my vrjuggler-info post, we need to be able to assign processor affinity to draw threads in VR Juggler to keep performance up on multi-CPU configurations with CPU-heavy draw threads like OSG-based or other scenegraph-based applications. A logical place to start adding support is in VPR. This code will assign CPU-affinity to the current thread on a Linux system: void setCPUAffinity(int cpu) { if(sysconf(_SC_NPROCESSORS_CONF) > cpu) { pid_t mythread = syscall(__NR_gettid); if (mythread) { cpu_set_t cpumask; CPU_ZERO( &cpumask ); CPU_SET( cpu, &cpumask ); sched_setaffinity( mythread, sizeof(cpumask), &cpumask ); } } } I can add similar code to ThreadPosix.cpp if you would like me to submit a patch. The other side of the equation is how to easily assign this affinity through VR Juggler. Would adding a configuration element option make sense? It could default to "unassigned" in vrjconfig and allow the user to specify which window (or viewport?) is rendered by which CPU. Another thought I had is that we could instead try to make the draw thread initialization semi-intelligently select a CPU. We could have a default behavior (0,1,2,3, unassigned) and also allow an environment variable to define the order in which threads are assigned affinity. Something like DRAW_THREAD_AFFINITY="0 2 1 3" on a dual-dual-core system would assign the first to draw threads to physically different CPUs (0 & 2) and remaining draw threads to the second cores on those CPUs (1 & 3). Thoughts? -Todd |
From: Patrick H. <pa...@in...> - 2007-09-15 15:34:38
Attachments:
signature.asc
|
Todd J. Furlong wrote: > As I mentioned in my vrjuggler-info post, we need to be able to assign = > processor affinity to draw threads in VR Juggler to keep performance up= =20 > on multi-CPU configurations with CPU-heavy draw threads like OSG-based = > or other scenegraph-based applications. >=20 > A logical place to start adding support is in VPR. This code will=20 > assign CPU-affinity to the current thread on a Linux system: > void setCPUAffinity(int cpu) > { > if(sysconf(_SC_NPROCESSORS_CONF) > cpu) > { =09 > pid_t mythread =3D syscall(__NR_gettid); > if (mythread) > { > cpu_set_t cpumask; > CPU_ZERO( &cpumask ); > CPU_SET( cpu, &cpumask ); > sched_setaffinity( mythread, sizeof(cpumask), &cpumask ); > } > } > } >=20 > I can add similar code to ThreadPosix.cpp if you would like me to submi= t=20 > a patch. That shouldn't be necessary. I think that the above would work fine in vpr::ThreadPosix::setRunOn(). I haven't looked yet, but does OpenThreads have a Windows equivalent of this implementation? Fortunately, we are usi= ng native Windows threads again and can make this happen easily if the Windo= ws APIs are there to do it. I doubt that NSPR is going to offer cross-platfo= rm CPU affinity, but I haven't looked recently. > The other side of the equation is how to easily assign this affinity=20 > through VR Juggler. Would adding a configuration element option make=20 > sense? It could default to "unassigned" in vrjconfig and allow the use= r=20 > to specify which window (or viewport?) is rendered by which CPU. >=20 > Another thought I had is that we could instead try to make the draw=20 > thread initialization semi-intelligently select a CPU. We could have a= =20 > default behavior (0,1,2,3, unassigned) and also allow an environment=20 > variable to define the order in which threads are assigned affinity.=20 > Something like DRAW_THREAD_AFFINITY=3D"0 2 1 3" on a dual-dual-core sys= tem=20 > would assign the first to draw threads to physically different CPUs (0 = &=20 > 2) and remaining draw threads to the second cores on those CPUs (1 & 3)= =2E >=20 > Thoughts? I will have to ponder this for a while. My first reaction is that the configuration approach would be simpler to deal with in general. However,= the need for CPU affinity would probably vary at the application level mo= re than at the configuration (i.e., site) level. In that case, the decision = is usually to leave that sort of thing up to the application programmer. But= then we have to consider that the application programmer does not have an= easy way to detect when multiple render threads are being used. That is dictated by the configuration and would probably be handled most cleanly = in vrj::GlPipe (vrj::opengl::Pipe). This still being my initial reaction with only a minute or two of thought= , it seems as though the environment variable approach would get the best balance. However, I don't like the use of an environment variable for two= reasons: 1. We have been working really hard to reduce or eliminate the need fo= r environment variables in VR Juggler application execution. Strictly= speaking, this one would be wholly optional, but it is still anothe= r thing to put in a list (thinking of the documentation) that is alre= ady too long. 2. The syntactic nature of your proposal has pitfalls, thus meaning th= at the error handling of the parsing code would need to be fairly smar= t in order to offer reasonable behavior in the event of (inevitable) operator error. It is, however, flexible in a way that does address= the situation without a lot of hassle for users. That being said, I don't have a better idea to offer at the moment. This = is all very interesting, though. Thanks for pursuing this so diligently. Perhaps others on this list have suggestions that will refine these ideas= further? -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-15 20:11:07
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >> As I mentioned in my vrjuggler-info post, we need to be able to assign >> processor affinity to draw threads in VR Juggler to keep performance up >> on multi-CPU configurations with CPU-heavy draw threads like OSG-based >> or other scenegraph-based applications. >> >> A logical place to start adding support is in VPR. This code will >> assign CPU-affinity to the current thread on a Linux system: >> void setCPUAffinity(int cpu) >> { >> if(sysconf(_SC_NPROCESSORS_CONF) > cpu) >> { >> pid_t mythread = syscall(__NR_gettid); >> if (mythread) >> { >> cpu_set_t cpumask; >> CPU_ZERO( &cpumask ); >> CPU_SET( cpu, &cpumask ); >> sched_setaffinity( mythread, sizeof(cpumask), &cpumask ); >> } >> } >> } >> >> I can add similar code to ThreadPosix.cpp if you would like me to submit >> a patch. > > That shouldn't be necessary. I think that the above would work fine in > vpr::ThreadPosix::setRunOn(). I haven't looked yet, but does OpenThreads > have a Windows equivalent of this implementation? Fortunately, we are using > native Windows threads again and can make this happen easily if the Windows > APIs are there to do it. I doubt that NSPR is going to offer cross-platform > CPU affinity, but I haven't looked recently. It looks like they use SetThreadAffinityMask in Windows. >> The other side of the equation is how to easily assign this affinity >> through VR Juggler. Would adding a configuration element option make >> sense? It could default to "unassigned" in vrjconfig and allow the user >> to specify which window (or viewport?) is rendered by which CPU. >> >> Another thought I had is that we could instead try to make the draw >> thread initialization semi-intelligently select a CPU. We could have a >> default behavior (0,1,2,3, unassigned) and also allow an environment >> variable to define the order in which threads are assigned affinity. >> Something like DRAW_THREAD_AFFINITY="0 2 1 3" on a dual-dual-core system >> would assign the first to draw threads to physically different CPUs (0 & >> 2) and remaining draw threads to the second cores on those CPUs (1 & 3). >> >> Thoughts? > > I will have to ponder this for a while. My first reaction is that the > configuration approach would be simpler to deal with in general. However, > the need for CPU affinity would probably vary at the application level more > than at the configuration (i.e., site) level. In that case, the decision is > usually to leave that sort of thing up to the application programmer. But > then we have to consider that the application programmer does not have an > easy way to detect when multiple render threads are being used. That is > dictated by the configuration and would probably be handled most cleanly in > vrj::GlPipe (vrj::opengl::Pipe). > > This still being my initial reaction with only a minute or two of thought, > it seems as though the environment variable approach would get the best > balance. However, I don't like the use of an environment variable for two > reasons: > > 1. We have been working really hard to reduce or eliminate the need for > environment variables in VR Juggler application execution. Strictly > speaking, this one would be wholly optional, but it is still another > thing to put in a list (thinking of the documentation) that is already > too long. > 2. The syntactic nature of your proposal has pitfalls, thus meaning that > the error handling of the parsing code would need to be fairly smart > in order to offer reasonable behavior in the event of (inevitable) > operator error. It is, however, flexible in a way that does address > the situation without a lot of hassle for users. > > That being said, I don't have a better idea to offer at the moment. This is > all very interesting, though. Thanks for pursuing this so diligently. > Perhaps others on this list have suggestions that will refine these ideas > further? > > -Patrick It's definitely easy to set an environment variable in a script in Linux, but that doesn't translate well to Windows. If we go the configuration route instead, I'd probably want the ability to have a jconf file that only specifies affinity. Then I could choose to include either a different set of affinities or none at all for the same display configuration depending on the application. However, that would mean separating affinity from GlPipe, which doesn't make sense. So, I'm open to suggestions there. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-16 19:17:03
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>> As I mentioned in my vrjuggler-info post, we need to be able to assig= n=20 >>> processor affinity to draw threads in VR Juggler to keep performance = up=20 >>> on multi-CPU configurations with CPU-heavy draw threads like OSG-base= d=20 >>> or other scenegraph-based applications. >>> >>> A logical place to start adding support is in VPR. This code will=20 >>> assign CPU-affinity to the current thread on a Linux system: >>> void setCPUAffinity(int cpu) >>> { >>> if(sysconf(_SC_NPROCESSORS_CONF) > cpu) >>> { =09 >>> pid_t mythread =3D syscall(__NR_gettid); >>> if (mythread) >>> { >>> cpu_set_t cpumask; >>> CPU_ZERO( &cpumask ); >>> CPU_SET( cpu, &cpumask ); >>> sched_setaffinity( mythread, sizeof(cpumask), &cpumask ); >>> } >>> } >>> } >>> >>> I can add similar code to ThreadPosix.cpp if you would like me to sub= mit=20 >>> a patch. >> That shouldn't be necessary. I think that the above would work fine in= >> vpr::ThreadPosix::setRunOn(). I haven't looked yet, but does OpenThrea= ds >> have a Windows equivalent of this implementation? Fortunately, we are = using >> native Windows threads again and can make this happen easily if the Wi= ndows >> APIs are there to do it. I doubt that NSPR is going to offer cross-pla= tform >> CPU affinity, but I haven't looked recently. >=20 > It looks like they use SetThreadAffinityMask in Windows. I have been looking into this, and we have a problem. Because vpr::BaseThread and its subclasses make (unnecessary) use of virtual functions, adding vpr::ThreadWin32::[gs]etRunOn() on the 2.2 branch is no= t an option. Doing so would break binary compatibility. The Linux-specific code for vpr::ThreadPosix can go into the 2.2 branch, though. So, here is what I am thinking: add the Linux-specific support on the 2.2= branch and make whatever extensions are appropriate to vrj::GlPipe also b= e specific to Linux. For VR Juggler 3.0, vpr::BaseThread and its subclasses= will be cleaned up further (a task that I started in February 1998 and wh= ich is still going on today... ugh). As part of that, vpr::ThreadWin32 will g= et the CPU affinity functions, and vrj::opengl::Pipe can have cross-platform= CPU affinity capabilities. Thoughts? >>> The other side of the equation is how to easily assign this affinity = >>> through VR Juggler. Would adding a configuration element option make= =20 >>> sense? It could default to "unassigned" in vrjconfig and allow the u= ser=20 >>> to specify which window (or viewport?) is rendered by which CPU. >>> >>> Another thought I had is that we could instead try to make the draw=20 >>> thread initialization semi-intelligently select a CPU. We could have= a=20 >>> default behavior (0,1,2,3, unassigned) and also allow an environment = >>> variable to define the order in which threads are assigned affinity. = >>> Something like DRAW_THREAD_AFFINITY=3D"0 2 1 3" on a dual-dual-core s= ystem=20 >>> would assign the first to draw threads to physically different CPUs (= 0 &=20 >>> 2) and remaining draw threads to the second cores on those CPUs (1 & = 3). >>> >>> Thoughts? >> I will have to ponder this for a while. My first reaction is that the >> configuration approach would be simpler to deal with in general. Howev= er, >> the need for CPU affinity would probably vary at the application level= more >> than at the configuration (i.e., site) level. In that case, the decisi= on is >> usually to leave that sort of thing up to the application programmer. = But >> then we have to consider that the application programmer does not have= an >> easy way to detect when multiple render threads are being used. That i= s >> dictated by the configuration and would probably be handled most clean= ly in >> vrj::GlPipe (vrj::opengl::Pipe). >> >> This still being my initial reaction with only a minute or two of thou= ght, >> it seems as though the environment variable approach would get the bes= t >> balance. However, I don't like the use of an environment variable for = two >> reasons: >> >> 1. We have been working really hard to reduce or eliminate the need= for >> environment variables in VR Juggler application execution. Stric= tly >> speaking, this one would be wholly optional, but it is still ano= ther >> thing to put in a list (thinking of the documentation) that is a= lready >> too long. >> 2. The syntactic nature of your proposal has pitfalls, thus meaning= that >> the error handling of the parsing code would need to be fairly s= mart >> in order to offer reasonable behavior in the event of (inevitabl= e) >> operator error. It is, however, flexible in a way that does addr= ess >> the situation without a lot of hassle for users. >> >> That being said, I don't have a better idea to offer at the moment. Th= is is >> all very interesting, though. Thanks for pursuing this so diligently. >> Perhaps others on this list have suggestions that will refine these id= eas >> further? >> >> -Patrick >=20 > It's definitely easy to set an environment variable in a script in=20 > Linux, but that doesn't translate well to Windows. Maestro will deal with that sort of thing in the cluster launching case--= if people are using Maestro. Otherwise, batch files for running things are certainly not uncommon. > If we go the=20 > configuration route instead, I'd probably want the ability to have a=20 > jconf file that only specifies affinity. Then I could choose to includ= e=20 > either a different set of affinities or none at all for the same displa= y=20 > configuration depending on the application. However, that would mean=20 > separating affinity from GlPipe, which doesn't make sense. So, I'm ope= n=20 > to suggestions there. Any configuration-based approach based on that sort of flexibility will probably prove to be too complicated for most people to use. The current decoupling of pipes and windows is already tricky. Trying to tie in CPU affinity to the render threads without packaging that up with the pipe configuration would get difficult quickly. An even bigger issue is that being able to swap in different display_system config elements without changing the display_window elements means that the display_window configurations all need to be set up in advance to allow for that. An environment variable still seems to be the cleanest approach so far. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-16 21:24:26
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >> Patrick Hartling wrote: >>> Todd J. Furlong wrote: >>>> As I mentioned in my vrjuggler-info post, we need to be able to assign >>>> processor affinity to draw threads in VR Juggler to keep performance up >>>> on multi-CPU configurations with CPU-heavy draw threads like OSG-based >>>> or other scenegraph-based applications. >>>> >>>> A logical place to start adding support is in VPR. This code will >>>> assign CPU-affinity to the current thread on a Linux system: >>>> void setCPUAffinity(int cpu) >>>> { >>>> if(sysconf(_SC_NPROCESSORS_CONF) > cpu) >>>> { >>>> pid_t mythread = syscall(__NR_gettid); >>>> if (mythread) >>>> { >>>> cpu_set_t cpumask; >>>> CPU_ZERO( &cpumask ); >>>> CPU_SET( cpu, &cpumask ); >>>> sched_setaffinity( mythread, sizeof(cpumask), &cpumask ); >>>> } >>>> } >>>> } >>>> >>>> I can add similar code to ThreadPosix.cpp if you would like me to submit >>>> a patch. >>> That shouldn't be necessary. I think that the above would work fine in >>> vpr::ThreadPosix::setRunOn(). I haven't looked yet, but does OpenThreads >>> have a Windows equivalent of this implementation? Fortunately, we are using >>> native Windows threads again and can make this happen easily if the Windows >>> APIs are there to do it. I doubt that NSPR is going to offer cross-platform >>> CPU affinity, but I haven't looked recently. >> It looks like they use SetThreadAffinityMask in Windows. > > I have been looking into this, and we have a problem. Because > vpr::BaseThread and its subclasses make (unnecessary) use of virtual > functions, adding vpr::ThreadWin32::[gs]etRunOn() on the 2.2 branch is not > an option. Doing so would break binary compatibility. The Linux-specific > code for vpr::ThreadPosix can go into the 2.2 branch, though. > > So, here is what I am thinking: add the Linux-specific support on the 2.2 > branch and make whatever extensions are appropriate to vrj::GlPipe also be > specific to Linux. For VR Juggler 3.0, vpr::BaseThread and its subclasses > will be cleaned up further (a task that I started in February 1998 and which > is still going on today... ugh). As part of that, vpr::ThreadWin32 will get > the CPU affinity functions, and vrj::opengl::Pipe can have cross-platform > CPU affinity capabilities. > > Thoughts? That will meet my needs in the near term, so it sounds OK. >>>> The other side of the equation is how to easily assign this affinity >>>> through VR Juggler. Would adding a configuration element option make >>>> sense? It could default to "unassigned" in vrjconfig and allow the user >>>> to specify which window (or viewport?) is rendered by which CPU. >>>> >>>> Another thought I had is that we could instead try to make the draw >>>> thread initialization semi-intelligently select a CPU. We could have a >>>> default behavior (0,1,2,3, unassigned) and also allow an environment >>>> variable to define the order in which threads are assigned affinity. >>>> Something like DRAW_THREAD_AFFINITY="0 2 1 3" on a dual-dual-core system >>>> would assign the first to draw threads to physically different CPUs (0 & >>>> 2) and remaining draw threads to the second cores on those CPUs (1 & 3). >>>> >>>> Thoughts? >>> I will have to ponder this for a while. My first reaction is that the >>> configuration approach would be simpler to deal with in general. However, >>> the need for CPU affinity would probably vary at the application level more >>> than at the configuration (i.e., site) level. In that case, the decision is >>> usually to leave that sort of thing up to the application programmer. But >>> then we have to consider that the application programmer does not have an >>> easy way to detect when multiple render threads are being used. That is >>> dictated by the configuration and would probably be handled most cleanly in >>> vrj::GlPipe (vrj::opengl::Pipe). >>> >>> This still being my initial reaction with only a minute or two of thought, >>> it seems as though the environment variable approach would get the best >>> balance. However, I don't like the use of an environment variable for two >>> reasons: >>> >>> 1. We have been working really hard to reduce or eliminate the need for >>> environment variables in VR Juggler application execution. Strictly >>> speaking, this one would be wholly optional, but it is still another >>> thing to put in a list (thinking of the documentation) that is already >>> too long. >>> 2. The syntactic nature of your proposal has pitfalls, thus meaning that >>> the error handling of the parsing code would need to be fairly smart >>> in order to offer reasonable behavior in the event of (inevitable) >>> operator error. It is, however, flexible in a way that does address >>> the situation without a lot of hassle for users. >>> >>> That being said, I don't have a better idea to offer at the moment. This is >>> all very interesting, though. Thanks for pursuing this so diligently. >>> Perhaps others on this list have suggestions that will refine these ideas >>> further? >>> >>> -Patrick >> It's definitely easy to set an environment variable in a script in >> Linux, but that doesn't translate well to Windows. > > Maestro will deal with that sort of thing in the cluster launching case--if > people are using Maestro. Otherwise, batch files for running things are > certainly not uncommon. > >> If we go the >> configuration route instead, I'd probably want the ability to have a >> jconf file that only specifies affinity. Then I could choose to include >> either a different set of affinities or none at all for the same display >> configuration depending on the application. However, that would mean >> separating affinity from GlPipe, which doesn't make sense. So, I'm open >> to suggestions there. > > Any configuration-based approach based on that sort of flexibility will > probably prove to be too complicated for most people to use. The current > decoupling of pipes and windows is already tricky. Trying to tie in CPU > affinity to the render threads without packaging that up with the pipe > configuration would get difficult quickly. An even bigger issue is that > being able to swap in different display_system config elements without > changing the display_window elements means that the display_window > configurations all need to be set up in advance to allow for that. > > An environment variable still seems to be the cleanest approach so far. > > -Patrick > OK, what's a good name for the environment variable? I can add in the code tomorrow and submit a patch. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-16 23:20:23
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>> Patrick Hartling wrote: >>>> Todd J. Furlong wrote: [snip] >>>>> The other side of the equation is how to easily assign this affinit= y=20 >>>>> through VR Juggler. Would adding a configuration element option ma= ke=20 >>>>> sense? It could default to "unassigned" in vrjconfig and allow the= user=20 >>>>> to specify which window (or viewport?) is rendered by which CPU. >>>>> >>>>> Another thought I had is that we could instead try to make the draw= =20 >>>>> thread initialization semi-intelligently select a CPU. We could ha= ve a=20 >>>>> default behavior (0,1,2,3, unassigned) and also allow an environmen= t=20 >>>>> variable to define the order in which threads are assigned affinity= =2E=20 >>>>> Something like DRAW_THREAD_AFFINITY=3D"0 2 1 3" on a dual-dual-core= system=20 >>>>> would assign the first to draw threads to physically different CPUs= (0 &=20 >>>>> 2) and remaining draw threads to the second cores on those CPUs (1 = & 3). >>>>> >>>>> Thoughts? >>>> I will have to ponder this for a while. My first reaction is that th= e >>>> configuration approach would be simpler to deal with in general. How= ever, >>>> the need for CPU affinity would probably vary at the application lev= el more >>>> than at the configuration (i.e., site) level. In that case, the deci= sion is >>>> usually to leave that sort of thing up to the application programmer= =2E But >>>> then we have to consider that the application programmer does not ha= ve an >>>> easy way to detect when multiple render threads are being used. That= is >>>> dictated by the configuration and would probably be handled most cle= anly in >>>> vrj::GlPipe (vrj::opengl::Pipe). >>>> >>>> This still being my initial reaction with only a minute or two of th= ought, >>>> it seems as though the environment variable approach would get the b= est >>>> balance. However, I don't like the use of an environment variable fo= r two >>>> reasons: >>>> >>>> 1. We have been working really hard to reduce or eliminate the ne= ed for >>>> environment variables in VR Juggler application execution. Str= ictly >>>> speaking, this one would be wholly optional, but it is still a= nother >>>> thing to put in a list (thinking of the documentation) that is= already >>>> too long. >>>> 2. The syntactic nature of your proposal has pitfalls, thus meani= ng that >>>> the error handling of the parsing code would need to be fairly= smart >>>> in order to offer reasonable behavior in the event of (inevita= ble) >>>> operator error. It is, however, flexible in a way that does ad= dress >>>> the situation without a lot of hassle for users. >>>> >>>> That being said, I don't have a better idea to offer at the moment. = This is >>>> all very interesting, though. Thanks for pursuing this so diligently= =2E >>>> Perhaps others on this list have suggestions that will refine these = ideas >>>> further? >>>> >>>> -Patrick >>> It's definitely easy to set an environment variable in a script in=20 >>> Linux, but that doesn't translate well to Windows. >> Maestro will deal with that sort of thing in the cluster launching cas= e--if >> people are using Maestro. Otherwise, batch files for running things ar= e >> certainly not uncommon. >> >>> If we go the=20 >>> configuration route instead, I'd probably want the ability to have a = >>> jconf file that only specifies affinity. Then I could choose to incl= ude=20 >>> either a different set of affinities or none at all for the same disp= lay=20 >>> configuration depending on the application. However, that would mean= =20 >>> separating affinity from GlPipe, which doesn't make sense. So, I'm o= pen=20 >>> to suggestions there. >> Any configuration-based approach based on that sort of flexibility wil= l >> probably prove to be too complicated for most people to use. The curre= nt >> decoupling of pipes and windows is already tricky. Trying to tie in CP= U >> affinity to the render threads without packaging that up with the pipe= >> configuration would get difficult quickly. An even bigger issue is tha= t >> being able to swap in different display_system config elements without= >> changing the display_window elements means that the display_window >> configurations all need to be set up in advance to allow for that. >> >> An environment variable still seems to be the cleanest approach so far= =2E >> >> -Patrick >> >=20 > OK, what's a good name for the environment variable? I can add in the = > code tomorrow and submit a patch. I didn't mean to suggest that the environment variable approach is the solution. No one else has chimed in with alternatives yet, and I haven't really given this a lot of thought so far. I have been trying to avoid working on the weekend for a change. One other thing that concerns me is the notion of having default CPU affinity assignment for render threads. I don't see the value in doing th= at, if for no other reason that a fundamental design philosophy of VR Juggler= is to avoid making assumptions such as that. The render threads, assuming mo= re than one is configured, are probably the threads doing the most work, but= the OS kernel thread scheduler may still be able to do a better job of arranging things than we would. As far as I understand this, your testing= is fairly qualitative and is for a single application and single configurati= on. Broad conclusions about how things should function should not be drawn fr= om that if that is indeed the case. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-16 23:58:54
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >> Patrick Hartling wrote: >>> Todd J. Furlong wrote: >>>> Patrick Hartling wrote: >>>>> Todd J. Furlong wrote: > > [snip] > >>>>>> The other side of the equation is how to easily assign this affinity >>>>>> through VR Juggler. Would adding a configuration element option make >>>>>> sense? It could default to "unassigned" in vrjconfig and allow the user >>>>>> to specify which window (or viewport?) is rendered by which CPU. >>>>>> >>>>>> Another thought I had is that we could instead try to make the draw >>>>>> thread initialization semi-intelligently select a CPU. We could have a >>>>>> default behavior (0,1,2,3, unassigned) and also allow an environment >>>>>> variable to define the order in which threads are assigned affinity. >>>>>> Something like DRAW_THREAD_AFFINITY="0 2 1 3" on a dual-dual-core system >>>>>> would assign the first to draw threads to physically different CPUs (0 & >>>>>> 2) and remaining draw threads to the second cores on those CPUs (1 & 3). >>>>>> >>>>>> Thoughts? >>>>> I will have to ponder this for a while. My first reaction is that the >>>>> configuration approach would be simpler to deal with in general. However, >>>>> the need for CPU affinity would probably vary at the application level more >>>>> than at the configuration (i.e., site) level. In that case, the decision is >>>>> usually to leave that sort of thing up to the application programmer. But >>>>> then we have to consider that the application programmer does not have an >>>>> easy way to detect when multiple render threads are being used. That is >>>>> dictated by the configuration and would probably be handled most cleanly in >>>>> vrj::GlPipe (vrj::opengl::Pipe). >>>>> >>>>> This still being my initial reaction with only a minute or two of thought, >>>>> it seems as though the environment variable approach would get the best >>>>> balance. However, I don't like the use of an environment variable for two >>>>> reasons: >>>>> >>>>> 1. We have been working really hard to reduce or eliminate the need for >>>>> environment variables in VR Juggler application execution. Strictly >>>>> speaking, this one would be wholly optional, but it is still another >>>>> thing to put in a list (thinking of the documentation) that is already >>>>> too long. >>>>> 2. The syntactic nature of your proposal has pitfalls, thus meaning that >>>>> the error handling of the parsing code would need to be fairly smart >>>>> in order to offer reasonable behavior in the event of (inevitable) >>>>> operator error. It is, however, flexible in a way that does address >>>>> the situation without a lot of hassle for users. >>>>> >>>>> That being said, I don't have a better idea to offer at the moment. This is >>>>> all very interesting, though. Thanks for pursuing this so diligently. >>>>> Perhaps others on this list have suggestions that will refine these ideas >>>>> further? >>>>> >>>>> -Patrick >>>> It's definitely easy to set an environment variable in a script in >>>> Linux, but that doesn't translate well to Windows. >>> Maestro will deal with that sort of thing in the cluster launching case--if >>> people are using Maestro. Otherwise, batch files for running things are >>> certainly not uncommon. >>> >>>> If we go the >>>> configuration route instead, I'd probably want the ability to have a >>>> jconf file that only specifies affinity. Then I could choose to include >>>> either a different set of affinities or none at all for the same display >>>> configuration depending on the application. However, that would mean >>>> separating affinity from GlPipe, which doesn't make sense. So, I'm open >>>> to suggestions there. >>> Any configuration-based approach based on that sort of flexibility will >>> probably prove to be too complicated for most people to use. The current >>> decoupling of pipes and windows is already tricky. Trying to tie in CPU >>> affinity to the render threads without packaging that up with the pipe >>> configuration would get difficult quickly. An even bigger issue is that >>> being able to swap in different display_system config elements without >>> changing the display_window elements means that the display_window >>> configurations all need to be set up in advance to allow for that. >>> >>> An environment variable still seems to be the cleanest approach so far. >>> >>> -Patrick >>> >> OK, what's a good name for the environment variable? I can add in the >> code tomorrow and submit a patch. > > I didn't mean to suggest that the environment variable approach is the > solution. No one else has chimed in with alternatives yet, and I haven't > really given this a lot of thought so far. I have been trying to avoid > working on the weekend for a change. Hmm, I should try that sometime... For now, an environment variable makes enough sense to me to give it a try. I'm on the hook for getting other applications up to speed, so I either have to do it for every application or make alterations to VR Juggler. > One other thing that concerns me is the notion of having default CPU > affinity assignment for render threads. I don't see the value in doing that, > if for no other reason that a fundamental design philosophy of VR Juggler is > to avoid making assumptions such as that. The render threads, assuming more > than one is configured, are probably the threads doing the most work, but > the OS kernel thread scheduler may still be able to do a better job of > arranging things than we would. As far as I understand this, your testing is > fairly qualitative and is for a single application and single configuration. > Broad conclusions about how things should function should not be drawn from > that if that is indeed the case. > > -Patrick What I'd like to implement is no affinity assignment unless the environment variable is set. If my changes make it into proper VR Juggler, then we can suggest how to assign affinity for different configurations in the doc's. The need for it is definitely application- and configuration-specific, and it is probably even content-specific as well. However, I think we'll see the issue more as people start configuring clusters with fewer nodes that have more horsepower per node. -Todd |
From: Doug M. <mc...@ia...> - 2007-09-17 01:17:17
|
> Patrick Hartling wrote: >> Todd J. Furlong wrote: >>> Patrick Hartling wrote: >>>> Todd J. Furlong wrote: >>>>> Patrick Hartling wrote: >>>>>> Todd J. Furlong wrote: >> >> [snip] >> >>>>>>> The other side of the equation is how to easily assign this >>>>>>> affinity >>>>>>> through VR Juggler. Would adding a configuration element option >>>>>>> make >>>>>>> sense? It could default to "unassigned" in vrjconfig and allow the >>>>>>> user >>>>>>> to specify which window (or viewport?) is rendered by which CPU. >>>>>>> >>>>>>> Another thought I had is that we could instead try to make the draw >>>>>>> thread initialization semi-intelligently select a CPU. We could >>>>>>> have a >>>>>>> default behavior (0,1,2,3, unassigned) and also allow an >>>>>>> environment >>>>>>> variable to define the order in which threads are assigned >>>>>>> affinity. >>>>>>> Something like DRAW_THREAD_AFFINITY="0 2 1 3" on a dual-dual-core >>>>>>> system >>>>>>> would assign the first to draw threads to physically different CPUs >>>>>>> (0 & >>>>>>> 2) and remaining draw threads to the second cores on those CPUs (1 >>>>>>> & 3). >>>>>>> >>>>>>> Thoughts? >>>>>> I will have to ponder this for a while. My first reaction is that >>>>>> the >>>>>> configuration approach would be simpler to deal with in general. >>>>>> However, >>>>>> the need for CPU affinity would probably vary at the application >>>>>> level more >>>>>> than at the configuration (i.e., site) level. In that case, the >>>>>> decision is >>>>>> usually to leave that sort of thing up to the application >>>>>> programmer. But >>>>>> then we have to consider that the application programmer does not >>>>>> have an >>>>>> easy way to detect when multiple render threads are being used. That >>>>>> is >>>>>> dictated by the configuration and would probably be handled most >>>>>> cleanly in >>>>>> vrj::GlPipe (vrj::opengl::Pipe). >>>>>> >>>>>> This still being my initial reaction with only a minute or two of >>>>>> thought, >>>>>> it seems as though the environment variable approach would get the >>>>>> best >>>>>> balance. However, I don't like the use of an environment variable >>>>>> for two >>>>>> reasons: >>>>>> >>>>>> 1. We have been working really hard to reduce or eliminate the >>>>>> need for >>>>>> environment variables in VR Juggler application execution. >>>>>> Strictly >>>>>> speaking, this one would be wholly optional, but it is still >>>>>> another >>>>>> thing to put in a list (thinking of the documentation) that is >>>>>> already >>>>>> too long. >>>>>> 2. The syntactic nature of your proposal has pitfalls, thus >>>>>> meaning that >>>>>> the error handling of the parsing code would need to be fairly >>>>>> smart >>>>>> in order to offer reasonable behavior in the event of >>>>>> (inevitable) >>>>>> operator error. It is, however, flexible in a way that does >>>>>> address >>>>>> the situation without a lot of hassle for users. >>>>>> >>>>>> That being said, I don't have a better idea to offer at the moment. >>>>>> This is >>>>>> all very interesting, though. Thanks for pursuing this so >>>>>> diligently. >>>>>> Perhaps others on this list have suggestions that will refine these >>>>>> ideas >>>>>> further? >>>>>> >>>>>> -Patrick >>>>> It's definitely easy to set an environment variable in a script in >>>>> Linux, but that doesn't translate well to Windows. >>>> Maestro will deal with that sort of thing in the cluster launching >>>> case--if >>>> people are using Maestro. Otherwise, batch files for running things >>>> are >>>> certainly not uncommon. >>>> >>>>> If we go the >>>>> configuration route instead, I'd probably want the ability to have a >>>>> jconf file that only specifies affinity. Then I could choose to >>>>> include >>>>> either a different set of affinities or none at all for the same >>>>> display >>>>> configuration depending on the application. However, that would mean >>>>> separating affinity from GlPipe, which doesn't make sense. So, I'm >>>>> open >>>>> to suggestions there. >>>> Any configuration-based approach based on that sort of flexibility >>>> will >>>> probably prove to be too complicated for most people to use. The >>>> current >>>> decoupling of pipes and windows is already tricky. Trying to tie in >>>> CPU >>>> affinity to the render threads without packaging that up with the pipe >>>> configuration would get difficult quickly. An even bigger issue is >>>> that >>>> being able to swap in different display_system config elements without >>>> changing the display_window elements means that the display_window >>>> configurations all need to be set up in advance to allow for that. >>>> >>>> An environment variable still seems to be the cleanest approach so >>>> far. >>>> >>>> -Patrick >>>> >>> OK, what's a good name for the environment variable? I can add in the >>> code tomorrow and submit a patch. >> >> I didn't mean to suggest that the environment variable approach is the >> solution. No one else has chimed in with alternatives yet, and I haven't >> really given this a lot of thought so far. I have been trying to avoid >> working on the weekend for a change. > > Hmm, I should try that sometime... For now, an environment variable > makes enough sense to me to give it a try. I'm on the hook for getting > other applications up to speed, so I either have to do it for every > application or make alterations to VR Juggler. > >> One other thing that concerns me is the notion of having default CPU >> affinity assignment for render threads. I don't see the value in doing >> that, >> if for no other reason that a fundamental design philosophy of VR >> Juggler is >> to avoid making assumptions such as that. The render threads, assuming >> more >> than one is configured, are probably the threads doing the most work, >> but >> the OS kernel thread scheduler may still be able to do a better job of >> arranging things than we would. As far as I understand this, your >> testing is >> fairly qualitative and is for a single application and single >> configuration. >> Broad conclusions about how things should function should not be drawn >> from >> that if that is indeed the case. >> >> -Patrick > > What I'd like to implement is no affinity assignment unless the > environment variable is set. If my changes make it into proper VR > Juggler, then we can suggest how to assign affinity for different > configurations in the doc's. The need for it is definitely application- > and configuration-specific, and it is probably even content-specific as > well. I agree. However, I think we'll see the issue more as people start > configuring clusters with fewer nodes that have more horsepower per node. > I am not convinced of the above statement. With more being moved from the CPU to GPU the ability to overload the render threads with CPU intensive computations may go down on large workstations for some applications. I suppose this is reiterated in your previous statement though. In my testing and installations we have not seen this issue yet but then again we are using GPU based occlusion culling. Even when our framerates go down due to cpu intensive operations it is typically not due to culling which may also be some of the differences in my testing. Doug |
From: Patrick H. <pa...@in...> - 2007-09-17 02:26:35
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>> Patrick Hartling wrote: >>>> Todd J. Furlong wrote: >>>>> Patrick Hartling wrote: >>>>>> Todd J. Furlong wrote: >> [snip] >> >>>>>>> The other side of the equation is how to easily assign this affin= ity=20 >>>>>>> through VR Juggler. Would adding a configuration element option = make=20 >>>>>>> sense? It could default to "unassigned" in vrjconfig and allow t= he user=20 >>>>>>> to specify which window (or viewport?) is rendered by which CPU. >>>>>>> >>>>>>> Another thought I had is that we could instead try to make the dr= aw=20 >>>>>>> thread initialization semi-intelligently select a CPU. We could = have a=20 >>>>>>> default behavior (0,1,2,3, unassigned) and also allow an environm= ent=20 >>>>>>> variable to define the order in which threads are assigned affini= ty.=20 >>>>>>> Something like DRAW_THREAD_AFFINITY=3D"0 2 1 3" on a dual-dual-co= re system=20 >>>>>>> would assign the first to draw threads to physically different CP= Us (0 &=20 >>>>>>> 2) and remaining draw threads to the second cores on those CPUs (= 1 & 3). >>>>>>> >>>>>>> Thoughts? >>>>>> I will have to ponder this for a while. My first reaction is that = the >>>>>> configuration approach would be simpler to deal with in general. H= owever, >>>>>> the need for CPU affinity would probably vary at the application l= evel more >>>>>> than at the configuration (i.e., site) level. In that case, the de= cision is >>>>>> usually to leave that sort of thing up to the application programm= er. But >>>>>> then we have to consider that the application programmer does not = have an >>>>>> easy way to detect when multiple render threads are being used. Th= at is >>>>>> dictated by the configuration and would probably be handled most c= leanly in >>>>>> vrj::GlPipe (vrj::opengl::Pipe). >>>>>> >>>>>> This still being my initial reaction with only a minute or two of = thought, >>>>>> it seems as though the environment variable approach would get the= best >>>>>> balance. However, I don't like the use of an environment variable = for two >>>>>> reasons: >>>>>> >>>>>> 1. We have been working really hard to reduce or eliminate the = need for >>>>>> environment variables in VR Juggler application execution. S= trictly >>>>>> speaking, this one would be wholly optional, but it is still= another >>>>>> thing to put in a list (thinking of the documentation) that = is already >>>>>> too long. >>>>>> 2. The syntactic nature of your proposal has pitfalls, thus mea= ning that >>>>>> the error handling of the parsing code would need to be fair= ly smart >>>>>> in order to offer reasonable behavior in the event of (inevi= table) >>>>>> operator error. It is, however, flexible in a way that does = address >>>>>> the situation without a lot of hassle for users. >>>>>> >>>>>> That being said, I don't have a better idea to offer at the moment= =2E This is >>>>>> all very interesting, though. Thanks for pursuing this so diligent= ly. >>>>>> Perhaps others on this list have suggestions that will refine thes= e ideas >>>>>> further? >>>>>> >>>>>> -Patrick >>>>> It's definitely easy to set an environment variable in a script in = >>>>> Linux, but that doesn't translate well to Windows. >>>> Maestro will deal with that sort of thing in the cluster launching c= ase--if >>>> people are using Maestro. Otherwise, batch files for running things = are >>>> certainly not uncommon. >>>> >>>>> If we go the=20 >>>>> configuration route instead, I'd probably want the ability to have = a=20 >>>>> jconf file that only specifies affinity. Then I could choose to in= clude=20 >>>>> either a different set of affinities or none at all for the same di= splay=20 >>>>> configuration depending on the application. However, that would me= an=20 >>>>> separating affinity from GlPipe, which doesn't make sense. So, I'm= open=20 >>>>> to suggestions there. >>>> Any configuration-based approach based on that sort of flexibility w= ill >>>> probably prove to be too complicated for most people to use. The cur= rent >>>> decoupling of pipes and windows is already tricky. Trying to tie in = CPU >>>> affinity to the render threads without packaging that up with the pi= pe >>>> configuration would get difficult quickly. An even bigger issue is t= hat >>>> being able to swap in different display_system config elements witho= ut >>>> changing the display_window elements means that the display_window >>>> configurations all need to be set up in advance to allow for that. >>>> >>>> An environment variable still seems to be the cleanest approach so f= ar. >>>> >>>> -Patrick >>>> >>> OK, what's a good name for the environment variable? I can add in th= e=20 >>> code tomorrow and submit a patch. >> I didn't mean to suggest that the environment variable approach is the= >> solution. No one else has chimed in with alternatives yet, and I haven= 't >> really given this a lot of thought so far. I have been trying to avoid= >> working on the weekend for a change. >=20 > Hmm, I should try that sometime... For now, an environment variable=20 > makes enough sense to me to give it a try. I'm on the hook for getting= =20 > other applications up to speed, so I either have to do it for every=20 > application or make alterations to VR Juggler. >=20 >> One other thing that concerns me is the notion of having default CPU >> affinity assignment for render threads. I don't see the value in doing= that, >> if for no other reason that a fundamental design philosophy of VR Jugg= ler is >> to avoid making assumptions such as that. The render threads, assuming= more >> than one is configured, are probably the threads doing the most work, = but >> the OS kernel thread scheduler may still be able to do a better job of= >> arranging things than we would. As far as I understand this, your test= ing is >> fairly qualitative and is for a single application and single configur= ation. >> Broad conclusions about how things should function should not be drawn= from >> that if that is indeed the case. >> >> -Patrick >=20 > What I'd like to implement is no affinity assignment unless the=20 > environment variable is set. I guess I misunderstood this aspect of your original proposal. That sound= s good to me. > If my changes make it into proper VR=20 > Juggler, then we can suggest how to assign affinity for different=20 > configurations in the doc's. The need for it is definitely application= -=20 > and configuration-specific, and it is probably even content-specific as= =20 > well. However, I think we'll see the issue more as people start=20 > configuring clusters with fewer nodes that have more horsepower per nod= e. Perhaps. I have to wonder why people didn't observe this with multi-pipe SGIs, but maybe the difference was that there wasn't much for comparison.= If you have patches to submit, or soon will, please go ahead. If not, I'l= l see if I can put something together tomorrow night. The only alternatives= that I have come up with are to extend vrj::GlApp or to make it easier fo= r user-level code to determine when multi-threaded rendering is in use. The= environment variable approach gives application programmers the option to= use it or to write application-specific performance tuning. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Doug M. <mc...@ia...> - 2007-09-17 03:05:13
|
>> [snip] >> What I'd like to implement is no affinity assignment unless the >> environment variable is set. > > I guess I misunderstood this aspect of your original proposal. That > sounds > good to me. > >> If my changes make it into proper VR >> Juggler, then we can suggest how to assign affinity for different >> configurations in the doc's. The need for it is definitely >> application- >> and configuration-specific, and it is probably even content- >> specific as >> well. However, I think we'll see the issue more as people start >> configuring clusters with fewer nodes that have more horsepower >> per node. > > Perhaps. I have to wonder why people didn't observe this with multi- > pipe > SGIs, but maybe the difference was that there wasn't much for > comparison. > > If you have patches to submit, or soon will, please go ahead. If > not, I'll > see if I can put something together tomorrow night. The only > alternatives > that I have come up with are to extend vrj::GlApp or to make it > easier for > user-level code to determine when multi-threaded rendering is in use. Could this also be utilized to help the recent OSG problems with setting tread safety? > The > environment variable approach gives application programmers the > option to > use it or to write application-specific performance tuning. > Doug |
From: Patrick H. <pa...@in...> - 2007-09-17 11:54:44
Attachments:
signature.asc
|
Doug McCorkle wrote: >=20 > [snip] >=20 >>> What I'd like to implement is no affinity assignment unless the >>> environment variable is set. >> I guess I misunderstood this aspect of your original proposal. That =20 >> sounds >> good to me. >> >>> If my changes make it into proper VR >>> Juggler, then we can suggest how to assign affinity for different >>> configurations in the doc's. The need for it is definitely =20 >>> application- >>> and configuration-specific, and it is probably even content-=20 >>> specific as >>> well. However, I think we'll see the issue more as people start >>> configuring clusters with fewer nodes that have more horsepower =20 >>> per node. >> Perhaps. I have to wonder why people didn't observe this with multi-=20 >> pipe >> SGIs, but maybe the difference was that there wasn't much for =20 >> comparison. >> >> If you have patches to submit, or soon will, please go ahead. If =20 >> not, I'll >> see if I can put something together tomorrow night. The only =20 >> alternatives >> that I have come up with are to extend vrj::GlApp or to make it =20 >> easier for >> user-level code to determine when multi-threaded rendering is in use. >=20 > Could this also be utilized to help the recent OSG problems with =20 > setting tread safety? I doubt it. This would be at the OpenGL Draw Manager level and would not have knowledge of or access to OSG interfaces. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-17 12:13:23
|
>> [snip] >> >>>> What I'd like to implement is no affinity assignment unless the >>>> environment variable is set. >>> I guess I misunderstood this aspect of your original proposal. That >>> sounds >>> good to me. >>> >>>> If my changes make it into proper VR >>>> Juggler, then we can suggest how to assign affinity for different >>>> configurations in the doc's. The need for it is definitely >>>> application- >>>> and configuration-specific, and it is probably even content- >>>> specific as >>>> well. However, I think we'll see the issue more as people start >>>> configuring clusters with fewer nodes that have more horsepower >>>> per node. >>> Perhaps. I have to wonder why people didn't observe this with multi- >>> pipe >>> SGIs, but maybe the difference was that there wasn't much for >>> comparison. I know Performer was very thread- & CPU-aware, but I never used OSG on SGI myself. I was a bit surprised to run into this and have it be a such big problem. I'll try to gather some data to illustrate. I'm thinking that some of the scenes we are rendering could be reorganized to be more cull-friendly, but I can't count on people to be aware of that for new content. >>> If you have patches to submit, or soon will, please go ahead. If >>> not, I'll >>> see if I can put something together tomorrow night. The only >>> alternatives >>> that I have come up with are to extend vrj::GlApp or to make it >>> easier for >>> user-level code to determine when multi-threaded rendering is in use. >> Could this also be utilized to help the recent OSG problems with >> setting tread safety? > > I doubt it. This would be at the OpenGL Draw Manager level and would not > have knowledge of or access to OSG interfaces. > > -Patrick I'll be working on this today, and I'll send along my results. Do you know offhand where in the code the draw threads are created? -Todd |
From: Todd J. F. <to...@in...> - 2007-09-17 14:46:20
|
>> [snip] >> >>>> What I'd like to implement is no affinity assignment unless the >>>> environment variable is set. >>> I guess I misunderstood this aspect of your original proposal. That >>> sounds >>> good to me. >>> >>>> If my changes make it into proper VR >>>> Juggler, then we can suggest how to assign affinity for different >>>> configurations in the doc's. The need for it is definitely >>>> application- >>>> and configuration-specific, and it is probably even content- >>>> specific as >>>> well. However, I think we'll see the issue more as people start >>>> configuring clusters with fewer nodes that have more horsepower >>>> per node. >>> Perhaps. I have to wonder why people didn't observe this with multi- >>> pipe >>> SGIs, but maybe the difference was that there wasn't much for >>> comparison. >>> >>> If you have patches to submit, or soon will, please go ahead. If >>> not, I'll >>> see if I can put something together tomorrow night. The only >>> alternatives >>> that I have come up with are to extend vrj::GlApp or to make it >>> easier for >>> user-level code to determine when multi-threaded rendering is in use. >> Could this also be utilized to help the recent OSG problems with >> setting tread safety? > > I doubt it. This would be at the OpenGL Draw Manager level and would not > have knowledge of or access to OSG interfaces. > > -Patrick Patrick, I want to make sure not to break anything, so I need to ask: where can I add functionality? It looks like GlDrawManager is a good place to check the environment variable and create a std::vector of Affinities to assign to each GlPipe on creation or start. The actual "SetRunOn" call looks like it would fit well near the beginning of GlPipe::controlLoop. Can I override GlPipe::start to accept an integer? Or is there a better place to pass this information to GlPipe? -Todd |
From: Patrick H. <pa...@in...> - 2007-09-17 15:05:38
Attachments:
signature.asc
|
Todd J. Furlong wrote: >>> [snip] >>> >>>>> What I'd like to implement is no affinity assignment unless the >>>>> environment variable is set. >>>> I guess I misunderstood this aspect of your original proposal. That = =20 >>>> sounds >>>> good to me. >>>> >>>>> If my changes make it into proper VR >>>>> Juggler, then we can suggest how to assign affinity for different >>>>> configurations in the doc's. The need for it is definitely =20 >>>>> application- >>>>> and configuration-specific, and it is probably even content-=20 >>>>> specific as >>>>> well. However, I think we'll see the issue more as people start >>>>> configuring clusters with fewer nodes that have more horsepower =20 >>>>> per node. >>>> Perhaps. I have to wonder why people didn't observe this with multi-= =20 >>>> pipe >>>> SGIs, but maybe the difference was that there wasn't much for =20 >>>> comparison. >>>> >>>> If you have patches to submit, or soon will, please go ahead. If =20 >>>> not, I'll >>>> see if I can put something together tomorrow night. The only =20 >>>> alternatives >>>> that I have come up with are to extend vrj::GlApp or to make it =20 >>>> easier for >>>> user-level code to determine when multi-threaded rendering is in use= =2E >>> Could this also be utilized to help the recent OSG problems with =20 >>> setting tread safety? >> I doubt it. This would be at the OpenGL Draw Manager level and would n= ot >> have knowledge of or access to OSG interfaces. >> >> -Patrick >=20 > Patrick, >=20 > I want to make sure not to break anything, so I need to ask: where can = I=20 > add functionality? It looks like GlDrawManager is a good place to chec= k=20 > the environment variable and create a std::vector of Affinities to=20 > assign to each GlPipe on creation or start. That is how I imagined that it would be done. It should be quite easy to extract the integers from the environment variable (VJ_DRAW_THREAD_AFFINI= TY should be the name, btw) using boost::split() and std::transform(). > The actual "SetRunOn" call=20 > looks like it would fit well near the beginning of GlPipe::controlLoop.= =20 > Can I override GlPipe::start to accept an integer? Or is there a=20 > better place to pass this information to GlPipe? Is "override" the word that you want here? I think you mean that you want= to change the signature of vrj::GlPipe::start() so that it expects a const i= nt. Then, that would be passed through to vrj::GlPipe::controlLoop() so that = the first thing that the pipe's thread does is call mActiveThread->setRunOn()= with that integer. However, there will have to be some care taken to ensu= re that mActiveThread has a valid value when vrj::GlPipe::controlLoop() is running. There is a race condition there, but it can be fixed with relati= ve ease. vrj::Kernel has to do the same thing. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-18 20:29:17
Attachments:
cpu_affinity.patch
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >>>> [snip] >>>> >>>>>> What I'd like to implement is no affinity assignment unless the >>>>>> environment variable is set. >>>>> I guess I misunderstood this aspect of your original proposal. That >>>>> sounds >>>>> good to me. >>>>> >>>>>> If my changes make it into proper VR >>>>>> Juggler, then we can suggest how to assign affinity for different >>>>>> configurations in the doc's. The need for it is definitely >>>>>> application- >>>>>> and configuration-specific, and it is probably even content- >>>>>> specific as >>>>>> well. However, I think we'll see the issue more as people start >>>>>> configuring clusters with fewer nodes that have more horsepower >>>>>> per node. >>>>> Perhaps. I have to wonder why people didn't observe this with multi- >>>>> pipe >>>>> SGIs, but maybe the difference was that there wasn't much for >>>>> comparison. >>>>> >>>>> If you have patches to submit, or soon will, please go ahead. If >>>>> not, I'll >>>>> see if I can put something together tomorrow night. The only >>>>> alternatives >>>>> that I have come up with are to extend vrj::GlApp or to make it >>>>> easier for >>>>> user-level code to determine when multi-threaded rendering is in use. >>>> Could this also be utilized to help the recent OSG problems with >>>> setting tread safety? >>> I doubt it. This would be at the OpenGL Draw Manager level and would not >>> have knowledge of or access to OSG interfaces. >>> >>> -Patrick >> Patrick, >> >> I want to make sure not to break anything, so I need to ask: where can I >> add functionality? It looks like GlDrawManager is a good place to check >> the environment variable and create a std::vector of Affinities to >> assign to each GlPipe on creation or start. > > That is how I imagined that it would be done. It should be quite easy to > extract the integers from the environment variable (VJ_DRAW_THREAD_AFFINITY > should be the name, btw) using boost::split() and std::transform(). > >> The actual "SetRunOn" call >> looks like it would fit well near the beginning of GlPipe::controlLoop. >> Can I override GlPipe::start to accept an integer? Or is there a >> better place to pass this information to GlPipe? > > Is "override" the word that you want here? I think you mean that you want to > change the signature of vrj::GlPipe::start() so that it expects a const int. > Then, that would be passed through to vrj::GlPipe::controlLoop() so that the > first thing that the pipe's thread does is call mActiveThread->setRunOn() > with that integer. However, there will have to be some care taken to ensure > that mActiveThread has a valid value when vrj::GlPipe::controlLoop() is > running. There is a race condition there, but it can be fixed with relative > ease. vrj::Kernel has to do the same thing. > > -Patrick > Here is a patch that adds CPU affinity via an environment variable to VR Juggler 2.2 from SVN. This implementation does not assign affinity if the environment variable is specified. It doesn't add Allen's suggestions, but I think it is a good starting point. It does not break binary compatibility. I'm running this code now and testing my software, and it seems to be working OK. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-19 11:39:51
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>>>> [snip] >>>>> >>>>>>> What I'd like to implement is no affinity assignment unless the >>>>>>> environment variable is set. >>>>>> I guess I misunderstood this aspect of your original proposal. >>>>>> That sounds >>>>>> good to me. >>>>>> >>>>>>> If my changes make it into proper VR >>>>>>> Juggler, then we can suggest how to assign affinity for different= >>>>>>> configurations in the doc's. The need for it is definitely=20 >>>>>>> application- >>>>>>> and configuration-specific, and it is probably even content- >>>>>>> specific as >>>>>>> well. However, I think we'll see the issue more as people start >>>>>>> configuring clusters with fewer nodes that have more horsepower=20 >>>>>>> per node. >>>>>> Perhaps. I have to wonder why people didn't observe this with >>>>>> multi- pipe >>>>>> SGIs, but maybe the difference was that there wasn't much for=20 >>>>>> comparison. >>>>>> >>>>>> If you have patches to submit, or soon will, please go ahead. If=20 >>>>>> not, I'll >>>>>> see if I can put something together tomorrow night. The only=20 >>>>>> alternatives >>>>>> that I have come up with are to extend vrj::GlApp or to make it=20 >>>>>> easier for >>>>>> user-level code to determine when multi-threaded rendering is in u= se. >>>>> Could this also be utilized to help the recent OSG problems with=20 >>>>> setting tread safety? >>>> I doubt it. This would be at the OpenGL Draw Manager level and would= >>>> not >>>> have knowledge of or access to OSG interfaces. >>>> >>>> -Patrick >>> Patrick, >>> >>> I want to make sure not to break anything, so I need to ask: where >>> can I add functionality? It looks like GlDrawManager is a good place= >>> to check the environment variable and create a std::vector of >>> Affinities to assign to each GlPipe on creation or start. >> >> That is how I imagined that it would be done. It should be quite easy = to >> extract the integers from the environment variable >> (VJ_DRAW_THREAD_AFFINITY >> should be the name, btw) using boost::split() and std::transform(). >> >>> The actual "SetRunOn" call looks like it would fit well near the >>> beginning of GlPipe::controlLoop. Can I override GlPipe::start to >>> accept an integer? Or is there a better place to pass this >>> information to GlPipe? >> >> Is "override" the word that you want here? I think you mean that you >> want to >> change the signature of vrj::GlPipe::start() so that it expects a >> const int. >> Then, that would be passed through to vrj::GlPipe::controlLoop() so >> that the >> first thing that the pipe's thread does is call mActiveThread->setRunO= n() >> with that integer. However, there will have to be some care taken to >> ensure >> that mActiveThread has a valid value when vrj::GlPipe::controlLoop() i= s >> running. There is a race condition there, but it can be fixed with >> relative >> ease. vrj::Kernel has to do the same thing. >> >> -Patrick >> >=20 > Here is a patch that adds CPU affinity via an environment variable to V= R > Juggler 2.2 from SVN. This implementation does not assign affinity if > the environment variable is specified. It doesn't add Allen's > suggestions, but I think it is a good starting point. It does not brea= k > binary compatibility. I'm running this code now and testing my > software, and it seems to be working OK. Could you please resubmit this as output from running 'svn diff' against = the source tree? Either that, or please use the diff command to make unified = or context diffs. This patch does not contain the information about what fil= es are to be modified. Aside from that, context or unified diffs are much easier to read--at least for me. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-19 12:06:37
Attachments:
cpu_affinity.svn.patch
|
[snip] >>> >> Here is a patch that adds CPU affinity via an environment variable to VR >> Juggler 2.2 from SVN. This implementation does not assign affinity if >> the environment variable is specified. It doesn't add Allen's >> suggestions, but I think it is a good starting point. It does not break >> binary compatibility. I'm running this code now and testing my >> software, and it seems to be working OK. > > Could you please resubmit this as output from running 'svn diff' against the > source tree? Either that, or please use the diff command to make unified or > context diffs. This patch does not contain the information about what files > are to be modified. Aside from that, context or unified diffs are much > easier to read--at least for me. > > -Patrick > Sorry about that. Here is an svn diff. I removed two sections for files that I modified that are unrelated, so let me know if it doesn't work. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-19 19:18:54
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>>>> [snip] >>>>> >>>>>>> What I'd like to implement is no affinity assignment unless the >>>>>>> environment variable is set. >>>>>> I guess I misunderstood this aspect of your original proposal. >>>>>> That sounds >>>>>> good to me. >>>>>> >>>>>>> If my changes make it into proper VR >>>>>>> Juggler, then we can suggest how to assign affinity for different= >>>>>>> configurations in the doc's. The need for it is definitely=20 >>>>>>> application- >>>>>>> and configuration-specific, and it is probably even content- >>>>>>> specific as >>>>>>> well. However, I think we'll see the issue more as people start >>>>>>> configuring clusters with fewer nodes that have more horsepower=20 >>>>>>> per node. >>>>>> Perhaps. I have to wonder why people didn't observe this with >>>>>> multi- pipe >>>>>> SGIs, but maybe the difference was that there wasn't much for=20 >>>>>> comparison. >>>>>> >>>>>> If you have patches to submit, or soon will, please go ahead. If=20 >>>>>> not, I'll >>>>>> see if I can put something together tomorrow night. The only=20 >>>>>> alternatives >>>>>> that I have come up with are to extend vrj::GlApp or to make it=20 >>>>>> easier for >>>>>> user-level code to determine when multi-threaded rendering is in u= se. >>>>> Could this also be utilized to help the recent OSG problems with=20 >>>>> setting tread safety? >>>> I doubt it. This would be at the OpenGL Draw Manager level and would= >>>> not >>>> have knowledge of or access to OSG interfaces. >>>> >>>> -Patrick >>> Patrick, >>> >>> I want to make sure not to break anything, so I need to ask: where >>> can I add functionality? It looks like GlDrawManager is a good place= >>> to check the environment variable and create a std::vector of >>> Affinities to assign to each GlPipe on creation or start. >> >> That is how I imagined that it would be done. It should be quite easy = to >> extract the integers from the environment variable >> (VJ_DRAW_THREAD_AFFINITY >> should be the name, btw) using boost::split() and std::transform(). >> >>> The actual "SetRunOn" call looks like it would fit well near the >>> beginning of GlPipe::controlLoop. Can I override GlPipe::start to >>> accept an integer? Or is there a better place to pass this >>> information to GlPipe? >> >> Is "override" the word that you want here? I think you mean that you >> want to >> change the signature of vrj::GlPipe::start() so that it expects a >> const int. >> Then, that would be passed through to vrj::GlPipe::controlLoop() so >> that the >> first thing that the pipe's thread does is call mActiveThread->setRunO= n() >> with that integer. However, there will have to be some care taken to >> ensure >> that mActiveThread has a valid value when vrj::GlPipe::controlLoop() i= s >> running. There is a race condition there, but it can be fixed with >> relative >> ease. vrj::Kernel has to do the same thing. >> >> -Patrick >> >=20 > Here is a patch that adds CPU affinity via an environment variable to V= R > Juggler 2.2 from SVN. This implementation does not assign affinity if > the environment variable is specified. It doesn't add Allen's > suggestions, but I think it is a good starting point. It does not brea= k > binary compatibility. I'm running this code now and testing my > software, and it seems to be working OK. I just checked in changes to the trunk (see r20824) that add draw thread = CPU affinity capabilities in a generalized way that can be used quite easily = by any (multi-threaded) Draw Manager implementation. I implemented it using = the Strategy Pattern as Allen and I were thinking. User-level code can change= the CPU affinity determination mechanism by using the new method vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to call= this new method is before vrj::Kernel::start() is called. I will port the changes back to the 2.2 branch tonight. The implementatio= n will be different in the following ways: * It will only be available when VPR is compiled to use the POSIX subsystem * It will only be available to the OpenGL Draw Manager so that libvrj does not change in way that would break binary compatibility I had already put the CPU affinity calls for Linux into the 2.2 implementation of vpr::ThreadPosix::setRunOn(). Have you tested those changes, Todd? The way that I implemented that is somewhat different than= the patch that you posted. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-19 20:18:24
|
[snip] >> Here is a patch that adds CPU affinity via an environment variable to VR >> Juggler 2.2 from SVN. This implementation does not assign affinity if >> the environment variable is specified. It doesn't add Allen's >> suggestions, but I think it is a good starting point. It does not break >> binary compatibility. I'm running this code now and testing my >> software, and it seems to be working OK. > > I just checked in changes to the trunk (see r20824) that add draw thread CPU > affinity capabilities in a generalized way that can be used quite easily by > any (multi-threaded) Draw Manager implementation. I implemented it using the > Strategy Pattern as Allen and I were thinking. User-level code can change > the CPU affinity determination mechanism by using the new method > vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to call > this new method is before vrj::Kernel::start() is called. > > I will port the changes back to the 2.2 branch tonight. The implementation > will be different in the following ways: > > * It will only be available when VPR is compiled to use the POSIX > subsystem > * It will only be available to the OpenGL Draw Manager so that libvrj > does not change in way that would break binary compatibility > > I had already put the CPU affinity calls for Linux into the 2.2 > implementation of vpr::ThreadPosix::setRunOn(). Have you tested those > changes, Todd? The way that I implemented that is somewhat different than > the patch that you posted. > > -Patrick Patrick, I just looked at ThreadPosix.cpp, and your code looks more like the OpenThreads implementation than the code I sent you. The OpenThreads code didn't work for me (and it may not work for them if an OpenThreads thread is really just a wrapped pthread). My app actually crashed on the second time I called that code. sched_setaffinity with the first argument set to zero sets affinity for the calling process, but you need to pass in the result of "syscall(__NR_gettid)" as the first argument to set the affinity for the calling pthread. Similarly, you need that thread ID in sched_getaffinity to query the affinity. I think the only other significant difference is checking if lower bound on the cpu ID in setRunOn. I put that in because I don't know if the other functions will exit gracefully if given a negative number. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-19 20:35:52
Attachments:
signature.asc
|
Todd J. Furlong wrote: > [snip] >>> Here is a patch that adds CPU affinity via an environment variable to= VR >>> Juggler 2.2 from SVN. This implementation does not assign affinity i= f >>> the environment variable is specified. It doesn't add Allen's >>> suggestions, but I think it is a good starting point. It does not br= eak >>> binary compatibility. I'm running this code now and testing my >>> software, and it seems to be working OK. >> I just checked in changes to the trunk (see r20824) that add draw thre= ad CPU >> affinity capabilities in a generalized way that can be used quite easi= ly by >> any (multi-threaded) Draw Manager implementation. I implemented it usi= ng the >> Strategy Pattern as Allen and I were thinking. User-level code can cha= nge >> the CPU affinity determination mechanism by using the new method >> vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to c= all >> this new method is before vrj::Kernel::start() is called. >> >> I will port the changes back to the 2.2 branch tonight. The implementa= tion >> will be different in the following ways: >> >> * It will only be available when VPR is compiled to use the POSIX >> subsystem >> * It will only be available to the OpenGL Draw Manager so that libv= rj >> does not change in way that would break binary compatibility >> >> I had already put the CPU affinity calls for Linux into the 2.2 >> implementation of vpr::ThreadPosix::setRunOn(). Have you tested those >> changes, Todd? The way that I implemented that is somewhat different t= han >> the patch that you posted. >> >> -Patrick >=20 > Patrick, >=20 > I just looked at ThreadPosix.cpp, and your code looks more like the=20 > OpenThreads implementation than the code I sent you. The OpenThreads=20 > code didn't work for me (and it may not work for them if an OpenThreads= =20 > thread is really just a wrapped pthread). My app actually crashed on=20 > the second time I called that code. Are the OpenThreads maintainers aware of this problem? > sched_setaffinity with the first=20 > argument set to zero sets affinity for the calling process, but you nee= d=20 > to pass in the result of "syscall(__NR_gettid)" as the first argument t= o=20 > set the affinity for the calling pthread. Similarly, you need that=20 > thread ID in sched_getaffinity to query the affinity. OK. I didn't see the note at the end of the sched_setaffinity(2) man page= regarding threads. Why does your code use syscall(2) instead of gettid(2)= ? Is the value returned from syscall(__NR_gettid) different that what gettid(2) returns? I saw one example implementation of gettid(2) that sim= ply returns the value from calling syscall(__NR_gettid), but I don't know how= correct it is. Did you devise the code for this that you posted, or did y= ou adapt it from another source? Basically, without comments in the code explaining the significance of doing it this way, I have no easy way of getting a complete understanding of why you are doing things the way that= you are. > I think the only=20 > other significant difference is checking if lower bound on the cpu ID i= n=20 > setRunOn. I put that in because I don't know if the other functions=20 > will exit gracefully if given a negative number. I'm not sure what you mean here. The code that I checked in verifies that= the CPU value is grater than or equal to 0. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-19 20:47:23
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >> [snip] >>>> Here is a patch that adds CPU affinity via an environment variable to VR >>>> Juggler 2.2 from SVN. This implementation does not assign affinity if >>>> the environment variable is specified. It doesn't add Allen's >>>> suggestions, but I think it is a good starting point. It does not break >>>> binary compatibility. I'm running this code now and testing my >>>> software, and it seems to be working OK. >>> I just checked in changes to the trunk (see r20824) that add draw thread CPU >>> affinity capabilities in a generalized way that can be used quite easily by >>> any (multi-threaded) Draw Manager implementation. I implemented it using the >>> Strategy Pattern as Allen and I were thinking. User-level code can change >>> the CPU affinity determination mechanism by using the new method >>> vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to call >>> this new method is before vrj::Kernel::start() is called. >>> >>> I will port the changes back to the 2.2 branch tonight. The implementation >>> will be different in the following ways: >>> >>> * It will only be available when VPR is compiled to use the POSIX >>> subsystem >>> * It will only be available to the OpenGL Draw Manager so that libvrj >>> does not change in way that would break binary compatibility >>> >>> I had already put the CPU affinity calls for Linux into the 2.2 >>> implementation of vpr::ThreadPosix::setRunOn(). Have you tested those >>> changes, Todd? The way that I implemented that is somewhat different than >>> the patch that you posted. >>> >>> -Patrick >> Patrick, >> >> I just looked at ThreadPosix.cpp, and your code looks more like the >> OpenThreads implementation than the code I sent you. The OpenThreads >> code didn't work for me (and it may not work for them if an OpenThreads >> thread is really just a wrapped pthread). My app actually crashed on >> the second time I called that code. > > Are the OpenThreads maintainers aware of this problem? I wasn't sure it was a problem for them, so I haven't emailed anyone about it. >> sched_setaffinity with the first >> argument set to zero sets affinity for the calling process, but you need >> to pass in the result of "syscall(__NR_gettid)" as the first argument to >> set the affinity for the calling pthread. Similarly, you need that >> thread ID in sched_getaffinity to query the affinity. > > OK. I didn't see the note at the end of the sched_setaffinity(2) man page > regarding threads. Why does your code use syscall(2) instead of gettid(2)? > Is the value returned from syscall(__NR_gettid) different that what > gettid(2) returns? I saw one example implementation of gettid(2) that simply > returns the value from calling syscall(__NR_gettid), but I don't know how > correct it is. Did you devise the code for this that you posted, or did you > adapt it from another source? Basically, without comments in the code > explaining the significance of doing it this way, I have no easy way of > getting a complete understanding of why you are doing things the way that > you are. See the notes in the gettid man page: "Glibc does not provide a wrapper for this system call; call it using syscall(2)" I started with code that looked like the OpenThreads calls, and then I saw the gettid reference in the sched_setaffinity man page. >> I think the only >> other significant difference is checking if lower bound on the cpu ID in >> setRunOn. I put that in because I don't know if the other functions >> will exit gracefully if given a negative number. > > I'm not sure what you mean here. The code that I checked in verifies that > the CPU value is grater than or equal to 0. I just looked at the trunk. It looks like you check the lower bound in Pipe::controlLoop. The rest of what I saw looks good to me, but I haven't compiled & run with it. -Todd |
From: Patrick H. <pa...@in...> - 2007-09-20 00:58:05
Attachments:
signature.asc
|
Todd J. Furlong wrote: > Patrick Hartling wrote: >> Todd J. Furlong wrote: >>> [snip] >>>>> Here is a patch that adds CPU affinity via an environment variable = to VR >>>>> Juggler 2.2 from SVN. This implementation does not assign affinity= if >>>>> the environment variable is specified. It doesn't add Allen's >>>>> suggestions, but I think it is a good starting point. It does not = break >>>>> binary compatibility. I'm running this code now and testing my >>>>> software, and it seems to be working OK. >>>> I just checked in changes to the trunk (see r20824) that add draw th= read CPU >>>> affinity capabilities in a generalized way that can be used quite ea= sily by >>>> any (multi-threaded) Draw Manager implementation. I implemented it u= sing the >>>> Strategy Pattern as Allen and I were thinking. User-level code can c= hange >>>> the CPU affinity determination mechanism by using the new method >>>> vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to= call >>>> this new method is before vrj::Kernel::start() is called. >>>> >>>> I will port the changes back to the 2.2 branch tonight. The implemen= tation >>>> will be different in the following ways: >>>> >>>> * It will only be available when VPR is compiled to use the POSIX= >>>> subsystem >>>> * It will only be available to the OpenGL Draw Manager so that li= bvrj >>>> does not change in way that would break binary compatibility >>>> >>>> I had already put the CPU affinity calls for Linux into the 2.2 >>>> implementation of vpr::ThreadPosix::setRunOn(). Have you tested thos= e >>>> changes, Todd? The way that I implemented that is somewhat different= than >>>> the patch that you posted. >>>> >>>> -Patrick >>> Patrick, >>> >>> I just looked at ThreadPosix.cpp, and your code looks more like the=20 >>> OpenThreads implementation than the code I sent you. The OpenThreads= =20 >>> code didn't work for me (and it may not work for them if an OpenThrea= ds=20 >>> thread is really just a wrapped pthread). My app actually crashed on= =20 >>> the second time I called that code. >> Are the OpenThreads maintainers aware of this problem? >=20 > I wasn't sure it was a problem for them, so I haven't emailed anyone=20 > about it. So the code that crashed was your own? I thought you meant that the affin= ity assignment call in OpenThreads fails. >>> sched_setaffinity with the first=20 >>> argument set to zero sets affinity for the calling process, but you n= eed=20 >>> to pass in the result of "syscall(__NR_gettid)" as the first argument= to=20 >>> set the affinity for the calling pthread. Similarly, you need that=20 >>> thread ID in sched_getaffinity to query the affinity. >> OK. I didn't see the note at the end of the sched_setaffinity(2) man p= age >> regarding threads. Why does your code use syscall(2) instead of gettid= (2)? >> Is the value returned from syscall(__NR_gettid) different that what >> gettid(2) returns? I saw one example implementation of gettid(2) that = simply >> returns the value from calling syscall(__NR_gettid), but I don't know = how >> correct it is. Did you devise the code for this that you posted, or di= d you >> adapt it from another source? Basically, without comments in the code >> explaining the significance of doing it this way, I have no easy way o= f >> getting a complete understanding of why you are doing things the way t= hat >> you are. >=20 > See the notes in the gettid man page: "Glibc does not provide a wrapper= =20 > for this system call; call it using syscall(2)" Hmm, well, I guess it's not a big deal. What I see indicates that gettid(= 2) has been available since the 2.4.20 kernel in early 2003. However, other versions of the man page from 2007 include the quote that you reference. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |
From: Todd J. F. <to...@in...> - 2007-09-20 01:42:54
|
Patrick Hartling wrote: > Todd J. Furlong wrote: >> Patrick Hartling wrote: >>> Todd J. Furlong wrote: >>>> [snip] >>>>>> Here is a patch that adds CPU affinity via an environment variable to VR >>>>>> Juggler 2.2 from SVN. This implementation does not assign affinity if >>>>>> the environment variable is specified. It doesn't add Allen's >>>>>> suggestions, but I think it is a good starting point. It does not break >>>>>> binary compatibility. I'm running this code now and testing my >>>>>> software, and it seems to be working OK. >>>>> I just checked in changes to the trunk (see r20824) that add draw thread CPU >>>>> affinity capabilities in a generalized way that can be used quite easily by >>>>> any (multi-threaded) Draw Manager implementation. I implemented it using the >>>>> Strategy Pattern as Allen and I were thinking. User-level code can change >>>>> the CPU affinity determination mechanism by using the new method >>>>> vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best time to call >>>>> this new method is before vrj::Kernel::start() is called. >>>>> >>>>> I will port the changes back to the 2.2 branch tonight. The implementation >>>>> will be different in the following ways: >>>>> >>>>> * It will only be available when VPR is compiled to use the POSIX >>>>> subsystem >>>>> * It will only be available to the OpenGL Draw Manager so that libvrj >>>>> does not change in way that would break binary compatibility >>>>> >>>>> I had already put the CPU affinity calls for Linux into the 2.2 >>>>> implementation of vpr::ThreadPosix::setRunOn(). Have you tested those >>>>> changes, Todd? The way that I implemented that is somewhat different than >>>>> the patch that you posted. >>>>> >>>>> -Patrick >>>> Patrick, >>>> >>>> I just looked at ThreadPosix.cpp, and your code looks more like the >>>> OpenThreads implementation than the code I sent you. The OpenThreads >>>> code didn't work for me (and it may not work for them if an OpenThreads >>>> thread is really just a wrapped pthread). My app actually crashed on >>>> the second time I called that code. >>> Are the OpenThreads maintainers aware of this problem? >> I wasn't sure it was a problem for them, so I haven't emailed anyone >> about it. > > So the code that crashed was your own? I thought you meant that the affinity > assignment call in OpenThreads fails. > I didn't use OpenThreads directly, I just hit on their POSIX implementation in Google and tried it in my code. When it crashed, I hit on the gettid call, which did the trick in VR Juggler. There may be another way to initialize pthreads to avoid the need for the gettid call, but I didn't dig any deeper once I found something that worked. >>>> sched_setaffinity with the first >>>> argument set to zero sets affinity for the calling process, but you need >>>> to pass in the result of "syscall(__NR_gettid)" as the first argument to >>>> set the affinity for the calling pthread. Similarly, you need that >>>> thread ID in sched_getaffinity to query the affinity. >>> OK. I didn't see the note at the end of the sched_setaffinity(2) man page >>> regarding threads. Why does your code use syscall(2) instead of gettid(2)? >>> Is the value returned from syscall(__NR_gettid) different that what >>> gettid(2) returns? I saw one example implementation of gettid(2) that simply >>> returns the value from calling syscall(__NR_gettid), but I don't know how >>> correct it is. Did you devise the code for this that you posted, or did you >>> adapt it from another source? Basically, without comments in the code >>> explaining the significance of doing it this way, I have no easy way of >>> getting a complete understanding of why you are doing things the way that >>> you are. >> See the notes in the gettid man page: "Glibc does not provide a wrapper >> for this system call; call it using syscall(2)" > > Hmm, well, I guess it's not a big deal. What I see indicates that gettid(2) > has been available since the 2.4.20 kernel in early 2003. However, other > versions of the man page from 2007 include the quote that you reference. > > -Patrick I think this code may have hit on a difference in NPTL versus LinuxThreads. LinuxThreads were assigned a unique PID, but NPTL threads aren't. So, process-level affinity was probably the same as thread-level in the past. Maybe there's a way to initialize pthreads in Linux to have unique PID's? Like IRIX had the "system scope" attribute? Then you could use more portable code and avoid syscall. -Todd |
From: Doug M. <mc...@ia...> - 2007-09-20 11:55:09
|
On Sep 20, 2007, at 6:33 AM, Patrick Hartling wrote: > Todd J. Furlong wrote: >> Patrick Hartling wrote: >>> Todd J. Furlong wrote: >>>> Patrick Hartling wrote: >>>>> Todd J. Furlong wrote: >>>>>> [snip] >>>>>>>> Here is a patch that adds CPU affinity via an environment >>>>>>>> variable to VR >>>>>>>> Juggler 2.2 from SVN. This implementation does not assign >>>>>>>> affinity if >>>>>>>> the environment variable is specified. It doesn't add Allen's >>>>>>>> suggestions, but I think it is a good starting point. It >>>>>>>> does not break >>>>>>>> binary compatibility. I'm running this code now and testing my >>>>>>>> software, and it seems to be working OK. >>>>>>> I just checked in changes to the trunk (see r20824) that add >>>>>>> draw thread CPU >>>>>>> affinity capabilities in a generalized way that can be used >>>>>>> quite easily by >>>>>>> any (multi-threaded) Draw Manager implementation. I >>>>>>> implemented it using the >>>>>>> Strategy Pattern as Allen and I were thinking. User-level >>>>>>> code can change >>>>>>> the CPU affinity determination mechanism by using the new method >>>>>>> vrj::opengl::DrawManager::setCpuAffinityStrategy(). The best >>>>>>> time to call >>>>>>> this new method is before vrj::Kernel::start() is called. >>>>>>> >>>>>>> I will port the changes back to the 2.2 branch tonight. The >>>>>>> implementation >>>>>>> will be different in the following ways: >>>>>>> >>>>>>> * It will only be available when VPR is compiled to use >>>>>>> the POSIX >>>>>>> subsystem >>>>>>> * It will only be available to the OpenGL Draw Manager so >>>>>>> that libvrj >>>>>>> does not change in way that would break binary >>>>>>> compatibility >>>>>>> >>>>>>> I had already put the CPU affinity calls for Linux into the 2.2 >>>>>>> implementation of vpr::ThreadPosix::setRunOn(). Have you >>>>>>> tested those >>>>>>> changes, Todd? The way that I implemented that is somewhat >>>>>>> different than >>>>>>> the patch that you posted. >>>>>>> >>>>>>> -Patrick >>>>>> Patrick, >>>>>> >>>>>> I just looked at ThreadPosix.cpp, and your code looks more >>>>>> like the >>>>>> OpenThreads implementation than the code I sent you. The >>>>>> OpenThreads >>>>>> code didn't work for me (and it may not work for them if an >>>>>> OpenThreads >>>>>> thread is really just a wrapped pthread). My app actually >>>>>> crashed on >>>>>> the second time I called that code. >>>>> Are the OpenThreads maintainers aware of this problem? >>>> I wasn't sure it was a problem for them, so I haven't emailed >>>> anyone >>>> about it. >>> So the code that crashed was your own? I thought you meant that >>> the affinity >>> assignment call in OpenThreads fails. >>> >> >> I didn't use OpenThreads directly, I just hit on their POSIX >> implementation in Google and tried it in my code. When it crashed, I >> hit on the gettid call, which did the trick in VR Juggler. There >> may be >> another way to initialize pthreads to avoid the need for the gettid >> call, but I didn't dig any deeper once I found something that worked. > > OK. Thanks for the background information. > >>>>>> sched_setaffinity with the first >>>>>> argument set to zero sets affinity for the calling process, >>>>>> but you need >>>>>> to pass in the result of "syscall(__NR_gettid)" as the first >>>>>> argument to >>>>>> set the affinity for the calling pthread. Similarly, you need >>>>>> that >>>>>> thread ID in sched_getaffinity to query the affinity. >>>>> OK. I didn't see the note at the end of the sched_setaffinity >>>>> (2) man page >>>>> regarding threads. Why does your code use syscall(2) instead of >>>>> gettid(2)? >>>>> Is the value returned from syscall(__NR_gettid) different that >>>>> what >>>>> gettid(2) returns? I saw one example implementation of gettid >>>>> (2) that simply >>>>> returns the value from calling syscall(__NR_gettid), but I >>>>> don't know how >>>>> correct it is. Did you devise the code for this that you >>>>> posted, or did you >>>>> adapt it from another source? Basically, without comments in >>>>> the code >>>>> explaining the significance of doing it this way, I have no >>>>> easy way of >>>>> getting a complete understanding of why you are doing things >>>>> the way that >>>>> you are. >>>> See the notes in the gettid man page: "Glibc does not provide a >>>> wrapper >>>> for this system call; call it using syscall(2)" >>> Hmm, well, I guess it's not a big deal. What I see indicates that >>> gettid(2) >>> has been available since the 2.4.20 kernel in early 2003. >>> However, other >>> versions of the man page from 2007 include the quote that you >>> reference. >>> >>> -Patrick >> >> I think this code may have hit on a difference in NPTL versus >> LinuxThreads. LinuxThreads were assigned a unique PID, but NPTL >> threads >> aren't. So, process-level affinity was probably the same as >> thread-level in the past. Maybe there's a way to initialize >> pthreads in >> Linux to have unique PID's? Like IRIX had the "system scope" >> attribute? >> Then you could use more portable code and avoid syscall. > > The system scope attribute is part of the POSIX threads standard. > It's up to > the OS to decide how to implement and schedule such threads. > Assigning them > unique process IDs is not necessarily part of that. That sort of > thing would > be platform-specific. The only cross-platform way to have a thread > ID is to > use the pthread_t opague type with some pthread_*(3) function. CPU > affinity > is not part of the current standard, unfortunately. > > Anyway, the CPU affinity changes are on the trunk and the 2.2 > branch now. > I've only done testing on Windows (which is unusual, but that's > what is > convenient for me right now). Feedback on the Linux changes would > be welcome. What is the proper way to test this on Linux to make sure things are working properly? Doug |
From: Patrick H. <pa...@in...> - 2007-09-20 12:06:52
Attachments:
signature.asc
|
Doug McCorkle wrote: > On Sep 20, 2007, at 6:33 AM, Patrick Hartling wrote: [snip] >> Anyway, the CPU affinity changes are on the trunk and the 2.2 =20 >> branch now. >> I've only done testing on Windows (which is unusual, but that's =20 >> what is >> convenient for me right now). Feedback on the Linux changes would =20 >> be welcome. >=20 > What is the proper way to test this on Linux to make sure things are =20 > working properly? Use a multi-pipe/multi-threaded configuration and set the environment variable VJ_DRAW_THREAD_AFFINITY to a space-separated list of integer identifiers corresponding to the processors in the machine. Either that o= r modify your application to call vrj::GlDrawManager::setCpuAffinityStrateg= y() (before vrj::Kernel::start() is called) and use your own CPU affinity selection mechanism. -Patrick --=20 Patrick L. Hartling Senior Software Engineer, Priority 5 http://www.priority5.com/ |