You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(4) |
Sep
(7) |
Oct
(12) |
Nov
(9) |
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
(4) |
Feb
(12) |
Mar
(23) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Reed, R. W <rob...@in...> - 2008-03-24 23:46:57
|
Hmmm. There are no non-commercial versions that I know of, only non-commercial licenses. It certainly sounds from your description that there is a problem with VTune analyzer on your Kubuntu system. I have no knowledge how different Kubuntu 7.10 is from Ubuntu 7.04 (kernel 2.6.20-15.generic). Do you know what version of VTune analyzer you tried? Does the installation find a driver or successfully build it if necessary? Were you SU-ed to root when you installed it? I looked through the cavalcade of caveats listed with the release notes for VTune Analyzer 9.0 Update 7, but saw nothing that seems to apply to the case you describe. With VTune event sampler, the execution time should not be noticeably slower, though there is some sampling overhead. ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Monday, March 24, 2008 5:54 AM To: for people interested in developing TBB itself Subject: Re: [Tbb-develop] What Do You Develop TBB On? I'm using non-commercial versions, and the Thread Checker and Thread Profiler versions available didn't show up as being available for Ubuntu. I'm using Kubuntu 7.10 and VTune doesn't seem to play nicely with my system, during a profiling run it will basically stop responding... using strace on the java process reveals a barrage of segfaults which are occurring. I realize execution will proceed slower, however this is rather extreme.... and it fails for very small inputs as well. I tried both with icc and gcc. I should get a book on VTune today from my multi-core book pack, so maybe I can debug this further then. I don't have a Windows machine. When I'm ready to see the results of Thread Profiler, I'll see what I can do. I might be able to build my own tool to visualize the data, or just go through the data given.... right now I'm leaning toward just dual-booting with an Intel supported OS for when I'm ready to do benchmarks. |
From: Adrien G. <aj....@gm...> - 2008-03-24 17:28:42
|
I can get a processor-heavy system and go cheap on the memory (something with slower frequency)... or I can get fewer processors with very fast memory. A friend has suggested that I get a dual-processor dual-core for now, and get a motherboard that supports better... then when prices drop in quad cores ebay the dual-core processors and get the quad-cores. This way I can invest in the fast memory now by trading processor cores. Really this system is just for TBB development... but maxing out the number of cores might be expensive right now... I kinda like the idea of going cheap on cores right now and getting more later. Any advice on memory access times for TBB development? In some sense if I get a high-throughput system I can also run real simulations also in good time. It might be interesting to document the system I wind up building with my experiences developing TBB on it. I'm basically looking for something that will really let me explore the machine. Of course, what would be really nice is an Itanium2 :-D. AJ On Mon, Mar 24, 2008 at 11:48 AM, Robison, Arch <arc...@in...> wrote: > The TBB group has access to many different machines, ranging from 1 to 64 > cores. > > > > The general trend for budget minded parallel programming research is to > favor many cheap slow cores over a few expensive fast ones. So I'd go for > the cheapest quad core I could find, perhaps even a refurbished one. > (Warning: This advice comes from someone who bought an Apple 6200 Performa > off eBay two years ago, and is still running it.) > > > > -Arch > > > ------------------------------ > > *From:* tbb...@li... [mailto: > tbb...@li...] *On Behalf Of *Adrien Guillon > *Sent:* Friday, March 21, 2008 10:42 AM > *To:* for people interested in developing TBB itself > *Subject:* [Tbb-develop] What Do You Develop TBB On? > > > > I'm considering getting an Intel Core 2 Quad-Core PC for my TBB/YetiSim > development. This would be a bit of a purchase for me, and I'm curious what > others are using for their TBB development. Does anyone have suggestions? > I know Intel makes TBB... but AMD has cheaper Quad-core processors, although > I have heard that due to cache design the Intel ones are better. Any > comments? > > I have access to both a 128 processor Itanium-2 system, and a > dual-processor quad-core Xeon... but I can't really run profiling tools on > these systems to analyze performance since they are a shared resource... and > these systems are usually under high load anyways. The main point of a new > system is for analyzing performance. > > Thanks, > > AJ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Tbb-develop mailing list > Tbb...@li... > https://lists.sourceforge.net/lists/listinfo/tbb-develop > > |
From: Adrien G. <aj....@gm...> - 2008-03-24 16:41:34
|
Thanks, I have a lot to read about now and learn from your detailed responses. I'm hoping to receive my multi-core programming book pack soon, and that it will have sufficient detail for me to learn more about these issues. AJ On Mon, Mar 24, 2008 at 12:06 PM, Robison, Arch <arc...@in...> wrote: > To Robert's excellent summary, I'll add pipeline effects. With a Pentium > 4 or Core 2 processor, all instructions before an atomic operation must > complete before any instructions after the atomic operation can start. For > long pipelines, that can be a major hit. And there are pipeline replays > involved, so the penalty is a multiple of the pipeline length. > > > > For classic PRAM<http://pages.cs.wisc.edu/%7Etvrdik/2/html/Section2.html#PRAM>machines, lock-free algorithms are faster. The trouble is, there are no > PRAM machines in real life. > > > > For priority queues, we've tried both skiplist and heap based versions. > Google "priority queue concurrent" and you will find ample literature. Be > warned that some of the literature reports results for grossly > oversubscribed machines, or for older machines where caches were less of an > issue. > > > > - Arch > ------------------------------ > > *From:* tbb...@li... [mailto: > tbb...@li...] *On Behalf Of *Reed, Robert W > *Sent:* Friday, March 21, 2008 4:29 PM > *To:* 'tbb...@li...'; tbb...@li... > > *Subject:* Re: [Tbb-develop] TBB Priority Queue > > > > It all has to do with cache lines. And is probably more appropriate a > topic for tbb-users. I've added it to the list. If we continue this > discussion, we should probably do it over there. > > > > The hit from atomic operations is not as severe as when using a mutex, but > not negligible. A cache line on current Intel(R) Architecture is 64 bytes. If > Processing Element A does an atomic operation on 4 bytes of Cache Line 1, it > doesn't need to write the 4 bytes all the way to memory. By the MESI > protocol for cache coherence, all it has to have is Exclusive access to the > cache line. The "atomic" part is the guarantee that PE-A either already has > Line 1 in Modified state or can convert it from Exclusive or Shared state to > Modified without any other intervening operations. In MESI, the process for > converting from Shared to Modified involves a broadcast called Read For > Ownership, which essentially forces all other PEs, if they have that cache > line in Shared state, to mark it Invalid. All this involves various snoop > traffic as the PEs exchange information about their cache lines. If > Processing Element B wants to twiddle any of the bytes in Cache Line 1, it > must contend with PE-A for ownership of the cache line. In MESI, PE-B's RFO > would be aborted so that PE-A could grab the bus and write Line 1 back to > memory. PE-B would try again, doing another RFO as the first step in its > read-modify-write cycle. And there're other complications like > write-through versus write-back caches. So contention is still an issue. > > _______________________________________________________________________ > > Robert Reed Rob...@in... > > Intel SSG/ Developer Products Division/ Performance, Analysis and > > Threading Lab/ Technical Consulting > > Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 > > Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 > > > > The only thing that saves us from the bureaucracy is inefficiency. An > > efficient bureaucracy is the greatest threat to liberty. > > --Eugene McCarthy > > ________________________________________________________________________ > > > > > ------------------------------ > > *From:* tbb...@li... [mailto: > tbb...@li...] *On Behalf Of *Adrien Guillon > *Sent:* Friday, March 21, 2008 12:42 PM > *To:* for people interested in developing TBB itself > *Subject:* Re: [Tbb-develop] TBB Priority Queue > > > > The part that puzzled me was how multiple atomic operations could cause > performance bottlenecks. > > As for priority queues by their nature having internal dependencies which > prohibit efficient concurrent behavior, it's quite possible. I am currently > devoting some effort to the study of efficient parallel containers. I > haven't searched for texts on the subject yet, rather I am waiting for my > multi-core programming books from Intel Press to understand more about what > the containers should look like and how they should behave for good > performance. > > Once I understand what I'm looking for, I will attempt to design some > better behaved containers, and perhaps understand the TBB containers better. > > AJ > > On Fri, Mar 21, 2008 at 3:22 PM, Reed, Robert W <rob...@in...> > wrote: > > I'm not sure I agree, Adrien. Publishing bad code doesn't necessarily > provide any lessons to the reader. If there was a demonstration of how to > avoid the shoals of bad code provided in such a release, there'd be a > positive lesson there. If the answer is that priority queues by their very > nature have too many internal dependencies to permit concurrent thread > access, then you might have to accept that priority queue accesses are > serial events. You put a global lock around your STL priority queue and try > to minimize the use of it. The only reason I could see for wanting such > failed implementations is the expectation that you could do better. > Personally I don't see the value there. > > _______________________________________________________________________ > > Robert Reed Rob...@in... > > Intel SSG/ Developer Products Division/ Performance, Analysis and > > Threading Lab/ Technical Consulting > > Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 > > Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 > > > > Imagine the Creator as a low comedian, and at once the world becomes > > explicable. > > --H.L. Mencken > > ________________________________________________________________________ > ------------------------------ > > *From:* tbb...@li... [mailto: > tbb...@li...] *On Behalf Of *Adrien Guillon > *Sent:* Friday, March 21, 2008 8:36 AM > *To:* for people interested in developing TBB itself > *Subject:* [Tbb-develop] TBB Priority Queue > > > > It was mentioned on the forums that a TBB priority queue was implemented > before, but that the performance was horrible due to many atomic operations. > > Could we have this code made public so that we can learn from your > mistakes without making them ourselves? > > AJ > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Tbb-develop mailing list > Tbb...@li... > https://lists.sourceforge.net/lists/listinfo/tbb-develop > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Tbb-develop mailing list > Tbb...@li... > https://lists.sourceforge.net/lists/listinfo/tbb-develop > > |
From: Robison, A. <arc...@in...> - 2008-03-24 16:14:15
|
To Robert's excellent summary, I'll add pipeline effects. With a Pentium 4 or Core 2 processor, all instructions before an atomic operation must complete before any instructions after the atomic operation can start. For long pipelines, that can be a major hit. And there are pipeline replays involved, so the penalty is a multiple of the pipeline length. For classic PRAM <http://pages.cs.wisc.edu/~tvrdik/2/html/Section2.html#PRAM> machines, lock-free algorithms are faster. The trouble is, there are no PRAM machines in real life. For priority queues, we've tried both skiplist and heap based versions. Google "priority queue concurrent" and you will find ample literature. Be warned that some of the literature reports results for grossly oversubscribed machines, or for older machines where caches were less of an issue. - Arch ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Reed, Robert W Sent: Friday, March 21, 2008 4:29 PM To: 'tbb...@li...'; tbb...@li... Subject: Re: [Tbb-develop] TBB Priority Queue It all has to do with cache lines. And is probably more appropriate a topic for tbb-users. I've added it to the list. If we continue this discussion, we should probably do it over there. The hit from atomic operations is not as severe as when using a mutex, but not negligible. A cache line on current Intel(r) Architecture is 64 bytes. If Processing Element A does an atomic operation on 4 bytes of Cache Line 1, it doesn't need to write the 4 bytes all the way to memory. By the MESI protocol for cache coherence, all it has to have is Exclusive access to the cache line. The "atomic" part is the guarantee that PE-A either already has Line 1 in Modified state or can convert it from Exclusive or Shared state to Modified without any other intervening operations. In MESI, the process for converting from Shared to Modified involves a broadcast called Read For Ownership, which essentially forces all other PEs, if they have that cache line in Shared state, to mark it Invalid. All this involves various snoop traffic as the PEs exchange information about their cache lines. If Processing Element B wants to twiddle any of the bytes in Cache Line 1, it must contend with PE-A for ownership of the cache line. In MESI, PE-B's RFO would be aborted so that PE-A could grab the bus and write Line 1 back to memory. PE-B would try again, doing another RFO as the first step in its read-modify-write cycle. And there're other complications like write-through versus write-back caches. So contention is still an issue. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 The only thing that saves us from the bureaucracy is inefficiency. An efficient bureaucracy is the greatest threat to liberty. --Eugene McCarthy ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 12:42 PM To: for people interested in developing TBB itself Subject: Re: [Tbb-develop] TBB Priority Queue The part that puzzled me was how multiple atomic operations could cause performance bottlenecks. As for priority queues by their nature having internal dependencies which prohibit efficient concurrent behavior, it's quite possible. I am currently devoting some effort to the study of efficient parallel containers. I haven't searched for texts on the subject yet, rather I am waiting for my multi-core programming books from Intel Press to understand more about what the containers should look like and how they should behave for good performance. Once I understand what I'm looking for, I will attempt to design some better behaved containers, and perhaps understand the TBB containers better. AJ On Fri, Mar 21, 2008 at 3:22 PM, Reed, Robert W <rob...@in...> wrote: I'm not sure I agree, Adrien. Publishing bad code doesn't necessarily provide any lessons to the reader. If there was a demonstration of how to avoid the shoals of bad code provided in such a release, there'd be a positive lesson there. If the answer is that priority queues by their very nature have too many internal dependencies to permit concurrent thread access, then you might have to accept that priority queue accesses are serial events. You put a global lock around your STL priority queue and try to minimize the use of it. The only reason I could see for wanting such failed implementations is the expectation that you could do better. Personally I don't see the value there. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 Imagine the Creator as a low comedian, and at once the world becomes explicable. --H.L. Mencken ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 8:36 AM To: for people interested in developing TBB itself Subject: [Tbb-develop] TBB Priority Queue It was mentioned on the forums that a TBB priority queue was implemented before, but that the performance was horrible due to many atomic operations. Could we have this code made public so that we can learn from your mistakes without making them ourselves? AJ ------------------------------------------------------------------------ - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Tbb-develop mailing list Tbb...@li... https://lists.sourceforge.net/lists/listinfo/tbb-develop |
From: Robison, A. <arc...@in...> - 2008-03-24 15:53:00
|
The TBB group has access to many different machines, ranging from 1 to 64 cores. The general trend for budget minded parallel programming research is to favor many cheap slow cores over a few expensive fast ones. So I'd go for the cheapest quad core I could find, perhaps even a refurbished one. (Warning: This advice comes from someone who bought an Apple 6200 Performa off eBay two years ago, and is still running it.) -Arch ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 10:42 AM To: for people interested in developing TBB itself Subject: [Tbb-develop] What Do You Develop TBB On? I'm considering getting an Intel Core 2 Quad-Core PC for my TBB/YetiSim development. This would be a bit of a purchase for me, and I'm curious what others are using for their TBB development. Does anyone have suggestions? I know Intel makes TBB... but AMD has cheaper Quad-core processors, although I have heard that due to cache design the Intel ones are better. Any comments? I have access to both a 128 processor Itanium-2 system, and a dual-processor quad-core Xeon... but I can't really run profiling tools on these systems to analyze performance since they are a shared resource... and these systems are usually under high load anyways. The main point of a new system is for analyzing performance. Thanks, AJ |
From: Robison, A. <arc...@in...> - 2008-03-24 15:30:54
|
As far as I know, the only way in pthreads to query the state of a lock is to use pthread_mutex_trylock. TBB has the same capability. I think this is similarly true for Windows <http://msdn2.microsoft.com/en-us/library/ms686360(VS.85).aspx> . Likewise for C++ 200x <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html> . Since none of these other libraries have any other way to query the state of a lock, and all are designed by threading experts, I conclude that adding another query function is a bad idea. I concur with Wooyoung's suggestion. - Arch _____________________________________________ From: Kim, Wooyoung Sent: Monday, March 24, 2008 9:38 AM To: Voss, Michael J; Kukanov, Alexey; INNL TBB; Huson, Chris; Poulsen, Dave; Robison, Arch Cc: 'tbb...@li...' Subject: RE: Making upgrade_to_writer not fail if already locked for writing? If an application needs to know the state of a lock, it may declare a separate variable and keep track of the state. When the lock is acquired, it may raise a flag, and before it calls 'release()' it lowers the flag. Better yet, it writes thread ID onto it, so that it knows which thread has acquired the lock. Given this option, is the overhead for supporting such a functionality in the mutexes themselves justified? By doing so, we would charge other applications that would not need such a functionality. best regards, ------------------------- Wooyoung Kim +1 217-621-9968 wooyoung dot kim at intel dot com _____________________________________________ From: Voss, Michael J Sent: Friday, March 21, 2008 10:45 AM To: Kukanov, Alexey; INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch Cc: 'tbb...@li...' Subject: RE: Making upgrade_to_writer not fail if already locked for writing? Isn't this related to the general question of whether a recursive mutex is a good thing? We had this discussion some time ago and then eventually decided to supply a recursive_mutex class. It's seems that the tradeoffs are the same here, pragmatism -vs- idealism. We did, however, decide to create a separate class instead of making all of our mutex classes recursive. The thought was that we could appropriately document that using the recursive variant might indicate a bad design. So following our previous decisions, maybe we provide a method to query the current state, and then document that using it probably indicates that you have a bad design. Mike _____________________________________________ From: Kukanov, Alexey Sent: Friday, March 21, 2008 3:32 AM To: INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch; Voss, Michael J Cc: tbb...@li... Subject: Making upgrade_to_writer not fail if already locked for writing? There was a request at the forum to make upgrade_to_writer work (and not run into an assertion) if the call is made when a lock is already held for writing. Here is a quotation: "If you try to call the method on a lock which is currently in the active_writer state however, one will run into an assertion. If one additional check would be introduced to the method (ok, which would mean some overhead) to see if the lock is already in the requested state and simply return without doing anything, my problem would be fixed without changing the interface. Do you think there is any harm in such extension? Another suggestion discussed in the thread was to provide a method that returns the current state of the mutex. I personally find this being not a good idea; in my opinion, a good design should not rely on methods requesting mutex state. Is there a similar functionality for mutexes in pthread or Win32 API? Regards, - Alexey |
From: Kim, W. <woo...@in...> - 2008-03-24 14:50:06
|
If an application needs to know the state of a lock, it may declare a separate variable and keep track of the state. When the lock is acquired, it may raise a flag, and before it calls 'release()' it lowers the flag. Better yet, it writes thread ID onto it, so that it knows which thread has acquired the lock. Given this option, is the overhead for supporting such a functionality in the mutexes themselves justified? By doing so, we would charge other applications that would not need such a functionality. best regards, ------------------------- Wooyoung Kim +1 217-621-9968 wooyoung dot kim at intel dot com _____________________________________________ From: Voss, Michael J Sent: Friday, March 21, 2008 10:45 AM To: Kukanov, Alexey; INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch Cc: 'tbb...@li...' Subject: RE: Making upgrade_to_writer not fail if already locked for writing? Isn't this related to the general question of whether a recursive mutex is a good thing? We had this discussion some time ago and then eventually decided to supply a recursive_mutex class. It's seems that the tradeoffs are the same here, pragmatism -vs- idealism. We did, however, decide to create a separate class instead of making all of our mutex classes recursive. The thought was that we could appropriately document that using the recursive variant might indicate a bad design. So following our previous decisions, maybe we provide a method to query the current state, and then document that using it probably indicates that you have a bad design. Mike _____________________________________________ From: Kukanov, Alexey Sent: Friday, March 21, 2008 3:32 AM To: INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch; Voss, Michael J Cc: tbb...@li... Subject: Making upgrade_to_writer not fail if already locked for writing? There was a request at the forum to make upgrade_to_writer work (and not run into an assertion) if the call is made when a lock is already held for writing. Here is a quotation: "If you try to call the method on a lock which is currently in the active_writer state however, one will run into an assertion. If one additional check would be introduced to the method (ok, which would mean some overhead) to see if the lock is already in the requested state and simply return without doing anything, my problem would be fixed without changing the interface. Do you think there is any harm in such extension? Another suggestion discussed in the thread was to provide a method that returns the current state of the mutex. I personally find this being not a good idea; in my opinion, a good design should not rely on methods requesting mutex state. Is there a similar functionality for mutexes in pthread or Win32 API? Regards, - Alexey |
From: Adrien G. <aj....@gm...> - 2008-03-24 13:00:12
|
I'm using non-commercial versions, and the Thread Checker and Thread Profiler versions available didn't show up as being available for Ubuntu. I'm using Kubuntu 7.10 and VTune doesn't seem to play nicely with my system, during a profiling run it will basically stop responding... using strace on the java process reveals a barrage of segfaults which are occurring. I realize execution will proceed slower, however this is rather extreme.... and it fails for very small inputs as well. I tried both with icc and gcc. I should get a book on VTune today from my multi-core book pack, so maybe I can debug this further then. I don't have a Windows machine. When I'm ready to see the results of Thread Profiler, I'll see what I can do. I might be able to build my own tool to visualize the data, or just go through the data given.... right now I'm leaning toward just dual-booting with an Intel supported OS for when I'm ready to do benchmarks. |
From: Reed, R. W <rob...@in...> - 2008-03-24 06:54:32
|
I don't have an Ubuntu installation to compare notes, but according to the latest release notes VTune(tm) analyzer and Intel(r) Thread Checker are both qualified on Ubuntu 7.0 (7.04 for VTune analyzer). So is the Intel(r) Thread Profiler command line data collector. It's true there's no Linux viewer for Intel Thread Profiler data-the product was developed on Windows, so far that's the only place you can browse the data. But tprofile_cl is shipped in the remote data collector package and should allow you to test YetiSim on Ubuntu, though you will need a Windows machine to move the data or better yet, set up a Samba server so you can see the files from a Windows machine (the browser will need build and source trees plus the generated data files). Did you report the problem you had with Intel Thread Checker anywhere? Is it something you can reproduce and would be willing to submit through Premier.intel.com? If you've tried it on Ubuntu 7.0 and had segmentation faults there, please do so if you can. ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Saturday, March 22, 2008 10:48 AM To: for people interested in developing TBB itself Subject: Re: [Tbb-develop] What Do You Develop TBB On? I have played around with VTune, but can't get it to work well for Ubuntu. The Thread Checker segfaults on Ubuntu, and Thread Profiler doesn't seem to be available for Linux. On my new system, I will dual-boot with one of the supported Linux versions so that the Ubuntu compatibility problem goes away. I have used open source profilers, but VTune looks a lot better. I qualify for the non-commercial licenses from Intel, so I intend to take advantage of that fact and use them. I've decided to go with Intel processors for my development system. I'm curious as to the number of cores used by the TBB team in their work. AJ |
From: Adrien G. <aj....@gm...> - 2008-03-22 17:47:26
|
I have played around with VTune, but can't get it to work well for Ubuntu. The Thread Checker segfaults on Ubuntu, and Thread Profiler doesn't seem to be available for Linux. On my new system, I will dual-boot with one of the supported Linux versions so that the Ubuntu compatibility problem goes away. I have used open source profilers, but VTune looks a lot better. I qualify for the non-commercial licenses from Intel, so I intend to take advantage of that fact and use them. I've decided to go with Intel processors for my development system. I'm curious as to the number of cores used by the TBB team in their work. AJ |
From: Reed, R. W <rob...@in...> - 2008-03-21 21:35:07
|
It all has to do with cache lines. And is probably more appropriate a topic for tbb-users. I've added it to the list. If we continue this discussion, we should probably do it over there. The hit from atomic operations is not as severe as when using a mutex, but not negligible. A cache line on current Intel(r) Architecture is 64 bytes. If Processing Element A does an atomic operation on 4 bytes of Cache Line 1, it doesn't need to write the 4 bytes all the way to memory. By the MESI protocol for cache coherence, all it has to have is Exclusive access to the cache line. The "atomic" part is the guarantee that PE-A either already has Line 1 in Modified state or can convert it from Exclusive or Shared state to Modified without any other intervening operations. In MESI, the process for converting from Shared to Modified involves a broadcast called Read For Ownership, which essentially forces all other PEs, if they have that cache line in Shared state, to mark it Invalid. All this involves various snoop traffic as the PEs exchange information about their cache lines. If Processing Element B wants to twiddle any of the bytes in Cache Line 1, it must contend with PE-A for ownership of the cache line. In MESI, PE-B's RFO would be aborted so that PE-A could grab the bus and write Line 1 back to memory. PE-B would try again, doing another RFO as the first step in its read-modify-write cycle. And there're other complications like write-through versus write-back caches. So contention is still an issue. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 The only thing that saves us from the bureaucracy is inefficiency. An efficient bureaucracy is the greatest threat to liberty. --Eugene McCarthy ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 12:42 PM To: for people interested in developing TBB itself Subject: Re: [Tbb-develop] TBB Priority Queue The part that puzzled me was how multiple atomic operations could cause performance bottlenecks. As for priority queues by their nature having internal dependencies which prohibit efficient concurrent behavior, it's quite possible. I am currently devoting some effort to the study of efficient parallel containers. I haven't searched for texts on the subject yet, rather I am waiting for my multi-core programming books from Intel Press to understand more about what the containers should look like and how they should behave for good performance. Once I understand what I'm looking for, I will attempt to design some better behaved containers, and perhaps understand the TBB containers better. AJ On Fri, Mar 21, 2008 at 3:22 PM, Reed, Robert W <rob...@in...<mailto:rob...@in...>> wrote: I'm not sure I agree, Adrien. Publishing bad code doesn't necessarily provide any lessons to the reader. If there was a demonstration of how to avoid the shoals of bad code provided in such a release, there'd be a positive lesson there. If the answer is that priority queues by their very nature have too many internal dependencies to permit concurrent thread access, then you might have to accept that priority queue accesses are serial events. You put a global lock around your STL priority queue and try to minimize the use of it. The only reason I could see for wanting such failed implementations is the expectation that you could do better. Personally I don't see the value there. _______________________________________________________________________ Robert Reed Rob...@in...<mailto:Rob...@in...> Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 Imagine the Creator as a low comedian, and at once the world becomes explicable. --H.L. Mencken ________________________________________________________________________ ________________________________ From: tbb...@li...<mailto:tbb...@li...> [mailto:tbb...@li...<mailto:tbb...@li...>] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 8:36 AM To: for people interested in developing TBB itself Subject: [Tbb-develop] TBB Priority Queue It was mentioned on the forums that a TBB priority queue was implemented before, but that the performance was horrible due to many atomic operations. Could we have this code made public so that we can learn from your mistakes without making them ourselves? AJ ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Tbb-develop mailing list Tbb...@li...<mailto:Tbb...@li...> https://lists.sourceforge.net/lists/listinfo/tbb-develop |
From: Reed, R. W <rob...@in...> - 2008-03-21 19:59:24
|
To state my biases up front, I work for Intel. You can make of that what you will. You suggest that you want the new machine for doing performance profiling. The obvious next question: what profiling tools do you plan to employ? Intel provides a set of tools for performance profiling and threading analysis. Their most rudimentary functions will work on any x86 architecture, but Intel processors have a rich collection of performance tuning events that are matched with those tools: you can select events in VTune(tm) analyzer to dig into cache misses, or bus saturation, or a variety of other indicators that might provide significant clues. With your VTune analyzer license, you can grab a copy of PTU from Whatif.intel.com and start playing with some of the advanced features for looking at performance via basic blocks or experiment with data access analysis. Or you can validate your threaded code using Intel(r) Thread Checker or dig deeply into the timeline of your threaded application and study the concurrency or lack of it using Intel(r) Thread Profiler. These threading analysis tools may work to some extent on non-Intel processors, but the only way to guarantee their operation is on Intel hardware. There's some stuff you can do with oprofile or gprof, and there may be other tools in the open source or free categories. Mostly what I've seen are limited to hot spot analysis and call graphing. It's been a while since I've looked to see what's available. But the most advanced tools I've seen I must say come from the company I work for. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 The vast majority of Iraqis want to live in a peaceful, free world. And we will find these people and we will bring them to Justice. --George W Bush ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 8:42 AM To: for people interested in developing TBB itself Subject: [Tbb-develop] What Do You Develop TBB On? I'm considering getting an Intel Core 2 Quad-Core PC for my TBB/YetiSim development. This would be a bit of a purchase for me, and I'm curious what others are using for their TBB development. Does anyone have suggestions? I know Intel makes TBB... but AMD has cheaper Quad-core processors, although I have heard that due to cache design the Intel ones are better. Any comments? I have access to both a 128 processor Itanium-2 system, and a dual-processor quad-core Xeon... but I can't really run profiling tools on these systems to analyze performance since they are a shared resource... and these systems are usually under high load anyways. The main point of a new system is for analyzing performance. Thanks, AJ |
From: Adrien G. <aj....@gm...> - 2008-03-21 19:55:58
|
The part that puzzled me was how multiple atomic operations could cause performance bottlenecks. As for priority queues by their nature having internal dependencies which prohibit efficient concurrent behavior, it's quite possible. I am currently devoting some effort to the study of efficient parallel containers. I haven't searched for texts on the subject yet, rather I am waiting for my multi-core programming books from Intel Press to understand more about what the containers should look like and how they should behave for good performance. Once I understand what I'm looking for, I will attempt to design some better behaved containers, and perhaps understand the TBB containers better. AJ On Fri, Mar 21, 2008 at 3:22 PM, Reed, Robert W <rob...@in...> wrote: > I'm not sure I agree, Adrien. Publishing bad code doesn't necessarily > provide any lessons to the reader. If there was a demonstration of how to > avoid the shoals of bad code provided in such a release, there'd be a > positive lesson there. If the answer is that priority queues by their very > nature have too many internal dependencies to permit concurrent thread > access, then you might have to accept that priority queue accesses are > serial events. You put a global lock around your STL priority queue and try > to minimize the use of it. The only reason I could see for wanting such > failed implementations is the expectation that you could do better. > Personally I don't see the value there. > > _______________________________________________________________________ > > Robert Reed Rob...@in... > > Intel SSG/ Developer Products Division/ Performance, Analysis and > > Threading Lab/ Technical Consulting > > Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 > > Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 > > > > Imagine the Creator as a low comedian, and at once the world becomes > > explicable. > > --H.L. Mencken > > ________________________________________________________________________ > ------------------------------ > > *From:* tbb...@li... [mailto: > tbb...@li...] *On Behalf Of *Adrien Guillon > *Sent:* Friday, March 21, 2008 8:36 AM > *To:* for people interested in developing TBB itself > *Subject:* [Tbb-develop] TBB Priority Queue > > > > It was mentioned on the forums that a TBB priority queue was implemented > before, but that the performance was horrible due to many atomic operations. > > Could we have this code made public so that we can learn from your > mistakes without making them ourselves? > > AJ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Tbb-develop mailing list > Tbb...@li... > https://lists.sourceforge.net/lists/listinfo/tbb-develop > > |
From: Reed, R. W <rob...@in...> - 2008-03-21 19:22:17
|
I'm not sure I agree, Adrien. Publishing bad code doesn't necessarily provide any lessons to the reader. If there was a demonstration of how to avoid the shoals of bad code provided in such a release, there'd be a positive lesson there. If the answer is that priority queues by their very nature have too many internal dependencies to permit concurrent thread access, then you might have to accept that priority queue accesses are serial events. You put a global lock around your STL priority queue and try to minimize the use of it. The only reason I could see for wanting such failed implementations is the expectation that you could do better. Personally I don't see the value there. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 Imagine the Creator as a low comedian, and at once the world becomes explicable. --H.L. Mencken ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Adrien Guillon Sent: Friday, March 21, 2008 8:36 AM To: for people interested in developing TBB itself Subject: [Tbb-develop] TBB Priority Queue It was mentioned on the forums that a TBB priority queue was implemented before, but that the performance was horrible due to many atomic operations. Could we have this code made public so that we can learn from your mistakes without making them ourselves? AJ |
From: Reed, R. W <rob...@in...> - 2008-03-21 19:01:26
|
When I saw this last night, it struck me then as a quasi-recursive locking scheme, quasi- because it doesn't require reference counting the nesting and a suitable number of unlocks to match the locks. I don't think it's good practice to provide a method which most likely would be used only in implementing bad designs. However, could implementing Florian's request be satisfied by simply replacing the !is_writer assertion in upgrade_to_writer with something like "if (is_writer) return false" in spin_rw_lock? I'm not well versed enough in the code yet to know whether this opens any holes, but it seems to me that as long as you avoid a second call to internal_upgrade, things should be fine. The story initially looks a little more complicated for queuing_rw_lock, but might be susceptible to a similar ploy, adding an "if" at the head of queuing_rw_mutex::scoped_lock::upgrade_to_writer() to check if state == STATE_WRITER. It would only be a slight relaxation of the current rules and make upgrade_to_writer a noop if the write lock is already acquired, but assert as normal if the caller is not even a reader, etc, etc. _______________________________________________________________________ Robert Reed Rob...@in... Intel SSG/ Developer Products Division/ Performance, Analysis and Threading Lab/ Technical Consulting Dept. JF1-15, 2111 NE 25th Ave phone 503-264-9624 Hillsboro, OR 97124 mobile 503-830-1530 fax 503-264-9227 You can fool all the people all the time if the advertising is right and the budget is big enough. --Joseph E. Levine ________________________________________________________________________ ________________________________ From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Voss, Michael J Sent: Friday, March 21, 2008 8:45 AM To: Kukanov, Alexey; INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch Cc: tbb...@li... Subject: Re: [Tbb-develop] Making upgrade_to_writer not fail if already locked for writing? Isn't this related to the general question of whether a recursive mutex is a good thing? We had this discussion some time ago and then eventually decided to supply a recursive_mutex class. It's seems that the tradeoffs are the same here, pragmatism -vs- idealism. We did, however, decide to create a separate class instead of making all of our mutex classes recursive. The thought was that we could appropriately document that using the recursive variant might indicate a bad design. So following our previous decisions, maybe we provide a method to query the current state, and then document that using it probably indicates that you have a bad design. Mike _____________________________________________ From: Kukanov, Alexey Sent: Friday, March 21, 2008 3:32 AM To: INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch; Voss, Michael J Cc: tbb...@li... Subject: Making upgrade_to_writer not fail if already locked for writing? There was a request at the forum to make upgrade_to_writer work (and not run into an assertion) if the call is made when a lock is already held for writing. Here is a quotation: "If you try to call the method on a lock which is currently in the active_writer state however, one will run into an assertion. If one additional check would be introduced to the method (ok, which would mean some overhead) to see if the lock is already in the requested state and simply return without doing anything, my problem would be fixed without changing the interface. Do you think there is any harm in such extension? Another suggestion discussed in the thread was to provide a method that returns the current state of the mutex. I personally find this being not a good idea; in my opinion, a good design should not rely on methods requesting mutex state. Is there a similar functionality for mutexes in pthread or Win32 API? Regards, - Alexey |
From: Voss, M. J <Mic...@in...> - 2008-03-21 15:46:08
|
Isn't this related to the general question of whether a recursive mutex is a good thing? We had this discussion some time ago and then eventually decided to supply a recursive_mutex class. It's seems that the tradeoffs are the same here, pragmatism -vs- idealism. We did, however, decide to create a separate class instead of making all of our mutex classes recursive. The thought was that we could appropriately document that using the recursive variant might indicate a bad design. So following our previous decisions, maybe we provide a method to query the current state, and then document that using it probably indicates that you have a bad design. Mike _____________________________________________ From: Kukanov, Alexey Sent: Friday, March 21, 2008 3:32 AM To: INNL TBB; Huson, Chris; Kim, Wooyoung; Poulsen, Dave; Robison, Arch; Voss, Michael J Cc: tbb...@li... Subject: Making upgrade_to_writer not fail if already locked for writing? There was a request at the forum to make upgrade_to_writer work (and not run into an assertion) if the call is made when a lock is already held for writing. Here is a quotation: "If you try to call the method on a lock which is currently in the active_writer state however, one will run into an assertion. If one additional check would be introduced to the method (ok, which would mean some overhead) to see if the lock is already in the requested state and simply return without doing anything, my problem would be fixed without changing the interface. Do you think there is any harm in such extension? Another suggestion discussed in the thread was to provide a method that returns the current state of the mutex. I personally find this being not a good idea; in my opinion, a good design should not rely on methods requesting mutex state. Is there a similar functionality for mutexes in pthread or Win32 API? Regards, - Alexey |
From: Adrien G. <aj....@gm...> - 2008-03-21 15:41:53
|
I'm considering getting an Intel Core 2 Quad-Core PC for my TBB/YetiSim development. This would be a bit of a purchase for me, and I'm curious what others are using for their TBB development. Does anyone have suggestions? I know Intel makes TBB... but AMD has cheaper Quad-core processors, although I have heard that due to cache design the Intel ones are better. Any comments? I have access to both a 128 processor Itanium-2 system, and a dual-processor quad-core Xeon... but I can't really run profiling tools on these systems to analyze performance since they are a shared resource... and these systems are usually under high load anyways. The main point of a new system is for analyzing performance. Thanks, AJ |
From: Adrien G. <aj....@gm...> - 2008-03-21 15:35:29
|
It was mentioned on the forums that a TBB priority queue was implemented before, but that the performance was horrible due to many atomic operations. Could we have this code made public so that we can learn from your mistakes without making them ourselves? AJ |
From: Adrien G. <aj....@gm...> - 2008-03-21 15:33:40
|
Personally, I think querying whether or not a mutex is in writer mode is an indication of a design flaw. It implies to me that the mutex scoped_lock is being passed itself as an argument, the function body is massive, or otherwise the mutex is referenced elsewhere... and this seems to open a can of worms... the user might copy a scoped_lock when they meant to pass a reference. The adjustment of behavior of upgrade_to_writer to do nothing if a write mutex is already held might be a good idea. I'm kinda split on this one. In some sense, it means the user has lost track of where they are in their code... personally I have tended to upgrade to writer only when really required, then downgrade back to reader or release the mutex. On the other hand, if you are writing a long complicated function it might be useful... although for someone reading the code they might find another call to upgrade_to_writer nonintuitive. Personally I use mutexes only in short bursts so far. AJ |
From: Kukanov, A. <Ale...@in...> - 2008-03-21 08:39:41
|
There was a request at the forum to make upgrade_to_writer work (and not run into an assertion) if the call is made when a lock is already held for writing. Here is a quotation: "If you try to call the method on a lock which is currently in the active_writer state however, one will run into an assertion. If one additional check would be introduced to the method (ok, which would mean some overhead) to see if the lock is already in the requested state and simply return without doing anything, my problem would be fixed without changing the interface. Do you think there is any harm in such extension? Another suggestion discussed in the thread was to provide a method that returns the current state of the mutex. I personally find this being not a good idea; in my opinion, a good design should not rely on methods requesting mutex state. Is there a similar functionality for mutexes in pthread or Win32 API? Regards, - Alexey -------------------------------------------------------------------- Closed Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |
From: Kukanov, A. <Ale...@in...> - 2008-03-17 16:42:24
|
I have found an interesting post in one of the Intel forums: ===== hello i want to get the time spent by any thread on win32 when i uste GeTthreadTimes(user,kernel) the minimum resolution is 1ms how to get the delta of 1ms if i call many time GeThreadTimes() in the delta de 1ms the function return allways the same time !!!!!! the QueryPerformanceCounter() use high resolution timming but is not correct for thread time i want thread time user & kernel in high resolution timming : 100ns anyone know or have solution ????? Thanks ===== It might be interesting and useful to extend TBB timing capabilities with a class that returns high-resolution user and kernel times for a thread. But do operating systems of interest provide necessary capabilities? Regards, - Alexey -------------------------------------------------------------------- Closed Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russian Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |
From: Kukanov, A. <Ale...@in...> - 2008-03-06 22:32:14
|
Hi everyone, There was a problem reported recently in the TBB forum (see that post below). The user wanted to change the data item in Body::operator(), but passing the item by reference caused compilation errors. The user was misled by an error in TBB documentation that said Body::operator() takes items to process by reference. That post made me thinking if such usage model is something TBB should be able to handle, and if so, what the best way is to handle it. For example, should parallel_do be extended or reworked for it, or should it better be implemented in a special algorithm? What do you think? Regards, - Alexey Kukanov TBB developer @ Intel ________________________________ Original forum post: Posted By: devbisme in Intel(r) Threading Building Blocks Subject: problems using parallel_do View the complete topic at: http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/302 50244.aspx I'm trying to use parallel_do (using tbb20_20080226oss), but I'm getting an error from the compiler (MS VC++ 2005). I am trying to process some data in a simple structure maintained in a list: typedef struct { double v1, v2, v3; } my_item_t; typedef list<my_item_t> vec_list; class Body{ public: typedef my_item_t argument_type; Body() {}; // overload () so it does a vector multiply void operator() (argument_type& it) const { it.v3 = it.v1 * it.v2; } }; The () operator takes a reference to the simple structure as shown in the documentation for parallel_do. However, the compiler cannot find a () operator that matches the prototype it wants. If I change the operator by removing the reference, then it compiles with no errors: void operator() (argument_type it) const { it.v3 = it.v1 * it.v2; } But this doesn't work because any changes made by the operator only occur in the local copy and don't get passed back into the main list. I can get around this problem by making a list of pointers that point to the data structures, but it seems like the code shown above should work and I want to understand where I'm making my mistake. Thanks for any enlightenment anyone can give. -------------------------------------------------------------------- Closed Joint Stock Company Intel A/O Registered legal address: Krylatsky Hills Business Park, 17 Krylatskaya Str., Bldg 4, Moscow 121614, Russia Federation This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |
From: Adrien G. <aj....@gm...> - 2008-03-05 18:51:18
|
Hi TBBer's. I posted to the forum with this also, but the Google Summer of Code is on, and applications are due from sponsoring organizations soon! The deadline for application is March 12, 2008. The Google Summer of Code program is an opportunity for students to participate in a sponsoring open source project and to contribute with the assistance of a mentor. Here is a summary from the FAQ: *"Google Summer of Code* (GSoC) is a program that offers student developers stipends to write code for various open source projects. Google will be working with a several open source, free software, and technology-related groups to identify and fund several projects over a three month period. Historically, the program has brought together over 1,500 students with over 130 open source projects to create millions of lines of code. The program, which kicked off in 2005 <http://code.google.com/soc/2005/>, is now in its fourth year." I would be interested in participating, and I'm sure others would be too! AJ |
From: Kukanov, A. <Ale...@in...> - 2008-02-15 09:21:05
|
We have received this report from several independent sources. It will be fixed soon. Regards, - Alexey -----Original Message----- From: tbb...@li... [mailto:tbb...@li...] On Behalf Of Roberto C. Sanchez Sent: Friday, February 15, 2008 02:35 To: tbb...@li... Subject: [Tbb-develop] Building with g++-4.3 A bug [0] was filed against the Debian package of TBB, since it fails to build with g++-4.3. I was able to correct the problem by adding: #include <cstring> at line 34 of src/tbb/concurrent_vector.cpp It would be great if this could be incorporated upstream. Regards, -Roberto [0] http://bugs.debian.org/465617 -- Roberto C. Sánchez http://people.connexer.com/~roberto http://www.connexer.com -------------------------------------------------------------------- Closed Joint Stock Company Intel A/O Registered legal address: 125252, Moscow, Russian Federation, Chapayevsky Per, 14. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. |
From: Roberto C. S. <ro...@co...> - 2008-02-14 23:34:30
|
A bug [0] was filed against the Debian package of TBB, since it fails to build with g++-4.3. I was able to correct the problem by adding: #include <cstring> at line 34 of src/tbb/concurrent_vector.cpp It would be great if this could be incorporated upstream. Regards, -Roberto [0] http://bugs.debian.org/465617 -- Roberto C. Sánchez http://people.connexer.com/~roberto http://www.connexer.com |