|
From: Konstantin S. <kon...@gm...> - 2009-06-02 07:50:06
|
On Mon, Jun 1, 2009 at 9:54 PM, Bart Van Assche
<bar...@gm...> wrote:
> On Mon, Jun 1, 2009 at 10:24 AM, Konstantin Serebryany
> <kon...@gm...> wrote:
>> On Sat, May 30, 2009 at 3:35 PM, Bart Van Assche
>> <bar...@gm...> wrote:
>>> A few remarks about the semantics of the ANNOTATE_* macro's:
>>> * I do not really like ANNOTATE_PUBLISH_MEMORY_RANGE. The comment
>>> above this macro says more or less that any other thread may access
>>> the published memory range safely after it has been published.
>>> However, no matter which synchronization instructions have been issued
>>> by the publishing thread, a consumer thread may only access the
>>> published memory safely after proper synchronization with the
>>> publishing thread. So my proposal is to remove this annotation and to
>>> use ANNOTATE_MUTEX_IS_USED_AS_CONDVAR instead.
>>
>> ANNOTATE_MUTEX_IS_USED_AS_CONDVAR is a big hammer as it essentially
>> makes the detection to be pure h-b.
>> PUBLISH_MEMORY_RANGE() is needed to hybrid mode.
>>
>> I am not a great expert in lock-less synchronization but I believe
>> that an object could be published safely in a way that does not
>> require any action by a consumer.
>> You can publish an object with just one CAS (at least on x86?). No?
>> So, you can use this annotation in a situation were you don't have
>> locks at all.
>
> The annotations should be general enough such that these are useful
> for any modern memory architecture. It's not entirely clear to me what
> the intended semantics of PUBLISH_MEMORY_RANGE() is. How does it e.g.
> map on the memory barrier instructions as defined by the Alpha
> architecture or the acquire/release labels as defined by the Itanium
> architecture ? On these two architectures making sure that all the
> store operations performed on one CPU are visible on another CPU
> requires the following:
> * First CPU modifies an object as necessary.
> * First CPU issues a memory barrier and sets a flag (Alpha) or updates
> a flag via a store operation that has release semantics.
> * Second CPU observes that the flag has been set and issues a memory
> barrier (Alpha) or observes that the flag has been modified through a
> load with acquire semantics (Itanium).
> * Second CPU loads object data.
> The update of the flag is necessary to make sure that all store
> operations performed by the first CPU will be observed by the second
> CPU: many memory consistency models allow stores to be reordered if
> not explicitly prevented. My point is that on a multiprocessor with
> sufficiently weak ordering guarantees, you can't just publish memory
> modifications. Cooperation of the consumer is needed to make sure that
> the intended semantics are realized.
Let's not argue about lock-less synchronization for now.
Just think about hybrid detectors:
Object *o = NULL;
void Thread1() {
Object *t = new Object;
ScopedMutexLock lock(&mu);
o = t;
ANNOTATE_PUBLISH_MEMORY_RANGE(o, sizeof(*o)) ;
}
void Thread2() {
ScopedMutexLock lock(&mu);
if (o) o->UseMe();
}
...
// we have 999 different places where 'o' is used.
void Thread999() {
ScopedMutexLock lock(&mu);
if (o) o->UseMeInSomeOtherWay();
}
Here, w/o the annotations, a hybrid detector will report a false
positive because the object was constructed outside the mutex.
The same thing could be done with ANNOTATE_HAPPENS_*, but it will
require 1000 annotations instead of just one.
>
> Looking at unit test 92 I get the impression that the semantics of
> PUBLISH_MEMORY_RANGE() is similar to that of the happens before /
> happens after annotations but only for an address range instead of all
> memory locations ?
>
PUBLISH_MEMORY_RANGE creates the same h-b arch as any other h-b
annotation or as e.g. sem_post/sem_wait.
The difference is that there is just one annotation.
The h-b edge is created between the call to PUBLISH_MEMORY_RANGE(mem,
size) and subsequent accesses to memory within the range [mem,
mem+size).
Once the memory [mem, mem+size) is freed, this stops.
--kcc
|