A colleague gave me this code to convert two characters read from a serial port to a float (well, double actually, I changed it but didn't update the function name):
There is an implied decimal place in the concatenated characters, thus the multiply-by-one-tenth operation. It seems like it would only work if a short int is 16 bits though. Is that always the case, or is it OS and/or compiler dependent? Is there a more explicit way to declare a 16-bit signed integer, or a better way to do the calculation?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2007-10-31
It is architecture dependent. That said, due to the manner in which integer types must relate - each of char, short, int, long and long long in that order must be as large as or lager then the previous one - you will find that short is probably invariably 16 bits. In all 8, 16 and 32 bit compilers I have ever used, and all 64 bit compilers I am aware of, this is the case.
All that said, you are right to be concerned, and there is an very simple way of allaying your fears - <stdint.h>. This is standard in ISO C99 and available in most newer C and C++. This defines a number of size-specific integer types. There are a number types defined but the basic ones are:
As an aside, I can see no reason to make the variable 'answer' static. It achieves nothing while rendering the function non-reentrant. Which might be a concern in a multi-threaded application. To be honest I would remove the variable completely and simply have:
So when exactly do you need static when returning a value?
you do not "static" to calculate the result.
the returned value type if defined by function definition. there is double.
double returned by value.
so compiler at function exit will place a result on registers, or on stack.
if you have multithreading, this is safe.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2007-11-01
I am sure most of your questions have been answered, but since they were directed at my response let me address them:
>> So when exactly do you need static when returning a value?
A static exists in memory permanently. Even when it is out of scope. This means two things: you can reference the data externally via a pointer or reference, you can 'remember' information between calls, such as a count of the number of calls for example, or recursion depth (if using a recursive function). All of these uses should be used with caution. In your code the return value is copied to the callers recieving variable or expressions, so it is entirely independent of the static.
>> In a multi-threaded application, what happens when a second thread tries to call that function?
The static variable will be common between threads - i.e. shared memory. Say thread A calls this function and calculates answer variable, but just before the return statement thread B pre-empts A and calls the function and it runs to completion, when the context returns to thread A the answer variable will contain the value calculated in thread B. Recursion is another instance of reentrancy, but one over which you have deterministic control - if you get it wrong it will probably fail every time, whereas the multi-threaded problem will fail probably very infrequently and in very unpredictable ways - unfortunately it will probably do this in the field rather than in development, which is far more expensive, both to fix, and in terms of your reputation.
>> I'm curious, why do you hate macros?
(not directed at my response but here's my opinion)
Macros are expanded by the pre-processor before compilation. If there is a problem, the compiler message refers the expansion not the original source, so they can be hard to interpret. If you do not take a great deal of care the expansion of a macro in the context of its surrounding text may be semantically different that you intended, but in the debugger for example, you will only see the macro not the expansion. Furthermore in a debugger you cannot step into a macro to debug it. There are techniques for writing 'safe' macros, when you know them and have tried to use them you will see that they are more trouble than the are worth. For almost all legitimate uses of a macro there is a far better alternative in C++.
Thanks Clifford. I hadn't come across that site you mentioned - it looks like it has a lot of good information. And between your comments and Soma's, I have a much better understanding of the problem with my TwoCharactersToFloat function as well as some potential problems with other code in my program that I hadn't even considered.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK so I have revised another piece of my bad code to be more thread-safe (hopefully). Please tell me if this is OK:
Old, bad:
char my_function(MY_STRUCT info)
{
static char x[1000];
// do stuff to x based on info
return &x[0];
}
New (MY_STRUCT now has a new member, char x[1000]):
char my_function(MY_STRUCT info)
{
char* temp;
temp = info->x;
// do stuff to temp based on info
return &temp[0];
}
I think I would still have problems if another function tried to modify info->x but that can be prevented in other ways. Did I at least improve it? I realize temp is probably not needed but it makes it easier for me to read.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2007-11-01
&x[0] and &temp[0] are identical to simply x and temp respectively and does not really add clarity.
The variable temp is entirely redundant, you can simply return info->x. That said, having the temp variable is useful if there is an error condition where you might return null, and also as a shorthand.
If info did not really need the large array for anything else, making every instance of info carry it just for this might be inefficient. It may be better to pass the buffer as a separate parameter so you only create one when you need it.
I don't know where I picked up the habit of returning &x[0] instead of x. There was some case where the compiler threw a warning with just x but I can't remember what. I just remember changing to &x[0] and it worked. Is that possible or did I have another issue and I'm remembering wrong?
I do return NULL sometimes, but temp was really just because I don't like the way info->x looks in code. I guess I haven't used structures a lot and it looks odd to me for some reason. Also, I get paranoid about referencing the wrong thing within the struct by accident, whereas with a simple array of characters I'm more comfortable. I should get over it and learn the proper technique.
As for the size of the buffer, all of my info structures do need it (although maybe not that big in all cases) but either way it's only a dozen or so info structs so I'm wasting 12000 bytes. If it were 1000 info structs with 1000 byte buffers, I'd be more careful I suppose.
As always, thanks for the advice!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I hate C macros because they are C macros. I hate them even more in C++. I hate them in C++ because they are still C macros. There are a handful of other kinds of macros that could have been used, but no, they had to design a simple minded preprocessor to do unchecked macro expansion meaning that every single rule of the compilation phase is ignored by the preprocessor. This is ridiculous, atrocious, and it adds nothing to the language that checked macros wouldn't have.
Soma
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2007-11-02
>> ...but temp was really just because I don't like the way info->x looks in code...
Fine, but 'temp' is a really bad name for a variable. It means nothing, in the context of my work it could easily be assumed to mean temperature! It may be temporary, but that is not its purpose, that is just an indicator of its useful life. If it points to somthing, the name should probably reflect what it points to - even if it is temprorary.
>> ...I guess I haven't used structures a lot and it looks odd to me for some reason...
The you are really going to struggle with classes! ;-)
>> As for the size of the buffer, all of my info structures do need it...
In that case the static buffer version would never have been correct because there was only ever one buffer. Are you sure about this!?
Clifford
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The names info->x and temp were just in used in my dummy example. My actual code uses device_info->result and serial_port_message. Oddly enough this is a temperature logging application, so temp would indeed be a bad name for a temporary variable. :)
Classes scare the hell out of me. That's probably why I've stuck with C for so long.
The static version was horrible, right - that's what I was trying to improve. In the original version, if two devices tried to access their serial ports at the same time, that static buffer would get corrupted. Now at least each device can simultaneously do a serial access (or interpret a serial result) in separate threads. If two threads tried to access one device's port, I would still have a problem but I can prevent that at the user-interface level. To be truly safe I should probably use a mutex or something to block a device from calling the function twice, but for this application it might be overkill.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2007-11-02
>> Classes scare the hell out of me.
Worth the effort in the long term. If you have more that one temperature input to log from multiple ports, then this is screaming out to be a class. I guess one step at a time though, trying to use classes without first understanding why you'd want to is a recipe for getting it wrong.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
*Note: Clifford used the word 'reentrant' to describe a threading issue rather than a stack issue, but I have heard it used that way. To that end, I have avoided the word in my discussion.
"So when exactly do you need static when returning a value?"
Never. You seem to have confused the issue. Returning a value has nothing to do with static instances. You only need 'static' when setting aside an instance for future use. Whether or not you return anything from the function is irrelevant.
"you do not [...] this is safe."
I'm not sure what this person said, but he is very likely wrong. (Of course, I may have misunderstood.) C and C++ can only pass parameters and return values "by value". (When passing or returning pointers you are sending the value of the pointer "by value".) That is to say, the method by which C and C++ operate can't protect you from irregular access. Further, any 'static' instance used, both read and written, is never thread safe unless it is protected from irregular access by the appropriate mechanisms.
"Just when it's a pointer you're returning rather than the actual value?"
Returning a pointer has, again, nothing to do with static instances. You can safely, and sagely I should add, return a pointer passed to a function. Of course, if you do return a reference to a static instance via pointer or a C++ reference you can not maintain thread safety.
For example, you might write a function like this (that isn't thread safe):
It isn't thread safe for several reasons. What's more, because of the implementation, it can't even be made thread safe by using it carefully. It is simply broken in a threaded environment. I'll explain later. (For the record, this is why you shouldn't use 'strpbrk' and 'strtok' from the standard library. Some people think that one or the other is thread safe. The source for both may have been written to use the static pointer so even with checked access on your part they may not be thread safe.)
You might instead write a function like this (that still isn't thread safe):
This code is thread safe. For the record, it may produce incorrect results because of threading issues, but, for the purposes of a single execution, it is not going to break like the others will. And with careful use, unlike the others, it is perfectly thread safe and will produce correct results.
You might write a function like this (that isn't thread safe):
<code>
const char * itoa(int i)
{
static char string[32];
// I'm lazy, but obviously the conversion should now take place.
return(string);
}
</code>
That the return value is a pointer to a static instance implies a great deal. It can not be made thread safe, and worse, no number of checking or locking can make it thread safe. It must be rewritten. If you return a pointer to a static instance you are writing an unsound implementation.
You should instead write a function like this (that is thread safe):
<code>
const char * itoa(char * string, int length, int i)
{
// I'm lazy, but obviously the conversion should now take place.
return(string);
}
</code>
You might have tried using a lock and a few extra variables to fix the threading issues as you did in the first code, but that will not work. The problem with this code was that the instance itself was shared. You could not limit access to the returned pointer to the instance and so could no guarantee that the result would be valid.
"In a multi-threaded application, [...] to call that function?"
Here are the relevant bits from your own first post:
First off, even without reorganizing the execution, you have a problem in that 'answer' can change in the "time" it takes the return to take place. Consider what may happen when threads 'A' and 'B' call 'TwoCharactersToFloat'. (That this is unlikely is irrelevant. It could happen and so the code is unsound.)
(Understand this is intended to be a higher level view of the actual execution. The actual instructions and memory aspects are irrelevant. You should only notice that the address '$[004B8CD0]' is used in both codes.)
Thread 'A' executes the following:
<code>
mov flnm, @0.10 // set the multiply register to 0.10
mwc fln1, $[66014692] // move the integer 'temp' into a register
mlf fln1 // perform the multiplication
mov $[004B8CD0], flnr // store the result in 'answer'
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
Thread 'B' executes the following:
<code>
mov flnm, @0.10 // set the multiply register to 0.10
mwc fln1, $[66101049] // move the integer 'temp' into a register
mlf fln1 // perform the multiplication
mov $[004B8CD0], flnr // store the result in 'answer'
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
Do you see the problem now? (Once again, considering only the return statement.)
Thread 'A' can get to this instruction:
<code>
mov $[004B8CD0], flnr // store the result in 'answer'
</code>
at the same "time" as thread 'B' gets to this instruction:
<code>
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
The result, thread 'B' gets the answer 'A' was supposed to get and the answer thread 'B' was expecting is lost forever.
The compiler may optimize some uses of certain variables, as I have assumed by using 'answer' only once. The thing is, the compiler may also not optimize some uses of certain variables.
I see at least eight ways thread 'A' could trample on the results thread 'B' was expecting and visa-versa. And of course, depending on "where" each thread is, the results may be invalid for bother thread 'A' and 'B'.
Soma
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, so I followed most of what you had to say, and I do understand why it's bad, for instance, to return a char pointer to a static char x[] within a function. What is still unclear is whether my function can be written to be thread-safe. It seems like at the end you were basically saying no. But is Clifford's version acceptable, where it never even stores answer at all but simply returns the result of the calculation?
Am I better off turning it into a macro, since it's pretty small anyway and only occurs three times in my entire program?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What is still unclear is whether my function can be written to be thread-safe.
If you need to have thread safe function, which "returns" something complex, simply put to a function additional parameter, and use only local variables in a function.
if you need, put all external values to a function by values, as parameters.
so function should work only with parameters, local variables and return result into buffer allocated by caller function.
//here we do the return of result to external buffer and exit.
some_copy_function(local_var, result);
}
and never use global variables(static are global too, but accessable(compiler checks it) only in a code block, where are defined).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
"What is still unclear is whether my function can be written to be thread-safe."
I particularly addressed this by example. I suggest you read it again.
"It seems like at the end you were basically saying no. But is Clifford's version acceptable, where it never even stores answer at all but simply returns the result of the calculation?"
Of course, in this case, you are actually making a problem where there need not be one. If you remove the static keyword your example is fine--exactly as Clifford said.
"Am I better off turning it into a macro, since it's pretty small anyway and only occurs three times in my entire program?"
No. It should be a function. I hate macros. My own library has some 300,000 lines and uses exactly one macro per header (inclusion guard), one 'tracer' macro (a debug macro), one description macro (a 'named' macro), four portability macros (shared library import/export wrapper, restrict wrapper, inline wrapper, extern wrapper), and one multiple implementation macro (include the same header more than once to different effect). I don't see myself adding any more.
Soma
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm curious, why do you hate macros? And in a case like this, what makes a function better than a macro? I'm not a fan of macros, but I don't have valid reason for that I guess.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm not sure what this person said, but he is very likely wrong. (Of course, I may have misunderstood.) C and C++ can only pass parameters and return values "by value". (When passing or returning pointers you are sending the value of the pointer "by value".) That is to say, the method by which C and C++ operate can't protect you from irregular access. Further, any 'static' instance used, both read and written, is never thread safe unless it is protected from irregular access by the appropriate mechanisms.
I simply said that if you have types in a language, not scalar, but complex, for example arrays, structures, etc... as return by value(i.e. return by copiing of some data to private caller space or by data granted to private access to caller), as return by reference(there only addrres of potetially shared common data is granted to caller) are legal terms. Because this term are different.
if you have as a result of some function call - only address of some shared data, but not an adress of your private copy - this potentially leads to crash in multithreading.
adress of data really means a "tag", which caller code can use to have effective access to values stored in this "data".
to have thread safed code there are few simlpe rules.
1. if code works with private copy of data, and another thread cannot have access to this data - this is safe. local procedure variables or variables created(and disposed) on a heap, by this thread, if pointers to this heap areas are not visible to another threads - is safe also.
2. if you have common variables, and few threads can have access to this data - you must use some synchronization methods, like mutexes, critical sections, etc. to guarantee noninterrupted access of current thread to shared data. Usually it is called as atomic operation. So your operation with shared data must be atomic.
its enough.
Anyone can speak a lot about thread safety, but the matter of fact is - if two threads work with same variable in nonatomic operation - this is crash.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A colleague gave me this code to convert two characters read from a serial port to a float (well, double actually, I changed it but didn't update the function name):
double TwoCharactersToFloat(unsigned char hi, unsigned char lo)
{
static double answer;
short int temp;
temp = (hi << 8) | (lo);
answer = temp * 0.10;
return answer;
}
There is an implied decimal place in the concatenated characters, thus the multiply-by-one-tenth operation. It seems like it would only work if a short int is 16 bits though. Is that always the case, or is it OS and/or compiler dependent? Is there a more explicit way to declare a 16-bit signed integer, or a better way to do the calculation?
It is architecture dependent. That said, due to the manner in which integer types must relate - each of char, short, int, long and long long in that order must be as large as or lager then the previous one - you will find that short is probably invariably 16 bits. In all 8, 16 and 32 bit compilers I have ever used, and all 64 bit compilers I am aware of, this is the case.
All that said, you are right to be concerned, and there is an very simple way of allaying your fears - <stdint.h>. This is standard in ISO C99 and available in most newer C and C++. This defines a number of size-specific integer types. There are a number types defined but the basic ones are:
int8_t
int16_t
int32_t
int64_t
uint8_t
uint16_t
uint32_t
uint64_t
In your case you need int16_t.
As an aside, I can see no reason to make the variable 'answer' static. It achieves nothing while rendering the function non-reentrant. Which might be a concern in a multi-threaded application. To be honest I would remove the variable completely and simply have:
include <stdint.h>
double TwoCharactersToFloat( uint8_t hi, uint8_t lo )
{
int16_t temp = (hi << 8) | lo ;
return temp * 0.10 ;
}
Clifford
So when exactly do you need static when returning a value? Just when it's a pointer you're returning rather than the actual value?
In a multi-threaded application, what happens when a second thread tries to call that function?
So when exactly do you need static when returning a value?
you do not "static" to calculate the result.
the returned value type if defined by function definition. there is double.
double returned by value.
so compiler at function exit will place a result on registers, or on stack.
if you have multithreading, this is safe.
I am sure most of your questions have been answered, but since they were directed at my response let me address them:
>> So when exactly do you need static when returning a value?
A static exists in memory permanently. Even when it is out of scope. This means two things: you can reference the data externally via a pointer or reference, you can 'remember' information between calls, such as a count of the number of calls for example, or recursion depth (if using a recursive function). All of these uses should be used with caution. In your code the return value is copied to the callers recieving variable or expressions, so it is entirely independent of the static.
>> In a multi-threaded application, what happens when a second thread tries to call that function?
The static variable will be common between threads - i.e. shared memory. Say thread A calls this function and calculates answer variable, but just before the return statement thread B pre-empts A and calls the function and it runs to completion, when the context returns to thread A the answer variable will contain the value calculated in thread B. Recursion is another instance of reentrancy, but one over which you have deterministic control - if you get it wrong it will probably fail every time, whereas the multi-threaded problem will fail probably very infrequently and in very unpredictable ways - unfortunately it will probably do this in the field rather than in development, which is far more expensive, both to fix, and in terms of your reputation.
>> I'm curious, why do you hate macros?
(not directed at my response but here's my opinion)
Macros are expanded by the pre-processor before compilation. If there is a problem, the compiler message refers the expansion not the original source, so they can be hard to interpret. If you do not take a great deal of care the expansion of a macro in the context of its surrounding text may be semantically different that you intended, but in the debugger for example, you will only see the macro not the expansion. Furthermore in a debugger you cannot step into a macro to debug it. There are techniques for writing 'safe' macros, when you know them and have tried to use them you will see that they are more trouble than the are worth. For almost all legitimate uses of a macro there is a far better alternative in C++.
Take a look at 32.4 to 39.6 here: http://www.new-brunswick.net/workshop/c++/faq/misc-technical-issues.html#faq-39.4
Clifford
Thanks Clifford. I hadn't come across that site you mentioned - it looks like it has a lot of good information. And between your comments and Soma's, I have a much better understanding of the problem with my TwoCharactersToFloat function as well as some potential problems with other code in my program that I hadn't even considered.
OK so I have revised another piece of my bad code to be more thread-safe (hopefully). Please tell me if this is OK:
Old, bad:
char my_function(MY_STRUCT info)
{
static char x[1000];
// do stuff to x based on info
return &x[0];
}
New (MY_STRUCT now has a new member, char x[1000]):
char my_function(MY_STRUCT info)
{
char* temp;
temp = info->x;
// do stuff to temp based on info
return &temp[0];
}
I think I would still have problems if another function tried to modify info->x but that can be prevented in other ways. Did I at least improve it? I realize temp is probably not needed but it makes it easier for me to read.
&x[0] and &temp[0] are identical to simply x and temp respectively and does not really add clarity.
The variable temp is entirely redundant, you can simply return info->x. That said, having the temp variable is useful if there is an error condition where you might return null, and also as a shorthand.
If info did not really need the large array for anything else, making every instance of info carry it just for this might be inefficient. It may be better to pass the buffer as a separate parameter so you only create one when you need it.
char my_function(MY_STRUCT info, char* x )
{
bool error = false ;
// do stuff to x based on info
// set error to true on failure
// Return NULL on error otherwise x
return error ? 0 : x ;
}
Which might then be called thus:
char buffer[1000] ;
if( my_function( info, buffer ) == 0 )
{
// handle error
}
else
{
// use data in buffer
}
To secure against buffer overrun you might need ot pass the buffer size to the function:
char my_function(MY_STRUCT info, char* x, size_t len ) ...
called thus:
if( my_function( info, buffer, sizeof(buffer) ) == 0 ) ...
Clifford
I don't know where I picked up the habit of returning &x[0] instead of x. There was some case where the compiler threw a warning with just x but I can't remember what. I just remember changing to &x[0] and it worked. Is that possible or did I have another issue and I'm remembering wrong?
I do return NULL sometimes, but temp was really just because I don't like the way info->x looks in code. I guess I haven't used structures a lot and it looks odd to me for some reason. Also, I get paranoid about referencing the wrong thing within the struct by accident, whereas with a simple array of characters I'm more comfortable. I should get over it and learn the proper technique.
As for the size of the buffer, all of my info structures do need it (although maybe not that big in all cases) but either way it's only a dozen or so info structs so I'm wasting 12000 bytes. If it were 1000 info structs with 1000 byte buffers, I'd be more careful I suppose.
As always, thanks for the advice!
I hate C macros because they are C macros. I hate them even more in C++. I hate them in C++ because they are still C macros. There are a handful of other kinds of macros that could have been used, but no, they had to design a simple minded preprocessor to do unchecked macro expansion meaning that every single rule of the compilation phase is ignored by the preprocessor. This is ridiculous, atrocious, and it adds nothing to the language that checked macros wouldn't have.
Soma
>> ...but temp was really just because I don't like the way info->x looks in code...
Fine, but 'temp' is a really bad name for a variable. It means nothing, in the context of my work it could easily be assumed to mean temperature! It may be temporary, but that is not its purpose, that is just an indicator of its useful life. If it points to somthing, the name should probably reflect what it points to - even if it is temprorary.
>> ...I guess I haven't used structures a lot and it looks odd to me for some reason...
The you are really going to struggle with classes! ;-)
>> As for the size of the buffer, all of my info structures do need it...
In that case the static buffer version would never have been correct because there was only ever one buffer. Are you sure about this!?
Clifford
The names info->x and temp were just in used in my dummy example. My actual code uses device_info->result and serial_port_message. Oddly enough this is a temperature logging application, so temp would indeed be a bad name for a temporary variable. :)
Classes scare the hell out of me. That's probably why I've stuck with C for so long.
The static version was horrible, right - that's what I was trying to improve. In the original version, if two devices tried to access their serial ports at the same time, that static buffer would get corrupted. Now at least each device can simultaneously do a serial access (or interpret a serial result) in separate threads. If two threads tried to access one device's port, I would still have a problem but I can prevent that at the user-interface level. To be truly safe I should probably use a mutex or something to block a device from calling the function twice, but for this application it might be overkill.
>> Classes scare the hell out of me.
Worth the effort in the long term. If you have more that one temperature input to log from multiple ports, then this is screaming out to be a class. I guess one step at a time though, trying to use classes without first understanding why you'd want to is a recipe for getting it wrong.
*Note: Clifford used the word 'reentrant' to describe a threading issue rather than a stack issue, but I have heard it used that way. To that end, I have avoided the word in my discussion.
"So when exactly do you need static when returning a value?"
Never. You seem to have confused the issue. Returning a value has nothing to do with static instances. You only need 'static' when setting aside an instance for future use. Whether or not you return anything from the function is irrelevant.
"you do not [...] this is safe."
I'm not sure what this person said, but he is very likely wrong. (Of course, I may have misunderstood.) C and C++ can only pass parameters and return values "by value". (When passing or returning pointers you are sending the value of the pointer "by value".) That is to say, the method by which C and C++ operate can't protect you from irregular access. Further, any 'static' instance used, both read and written, is never thread safe unless it is protected from irregular access by the appropriate mechanisms.
"Just when it's a pointer you're returning rather than the actual value?"
Returning a pointer has, again, nothing to do with static instances. You can safely, and sagely I should add, return a pointer passed to a function. Of course, if you do return a reference to a static instance via pointer or a C++ reference you can not maintain thread safety.
For example, you might write a function like this (that isn't thread safe):
<code>
char * tbrk(char * s, char p)
{
static char * string;
if(0 != s)
{
string = s;
}
while(p != string && 0 != string)
{
++string;
}
return(string);
}
</code>
It isn't thread safe for several reasons. What's more, because of the implementation, it can't even be made thread safe by using it carefully. It is simply broken in a threaded environment. I'll explain later. (For the record, this is why you shouldn't use 'strpbrk' and 'strtok' from the standard library. Some people think that one or the other is thread safe. The source for both may have been written to use the static pointer so even with checked access on your part they may not be thread safe.)
You might instead write a function like this (that still isn't thread safe):
<code>
char * tbrk(char * s, char p)
{
Mutex * lock = ObtainMutex(/?/);
static char * string;
if(0 != s)
{
string = s;
}
while(p != string && 0 != string)
{
++string;
}
FreeMutex(lock);
return(string);
}
</code>
This ignorant approach fails for exactly the same reasons.
You should instead, at the very least, write a function like this (that still isn't thread safe):
<code>
char * tbrk(char * s, char p)
{
static char * staticstring;
char * string;
Mutex * lock = ObtainMutex(/?/);
if(0 != s)
{
staticstring = s;
}
string = staticstring;
FreeMutex(lock);
while(p != string && 0 != string)
{
++string;
}
return(string);
}
</code>
This code is thread safe. For the record, it may produce incorrect results because of threading issues, but, for the purposes of a single execution, it is not going to break like the others will. And with careful use, unlike the others, it is perfectly thread safe and will produce correct results.
You might write a function like this (that isn't thread safe):
<code>
const char * itoa(int i)
{
static char string[32];
// I'm lazy, but obviously the conversion should now take place.
return(string);
}
</code>
That the return value is a pointer to a static instance implies a great deal. It can not be made thread safe, and worse, no number of checking or locking can make it thread safe. It must be rewritten. If you return a pointer to a static instance you are writing an unsound implementation.
You should instead write a function like this (that is thread safe):
<code>
const char * itoa(char * string, int length, int i)
{
// I'm lazy, but obviously the conversion should now take place.
return(string);
}
</code>
You might have tried using a lock and a few extra variables to fix the threading issues as you did in the first code, but that will not work. The problem with this code was that the instance itself was shared. You could not limit access to the returned pointer to the instance and so could no guarantee that the result would be valid.
"In a multi-threaded application, [...] to call that function?"
Here are the relevant bits from your own first post:
<code>
static double answer;
answer = temp * 0.10;
return answer;
</code>
First off, even without reorganizing the execution, you have a problem in that 'answer' can change in the "time" it takes the return to take place. Consider what may happen when threads 'A' and 'B' call 'TwoCharactersToFloat'. (That this is unlikely is irrelevant. It could happen and so the code is unsound.)
(Understand this is intended to be a higher level view of the actual execution. The actual instructions and memory aspects are irrelevant. You should only notice that the address '$[004B8CD0]' is used in both codes.)
Thread 'A' executes the following:
<code>
mov flnm, @0.10 // set the multiply register to 0.10
mwc fln1, $[66014692] // move the integer 'temp' into a register
mlf fln1 // perform the multiplication
mov $[004B8CD0], flnr // store the result in 'answer'
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
Thread 'B' executes the following:
<code>
mov flnm, @0.10 // set the multiply register to 0.10
mwc fln1, $[66101049] // move the integer 'temp' into a register
mlf fln1 // perform the multiplication
mov $[004B8CD0], flnr // store the result in 'answer'
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
Do you see the problem now? (Once again, considering only the return statement.)
Thread 'A' can get to this instruction:
<code>
mov $[004B8CD0], flnr // store the result in 'answer'
</code>
at the same "time" as thread 'B' gets to this instruction:
<code>
mov fln1, $[004B8CD0] // copy the answer to the return register
</code>
The result, thread 'B' gets the answer 'A' was supposed to get and the answer thread 'B' was expecting is lost forever.
The compiler may optimize some uses of certain variables, as I have assumed by using 'answer' only once. The thing is, the compiler may also not optimize some uses of certain variables.
Look at your code again in a new light:
<code>
static double answer;
answer = temp;
answer *= 0.10;
return answer;
</code>
I see at least eight ways thread 'A' could trample on the results thread 'B' was expecting and visa-versa. And of course, depending on "where" each thread is, the results may be invalid for bother thread 'A' and 'B'.
Soma
grumble
"You should instead, at the very least, write a function like this (that still isn't thread safe):"
This is exactly what I get for writing my response out of order...
Soma
OK, so I followed most of what you had to say, and I do understand why it's bad, for instance, to return a char pointer to a static char x[] within a function. What is still unclear is whether my function can be written to be thread-safe. It seems like at the end you were basically saying no. But is Clifford's version acceptable, where it never even stores answer at all but simply returns the result of the calculation?
Am I better off turning it into a macro, since it's pretty small anyway and only occurs three times in my entire program?
What is still unclear is whether my function can be written to be thread-safe.
If you need to have thread safe function, which "returns" something complex, simply put to a function additional parameter, and use only local variables in a function.
if you need, put all external values to a function by values, as parameters.
so function should work only with parameters, local variables and return result into buffer allocated by caller function.
void func(...., MY_COMPLEX_TYPE* result)
{
MY_COMPLEX_TYPE local_var;
bla-bla-bla...
//here we do the return of result to external buffer and exit.
some_copy_function(local_var, result);
}
and never use global variables(static are global too, but accessable(compiler checks it) only in a code block, where are defined).
"What is still unclear is whether my function can be written to be thread-safe."
I particularly addressed this by example. I suggest you read it again.
"It seems like at the end you were basically saying no. But is Clifford's version acceptable, where it never even stores answer at all but simply returns the result of the calculation?"
Of course, in this case, you are actually making a problem where there need not be one. If you remove the static keyword your example is fine--exactly as Clifford said.
"Am I better off turning it into a macro, since it's pretty small anyway and only occurs three times in my entire program?"
No. It should be a function. I hate macros. My own library has some 300,000 lines and uses exactly one macro per header (inclusion guard), one 'tracer' macro (a debug macro), one description macro (a 'named' macro), four portability macros (shared library import/export wrapper, restrict wrapper, inline wrapper, extern wrapper), and one multiple implementation macro (include the same header more than once to different effect). I don't see myself adding any more.
Soma
I'm curious, why do you hate macros? And in a case like this, what makes a function better than a macro? I'm not a fan of macros, but I don't have valid reason for that I guess.
I'm not sure what this person said, but he is very likely wrong. (Of course, I may have misunderstood.) C and C++ can only pass parameters and return values "by value". (When passing or returning pointers you are sending the value of the pointer "by value".) That is to say, the method by which C and C++ operate can't protect you from irregular access. Further, any 'static' instance used, both read and written, is never thread safe unless it is protected from irregular access by the appropriate mechanisms.
I simply said that if you have types in a language, not scalar, but complex, for example arrays, structures, etc... as return by value(i.e. return by copiing of some data to private caller space or by data granted to private access to caller), as return by reference(there only addrres of potetially shared common data is granted to caller) are legal terms. Because this term are different.
if you have as a result of some function call - only address of some shared data, but not an adress of your private copy - this potentially leads to crash in multithreading.
adress of data really means a "tag", which caller code can use to have effective access to values stored in this "data".
to have thread safed code there are few simlpe rules.
1. if code works with private copy of data, and another thread cannot have access to this data - this is safe. local procedure variables or variables created(and disposed) on a heap, by this thread, if pointers to this heap areas are not visible to another threads - is safe also.
2. if you have common variables, and few threads can have access to this data - you must use some synchronization methods, like mutexes, critical sections, etc. to guarantee noninterrupted access of current thread to shared data. Usually it is called as atomic operation. So your operation with shared data must be atomic.
its enough.
Anyone can speak a lot about thread safety, but the matter of fact is - if two threads work with same variable in nonatomic operation - this is crash.