|
From: Liam W. <lw...@ca...> - 2005-05-16 19:04:54
|
I am writing a program to parse a certain webpage and I am checking to =
see if the first characters returned are 'HTTP' (which is what the =
standard specifies)
When I call my parsing function I pass it a pointer to a char and malloc =
memory inside the parsing function for the char. However when I return =
from the parsing program the char that I passed it doesn't point to the =
data parsed.
Basically I have a char*
char* dataToken
then I have a function
parseData(dateToken, data)
if I malloc data for dataToken before calling parseData the function =
works as expected
and I can use dataToken after the call to parseData. However if I pass =
in a simple char*
and malloc memory for dataToken inside parseData when parseData returns =
dataToken
contains seemingly random data.
Is malloced memory, created in a function call, released after the =
function has executed; in the same was as the function's local variables =
are released once the function is executed?
Here is code form my project.
int checkHTTPData(char* HTTPData)
{
int err =3D 0;
if (strlen(HTTPData) < 12)
{
printf("HTTP Data too small");
return AP_HTTP_DATA_TOO_SMALL;
}
=20
char* dataToken;
int tokenSize;
tokenSize =3D TOKEN_STRING_SIZE;
/* if I include this next block of commented out code and reserve memory =
for dataToken
before I pass it as an argument to getDataToken everything works =
fine.
dataToken =3D (char*) malloc(tokenSize);
if (dataToken =3D=3D '\0')
{
return AP_MALLOC_ERROR;
}
memset(dataToken, 0, tokenSize);
*/
/* here is the function that does the parsing dataToken is the char* */
err =3D getDataToken(HTTPData, dataToken, 0);
/* after this function is called dataToken seems to point at random =
memory*/
printf("dataToken:%s", dataToken);
=20
if (strncmp(dataToken, "HTTP", 4) =3D=3D 0)
{
printf("It is HTTP data");
}
return err;
}
/* Here is the getDataTokenFunction */
int getDataToken(char* data, char* dataToken, int tokenNbr)
{
char* tempToken;
char currentChar;
int tokernPos;
int whiteSpaceNbr=3D 0;
int tokenPlace =3D 0;
int skipWhiteSpaceReturn;
int startOfChar;
int stringPos;
int tokenSize;
int i =3D 0;
int j =3D 0;
/*this function skips over any white space at the beginning of the =
token*/
skipWhiteSpaceReturn =3D skipWhiteSpace(data, i, &startOfChar);=20
/* this skips over continuous groups of characters separated by =
whitespace until the desired token is reached */
for (i=3DstartOfChar; i < strlen(data); i++)
{
if ((data[i] =3D=3D ' ') || (data[i] =3D=3D '\t') || (data[i] =
=3D=3D '\n'))
{
if (whiteSpaceNbr =3D=3D tokenNbr)
{
i =3D strlen(data);
}
else
{ =20
whiteSpaceNbr++;
=20
// set the string position to the start of the new =
token
skipWhiteSpaceReturn =3D skipWhiteSpace(data, i, =
&startOfChar);
=20
//startOfChar points to the new string after the =
white space has been skipped
i =3D startOfChar;
}
}
}
// set string position to the start of the first non-whitespace =
character
stringPos =3D startOfChar;
/* if I pass in dataToken already malloced from checkHTTPData I don't =
include this next bit
However if I pass dataToken in as a simple char* without mallocing =
in checkHTTPData then I include
this part
tokenSize =3D TOKEN_STRING_SIZE;
dataToken =3D (char*) malloc(tokenSize);
if (token =3D=3D '\0')
{
return AP_MALLOC_ERROR;
}
memset(token, 0, tokenSize);
*/
=20
while ((data[stringPos] !=3D ' ') && (data[stringPos] !=3D '\0'))
{
if (tokenPlace >=3D tokenSize)
{
tokenSize =3D TOKEN_STRING_SIZE * ((tokenSize / =
TOKEN_STRING_SIZE) + 1);
tempToken =3D (char*) realloc(dataToken, tokenSize);
=20
if (tempToken =3D=3D '\0')
{
free(token);
return AP_REALLOC_ERROR;
}
=20
datatoken =3D tempToken;
}
=20
=20
/* here is where the string gets copied when I run through this =
function in gdb the correct data
gets assigned to dataToken however once the function returns the =
data is no longer there
and dataToken points to random data (or it seems random anyway)
if however I malloced dataToken before passing it to this =
function everything works as expected.
*/
=20
dataToken[tokenPlace] =3D data[stringPos];
dataTokenPlace++;
stringPos++;
} =20
=20
dataToken[tokenPlace] =3D '\0';
return 0;
}
|
|
From: Greg C. <chi...@co...> - 2005-05-16 19:32:26
|
On 2005-05-16 15:04 PM, Liam Whalen wrote: > > Is malloced memory, created in a function call, released after the > function has executed; in the same was as the function's local variables > are released once the function is executed? No, memory allocated with malloc() remains allocated until you free() it. It's not implicitly deallocated when you return from the function that called malloc(). |
|
From: John F. <jo...@ti...> - 2005-05-16 23:06:38
|
This failure is due to C being a 'call by value' language, not a 'call
by reference' language. That is, each function you call with an argument
list gets the current value of those arguments passed to it on the stack.
So you allocate a char * variable local to the function checkHTTPData,
but before you call your getDataToken function you don't initialize it
to anything, which means it just has random contents of the memory words
that get allocated as the stack frame for checkHTTPData. Then when you
call getDataToken, that random value is passed as an argument. That
random value is overwritten inside getDataToken when you call malloc,
but only for the local values within getDataToken, within its stack
frame. When you return from getDataToken, its stack frame is discarded
and the value that checkHTTPData knows about is still its uninitialized,
random value on the frame for checkHTTPData.
If you want to malloc inside getDataToken the way to do it is pass a
pointer to dataToken (now of type char **) as an argument to
getDataToken, and let getDataToken modify the value up in the
checkHTTPData frame. I guess I would just continue to malloc in
checkHTTPData if it were my code.
John
Liam Whalen wrote:
> I am writing a program to parse a certain webpage and I am checking to
> see if the first characters returned are 'HTTP' (which is what the
> standard specifies)
>
> When I call my parsing function I pass it a pointer to a char and malloc
> memory inside the parsing function for the char. However when I return
> from the parsing program the char that I passed it doesn't point to the
> data parsed.
>
> Basically I have a char*
>
> char* dataToken
>
> then I have a function
>
> parseData(dateToken, data)
>
> if I malloc data for dataToken before calling parseData the function
> works as expected
> and I can use dataToken after the call to parseData. However if I pass
> in a simple char*
> and malloc memory for dataToken inside parseData when parseData returns
> dataToken
> contains seemingly random data.
>
> Is malloced memory, created in a function call, released after the
> function has executed; in the same was as the function's local variables
> are released once the function is executed?
>
> Here is code form my project.
>
> int checkHTTPData(char* HTTPData)
> {
> int err = 0;
>
> if (strlen(HTTPData) < 12)
> {
> printf("HTTP Data too small");
> return AP_HTTP_DATA_TOO_SMALL;
> }
>
> char* dataToken;
> int tokenSize;
>
> tokenSize = TOKEN_STRING_SIZE;
>
> /* if I include this next block of commented out code and reserve memory
> for dataToken
> before I pass it as an argument to getDataToken everything works fine.
>
>
> dataToken = (char*) malloc(tokenSize);
>
> if (dataToken == '\0')
> {
> return AP_MALLOC_ERROR;
> }
>
> memset(dataToken, 0, tokenSize);
> */
>
> /* here is the function that does the parsing dataToken is the char* */
> err = getDataToken(HTTPData, dataToken, 0);
> /* after this function is called dataToken seems to point at random memory*/
>
> printf("dataToken:%s", dataToken);
>
> if (strncmp(dataToken, "HTTP", 4) == 0)
> {
> printf("It is HTTP data");
> }
>
> return err;
> }
>
> /* Here is the getDataTokenFunction */
>
> int getDataToken(char* data, char* dataToken, int tokenNbr)
> {
> char* tempToken;
> char currentChar;
> int tokernPos;
> int whiteSpaceNbr= 0;
> int tokenPlace = 0;
> int skipWhiteSpaceReturn;
> int startOfChar;
> int stringPos;
> int tokenSize;
> int i = 0;
> int j = 0;
>
> /*this function skips over any white space at the beginning of the
> token*/
> skipWhiteSpaceReturn = skipWhiteSpace(data, i, &startOfChar);
>
> /* this skips over continuous groups of characters separated by
> whitespace until the desired token is reached */
> for (i=startOfChar; i < strlen(data); i++)
> {
> if ((data[i] == ' ') || (data[i] == '\t') || (data[i] == '\n'))
> {
> if (whiteSpaceNbr == tokenNbr)
> {
> i = strlen(data);
> }
> else
> {
> whiteSpaceNbr++;
>
> // set the string position to the start of the new token
> skipWhiteSpaceReturn = skipWhiteSpace(data, i,
> &startOfChar);
>
> //startOfChar points to the new string after the
> white space has been skipped
> i = startOfChar;
> }
> }
> }
>
> // set string position to the start of the first non-whitespace
> character
> stringPos = startOfChar;
>
> /* if I pass in dataToken already malloced from checkHTTPData I don't
> include this next bit
>
> However if I pass dataToken in as a simple char* without mallocing
> in checkHTTPData then I include
> this part
>
> tokenSize = TOKEN_STRING_SIZE;
>
> dataToken = (char*) malloc(tokenSize);
>
> if (token == '\0')
> {
> return AP_MALLOC_ERROR;
> }
>
> memset(token, 0, tokenSize);
>
> */
>
> while ((data[stringPos] != ' ') && (data[stringPos] != '\0'))
> {
> if (tokenPlace >= tokenSize)
> {
> tokenSize = TOKEN_STRING_SIZE * ((tokenSize /
> TOKEN_STRING_SIZE) + 1);
> tempToken = (char*) realloc(dataToken, tokenSize);
>
> if (tempToken == '\0')
> {
> free(token);
> return AP_REALLOC_ERROR;
> }
>
> datatoken = tempToken;
> }
>
>
> /* here is where the string gets copied when I run through this
> function in gdb the correct data
> gets assigned to dataToken however once the function returns the
> data is no longer there
> and dataToken points to random data (or it seems random anyway)
> if however I malloced dataToken before passing it to this
> function everything works as expected.
> */
>
> dataToken[tokenPlace] = data[stringPos];
> dataTokenPlace++;
> stringPos++;
> }
>
> dataToken[tokenPlace] = '\0';
>
> return 0;
> }
>
>
>
>
|
|
From: Stephen R. <st...@mr...> - 2005-05-16 23:11:34
|
Liam Whalen wrote:
...snip
> When I call my parsing function I pass it a pointer to a char and malloc
> memory inside the parsing function for the char. However when I return
> from the parsing program the char that I passed it doesn't point to the
> data parsed.
>
My understanding is that parameters to a function cannot be modified,
ie. you cannot pass a pointer to a function and change the location the
pointer points to. Instead, you must pass a pointer to a pointer.
So then, you would have something like
char **dataToken;
then you call
int someFunction(dataToken){
in the function you would
*dataToken = (char*) malloc(...)
and then later you can use *dataToken to reference the malloc'ed memory.
If any of this is wrong, I apologize. It's been awhile since I've
actually used any C.
Stephen
|
|
From: John F. <jo...@ti...> - 2005-05-16 23:38:44
|
Well, close but not quite.
If you were to define dataToken as char ** in the calling function and
then call
someFunction(dataToken);
you'd still be using an undefined value to call the function, since you
haven't assigned the variable dataToken to anything yet. This would most
likely result in a segmentation fault down in the called function when
you dereference the pointer.
If the malloc just has to be located down in the called function, I
would have in the calling function:
char *dataToken;
...
someFunction(&dataToken);
then the called function would be as you have it,
int someFunction(char **dataTokenPtr) {
...
*dataTokenPtr = (char*) malloc(...)
...
strcmp("HTTP",*dataTokenPtr);
...
So then in the calling function the argument &dataToken (which is of
type char **) is certainly defined, and is simply the address of a
variable on the local stack, that variable being of type char *. The
malloc statement in the called function would then modify the value of
(char *) dataToken back up in the stack frame of the calling function.
But it becomes real hard to keep track of your memory allocation when
you have memory malloced in a different function than where you should
be freeing it. This is really a recipe for memory leaks when the program
gets big. This is why I would suggest mallocing in the same function
that decides that the string isn't needed anymore and frees it.
John
Stephen Ray wrote:
> Liam Whalen wrote:
> ...snip
>
>> When I call my parsing function I pass it a pointer to a char and
>> malloc memory inside the parsing function for the char. However when
>> I return from the parsing program the char that I passed it doesn't
>> point to the data parsed.
>>
>
>
> My understanding is that parameters to a function cannot be modified,
> ie. you cannot pass a pointer to a function and change the location the
> pointer points to. Instead, you must pass a pointer to a pointer.
>
> So then, you would have something like
> char **dataToken;
>
> then you call
> int someFunction(dataToken){
>
> in the function you would
> *dataToken = (char*) malloc(...)
>
> and then later you can use *dataToken to reference the malloc'ed memory.
>
> If any of this is wrong, I apologize. It's been awhile since I've
> actually used any C.
>
> Stephen
>
|
|
From: Michael G. <mg...@te...> - 2005-05-17 01:55:02
|
Hi Liam ! You have a misconception about passing pointers to subroutines. You have 'int getDataToken(char* data, char* dataToken, int tokenNbr)' which means getDataToken gets a _local_copy_ of dataToken (which initially points to the same memory as the copy in the invoking code. Later in your function you change the value of the copy by calling realloc. Now you have changed the copy but not the original. In fact you also invalidated the memory the original does point to. To achieve what you are after you'll have to pass a _pointer_ to the pointer you wish to change: int getDataToken(char* data, char** dataToken, int tokenNbr) which is invoked by=20 =C2=A0 =C2=A0 err =3D getDataToken(HTTPData, &dataToken, 0); Inside of getDataToken you have to replace all dataToken by (*dataToken). Doing that will yield the desired result (i.e. not regarding possible other programming error ;-) HTH, best, Michael =2D-=20 Vote against SPAM - see http://www.politik-digital.de/spam/ Michael Gerdau email: mg...@te... GPG-keys available on request or at public keyserver |
|
From: John G. <jo...@jo...> - 2005-05-17 03:13:12
|
Liam Whalen wrote: > When I call my parsing function I pass it a pointer to a char and > malloc memory inside the parsing function for the char. However when > I return from the parsing program the char that I passed it doesn't > point to the data parsed. You should always allocate memory before calling the function that acts on it. For example, check how sprintf() works. Give a pointer and a buffer length, or assume (and document) that the buffer is a specific length. Check for null pointers. I did a quick check of your code and found a few inconsistencies. If you still cannot get it to work the way you want, please narrow it down to the shortest, most concise example that still does not work and post that. I think this is more of a design issue, however. -- John Gaughan http://www.johngaughan.net/ jo...@jo... |
|
From: Tuomo L. <dj...@mb...> - 2005-05-17 10:46:57
|
Liam Whalen wrote:
> if I malloc data for dataToken before calling parseData the function works as expected
> and I can use dataToken after the call to parseData. However if I pass in a simple char*
> and malloc memory for dataToken inside parseData when parseData returns dataToken
> contains seemingly random data.
As others have already stated, this is because the change is not
passed back to the calling function. You are only changing the variable
inside the local scope. The solution is to use pointer to the
pointer (char **dataToken) as parameter.
> tokenSize = TOKEN_STRING_SIZE;
...
> dataToken = (char*) malloc(tokenSize);
I hope you are aware of the fact that you generally need to
allocate (string_length + 1) because the string is terminated
with zero character ('\0', ascii control character NUL).
AFAIK all libc string routines depend on this.
--
Tuomo
... Adding 100 to 486 made 585.999874653, so they named it "Pentium"
|