Menu

Tokenize Function

Alexsandro
2009-03-05
2012-09-26
  • Alexsandro

    Alexsandro - 2009-03-05

    Dear all,

    I have build a function to tokenize a sentence and to return the words in my main function. It seems ok, but it's not working the way I intend.

    The code is:

    main()
    {
    char str[50];
    char *array[50];
    printf("Token 0.1\nDeveloped by Alexsandro Meireles\nType");
    printf(" the sentence: ");
    gets(str);
    tokenizar(str);
    printf("\n");
    printf("Word #%d is %s\n",0, array[0]); //PROBLEM IS HERE!!!!!
    system("pause");
    }

    I need to have access to the words returned by the function tokenizar, but what I get as result, for example for "a casa pegou fogo." is:

    Word #0 is 'a'
    Word #1 is 'casa'
    Word #2 is 'pegou'
    Word #3 is 'fogo'

    AND THEN A LOOP

    Word #0 is Word #0 is Word #0 is Word #0 is ...

    IN OTHER WORDS, I CAN'T HAVE ACCESS TO THE ARRAY OF WORDS CREATED INSIDE 'tokenizar'.

    What am I doing wrong?

    Thanks in advance!

    --
    Prof. Dr. Alexsandro Meireles, linguist
    Federal University of Espírito Santo
    Departamento de Línguas e Letras
    Av. Fernando Ferrari, 514. Campus Universitário
    Goiabeiras. 29075-910
    Vitória-ES. Brazil
    meirelesalex@gmail.com
    +55-27-41021734

     
    • cpns

      cpns - 2009-03-05

      The code you posted certainly does not produce that output. The line:

      > printf("Word #%d is %s\n",0, array[0]); //PROBLEM IS HERE!!!!!

      is not in a loop of any kind so could not possibly produce that output.

      The variable 'array' is never used so would not contain anything printable in any case.

      Without also seeing the code to 'tokenisar()' there is no possible way of helping you.

      Regard gets() as bad for the health of your computer. Code containing this function can be targeted by malware, but apart from that perhaps unlikely scenario, it will be very easy to cause your code to crash horribly.

      Consider the following:

      include <stdio.h>

      include <string.h>

      include <stdlib.h>

      int tokenizar( char str, char* tokens );

      int main()
      {
      char str[50];
      char* array[50];
      int word_count ;
      int word ;

      printf(&quot;Token 0.1\nDeveloped by CPNS for Alexsandro Meireles\n&quot;); 
      printf(&quot;Type the sentence: &quot;);
      
      fgets( str, sizeof(str), stdin );   //// Safe alternative to gets()
      
      word_count = tokenizar( str, array );
      
      printf(&quot;\n&quot;); 
      for( word = 0; word &lt; word_count; word++ )
      {
          printf(&quot;Word #%d is %s\n&quot;, word, array[word]);
      }
      
      system(&quot;pause&quot;);
      

      }

      int tokenizar( char str, char tokens )
      {
      int index = 0 ;
      tokens[index] = strtok( str, " \t\n" ) ;
      while( tokens[index] != 0 )
      {
      index++ ;
      tokens[index] = strtok( 0, " \t\n!\&quot;£$%^&
      ()-\|<>,.?/+=_`¬{}[]@'~#" ) ;
      }

      return index ;
      

      }

      The list of delimiters is not exhaustive, and I would suggest that strtok() may not be the most appropriate solution in your case.

      Clifford

       
    • Alexsandro

      Alexsandro - 2009-03-05

      Thank you very much, Clifford!

      I appreciate your general comments about security. I finally got to solve my problems with your advice.

      Best,

      Alexsandro.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.