I want to make a training corpus for NameFind. In one line must be only one tag <START> <END> to learn. example :
I want to go to <START> New York <END>.
New York is place. I make my training corpus from article, so can i use this example :
Simon wants to go to <START> New York <END>.
Since there is a person "Simon" and place "New York" in one line, but I put one tag on place entity because I want to learn the New York place. Can i do that kind of example? Thank you...
Best regards,
Dee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
One one line there is only examples of a specific type: person, location, etc. If there are multiple people on a single line, that is fine. So using your example you would use data like:
Simon wants to go to <START> New York <END> .
to train the location model, and data like:
<START> Simon <END> wants to go to New York .
to train the person model.
Hope this helps...Tom
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1. Simon, Susan, and Susie want to go to New York.
---- Since they are different person, not a person with First-Middle-Last name, so I will have 3 line of training. Is it right?
the training file is :
<START> Simon <END>, Susan, and Susie want to go to New York.
Simon, <START> Susan <END>, and Susie want to go to New York.
Simon, Susan, and <START> Susie <END> want to go to New York.
2. I put all of my money in Hongkong Shanghai Bank Corporation (HSBC).
---- In order to train organization entity, which tag should I do ? (like training file1 or training file2)
training file 1 :
line 1 : I put all of my money in <START> Hongkong Shanghai Bank Corporation (HSBC) <END>.
training file 2 :
line 1 : I put all of my money in <START> Hongkong Shanghai Bank Corporation <END> (HSBC).
line 2 : I put all of my money in Hongkong Shanghai Bank Corporation (<START> HSBC <END>).
Thank you and regards. Dee.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I want to make a training corpus for NameFind. In one line must be only one tag <START> <END> to learn. example :
I want to go to <START> New York <END>.
New York is place. I make my training corpus from article, so can i use this example :
Simon wants to go to <START> New York <END>.
Since there is a person "Simon" and place "New York" in one line, but I put one tag on place entity because I want to learn the New York place. Can i do that kind of example? Thank you...
Best regards,
Dee
Hi,
One one line there is only examples of a specific type: person, location, etc. If there are multiple people on a single line, that is fine. So using your example you would use data like:
Simon wants to go to <START> New York <END> .
to train the location model, and data like:
<START> Simon <END> wants to go to New York .
to train the person model.
Hope this helps...Tom
Another example :
1. Simon, Susan, and Susie want to go to New York.
---- Since they are different person, not a person with First-Middle-Last name, so I will have 3 line of training. Is it right?
the training file is :
<START> Simon <END>, Susan, and Susie want to go to New York.
Simon, <START> Susan <END>, and Susie want to go to New York.
Simon, Susan, and <START> Susie <END> want to go to New York.
2. I put all of my money in Hongkong Shanghai Bank Corporation (HSBC).
---- In order to train organization entity, which tag should I do ? (like training file1 or training file2)
training file 1 :
line 1 : I put all of my money in <START> Hongkong Shanghai Bank Corporation (HSBC) <END>.
training file 2 :
line 1 : I put all of my money in <START> Hongkong Shanghai Bank Corporation <END> (HSBC).
line 2 : I put all of my money in Hongkong Shanghai Bank Corporation (<START> HSBC <END>).
Thank you and regards. Dee.
Hi,
I should be one line as they are all the same type; person:
<START> Simon <END>, <START> Susan <END> , and <START> Susie <END> want to go to New York.
I don't know what the annotation guidelines are for the bank case. I would go with input one, or:
I put all of my money in <START> Hongkong Shanghai Bank Corporation <END> ( <START> HSBC <END> ) .
be sure to put spaces between tags and words. Hope this helps...Tom