As the bsml2chado loader will append Organism/Strain to Organism genus+species, the strain text will be duplicated if it appears in both the species and the strain element.
Proposed fix is to remove strain from the end of organism prior to creating the genus and species strings.
As of revision 6012, the script will remove strain information from the organism name unless that would remove the entire organism name or the entire species.
There is a special regex to handle peculiarities with how Influenza H#N# strain information is stored. From genbank file CY003441:
SOURCE Influenza A virus (A/New York/425/1999(H3N2))
ORGANISM Influenza A virus (A/New York/425/1999(H3N2))
/organism="Influenza A virus (A/New York/425/1999(H3N2))"
/mol_type="genomic RNA"
/strain="A/New York/425/1999"
/serotype="H3N2"
The script explicitly looks for the presence of H\dN\d and handles it accordingly.
Upon further consideration, it might be better to parse out /serotype and check for its presence.