|
From: <m0...@ya...> - 2025-06-11 19:09:25
|
Dear Bob and team,
I suggest three possible improvements to the 1-letter protein sequence output of print {alpha}.sequence.all
1. When there are alternate locations, the residue should be listed only once. This is already true for "select protein; show sequence".
2. Non-standard residues should be X not ?, because ? is not accepted in a BLAST job at UniProt ("error: the sequence is invalid"), but X is accepted.
3. Less important but maybe a good idea? MSE (seleno-methionine) could be given as M not X, since MSE is used to replace MET in X-ray experiments. The MSE is not a natural part of the protein sequence.
Example: 3HTL. MSE (seleno-methionine) is shown as "?". There are 3 amino acids with two alternate locations each. These are duplicated in the listing. The first instance is MSE, listed as "??".
ERTSIAVHAL??GLPTSIGLPKVDG?SFTLYRVNEIDLTTQAGWDAASKIKLEELYTNGHPTDKVTKVATKKTEGGVAKFDNLTPALYLVVQELNGAEAVVRSQPFLVAAPQTNPTGDGWLQDVHVYPKHQALSEPVKTAVDPDATQPGFSVGENVKYRVATKIPEIASNTKFEGFTVADKLPAELGKPDTNKITVTLGGKPINSTDVSVQTYQVGDRTVLSVQLAGATLQSLDQHKDQELVVEFEAPVTKQPENGQLDNQAWVLPSNPTAQWDPEESGDAALRG?PSSRVSSKFGQITIEKSFDGNTPGADRTATFQLHRCCEADGSLVKSDPPISLDGKQEFVTGQDGKAVLSGIHLGTLQLESNV?KYTDAWAGKGTEFCLVETATASGYELLPKPVIVKLEANESTNNVLVEQKVKIDNKK
-Eric
|