From: Emig, R. <Rob...@pi...> - 2006-03-28 17:53:31
|
I like your idea of check one parser, then the next, etc. That may just solve all the problems. However, In my opinion, if we want to make something that works for as many people as possible, then we cannot rely on extensions alone. Maybe I'm using a sledge hammer for a nail, but this is one area I find most people get VERY frustrated with. File extensions work as long as the user knows what's going on. VectorNTI has a bad habit of exporting sequences in a file .gb which is actually a fasta file. In addition some websites allow the download of fasta files, but by default add a .txt extension, resulting in .fasta.txt . This causes a problem when users try to open a file they see as .fasta because windows default hides the .txt extension.=20 One thing to keep in mind is that sometimes the same file can be opened up in different ways... Example... PDB file as a sequence file .fasta is this a group of sequences, or an alignment .msf - normally an alignment, but what if I want just a group of sequences? .seq -maybe I want to edit the file as a text file (ie replace X's with N's, etc, inspect to find out why parsing doesn't work, etc) 2c, -Robin -----Original Message----- From: bio...@li... [mailto:bio...@li...] On Behalf Of Ola Spjuth Sent: Tuesday, March 28, 2006 9:25 AM To: Mark Southern Cc: Bioclipse-devel ML Subject: RE: [Bioclipse-devel] BiojavaResource Hi, I have been thinking a lot about this and decided: For Bioclipse vers 1.0: Use file extensions and levels to decide original type On parse, decide final type that could be other than original type For future versions we could: Implement a scanner to read e.g. first 10 rows and decide FAST the actual file format/BioResourceType I don't really see how your proposal differs from my implementation, given that the parser can handle multiple types. If the parser cannot parse a file, it returns null and the next parser (with a lower level) can give it a try. > Only some types of file perhaps? A .pdb file is pretty much always a pdb > file but a .seq file can be swissprot, fasta or several others. This is exactly the way it works now. I think we already have the functionality implemented in Bioclipse. Let me know if I have misunderstood things. Cheers, .../Ola On Tue, 2006-03-28 at 11:59 -0500, Mark Southern wrote: >=20 > -----Original Message----- > From: Mark Southern=20 > Sent: Tuesday, March 28, 2006 11:59 AM > To: 'Emig, Robin' > Subject: RE: [Bioclipse-devel] BiojavaResource >=20 > I like this idea too. Meta data about the user's files would need to be > stored though, else the next time they start bioclipse they'd be asked > the same questions. >=20 > Only some types of file perhaps? A .pdb file is pretty much always a pdb > file but a .seq file can be swissprot, fasta or several others. >=20 >=20 > Mark. >=20 > -----Original Message----- > From: bio...@li... > [mailto:bio...@li...] On Behalf Of Emig, > Robin > Sent: Monday, March 27, 2006 6:46 PM > To: eg...@us...; bio...@li... > Subject: RE: [Bioclipse-devel] BiojavaResource >=20 > I'm a big fan of the way MOE and Mesquite load files >=20 > Allow user to open ANY file > Then go to another dialog to actually parse it >=20 > Text version of parse dialog.... > We can use the biojava stuff to guess the file, and then read in.... > ------------------------------------------------------------ > - File Data: | Open file as... | Example format > - | Sequence: Fasta | >sequencename annotations > - >sequence blah blah | Sequence: MSF | sequence > - ATGAMGIALMD | Alignment: MSF | sequence > - KLKLKLKPRLKTLK | Sequence: Genbank | > - | Recommend: Sequence:Fasta | > ------------------------------------------------------------ >=20 > -Robin >=20 > -----Original Message----- > From: bio...@li... > [mailto:bio...@li...] On Behalf Of Egon > Willighagen > Sent: Sunday, March 26, 2006 9:33 PM > To: bio...@li... > Subject: Re: [Bioclipse-devel] BiojavaResource >=20 > On Sunday 26 March 2006 21:36, Mark Southern wrote: > > I propose that we use org.biojava.bio.seq.io.SeqIOTools to guess the > > type of sequence file and autoload it. Something like: >=20 > CDK uses such an algorithm for chem files too. >=20 > > The guessFileType() method IS deprecated in biojava but in many years > of > > using it, it has not failed me.=20 >=20 > You could set up a number of test cases. >=20 > > In a gui application such as bioclipse=20 > > it makes sense to try to best deal with what ever the user throws at > > you! >=20 > Egon >=20 > --=20 > eg...@us... > Blog: http://chem-bla-ics.blogspot.com/ > GPG: 1024D/D6336BA6 >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D1= 21642 > _______________________________________________ > Bioclipse-devel mailing list > Bio...@li... > https://lists.sourceforge.net/lists/listinfo/bioclipse-devel >=20 > This communication is for use by the intended recipient and contains > information that may be Privileged, confidential or copyrighted under > applicable law. If you are not the intended recipient, you are hereby > formally notified that any use, copying or distribution of this e-mail, > in whole or in part, is strictly prohibited. Please notify the sender by > return e-mail and delete this e-mail from your system. Unless explicitly > and conspicuously designated as "E-Contract Intended", this e-mail does > not constitute a contract offer, a contract amendment, or an acceptance > of a contract offer. This e-mail does not constitute a consent to the > use of sender's contact information for direct marketing purposes or for > transfers of data to third parties. >=20 > Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean >=20 > http://www.DuPont.com/corp/email_disclaimer.html >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=3Dk&kid0944&bid$1720&dat1642 > _______________________________________________ > Bioclipse-devel mailing list > Bio...@li... > https://lists.sourceforge.net/lists/listinfo/bioclipse-devel >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting language > that extends applications into web and mobile media. Attend the live webcast > and join the prime developer group breaking into this new coding territory! > http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid0944&bid$1720&dat1642 > _______________________________________________ > Bioclipse-devel mailing list > Bio...@li... > https://lists.sourceforge.net/lists/listinfo/bioclipse-devel ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D1= 21642 _______________________________________________ Bioclipse-devel mailing list Bio...@li... https://lists.sourceforge.net/lists/listinfo/bioclipse-devel This communication is for use by the intended recipient and contains information that may be Privileged, confidential or copyrighted under applicable law. If you are not the intended recipient, you are hereby formally notified that any use, copying or distribution of this e-mail, in whole or in part, is strictly prohibited. Please notify the sender by return e-mail and delete this e-mail from your system. Unless explicitly and conspicuously designated as "E-Contract Intended", this e-mail does not constitute a contract offer, a contract amendment, or an acceptance of a contract offer. This e-mail does not constitute a consent to the use of sender's contact information for direct marketing purposes or for transfers of data to third parties. Francais Deutsch Italiano Espanol Portugues Japanese Chinese Korean http://www.DuPont.com/corp/email_disclaimer.html |