From: joerg.wegner <joe...@we...> - 2007-01-30 21:39:06
|
Dear all virtual chemistry fans, As you might know ... Wikipedia provides a lot of chemical information and special projects like Chemistry portal http://en.wikipedia.org/wiki/Portal:Chemistry Wikiproject Drugs http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Drugs Since we are as chemists used to work with molecules and to search for substructure and similar molecules, some people would like to see SMILES support and substructure search services in public databases for Wikipedia articles. As usual this caused some discussions and we realized that those approaches must be tackled on a higher technical Mediawiki level. If you would like to see substructure searches and proper SMILES in Wikipedia it would push the priority, if you 'Vote for this bug' (actually feature request) at http://bugzilla.wikimedia.org/show_bug.cgi?id=7514 Unfortunately you have to create an account, so it's up to you if you really want to do that. History and details http://en.wikipedia.org/wiki/Template_talk:Drugbox#Substructure_search_in_eM olecules_and_PubChem_added Very kind regards, Joerg Kurt Wegner me...@ch... |
From: Craig A. J. <cj...@em...> - 2007-02-14 00:43:44
|
Joerg, joerg.wegner wrote: > If you would like to see substructure searches and proper SMILES in > Wikipedia it would push the priority, if you 'Vote for this bug' (actually > feature request) at > http://bugzilla.wikimedia.org/show_bug.cgi?id=7514 > Unfortunately you have to create an account, so it's up to you if you really > want to do that. I agree wholeheartedly that substructure searching of Wikipedia is an excellent idea. But in my opinion, Wikipedia itself is not the place to do it. The web is already divided into content providers (most web sites) and search engines. Wikipedia is primarily an information repository, not a search engine. Even for text searches, many users choose Google, Yahoo, MSN, Jeeves or another good text search engine to find Wikipedia articles. Why not do the same for chemistry? I suggest an alternative approach: Provide a standardized format for submission of chemical structures to Wikipedia, and let the chemistry search engines and projects provide the search service. If every structure in Wikipedia had a standard chemical identifier (InChI, SMILES, SDF, etc.), identified by a chemical mime type, then projects like PubChem, MDL's DiscoveryGate, Chemical Abstracts Service, and my company, eMolecules.com, will be able to find the Wikipedia entries, and provide excellent substructure searching of Wikipedia. Building a cheminformatics system is a very big job, and requires ongoing, active participation by a cheminformatics expert. (Actually, this should be a chemical registry system, which is even more complex.) Wikipedia could become the largest single public source of chemical data in the world. Maintaining such a cheminformatics system is no small job. The effort could be better spent standardizing the mime types, and rewriting the existing Wikipedia pages, so that web robots can find the chemical identifiers and add them to their search engines. Indeed, even if you decide to build substructure search into Wikipedia itself, it will require this very same thing: well documented mime types and formats that Wikipedia supports. So this is something the Wikipedia community needs to address either way. And once you do this, the web robots can automatically start indexing Wikipedia for you. Furthermore, standardized mime types are neutral -- it doesn't favor any one search engine, company, or technology over another. Any company that thinks they can provide a decent search service is welcome to try, and may the best one win. The web is about collaboration. Let Wikipedia remain what it is -- an information repository for the world's largest collaborative encyclopedia. But in addition, *standardize* the chemical identifiers so that chemistry search engines can find them. Craig James CTO, eMolecules.com P.S. I don't subscribe to all of the lists to which you sent your original message. If you think this email is worth further discussion, could you please forward it to the other lists? I only subscribe to OpenBabel. Thanks. -- +================================================== | Craig A. James | Chief Technology Officer, eMolecules, Inc. | PO Box 2790, Del Mar, CA 92014-5790, USA | cell: 760-212-9201 fax: 858-605-9605 | cj...@em... http://www.emolecules.com +================================================== |