You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(8) |
Jul
(13) |
Aug
(39) |
Sep
(23) |
Oct
(95) |
Nov
(9) |
Dec
(56) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(55) |
Feb
(64) |
Mar
(23) |
Apr
(53) |
May
(60) |
Jun
(37) |
Jul
(34) |
Aug
(84) |
Sep
(209) |
Oct
(80) |
Nov
(71) |
Dec
(32) |
2007 |
Jan
(27) |
Feb
(45) |
Mar
(44) |
Apr
(51) |
May
(63) |
Jun
(68) |
Jul
(30) |
Aug
(36) |
Sep
(104) |
Oct
(10) |
Nov
(10) |
Dec
(4) |
2008 |
Jan
(18) |
Feb
(16) |
Mar
(33) |
Apr
(19) |
May
(13) |
Jun
|
Jul
(15) |
Aug
(15) |
Sep
(7) |
Oct
(1) |
Nov
|
Dec
|
2009 |
Jan
(11) |
Feb
(30) |
Mar
(5) |
Apr
(18) |
May
(27) |
Jun
(12) |
Jul
(10) |
Aug
(45) |
Sep
(4) |
Oct
(3) |
Nov
(59) |
Dec
(42) |
2010 |
Jan
(2) |
Feb
(59) |
Mar
(103) |
Apr
(21) |
May
(27) |
Jun
(26) |
Jul
(60) |
Aug
(44) |
Sep
(9) |
Oct
(31) |
Nov
(3) |
Dec
(10) |
2011 |
Jan
(17) |
Feb
(33) |
Mar
(25) |
Apr
(3) |
May
|
Jun
(16) |
Jul
(8) |
Aug
(18) |
Sep
|
Oct
(7) |
Nov
|
Dec
(7) |
2012 |
Jan
(32) |
Feb
(18) |
Mar
(13) |
Apr
(17) |
May
(6) |
Jun
(7) |
Jul
(8) |
Aug
(68) |
Sep
(12) |
Oct
|
Nov
(3) |
Dec
(7) |
2013 |
Jan
(9) |
Feb
(3) |
Mar
(8) |
Apr
(15) |
May
|
Jun
|
Jul
|
Aug
(6) |
Sep
(3) |
Oct
(4) |
Nov
(1) |
Dec
|
2014 |
Jan
|
Feb
(16) |
Mar
(5) |
Apr
|
May
(16) |
Jun
|
Jul
(16) |
Aug
(1) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2015 |
Jan
(1) |
Feb
|
Mar
|
Apr
(7) |
May
(2) |
Jun
|
Jul
|
Aug
(7) |
Sep
|
Oct
|
Nov
|
Dec
|
2016 |
Jan
|
Feb
|
Mar
(28) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(26) |
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
(2) |
Apr
(54) |
May
(23) |
Jun
(2) |
Jul
|
Aug
(12) |
Sep
(11) |
Oct
(4) |
Nov
|
Dec
|
2018 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
(1) |
Dec
|
2019 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
(1) |
Jun
(1) |
Jul
|
Aug
(6) |
Sep
(9) |
Oct
(1) |
Nov
(1) |
Dec
(20) |
2020 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(12) |
2021 |
Jan
(5) |
Feb
(1) |
Mar
|
Apr
|
May
(3) |
Jun
(12) |
Jul
(13) |
Aug
(1) |
Sep
|
Oct
|
Nov
(1) |
Dec
(9) |
2022 |
Jan
(10) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
(6) |
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2024 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Christoph S. <chr...@un...> - 2024-01-24 10:26:17
|
Dear all, we are organising a week-long retreat on deep learning in molecular informatics as part of a workshop series called International Workshop on Open Molecular Informatics (IWOMI) from 13.-17. May 2024. The meeting place is located in the mountains next to Bolzano in Italy and a wonderful place to retreat and enjoy hands-on tutorials, hackathons and scientific and non-scientific conversations. This 2024 meeting will feature scientific talks, a hands-on session to learn how to train deep learning models using the Google Cloud and how to re-train GPT models for your specific scientific application. We will also have time for hackathons and a hike in the mountains. If you are interested, please register at https://www.iwomi.net/ and let me know if you would like to contribute a talk, tutorial, or a hackathon session. The registration deadline is the end of February. Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |
From: News Z. <ne...@zb...> - 2023-12-06 11:47:39
|
Dear Friends Science, The international EOSC-Symposium <https://eosc.eu/events/eosc-symposium-2024/> will take place in Berlin in 2024. 450 experts from research and research infrastructures as well as representatives of European science policy are expected in Berlin to present concrete results, discuss new ideas and promote the future of open science in Europe. This outstanding event as well as the high thematic synergies have inspired the *Open Science Conference <https://www.open-science-conference.eu/> *to join forces with the EOSC Symposium to present the topic of Open Science in Berlin in 2024. We can't wait to welcome you to this exciting gathering of European Open Science stakeholders *from October 21-23, 2024 in Berlin*. Under the patronage of the Federal Minister of Education and Research, Bettina Stark-Watzinger, we would like to cordially invite you to this important symposium. Please save this date. Further information on the program and registration will be provided at a later date. See: https://www.open-science-conference.eu/ Save the date: October 21-23, 2024, Berlin #OSC2024 #EOSCsymposium24 Kind regards Romy Rimpler ............................................................................................................................. ROMY RIMPLER Event Management ZBW – Leibniz Information Centre for Economics Duesternbrooker Weg 120 24105 Kiel Germany T: +49 431 8814-635 E: r.r...@zb... <d.s...@zb...> **Due to the change of our e-mail system you can currently reach me via this e-mail address.* |
From: Christoph S. <chr...@un...> - 2023-04-12 12:30:42
|
Dear all, AstraZeneca in Cambridge, UK, is looking for a hit discovery data scientist to help maximise the value of their internal datasets and build predictive models to better inform present/future campaigns and compounds collections for screening. This may be of interest for small molecule data science and ML/AI colleagues reaching the end of contract or looking for new career possibilities. Applications are open until the 19th April. AZ careers website: https://careers.astrazeneca.com/job/cambridge/senior-data-scientist-hit-discovery/7684/46662940032 Linkedin: https://www.linkedin.com/posts/clairemayo_datascience-hitdiscovery-smallmolecules-activity-7046799144765534209-ITA0 Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |
From: Suliman S. <sha...@gm...> - 2023-04-01 22:31:36
|
I submitted one too. Gonna be fun. See y'all there. -Sul On Sat, Apr 1, 2023, 2:07 PM Geoffrey Hutchison <geo...@gm...> wrote: > For what it’s worth, I’ve submitted a talk. > > Hope to see a bunch of people in SF. > > -Geoff > > On Apr 1, 2023 at 5:54:46 AM, Egon Willighagen <ego...@gm...> > wrote: > >> >> Hi all, >> >> this seems interesting: >> >> ---------- Forwarded message --------- >> From: Susi Lehtola susi.lehtola*_*alumni.helsinki.fi < >> own...@cc...> >> Date: Wed, 22 Mar 2023 at 17:09 >> Subject: CCL: Call for Papers: Free and Open Source Software symposium at >> ACS Fall 2023 Meeting >> >> Sent to CCL by: Susi Lehtola [susi.lehtola] >> >> Hi, >> >> I would like to inform the list that we are organizing a symposium "Free >> and >> Open Source Software: Harnessing the Power of Data" in the COMP division >> at the >> 2023 Fall ACS meeting held in San Francisco, California, USA on August >> 13-17, >> 2023. The symposium is jointly organized by the Molecular Sciences >> Software >> Institute (molssi.org) and the ACS Open Source Software Convergent >> Research >> Community. >> >> Abstract submission is currently open, with the deadline of April 4, >> 2023. To >> find our symposium, please look under the Computers in Chemistry (COMP) >> division >> symposium list. We look forward to receiving abstracts related to both >> applications and development of free and open source software. >> >> Thank you, >> Ashley Ringer McDonald, Cal Poly San Luis Obispo >> Susi Lehtola, University of Helsinki >> T. Daniel Crawford, Virginia Tech >> >> PS. Personally, I am especially looking forward to hear talks on what >> sets free >> and open source software aside from alternatives, such as in enabling new >> innovations in industry and academia, but all kinds of abstracts related >> to the >> topic of the symposium will be considered. >> >> ---------- </ Forwarded message > --------- >> >> -- >> Predicting binding affinities can be predicted for each protein variant >> with a new QSAR model that takes into account the amino acid change: >> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00701-3 >> >> -- >> E.L. Willighagen >> Department of Bioinformatics - BiGCaT >> Maastricht University (http://www.bigcat.unimaas.nl/) >> Homepage: https://egonw.github.io/ >> Blog: https://chem-bla-ics.blogspot.com/ >> Mastodon: https://scholar.social/@egonw >> PubList: https://orcid.org/0000-0001-7542-0286 >> _______________________________________________ >> Blueobelisk-discuss mailing list >> Blu...@li... >> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss >> > _______________________________________________ > Blueobelisk-discuss mailing list > Blu...@li... > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > |
From: Geoffrey H. <geo...@gm...> - 2023-04-01 18:06:33
|
For what it’s worth, I’ve submitted a talk. Hope to see a bunch of people in SF. -Geoff On Apr 1, 2023 at 5:54:46 AM, Egon Willighagen <ego...@gm...> wrote: > > Hi all, > > this seems interesting: > > ---------- Forwarded message --------- > From: Susi Lehtola susi.lehtola*_*alumni.helsinki.fi < > own...@cc...> > Date: Wed, 22 Mar 2023 at 17:09 > Subject: CCL: Call for Papers: Free and Open Source Software symposium at > ACS Fall 2023 Meeting > > Sent to CCL by: Susi Lehtola [susi.lehtola] > > Hi, > > I would like to inform the list that we are organizing a symposium "Free > and > Open Source Software: Harnessing the Power of Data" in the COMP division > at the > 2023 Fall ACS meeting held in San Francisco, California, USA on August > 13-17, > 2023. The symposium is jointly organized by the Molecular Sciences Software > Institute (molssi.org) and the ACS Open Source Software Convergent > Research > Community. > > Abstract submission is currently open, with the deadline of April 4, 2023. > To > find our symposium, please look under the Computers in Chemistry (COMP) > division > symposium list. We look forward to receiving abstracts related to both > applications and development of free and open source software. > > Thank you, > Ashley Ringer McDonald, Cal Poly San Luis Obispo > Susi Lehtola, University of Helsinki > T. Daniel Crawford, Virginia Tech > > PS. Personally, I am especially looking forward to hear talks on what sets > free > and open source software aside from alternatives, such as in enabling new > innovations in industry and academia, but all kinds of abstracts related > to the > topic of the symposium will be considered. > > ---------- </ Forwarded message > --------- > > -- > Predicting binding affinities can be predicted for each protein variant > with a new QSAR model that takes into account the amino acid change: > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00701-3 > > -- > E.L. Willighagen > Department of Bioinformatics - BiGCaT > Maastricht University (http://www.bigcat.unimaas.nl/) > Homepage: https://egonw.github.io/ > Blog: https://chem-bla-ics.blogspot.com/ > Mastodon: https://scholar.social/@egonw > PubList: https://orcid.org/0000-0001-7542-0286 > _______________________________________________ > Blueobelisk-discuss mailing list > Blu...@li... > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > |
From: Egon W. <ego...@gm...> - 2023-04-01 08:55:11
|
Hi all, this seems interesting: ---------- Forwarded message --------- From: Susi Lehtola susi.lehtola*_*alumni.helsinki.fi < own...@cc...> Date: Wed, 22 Mar 2023 at 17:09 Subject: CCL: Call for Papers: Free and Open Source Software symposium at ACS Fall 2023 Meeting Sent to CCL by: Susi Lehtola [susi.lehtola] Hi, I would like to inform the list that we are organizing a symposium "Free and Open Source Software: Harnessing the Power of Data" in the COMP division at the 2023 Fall ACS meeting held in San Francisco, California, USA on August 13-17, 2023. The symposium is jointly organized by the Molecular Sciences Software Institute (molssi.org) and the ACS Open Source Software Convergent Research Community. Abstract submission is currently open, with the deadline of April 4, 2023. To find our symposium, please look under the Computers in Chemistry (COMP) division symposium list. We look forward to receiving abstracts related to both applications and development of free and open source software. Thank you, Ashley Ringer McDonald, Cal Poly San Luis Obispo Susi Lehtola, University of Helsinki T. Daniel Crawford, Virginia Tech PS. Personally, I am especially looking forward to hear talks on what sets free and open source software aside from alternatives, such as in enabling new innovations in industry and academia, but all kinds of abstracts related to the topic of the symposium will be considered. ---------- </ Forwarded message > --------- -- Predicting binding affinities can be predicted for each protein variant with a new QSAR model that takes into account the amino acid change: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00701-3 -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: https://egonw.github.io/ Blog: https://chem-bla-ics.blogspot.com/ Mastodon: https://scholar.social/@egonw PubList: https://orcid.org/0000-0001-7542-0286 |
From: Peter Murray-R. <pm...@ca...> - 2023-01-19 12:44:10
|
That's really useful Sul, I wasn't aware of some of these tools. > > I am also interested in this work. I haven't tried ChemBERT. I should give > some a shot and do a little comparison. Would be a good lecture too. > That would be very useful. > I was using molminer a little while ago built on ORSA. > ==OSRA? Last time I looked (several years ago) OSRA had to be compiled or you could pay for a binary. (That's partly because the compilation wasn't trivial). > > https://github.com/gorgitko/molminer > This looks a useful package (haven't used it) > > > and I think I took a divergent path from automated tooling for now. I'm > working on this mapping for Cannabis Sativa I haven't figured out how to > map the relationship to the phenology perhaps by country of origin? > functional group? > I did it manually. I use it as a reference index here: > > > https://github.com/Sulstice/global-chem/blob/development/global_chem/global_chem/medicinal_chemistry/cannabinoids/constituents_of_cannabis_sativa.py > > I was thinking I could use this list as a master name as indexes in > searching other papers. > Most frequently occurring compounds are now in well maintained repos such as CHEBI, PubChem, Wikidata, etc. You shouldn't have to create SMILES for these as you can download them. (Also you have a few proteins - it's not normally useful to create SMILES for those. > Let me know any thoughts. > There are roughly two approaches: * supervised - which requires lists of chemicals, annotated/labelled data, etc * unsupervised where we look for patterns in the data including word embedding P. > > Cheers, > -Sul > > > On Thu, Jan 19, 2023 at 6:26 AM Peter Murray-Rust <pm...@ca...> wrote: > >> What are the current Open Source tools for recognising chemical entities >> in text? OSCAR still runs but is probably somewhat overtaken by more >> recent language models. I see that HuggingFace has "ChemBERT" - does anyone >> have experience? >> >> More generally we want to extract triples of the form: >> <chemical> <relationship> <plant> >> We plan to do chemicals and plants and then look for relationships. But >> maybe people have already done this. >> >> TIA >> >> P. >> >> -- >> "I always retain copyright in my papers, and nothing in any contract I >> sign with any publisher will override that fact. You should do the same". >> >> Peter Murray-Rust >> Reader Emeritus in Molecular Informatics >> Yusuf Hamied Department of Chemistry >> University of Cambridge >> CB2 1EW, UK >> +44-1223-336432 >> _______________________________________________ >> Blueobelisk-discuss mailing list >> Blu...@li... >> https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss >> > > > -- > *Suliman Sharif* > Ph.D. Candidate Pharmaceutical Sciences | University of Maryland, School > of Pharmacy > M.Sc Medicinal Chemistry | University of California, Riverside School of > Medicine > B.Sc. Biochemistry | University of Texas at Austin > sha...@gm... > -- "I always retain copyright in my papers, and nothing in any contract I sign with any publisher will override that fact. You should do the same". Peter Murray-Rust Reader Emeritus in Molecular Informatics Yusuf Hamied Department of Chemistry University of Cambridge CB2 1EW, UK +44-1223-336432 |
From: Suliman S. <sha...@gm...> - 2023-01-19 12:21:53
|
Hey Peter, I am also interested in this work. I haven't tried ChemBERT. I should give some a shot and do a little comparison. Would be a good lecture too. I was using molminer a little while ago built on ORSA. https://github.com/gorgitko/molminer and I think I took a divergent path from automated tooling for now. I'm working on this mapping for Cannabis Sativa I haven't figured out how to map the relationship to the phenology perhaps by country of origin? functional group? I did it manually. I use it as a reference index here: https://github.com/Sulstice/global-chem/blob/development/global_chem/global_chem/medicinal_chemistry/cannabinoids/constituents_of_cannabis_sativa.py I was thinking I could use this list as a master name as indexes in searching other papers. Let me know any thoughts. Cheers, -Sul On Thu, Jan 19, 2023 at 6:26 AM Peter Murray-Rust <pm...@ca...> wrote: > What are the current Open Source tools for recognising chemical entities > in text? OSCAR still runs but is probably somewhat overtaken by more > recent language models. I see that HuggingFace has "ChemBERT" - does anyone > have experience? > > More generally we want to extract triples of the form: > <chemical> <relationship> <plant> > We plan to do chemicals and plants and then look for relationships. But > maybe people have already done this. > > TIA > > P. > > -- > "I always retain copyright in my papers, and nothing in any contract I > sign with any publisher will override that fact. You should do the same". > > Peter Murray-Rust > Reader Emeritus in Molecular Informatics > Yusuf Hamied Department of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-336432 > _______________________________________________ > Blueobelisk-discuss mailing list > Blu...@li... > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > -- *Suliman Sharif* Ph.D. Candidate Pharmaceutical Sciences | University of Maryland, School of Pharmacy M.Sc Medicinal Chemistry | University of California, Riverside School of Medicine B.Sc. Biochemistry | University of Texas at Austin sha...@gm... |
From: Peter Murray-R. <pm...@ca...> - 2023-01-19 11:26:12
|
What are the current Open Source tools for recognising chemical entities in text? OSCAR still runs but is probably somewhat overtaken by more recent language models. I see that HuggingFace has "ChemBERT" - does anyone have experience? More generally we want to extract triples of the form: <chemical> <relationship> <plant> We plan to do chemicals and plants and then look for relationships. But maybe people have already done this. TIA P. -- "I always retain copyright in my papers, and nothing in any contract I sign with any publisher will override that fact. You should do the same". Peter Murray-Rust Reader Emeritus in Molecular Informatics Yusuf Hamied Department of Chemistry University of Cambridge CB2 1EW, UK +44-1223-336432 |
From: Christoph S. <chr...@un...> - 2023-01-08 10:00:21
|
Hi Suliman, > Is there some travel scholarship I can apply for to attend this workshop from the US? not from any organisation affiliated with this meeting. But there are travel grants from learned societies such as the ACS, AFAIK. There is also the travel grants from the CSA trust that you could give a try. (https://csa-trust.org/awards-and-grants/grants/) Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. > On 7. Jan 2023, at 14:27, Suliman Sharif <sha...@gm...> wrote: > > Hey All, > > Is there some travel scholarship I can apply for to attend this workshop from the US? > > -Sul > > On Sat, Jan 7, 2023 at 7:36 AM Christoph Steinbeck <chr...@un...> wrote: > Dear all, > > we are organising a meeting on cross-toolkit structure registration and normalization a part of a new workshop series called IWOMI. > > You can find more information about this year's International Workshop on Open Molecular Information (IWOMI) at https://www.iwomi.net. > If you are interested to join, please register on the workshop webpage. > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Team Assistant: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/0000-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > _______________________________________________ > Blueobelisk-discuss mailing list > Blu...@li... > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > > > -- > Suliman Sharif > Ph.D. Candidate Pharmaceutical Sciences | University of Maryland, School of Pharmacy > M.Sc Medicinal Chemistry | University of California, Riverside School of Medicine > B.Sc. Biochemistry | University of Texas at Austin > sha...@gm... |
From: Suliman S. <sha...@gm...> - 2023-01-07 13:28:04
|
Hey All, Is there some travel scholarship I can apply for to attend this workshop from the US? -Sul On Sat, Jan 7, 2023 at 7:36 AM Christoph Steinbeck < chr...@un...> wrote: > Dear all, > > we are organising a meeting on cross-toolkit structure registration and > normalization a part of a new workshop series called IWOMI. > > You can find more information about this year's International Workshop on > Open Molecular Information (IWOMI) at https://www.iwomi.net. > If you are interested to join, please register on the workshop webpage. > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Vice President for Digitalisation of the Friedrich-Schiller-University Jena > > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Team Assistant: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/0000-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > _______________________________________________ > Blueobelisk-discuss mailing list > Blu...@li... > https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss > -- *Suliman Sharif* Ph.D. Candidate Pharmaceutical Sciences | University of Maryland, School of Pharmacy M.Sc Medicinal Chemistry | University of California, Riverside School of Medicine B.Sc. Biochemistry | University of Texas at Austin sha...@gm... |
From: Christoph S. <chr...@un...> - 2023-01-07 12:36:09
|
Dear all, we are organising a meeting on cross-toolkit structure registration and normalization a part of a new workshop series called IWOMI. You can find more information about this year's International Workshop on Open Molecular Information (IWOMI) at https://www.iwomi.net. If you are interested to join, please register on the workshop webpage. Kind regards, Chris — Prof. Dr. Christoph Steinbeck Vice President for Digitalisation of the Friedrich-Schiller-University Jena Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Team Assistant: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |
From: Christoph S. <chr...@un...> - 2022-02-04 08:58:29
|
Hi Tim, sorry for confusing you. I took your email as an occasion to update the google group that we created for the very first attempt to get this meeting organised in 2020 when the whole crap started. Let’s move the discussion there. Kind regards, Chris — Prof. Dr. Christoph Steinbeck Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Secretariat: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. > On 3. Feb 2022, at 20:53, Tim Dudgeon <tdu...@gm...> wrote: > > Where is the right place for discussions on this event to happen? I'm pretty confused (but registered, and booked). > > On Wed, Jan 19, 2022 at 10:31 AM Christoph Steinbeck <chr...@un...> wrote: > Dear all, > > as announced last November on this list, the CDK 20th anniversary workshop will be held on Schloss Korb near Bolzano, Italy, from 4. - 8. April 2022. > > We are reaching the end of the registration period with the hotel, which is tomorrow. They might be able to hold the rooms for a tiny little longer, but there is no guarantee. At the moment, there are 20 rooms left. > The closer we get to the event, the more we compete with the regular guests for rooms. > Please consider booking ASAP if you are interested to join. It is really nice there :) > > For more information including how to register please visit > > https://www.eventbrite.co.uk/e/cdk-20th-anniversary-symposium-tickets-215520175647 > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Secretariat: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/0000-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > _______________________________________________ > Cdk-user mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-user > _______________________________________________ > Cdk-user mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-user |
From: Andrew D. <da...@da...> - 2022-01-20 12:00:31
|
On Jan 20, 2022, at 10:05, Peter Murray-Rust via Blueobelisk-discuss <blu...@li...> wrote: > > I think it's wonderful that we have an un-organization that is still going strong. What would it look like if the organization were not going strong, but existed mostly from inertia and a lack of a better alternative? > I highlight the great work done voluntarily by many people, without central management, in tackling the log4j problem. Do you highlight how people were not compensated for taking time off during the Christmas holidays? Egon writes "It totally messed up my schedule." I wonder how many people thanked him for his work. Do you highlight the urgent need for organizations to set aside budget to support these projects, either through direct financial support, or by paying people to maintain the software and assure its continued fitness for purpose? Since Egon took care of the log4j problem for CDK, and Egon is one of the core CDK developers, doesn't that make him part of central management of CDK? Jo Freeman long ago pointed out in "The Tyranny of Structurelessness": "once the movement no longer clings tenaciously to the ideology of structurelessness, it will be free to develop those forms of organisation best suited to its healthy functioning. This does not mean that we should go to the other extreme and blindly imitate the traditional forms of organisation. But neither should we blindly reject them all. Some traditional techniques will prove useful, albeit not perfect; some will give us insights into what we should not do to obtain certain ends with minimal costs to the individuals in the movement. Mostly, we will have to experiment with different kinds of structuring and develop a variety of techniques to use for different situations." I submit that Blue Obelisk *has a structure*. It's exactly the informal structure Freeman described, that happens even when its members assert there is no structure. Here's how I know there's an informal structure - who decides who gets a Blue Obelisk award? If I award one to myself, do I get to update the Wikipedia entry? For that matter, who gets to decide that Blue Obelisk is an un-organization without central management? I assert that Blue Obelisk as an organization is moribund. Its existence, combined with the informal structure which dictates it must have no formal centralized authority, prevents more effective organizations from forming. Look at our cousin, the Open Bioinformatics Foundation, to see what a more effective organization looks like, with yearly meetings on FOSS in bioinformatics and acting as a contact point for Google Summer of Code and other projects. To emphasize by repetition: "This does not mean that we should go to the other extreme and blindly imitate the traditional forms of organisation". Regards, Andrew da...@da... |
From: Peter Murray-R. <pet...@go...> - 2022-01-20 09:05:35
|
I think it's wonderful that we have an un-organization that is still going strong. I highlight the great work done voluntarily by many people, without central management, in tackling the log4j problem. Some of my colleagues are interested in DAOs ( https://en.wikipedia.org/wiki/Decentralized_autonomous_organization) . Does this have any role for BO? Do we have algorithms for governance that make sense for BO? I am continuing to keep a hand in chemistry, particularly materials. I was asked to speak at two meetings last year (OMDI2021) and am also invited to a CECAM meeting next month MADICES2022. (I've also been changing to Python as it has a better and more integrated set of libraries than Java - which is increasingly enterprise). We need new approaches and I'm trying to work some of these out - particularly a mixture of un-natural language processing with defined objects rooted in Wikidata. Wikidata is a gamechanger and Egon is one of the leaders. I'm also building software to extract data from diagrams. This includes plots (such as electrochemistry) and biochemical reaction pathways. There are much better general libraries for images and I think we can make good progress. I'd be interested in anyone who wants to extract data from spectra, reaction pathways, plots (scatter, bar, ...) I applaud Christoph and his group for Decimer ML to identify chemical diagrams and would be happy to explore. I am particularly keen on extracting knowledge from preprints (and have been talking with bio/medrxiv people). Being able to have an immediate preprint reader - e.g. minutes after the document was loaded - would be independent of publishers. Also the PDFs are much more tractable than the awful 2-column PDFs of the "version of record". In short I think there are huge opportunities for modern tools to innovate in chemistry again and take semantic control. P. > > > -- > Robert M. Hanson > Professor of Chemistry > St. Olaf College > Northfield, MN > http://www.stolaf.edu/people/hansonr > > > If nature does not answer first what we want, > it is better to take what answer we get. > > -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900 > > *We stand on the homelands of the Wahpekute Band of the Dakota Nation. We > honor with gratitude the people who have stewarded the land throughout the > generations and their ongoing contributions to this region. We acknowledge > the ongoing injustices that we have committed against the Dakota Nation, > and we wish to interrupt this legacy, beginning with acts of healing and > honest storytelling about this place.* > -- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK |
From: Robert H. <ha...@st...> - 2022-01-20 03:05:22
|
yeah -- well, Happy New Year 2022! Seems I picked up and replied to a message from 2015!!! NO idea how that happened. On Mon, Jan 17, 2022 at 11:58 PM Egon Willighagen < ego...@gm...> wrote: > > > On Tue, 18 Jan 2022 at 06:53, Robert Hanson via Blueobelisk-discuss < > blu...@li...> wrote: > >> Great idea! >> > > Okay, no idea where this message is coming from :) > > It was indeed a great idea (imho). The first Blue Obelisk paper ( > https://pubs.acs.org/doi/10.1021/ci050400b) has had a CC-BY license for > almost 7 years now: > https://sourceforge.net/p/blueobelisk/mailman/message/33816652/ (thx to > funding by Chris). > > Egon > > -- > ---- > BiGCaT received a NWO Open Science grant to support our research into > interoperability of biological data and knowledge: > https://www.nature.com/articles/d41586-021-03418-1 and > https://www.nwo.nl/en/researchprogrammes/open-science/open-science-fund/open-science-fund-2021-awarded-grants > > ----- > E.L. Willighagen > Department of Bioinformatics - BiGCaT > Maastricht University (http://www.bigcat.unimaas.nl/) > Twitter/Mastodon: @egonwillighagen <https://twitter.com/egonwillighagen> > / @egonw <https://scholar.social/@egonw> > Homepage: http://egonw.github.io/ > Blog: http://chem-bla-ics.blogspot.com/ > PubList: https://www.zotero.org/egonw > ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> > ImpactStory: https://impactstory.org/u/egonwillighagen > -- Robert M. Hanson Professor of Chemistry St. Olaf College Northfield, MN http://www.stolaf.edu/people/hansonr If nature does not answer first what we want, it is better to take what answer we get. -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900 *We stand on the homelands of the Wahpekute Band of the Dakota Nation. We honor with gratitude the people who have stewarded the land throughout the generations and their ongoing contributions to this region. We acknowledge the ongoing injustices that we have committed against the Dakota Nation, and we wish to interrupt this legacy, beginning with acts of healing and honest storytelling about this place.* |
From: Robert H. <ha...@st...> - 2022-01-19 20:47:17
|
Does anyone know a way to help PubChem fix/update OECHEM toolkit to properly create cumulene 3D structures? (allene is planar). Is this just a version issue? Bob ---------- Forwarded message --------- From: NLM Support <nlm...@nl...> Date: Wed, Jan 19, 2022 at 12:23 PM Subject: Re: case #CAS-843334-G6D6X8: PubChem Question - https://pubchem.ncbi.nlm.nih.g... TRACKING:000412000010244 To: Robert Hanson <ha...@st...> Hi, A follow up: This is an issue with the OEChem toolkit adopted by PubChem workflow to process chemical structures and to generate 3D coordinates. It has problems with allenic structures. The team is exploring other toolkits, which will take time. Another point - currently there is no existing mechanism to do any manual update due to human resource constraint. Your patience and understanding will be greatly appreciated! Regards, Tao Tao, PhD NCBI User Services https://go.usa.gov/x647S Case Information: Case #: CAS-843334-G6D6X8 Customer Name: Robert Hanson Customer Email: ha...@st... Case Created: 1/17/2022, 1:31:27 PM Summary: PubChem Question - https://pubchem.ncbi.nlm.nih.gov/compound/12590904 Details: https://pubchem.ncbi.nlm.nih.gov/compound/12590904 Allene 3d is planar; should be twisted. |
From: Egon W. <ego...@gm...> - 2022-01-19 11:38:49
|
Thanks for the reminder! Egon On Wed, 19 Jan 2022 at 11:30, Christoph Steinbeck < chr...@un...> wrote: > Dear all, > > as announced last November on this list, the CDK 20th anniversary workshop > will be held on Schloss Korb near Bolzano, Italy, from 4. - 8. April 2022. > > We are reaching the end of the registration period with the hotel, which > is tomorrow. They might be able to hold the rooms for a tiny little longer, > but there is no guarantee. At the moment, there are 20 rooms left. > The closer we get to the event, the more we compete with the regular > guests for rooms. > Please consider booking ASAP if you are interested to join. It is really > nice there :) > > For more information including how to register please visit > > > https://www.eventbrite.co.uk/e/cdk-20th-anniversary-symposium-tickets-215520175647 > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Secretariat: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/0000-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel > -- ---- BiGCaT received a NWO Open Science grant to support our research into interoperability of biological data and knowledge: https://www.nature.com/articles/d41586-021-03418-1 and https://www.nwo.nl/en/researchprogrammes/open-science/open-science-fund/open-science-fund-2021-awarded-grants ----- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Twitter/Mastodon: @egonwillighagen <https://twitter.com/egonwillighagen> / @egonw <https://scholar.social/@egonw> Homepage: http://egonw.github.io/ Blog: http://chem-bla-ics.blogspot.com/ PubList: https://www.zotero.org/egonw ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> ImpactStory: https://impactstory.org/u/egonwillighagen |
From: Christoph S. <chr...@un...> - 2022-01-19 10:30:45
|
Dear all, as announced last November on this list, the CDK 20th anniversary workshop will be held on Schloss Korb near Bolzano, Italy, from 4. - 8. April 2022. We are reaching the end of the registration period with the hotel, which is tomorrow. They might be able to hold the rooms for a tiny little longer, but there is no guarantee. At the moment, there are 20 rooms left. The closer we get to the event, the more we compete with the regular guests for rooms. Please consider booking ASAP if you are interested to join. It is really nice there :) For more information including how to register please visit https://www.eventbrite.co.uk/e/cdk-20th-anniversary-symposium-tickets-215520175647 Kind regards, Chris — Prof. Dr. Christoph Steinbeck Analytical Chemistry - Cheminformatics and Chemometrics Friedrich-Schiller-University Jena, Germany Phone Secretariat: +49-3641-948171 http://cheminf.uni-jena.de http://orcid.org/0000-0001-6966-0814 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |
From: Egon W. <ego...@gm...> - 2022-01-18 05:59:06
|
On Tue, 18 Jan 2022 at 06:53, Robert Hanson via Blueobelisk-discuss < blu...@li...> wrote: > Great idea! > Okay, no idea where this message is coming from :) It was indeed a great idea (imho). The first Blue Obelisk paper ( https://pubs.acs.org/doi/10.1021/ci050400b) has had a CC-BY license for almost 7 years now: https://sourceforge.net/p/blueobelisk/mailman/message/33816652/ (thx to funding by Chris). Egon -- ---- BiGCaT received a NWO Open Science grant to support our research into interoperability of biological data and knowledge: https://www.nature.com/articles/d41586-021-03418-1 and https://www.nwo.nl/en/researchprogrammes/open-science/open-science-fund/open-science-fund-2021-awarded-grants ----- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Twitter/Mastodon: @egonwillighagen <https://twitter.com/egonwillighagen> / @egonw <https://scholar.social/@egonw> Homepage: http://egonw.github.io/ Blog: http://chem-bla-ics.blogspot.com/ PubList: https://www.zotero.org/egonw ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> ImpactStory: https://impactstory.org/u/egonwillighagen |
From: Robert H. <ha...@st...> - 2022-01-18 00:20:11
|
Great idea! |
From: Egon W. <ego...@gm...> - 2022-01-05 09:47:06
|
On Sat, Jan 1, 2022 at 1:10 PM Egon Willighagen <ego...@gm...> wrote: > I intend to push this Euclid 2 and CMLXOM 4, so increasing the major > version. > After Peter's reply, I pushed these to Maven Central. Egon -- ---- BiGCaT received a NWO Open Science grant to support our research into interoperability of biological data and knowledge: https://www.nature.com/articles/d41586-021-03418-1 and https://www.nwo.nl/en/researchprogrammes/open-science/open-science-fund/open-science-fund-2021-awarded-grants ----- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Twitter/Mastodon: @egonwillighagen <https://twitter.com/egonwillighagen> / @egonw <https://scholar.social/@egonw> Homepage: http://egonw.github.io/ Blog: http://chem-bla-ics.blogspot.com/ PubList: https://www.zotero.org/egonw ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> ImpactStory: https://impactstory.org/u/egonwillighagen |
From: Egon W. <ego...@gm...> - 2022-01-01 12:10:35
|
Hi Peter, all, Most of the dependencies of Euclid/CMLXOM have been updated, with Log4j giving quite some opportunity to pull in updates for other libraries. One update still needs to happen, the update from XOM 1.2 to XOM 1.3. This requires a few API changes: https://github.com/BlueObelisk/euclid/pull/19 The changes are pretty straightforward and are just about return values not being the more generic Node but the more precise Atttribute/Element. I intend to push this Euclid 2 and CMLXOM 4, so increasing the major version. What do you think? Egon -- ---- BiGCaT received a NWO Open Science grant to support our research into interoperability of biological data and knowledge: https://www.nature.com/articles/d41586-021-03418-1 and https://www.nwo.nl/en/researchprogrammes/open-science/open-science-fund/open-science-fund-2021-awarded-grants ----- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Twitter/Mastodon: @egonwillighagen <https://twitter.com/egonwillighagen> / @egonw <https://scholar.social/@egonw> Homepage: http://egonw.github.io/ Blog: http://chem-bla-ics.blogspot.com/ PubList: https://www.zotero.org/egonw ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286> ImpactStory: https://impactstory.org/u/egonwillighagen |
From: Andrew D. <da...@da...> - 2021-12-21 12:29:02
|
H Suliman, On Dec 19, 2021, at 05:51, Suliman Sharif <sha...@gm...> wrote >> When was the current state of machine representation figured out? > > I would say the 1980s was after the invention of SMILES where they used something somewhat "readable", they got it started and now we continue is my thought there. As additional factors to think about, 1980s SMILES didn't handle chirality or isotopes. Those were added in the 1990s. Computer databases in the 1970s, like MACCS, could already store those. Indeed, lack of stereochemistry support in WLNs was one of the factors which lead to its decline around 1980. (Stereochemistry was given in human-readable notes.) What could SMILES handle that MACCS's connection tables couldn't in 1980? I don't think the 1980s SMILES representation exceeds that of MCC (mechanical chemical code - https://pubs.acs.org/doi/pdf/10.1021/c160027a002 ), which includes isotopes but not stereochemistry. >> Does cheminformatics include its roots in library science? Or are those now different fields? > > I like chem + informatic because it's one character shorter and in my opinion sounds cooler. I mean you could say it's around the time we started constructing the IUPAC language trying to turn what's going on chemistry wish to a language representation and it's a part of library science. But anything is a part of a library science since we all record scientific information in some format. I don't think I expressed my question well enough. The current journal "Journal of Chemical Information and Modeling" was previously "Journal of Chemical Information and Computer Science", which was previously "Journal of Computer Documentation". Before J. Chem. Doc., papers were published in American Documentation or the Journal of Computer Education. The word "Documentation" is used in those earlier journals because documentation science is the precursor to information science, coming out of the work of Otlet and Fontaine. See Traité de Documentation (1934) and their work on the Mundaneum (1910). "Documentation" was the hot topic in the mid-20th century. Chemistry was one of the biggest data sets around (after legal cases), and much of the field we now call "cheminformatics" arose during the post-war era as a way to mechanize documentation management, first through punched cards and then through computers. Terms like "chemical descriptor" come directly out of this era, and the same researcher who coined both "descriptor" and "chemical descriptor" also coined the term "information retrieval", for an ACS conference. So I don't mean the abstract "we all record scientific information in some format", but I mean the historic evolution of this field as a branch of library science, with practitioners who work in libraries, and publication articles on how to manage their collection. (Eg, "The Charter: A "Must" for Effective Information System Planning and Design", http://dx.doi.org/10.1021/c160012a004 "It is the product of research work by information center managers, information system supervisors, technical report file custodians, and others who undertook information storage and retrieval efforts".) On the other hand, cheminformatics can also be interpreted the field which (among other things) uses methods of chemical information originally developed for documentation management in order to model chemical behavior. That's the "... and Modeling" of JCIM. Someone can have a successful career in that aspect of cheminformatics without knowing anything about the connections to library science. Which means a book about cheminformatics has to decide what "cheminformatics" means, hence my question. > Maybe we should teach IUPAC first again, Again, what is your purpose? What topics do you de-emphasize in order to teach more about IUPAC? And from what I hear, IUPAC has recently changed. > Check out Morgan's paper and some slides I made from that paper in teaching. I have read Morgan's paper. Amusingly, the ACS included it in final report of the NSF-funded work they did to develop and expand a computer-based Chemical Registry System, which means it's not behind a paywall. https://eric.ed.gov/?id=ED032214 , Appendix D, starting on PDF page 134. I also looked at your text at https://sharifsuliman1.medium.com/understanding-morgan-f70186b172f6 . Since the slides are a bit ambiguous about a few concepts, here are some other things to consider: "Well to do that he first decided he needed to come up with a rank ordering system, a way to sequentially at atoms in some sort of list for example for acetone:" He didn't come up with a rank ordering system. He came up with a unique rank. A non-unique rank ordering was in use in, eg, Ray and Kirsh's 1957 computer substructure search implementation, and in Mooers' 1951 theoretical description. "He chose to implement an old method of a Search Tree" I think you should point out that these concepts were new at the time. "Morgan decided the information would be stored in a series of 5 lists" One of the things that makes that paper difficult to understand is how it uses the compact connection table, which is a representation I think no one uses these days. Those 5 lists are part of that specific representation, but not essential to the algorithm. This representation came from Gluck's work at Du Pont ("A Chemical Structure Storage and Search System Developed at Du Pont", https://pubs.acs.org/doi/pdf/10.1021/c160016a008 , presented 1964, published 1965). Now, Gluck also had a canonicalization method, described in that paper as "The atom numbers in the bond columns are the newly assigned rank positions. The two Atoms No. 4 have different atom ranks associated with their single bonds. The iterative procedure which follows the initial ordering break ties according to the magnitudes of the atoms to which the tied atoms are bonded. .. This iterative process of reordering according to the new rank of the atoms in the bond columns continues until all atoms are uniquely ranked, in which case the compound is is canonical form, or until no further reordering is possible until ties still remain." You can see ties with the Morgan approach; Gluck then went to work at CAS with Morgan. The main problem being that Gluck's algorithm wasn't actually canonical. In "A Collection of Algorithms for Searching Chemical Compound Structure Analogs" at https://archive.org/details/DTIC_AD0460819/page/n19/mode/2up you can see Lehman's counter-example showing how the algorithm failed. The Morgan algorithm resolved that problem. "Essentially what you can do is start with a Radius of 0 around the atom." I'm concerned that you've mixed up the "Morgan invariant", as its described for ECFP-like fingerprints, with the algorithm that Morgan described in the paper. If you look at your radius=2 example, you'll see the 17 = 3*3 + (3+3+2), that is, the invariant for the initial carbon, squared, plus the sum of the invariants for the atoms at R=2 away. It no longer includes the R=1 invariants. You can see that even if the neighboring -OH has an initial invariant of 1,000, that value won't be part of the initial carbon's invariant. Instead, for purposes of teaching I would start with Penny codes, which is the paper immediately following Morgan's in the same issue, at https://pubs.acs.org/doi/pdf/10.1021/c160017a019 . On page 11 of that same "A Collection of Algorithms for Searching Chemical Compound Structure Analogs" link at https://archive.org/details/DTIC_AD0460819/page/n19/mode/2up you can read about Penny codes. Penny, in a recent paper, recognizes correctly that atom and bonding considerations alone are in some cases inadequate for distinguishing compounds. His method is concerned with enumerating the simple connectivity in the neighborhood of each atom. As he states, "it is a unique expression of the atomic network within the immediate neighborhood of the subject atom and is an attribute of the atom as much as its chemical identity". Page 12 then goes into more detail. You'll see these are much more in line with your description. I personally think RDKit's use of "Morgan" fingerprint should be "Penny" fingerprint, but I know that's a predilection of mine. > It's weird to me that data structures is not a core requirement for cheminformatic folk. Like all interdisciplinary fields, cheminformatics uses only a subset of a larger topic of "data structures", and has some specialized needs not covered by normal introductory classes. I have a CS degree. Data structures as taught by computer scientists include many topics I have not yet needed in cheminformatics. I've never needed to care about B-tree implementations. Or red-black balanced binary trees. I don't think I've even had to use Dijkstra's algorithm, which is pretty molecular-graph-adjacent. While intro data structure classes don't teach substructure isomorphism algorithms. And I think Bloom filters (conceptually related to molecular fingerprints) is also a more advanced topic. On the other hand, I have used concepts I learned in automata theory. So while I completely support the idea that a cheminformatics textbook should include a deeper treatment of graph theory than, say, the 5 pages Gasteiger gives in his textbook, I also complete support the idea that a semester-long general-purpose programming course, plus a semester long data structures course, isn't appropriate. Cheers, Andrew da...@da... |
From: Suliman S. <sha...@gm...> - 2021-12-19 04:51:48
|
Sorry for being late.....back in this bloody grad school...don't have complete freedom yet. Advisor takes precedence. Dalke, When was the current state of machine representation figured out? I would say the 1980s was after the invention of SMILES where they used something somewhat "readable", they got it started and now we continue is my thought there. Does cheminformatics include its roots in library science? Or are those now different fields? I like chem + informatic because it's one character shorter and in my opinion sounds cooler. I mean you could say it's around the time we started constructing the IUPAC language trying to turn what's going on chemistry wish to a language representation and it's a part of library science. But anything is a part of a library science since we all record scientific information in some format. The first chapter wouldn't be SMILES, it would be the Hill system, then chemical formulas, and only then SMILES. Maybe we should teach IUPAC first again, since I think it still stands as the legacy language and predates Hill, most people forget it after the first chapter of organic chemistry. But instead of just basic nomenclature but expand into other dialects; to polymers and such. Then we talk about the representation of the elements and how the Hill system introduced rank ordering of the atoms. Carbon being first, then hydrogens, then other heteroatoms, and why that was important and perhaps easy to read? I personally, liked morgan's way of representing data and his paper, I found it easy to add my own hippie spin on it and teach it to the Molecular Dynamic folk. It was easy to learn for me how to play around with it, and the concept of a chemical environment and how to represent it numerically. Seems like some of them got it, most of my department tends to be 35-50+. Haven't tried on younger folk. Check out Morgan's paper and some slides I made from that paper in teaching. As an aside to the list, in a hypothetical textbook, I would think 2D-based input is an important component. What's the FOSS equivalent to ChemDraw? ChemDoodle Web Component? JSME is not distributed in "the preferred form of the work for making modifications to it" (quoting the GPL). Yeah I agree, ChemDraw I guess has been what I've been using since I was a kid as well (like 8-9 years now)....I use their online free version: https://chemdrawdirect.perkinelmer.cloud/js/sample/index.html. I don't have a good answer for that one. New software comes and goes, what stands the test of time. Geoffrey, Over the years, Rich Apodaca has also blogged about the many limitations to the common connection-table type representation. (Helicene chirality, delocalized bonding systems, weird aromaticity all jump to mind.) e.g. https://depth-first.com/articles/2021/07/14/the-trouble-with-huckel/ Aromaticity is tricky. And metal ions yeah, that's going to take some time. Metals aren't done well, if at all, in the charmm forcefield. I don't have answer to this, apodaca is right. Will need to re-read the blog again and think a bit. Regards to teaching, It's weird to me that data structures is not a core requirement for cheminformatic folk. I would rather make undergraduate students take 2 semesters of organic chemistry and then intro to programming, and data structures before tackling an intro to cheminformatics. There should be a barrier to entry into the field, I feel like the jupyter notebook is a good way to get instant gratification fast. But you want to 'refine education', which is far broader, and that's not an effort I feel like doing. I'm not dead yet, so I'll try. -Sul On Fri, Dec 3, 2021 at 1:44 PM Geoffrey Hutchison <geo...@gm...> wrote: > As a reminder, there's an infinite number of structures that SMILES can't > handle. (Endohedral fullerenes and catenanes, to name two.) > > > Over the years, Rich Apodaca has also blogged about the many limitations > to the common connection-table type representation. (Helicene chirality, > delocalized bonding systems, weird aromaticity all jump to mind.) > e.g. > https://depth-first.com/articles/2021/05/04/of-zero-order-bonds-and-bonding-systems/ > > https://depth-first.com/articles/2021/07/14/the-trouble-with-huckel/ > > And 3D structures are at best a snapshot of a flexible structure. > > > When I started at Pitt, part of my "teaching statement" was about how our > common 2D (textbook, slides, printouts, articles) representations hide both > the 3D and dynamic nature of real molecules. > > I also learned by watching students that they have a "hard sphere" concept > of lone pairs, orbitals, etc. > > I'm too jaded, in that I feel like I've seen all these ideas before and > participated in some of them, and they never made an impact outside the > core community. > > > Not to disillusion Suliman, but I would suggest there are a variety of > resources out there, e.g. > : Cheminformatics Online Collaborative Course http://olcc.ccce.divched.org - > https://pubs.acs.org/doi/10.1021/acs.jchemed.0c01035 > : TeachCADD - https://volkamerlab.org/projects/teachopencadd/ > > > -- *Suliman Sharif* M.Sc Medicinal Chemistry | University of California, Riverside B.Sc. Biochemistry | University of Texas at Austin sha...@gm... |