From: Karol L. <kar...@gm...> - 2019-02-05 15:59:13
|
Well, the parsers are not organized by method or whatnot... there's a base class with a parse method ( https://github.com/cclib/cclib/blob/master/cclib/parser/logfileparser.py#L281) and the various parsers inherit from that and implement and extract method which is called by the parse method (for NWChem, for example: https://github.com/cclib/cclib/blob/master/cclib/parser/nwchemparser.py#L42). So there's no function really that would be replaced. It's the whole parsing that would be imitated by the machine learned thing. Of course, one could as a first step try to learn to parse rather easy things like number of atoms and charge. HTH, Karol On Tue, Feb 5, 2019 at 3:32 AM Aditya Kamath <adt...@gm...> wrote: > Hi Karol, > Thank you for responding to my message. In that case, the problem becomes > information extraction. I think it is possible using Deep Learning. Can you > tell me some examples of cclib parsing functions you feel can be replaced > with ML? > Best Regards, > Aditya > > On Wed, Jan 30, 2019 at 1:12 AM Karol Langner <kar...@gm...> > wrote: > >> Hi Aditya, >> >> My intention with the idea was solely data extraction from log files, so >> parsing. But if you see other applications of ML within the scope of cclib, >> we're definitely interested. Please note other projects under the >> OpenChemistry umbrella also have ML ideas, and many of those are more >> straightforward. Here, with parsing, things will be much more researchy. >> >> >> HTH, >> Karol >> >> On Tue, Jan 29, 2019, 2:13 AM Aditya Kamath <adt...@gm...> wrote: >> >>> Dear Karol, >>> I am Aditya, I read your GSoC project Idea to possibly implement machine >>> learning to compete with cclib as an efficient data parser. From what I >>> understand, you wish to train a machine learning model to handle and >>> convert data between various software outputs. >>> >>> I suggest that the role of machine learning is not to handle or parse >>> data but rather to analyze it. cclib can benefit from backend trained ML >>> models to do tasks like classify file data, identify and extract >>> information from files. It can also perform very accurate regression and >>> emulate complex function maps which could benefit any calculation methods >>> used by cclib. >>> >>> We can use algorithms like CRF's to label and identify data in data >>> files or use neural networks or any other regression methods to compliment >>> calculations. >>> >>> I am a final year student, looking for a prospective GSoC project to >>> work with. I have previously worked with a research group implementing >>> machine learning for ODE solvers to compete with Gaussian software >>> calculations, ab initio calculations. I would be happy to discuss further >>> on how we can work with cclib functionalities. I look forward to hearing >>> from you. >>> >>> Best Wishes, >>> Aditya Kamath >>> >> |