File | Date | Author | Commit |
---|---|---|---|
Documents | 2021-09-01 | antonocube | [42813f] docs:First minimally-complete README version. |
Packages | 2021-11-18 | antononcube | [651e48] feat:Added special treatment of Python code. |
Presentations | 2021-11-01 | antononcube | [fb8d5a] fix:Using ⟹. |
TemplateData | 2021-09-10 | antonocube | [cb717e] feat:Added Raku-SendGrid template. Added questi... |
UnitTests | 2021-09-02 | antonocube | [b242bd] test:First version of with basic tests. |
.gitignore | 2021-09-01 | Anton Antonov | [6e1b91] Initial commit |
LICENSE | 2021-09-01 | Anton Antonov | [6e1b91] Initial commit |
README.md | 2021-11-01 | antononcube | [531006] docs:Added YouTube link for the "NLP Template E... |
This repository provides implementation, data, and documentation of a Natural Language Processing (NLP)
Template Engine (TE), [Wk1],
that utilized
Question Answering Systems (QAS'), [Wk2],
and Machine Learning (ML) classifiers.
The current implementation of repository's NLP-TE is heavily based on the Wolfram Language (WL)
built-in function
FindTextualAnswer
,
[WRI1].
In the future, we plan to utilize other -- both WL and non-WL -- QAS implementations.
We want to have a system (i.e. TE) that:
Generates relevant, correct, executable programming code based natural language specifications of computational workflows
Can automatically recognize the workflow types
Can generate code for different programming languages and related software packages
The points above are given in order of importance; the most important are placed first.
Install the package
NLPTemplateEngine.m
with:
Import["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m"]
lsaCommand = "Extract 20 topics from the text corpus aAbstracts using
the method NNMF. Show statistical thesaurus with the words neural, function, and notebook";
lsaRes = Concretize["LSAMon", lsaCommand]
Hold[lsaObj =
LSAMonUnit[aAbstracts]⟹
LSAMonMakeDocumentTermMatrix["StemmingRules" -> Automatic, "StopWords" -> Automatic]⟹
LSAMonEchoDocumentTermMatrixStatistics["LogBase" -> 10]⟹
LSAMonApplyTermWeightFunctions["GlobalWeightFunction" -> "IDF", "LocalWeightFunction" -> "None", "NormalizerFunction" -> "Cosine"]⟹
LSAMonExtractTopics["NumberOfTopics" -> 20, Method -> "NNMF", "MaxSteps" -> 16, "MinNumberOfDocumentsPerTerm" -> 20]⟹
LSAMonEchoTopicsTable["NumberOfTerms" -> 10]⟹
LSAMonEchoStatisticalThesaurus["Words" -> {"neural", "function", "notebook"}];
]
qrCommand =
"Compute quantile regression with probabilities 0.4 and 0.6, with interpolation order 2, for the dataset dfTempBoston.";
lsaRes = Concretize[qrCommand]
Hold[
qrObj =
QRMonUnit[dfTempBoston]⟹
QRMonEchoDataSummary[]⟹
QRMonQuantileRegression[12, {0.4, 0.6}, InterpolationOrder -> 2]⟹
QRMonPlot["DateListPlot" -> False, PlotTheme -> "Detailed"]⟹
QRMonErrorPlots["RelativeErrors" -> False, "DateListPlot" -> False, PlotTheme -> "Detailed"];
]
rtdCommand =
"Create a random dataset with 30 rows, 8 columns, and 60 values using column names generator RandomWord.";
res = Concretize["RandomDataset", rtdCommand]
Hold[
ResourceFunction["RandomTabularDataset"][{30, 8},
"ColumnNamesGenerator" -> RandomWord, "Form" -> "Wide",
"MaxNumberOfValues" -> 60, "MinNumberOfValues" -> 60,
"RowKeys" -> False]
]
Here is an interactive interface that gives "online" access to the functionalities discussed:
"DSL evaluations interface".
(I order to try out repository's TE "Question Answering System" radio button have to selected.)
Import["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m"]
dsSendMailTemplateEngineData = ResourceFunction["ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/TemplateData/dsQASParameters-SendMail.csv"];
Dimensions[dsSendMailTemplateEngineData]
(* {43, 5} *)
NLPTemplateEngineAddData[dsSendMailTemplateEngineData] // Keys
(* {"Questions", "Templates", "Defaults", "Shortcuts"} *)
Concretize["SendMail", "Send email to joedoe@gmail.com with content RandomReal[343], and the subject this is a random real call.", PerformanceGoal -> "Speed"]
(* Hold[
SendMail[
Association["To" -> {"joedoe@gmail.com"},
"Subject" -> "a random real call", "Body" -> RandomReal,
"AttachedFiles" -> None]]] *)
[JL1] Jérôme Louradour,
"New in the Wolfram Language: FindTextualAnswer",
(2018),
blog.wolfram.com.
[Wk1] Wikipedia entry, Template processor.
[Wk2] Wikipedia entry, Question answering.
[AAr1] Anton Antonov,
DSL::Shared::Utilities::ComprehensiveTranslation Raku package,
(2020),
GitHub/antononcube.
[ECE1] Edument Central Europe s.r.o.,
https://cro.services.
[WRI1] Wolfram Research,
FindTextualAnswer,
(2018),
Wolfram Language function, (updated 2020).
[AAv1] Anton Antonov,
"NLP Template Engine, Part 1",
(2021),
Simplified Machine Learning Workflows at YouTube.