Menu

Tree [651e48] main /
 History

HTTPS access


File Date Author Commit
 Documents 2021-09-01 antonocube antonocube [42813f] docs:First minimally-complete README version.
 Packages 2021-11-18 antononcube antononcube [651e48] feat:Added special treatment of Python code.
 Presentations 2021-11-01 antononcube antononcube [fb8d5a] fix:Using ⟹.
 TemplateData 2021-09-10 antonocube antonocube [cb717e] feat:Added Raku-SendGrid template. Added questi...
 UnitTests 2021-09-02 antonocube antonocube [b242bd] test:First version of with basic tests.
 .gitignore 2021-09-01 Anton Antonov Anton Antonov [6e1b91] Initial commit
 LICENSE 2021-09-01 Anton Antonov Anton Antonov [6e1b91] Initial commit
 README.md 2021-11-01 antononcube antononcube [531006] docs:Added YouTube link for the "NLP Template E...

Read Me

NLP Template Engine

In brief

This repository provides implementation, data, and documentation of a Natural Language Processing (NLP)
Template Engine (TE), [Wk1],
that utilized
Question Answering Systems (QAS'), [Wk2],
and Machine Learning (ML) classifiers.

The current implementation of repository's NLP-TE is heavily based on the Wolfram Language (WL)
built-in function
FindTextualAnswer,
[WRI1].

In the future, we plan to utilize other -- both WL and non-WL -- QAS implementations.

Problem formulation

We want to have a system (i.e. TE) that:

  1. Generates relevant, correct, executable programming code based natural language specifications of computational workflows

  2. Can automatically recognize the workflow types

  3. Can generate code for different programming languages and related software packages

The points above are given in order of importance; the most important are placed first.


Examples

Install the package
NLPTemplateEngine.m
with:

Import["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m"]

Latent Semantic Analysis

lsaCommand = "Extract 20 topics from the text corpus aAbstracts using
the method NNMF. Show statistical thesaurus with the words neural, function, and notebook";
lsaRes = Concretize["LSAMon", lsaCommand]
Hold[lsaObj =
    LSAMonUnit[aAbstracts]
    LSAMonMakeDocumentTermMatrix["StemmingRules" -> Automatic, "StopWords" -> Automatic]
    LSAMonEchoDocumentTermMatrixStatistics["LogBase" -> 10]
    LSAMonApplyTermWeightFunctions["GlobalWeightFunction" -> "IDF", "LocalWeightFunction" -> "None", "NormalizerFunction" -> "Cosine"]
    LSAMonExtractTopics["NumberOfTopics" -> 20, Method -> "NNMF", "MaxSteps" -> 16, "MinNumberOfDocumentsPerTerm" -> 20]
    LSAMonEchoTopicsTable["NumberOfTerms" -> 10]
    LSAMonEchoStatisticalThesaurus["Words" -> {"neural", "function", "notebook"}];
]

Quantile Regression

qrCommand = 
  "Compute quantile regression with probabilities 0.4 and 0.6, with interpolation order 2, for the dataset dfTempBoston.";
lsaRes = Concretize[qrCommand]
Hold[
 qrObj = 
   QRMonUnit[dfTempBoston]
   QRMonEchoDataSummary[]
   QRMonQuantileRegression[12, {0.4, 0.6}, InterpolationOrder -> 2]
   QRMonPlot["DateListPlot" -> False, PlotTheme -> "Detailed"]
   QRMonErrorPlots["RelativeErrors" -> False, "DateListPlot" -> False, PlotTheme -> "Detailed"];
]

Random tabular data generation

rtdCommand =
"Create a random dataset with 30 rows, 8 columns, and 60 values using column names generator RandomWord.";
res = Concretize["RandomDataset", rtdCommand]
Hold[ 
  ResourceFunction["RandomTabularDataset"][{30, 8}, 
    "ColumnNamesGenerator" -> RandomWord, "Form" -> "Wide", 
    "MaxNumberOfValues" -> 60, "MinNumberOfValues" -> 60, 
    "RowKeys" -> False]
]

Interactive interface

Here is an interactive interface that gives "online" access to the functionalities discussed:
"DSL evaluations interface".

DSL-evaluations-interface-with-QR-spec-for-QAS

(I order to try out repository's TE "Question Answering System" radio button have to selected.)


Bring your own templates

  1. Load the NLP-Template-Engine
    WL package:
Import["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/Packages/WL/NLPTemplateEngine.m"]
  1. Get the "training" templates data (from CSV file you have created or changed) for a new workflow
    ("SendMail"):
dsSendMailTemplateEngineData = ResourceFunction["ImportCSVToDataset"]["https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/TemplateData/dsQASParameters-SendMail.csv"];
Dimensions[dsSendMailTemplateEngineData]

(* {43, 5} *)
  1. Add the ingested data for the new workflow (from the CSV file) into the NLP-Template-Engine:
NLPTemplateEngineAddData[dsSendMailTemplateEngineData] // Keys

(* {"Questions", "Templates", "Defaults", "Shortcuts"} *)
  1. Parse natural language specification with the newly ingested and onboarded workflow ("SendMail"):
Concretize["SendMail", "Send email to joedoe@gmail.com with content RandomReal[343], and the subject this is a random real call.", PerformanceGoal -> "Speed"]

(* Hold[
 SendMail[
  Association["To" -> {"joedoe@gmail.com"}, 
   "Subject" -> "a random real call", "Body" -> RandomReal, 
   "AttachedFiles" -> None]]] *)
  1. Experiment with running the generated code!

References

Articles

[JL1] Jérôme Louradour,
"New in the Wolfram Language: FindTextualAnswer",
(2018),
blog.wolfram.com.

[Wk1] Wikipedia entry, Template processor.

[Wk2] Wikipedia entry, Question answering.

Functions, packages, repositories

[AAr1] Anton Antonov,
DSL::Shared::Utilities::ComprehensiveTranslation Raku package,
(2020),
GitHub/antononcube.

[ECE1] Edument Central Europe s.r.o.,
https://cro.services.

[WRI1] Wolfram Research,
FindTextualAnswer,
(2018),
Wolfram Language function, (updated 2020).

Videos

[AAv1] Anton Antonov,
"NLP Template Engine, Part 1",
(2021),
Simplified Machine Learning Workflows at YouTube.