Menu

Tree [fdba29] master /
 History

HTTPS access


File Date Author Commit
 ConceptualDiagrams 2020-10-07 antonocube antonocube [feacc0] Using white background.
 EBNF 2021-03-24 antonocube antonocube [ba5ea7] Minor code syntax changes.
 Packages 2021-05-29 antonocube antonocube [fdba29] feat:Added Camelia icons.
 Projects 2021-04-15 antonocube antonocube [63a402] Corrected YouTube link.
 UnitTests 2020-09-29 antonocube antonocube [22e489] Added mutation and selection tests.
 org 2020-10-12 antonocube antonocube [f7ebcb] Small advancement tracked: pivot-wider.
 .gitignore 2020-09-06 antonocube antonocube [8afd3e] First version.
 LICENSE 2017-01-11 Anton Antonov Anton Antonov [ff793f] Initial commit
 README.md 2020-08-04 antonocube antonocube [96fd13] Figured out how to get quoted terms back!

Read Me

ConversationalAgents

Articles, designs, and code for making conversational agents.

Conversational agents for Machine Learning workflows

Currently the primary focus in this repository is the making of grammars and interpreters that
generate programming code for Machine Learning (ML) workflows from sequences of natural language commands.

The code generation is done through dedicated grammar parsers, ML software monads, and interpreters that map
the parser-derived abstract syntax trees into corresponding ML monads.

Here is a diagram for the general, "big picture" approach:

Monadic-making-of-ML-conversational-agents

The primary target are the programming languages R and Wolfram Language (WL).
(Some of the Raku packages generate Python code, but at this point that is just for illustration purposes.
There are no actual implementations for the generated Python pipelines.)

Example

The following example shows monadic pipeline generation of Latent Semantic Analysis (LSA) workflows
in both R and WL using:

Note that:

  • the sequences of natural commands are the same;

  • the generated R and WL code pipelines are similar because the corresponding packages have similar implementations.


This Raku (Perl 6) command assigns a sequence of natural commands to a variable:

my $command ='
create from aText;
make document term matrix with no stemming and automatic stop words;
echo data summary;
apply lsi functions global weight function idf, local term weight function none, normalizer function cosine;
extract 12 topics using method NNMF and max steps 12;
show topics table with 12 columns and 10 terms;
show thesaurus table for sing, left, home;
';

This Raku (Perl 6) command:

say to_LSAMon_R($command);

generates this R code:

LSAMonUnit(aText) %>%
LSAMonMakeDocumentTermMatrix( stemWordsQ = NA, stopWords = NULL) %>%
LSAMonEchoDocumentTermMatrixStatistics( ) %>%
LSAMonApplyTermWeightFunctions(globalWeightFunction = "IDF", localWeightFunction = "None", normalizerFunction = "Cosine") %>%
LSAMonExtractTopics( numberOfTopics = 12, method = "NNMF",  maxSteps = 12) %>%
LSAMonEchoTopicsTable(numberOfTableColumns = 12, numberOfTerms = 10) %>%
LSAMonEchoStatisticalThesaurus( words = c("sing", "left", "home"))

This Raku (Perl 6) command:

say to_LSAMon_WL($command);

generates this WL code:

LSAMonUnit[aText] 
LSAMonMakeDocumentTermMatrix[ "StemmingRules" -> None, "StopWords" -> Automatic] 
LSAMonEchoDocumentTermMatrixStatistics[ ] 
LSAMonApplyTermWeightFunctions["GlobalWeightFunction" -> "IDF", "LocalWeightFunction" -> "None", "NormalizerFunction" -> "Cosine"] 
LSAMonExtractTopics["NumberOfTopics" -> 12, Method -> "NNMF", "MaxSteps" -> 12] 
LSAMonEchoTopicsTable["NumberOfTableColumns" -> 12, "NumberOfTerms" -> 10] 
LSAMonEchoStatisticalThesaurus[ "Words" -> { "sing", "left", "home" }]

This Raku (Perl 6) command:

say to_LSAMon_Python($command);

generates this Python code:

obj = LSAMonUnit(aText);
obj = LSAMonMakeDocumentTermMatrix( lsaObj = obj, stemWordsQ = NA, stopWords = NULL);
obj = LSAMonEchoDocumentTermMatrixStatistics( lsaObj = obj );
obj = LSAMonApplyTermWeightFunctions( lsaObj = obj, globalWeightFunction = "IDF", localWeightFunction = "None", normalizerFunction = "Cosine");
obj = LSAMonExtractTopics( lsaObj = obj, numberOfTopics = 12, method = "NNMF",  maxSteps = 12);
obj = LSAMonEchoTopicsTable( lsaObj = obj, numberOfTableColumns = 12, numberOfTerms = 10);
obj = LSAMonEchoStatisticalThesaurus( lsaObj = obj, words = c("sing", "left", "home"))

Note that the Python code above shows how to interpret the R and WL monadic pipelines above
into sequences of imperative commands.