From: Matthew F. <mat...@gm...> - 2025-07-28 14:20:13
|
You could use the CoreML IR, but (as a whole-program compiler), MLton will always give you the CoreML IR for the whole program, so you'll need to "find" your datatypes within there. I'm not sure about the complexity of SML datatype definitions that you expect to encounter, but I suspect that it might be simpler to just write your own tiny parser for SML type definitions (and assume that the complexities of type definitions aren't really used). I kind of see the appeal of the type-level "marking" of types in the datatypes, but I wonder if a simple custom comment syntax might suffice: ``` datatype 'a tree = Leaf | Node of 'a tree * 'a * 'a tree type iitree = (*#1*) ((*#0*) int tree) tree ``` I note that in your original example, you used a `type` definition to describe the type of data to be generated, but note that elaboration to CoreML has completely expanded all `type` definitions, so there wouldn't be one to find in the CoreML program. I suppose that in a larger application, it would be convenient to have all of the datatype definitions available, but you will still need to parse the CoreML sufficiently to find all of the involved types. -Matthew On Fri, Jul 25, 2025 at 9:41 PM Arnav Sabharwal <arn...@an...> wrote: > Hello Prof. Matthew, Yawar, > > Thank you for your response and advice! Apologies, I miscommunicated > earlier: my OCaml "generator" (not generator - generator) takes in an SML > type and the user specified target sizes for each involved inductive type > and yields an SML term of said type, which meets the size constraints. > > I was considering the approach of stopping MLton at **some** point since: > > 1. I was hoping I could get MLton to unique-ify type variables for me > 2. MLton could simplify the grammar of type and datatype declarations > for me, making parsing easier (as compared to say, parsing SML from source) > > Seemingly, the Core-ML IR seems to have these properties? Please correct > me if I'm wrong! > > Thanks, > Arnav. > > On Thu, Jul 24, 2025 at 9:21 PM Matthew Fluet <mat...@gm...> > wrote: > >> Aranav isn’t subscribed, so I needed to moderate/accept the email before >> it was sent to the whole list. >> >> On Thu, Jul 24, 2025 at 8:03 PM 'Yawar Raza (RIT Student)' via MLton-user >> <mlt...@ml...> wrote: >> >>> Nevermind to my first paragraph. I just received the original email. >>> Apparently they can get sent to me out of order... >>> >>> On Thursday, July 24, 2025, Yawar Raza (RIT Student) <ys...@ri...> >>> wrote: >>> >>>> Arnav, I think you might need to use "Reply-All" for the reply to >>>> remain on the mailing list, as only Prof. Fluet got your second email (I >>>> didn't get it, at least). >>>> >>>> Anyway, I don't think "stopping the MLton compiler part way through" is >>>> the best way to do the sort of thing you're trying to do. >>>> >>>> Something you _could_ look into is this one family of SML tricks >>>> centered around http://mlton.org/Fold, possibly more specifically >>>> http://mlton.org/VariableArityPolymorphism. I don't really understand >>>> these libraries though, and I'm personally very averse to this sort of code. >>>> >>>> I don't really get what you're trying to do. Why do you have a >>>> generator-generator? Is the generator (output by the generator-generator) >>>> in SML itself or does it just output SML? Does it output just datatype >>>> declarations or complete programs? I think saying what each level inputs >>>> and outputs would be helpful. >>>> >>>> In general, you should probably end up doing one of the following: >>>> - Push arity-polymorphism or similarly un-simple stuff up one generator >>>> level, generating different SML source code text for each (rather than >>>> trying to extract MLton compiler output). >>>> - Give up on perfect type safety relating to arities, and just have the >>>> different arities of some things share a single general type. >>>> - Use a different language that either has dependent types (e.g. Idris) >>>> or has more featureful metaprogramming (e.g. F# with its type providers). >>>> >>>> But maybe I'm just misunderstanding something. Feel free to ignore >>>> those last two paragraph if so, but I hope those two MLton links above were >>>> useful to learn exist, at least. >>>> >>>> >>>> On Thursday, July 24, 2025, Matthew Fluet <mat...@gm...> >>>> wrote: >>>>> >>>>> Well, if you just want a somewhat normalized representation of the >>>>> types, then you might just use the CoreML representation immediately after >>>>> elaboration (`-keep-pass parseAndElaborate`). >>>>> >>>>> For example: >>>>> ``` >>>>> $ cat z.sml >>>>> datatype zero = Zero >>>>> datatype 'n succ = Succ >>>>> >>>>> datatype ('elem, 'index) my_list = Nil | Cons of 'elem * ('elem, >>>>> 'index) my_list >>>>> val z : ((int, zero) my_list, zero succ) my_list = raise Fail "todo" >>>>> $ mlton -keep-pass parseAndElaborate z.sml >>>>> $ tail z.parseAndElaborate.post.core-ml >>>>> (* deadCode: true *) >>>>> (* deadCode: true *) >>>>> (* deadCode: false *) >>>>> datatype zero_21 = Zero_0 >>>>> datatype 'a_4547 succ_2 = Succ_0 >>>>> datatype ('a_4549, 'a_4548) my_list_0 = Nil_0 >>>>> | Cons_0 of 'a_4549 >>>>> * ('a_4549, >>>>> 'a_4548) my_list_0 >>>>> val z_26: ((int32, zero_21) my_list_0, zero_21 succ_2) my_list_0 = >>>>> raise (Fail_0 "todo") >>>>> ``` >>>>> >>>>> But, note, you'll still need to see through the uniquification of all >>>>> identifiers. >>>>> >>>>> -Matthew >>>>> >>>>> On Thu, Jul 24, 2025 at 10:48 AM Arnav Sabharwal < >>>>> arn...@an...> wrote: >>>>> >>>>>> Hello Matthew, >>>>>> >>>>>> Thank you for your response! My larger goal here is to make a random >>>>>> generator for (regular) SML inductive types that allows the user to >>>>>> configure "target sizes" for every inductive structure within their type. >>>>>> The type labels are meant to be a way for the user and the tool to agree on >>>>>> a common language for referring to said inductive types. >>>>>> >>>>>> For example, say the user wants to generate 2-d lists where the outer >>>>>> list has length 3 and each inner list has length 2. Then, they'd define >>>>>> their list type like so, building towards their monomorphic "final" type: >>>>>> ``` >>>>>> datatype ('elem, 'index) my_list = Nil | Cons of 'elem * ('elem, >>>>>> 'index) my_list >>>>>> type final = ((int, zero) my_list, zero succ) my_list >>>>>> ``` >>>>>> This way, the user could later provide the lengths [2, 3] and the >>>>>> tool would unambiguously be able to generate the type with the >>>>>> specified constraints. >>>>>> >>>>>> Oh, and I'm writing the "random generator"-generator tool in OCaml. >>>>>> So, I suppose one way could be to parse out the data-type and type >>>>>> declarations from source, while these tags still hang around. But then I'd >>>>>> have to "evaluate" type applications to bring them into a normal form, due >>>>>> to the lingering type variables...which is why I was considering delaying >>>>>> the point at which I "parse into an OCaml representation" as much as >>>>>> possible. >>>>>> >>>>>> Happy to clarify more if necessary, would greatly appreciate your >>>>>> advice! >>>>>> >>>>>> Arnav. >>>>>> >>>>>> On Thu, Jul 24, 2025 at 8:57 AM Matthew Fluet < >>>>>> mat...@gm...> wrote: >>>>>> >>>>>>> Arnav, >>>>>>> >>>>>>> Yes, in SXML ("Simply-typed" XML), there is no polymorphism >>>>>>> whatsoever (for functions or datatypes) and type parameters that do not >>>>>>> contribute to the representation of a datatype will have been eliminated. >>>>>>> >>>>>>> Actually, the SimplfyTypes (http://mlton.org/XMLSimplifyTypes) XML >>>>>>> optimization will have already eliminated these unused type parameters, >>>>>>> before monomorphisation. The motivation for SimplifyTypes is to avoid code >>>>>>> duplication by monomorphisation that is unnecessary, due to type parameters >>>>>>> that may be present for high-level API reasons, but do not affect the >>>>>>> dynamics because they do not influence the data representation. >>>>>>> >>>>>>> It may be that if you drop SimplifyTypes (compile with `-drop-pass >>>>>>> xmlSimplifyTypes`), then monomorphisation will force different "versions" >>>>>>> of your tagged datatypes to be separately monmorphised, though with the >>>>>>> current heuristics for naming monomorphised variants, you probably won't >>>>>>> get any evidence of the `zero` or `succ` type parameters remaining in the >>>>>>> SXML program. >>>>>>> >>>>>>> Could you say a bit more about your goals? That might help to >>>>>>> inform whether there is a better way to accomplish what you are trying to >>>>>>> do. >>>>>>> >>>>>>> -Matthew >>>>>>> >>>>>>> On Thu, Jul 24, 2025 at 5:24 AM Arnav Sabharwal < >>>>>>> arn...@an...> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I'm trying to tag every instance of a recursive type in a type >>>>>>>> definition with a tag. This tag is essentially a "type-level natural >>>>>>>> number". By way of example, here's what I mean: >>>>>>>> >>>>>>>> datatype zero = Zero >>>>>>>> datatype 'n succ = Succ >>>>>>>> >>>>>>>> (* example: rose tree *) >>>>>>>> datatype ('tree, 'list) rose_tree = Node of int * ('list, 'tree) rose_list >>>>>>>> and ('list, 'tree) rose_list = Nil | Cons of int * ('tree, 'list) rose_list >>>>>>>> >>>>>>>> (* example: tree with int list data *) >>>>>>>> datatype 'list my_list = Nil | Cons of int * 'list my_list >>>>>>>> and ('tree, 'list) my_tree = Leaf | Node of ('tree, 'list) my_tree * 'list my_list * ('tree, 'list) my_tree >>>>>>>> >>>>>>>> As seen, each type is given as many type parameters as is necessary >>>>>>>> to instantiate its own tag as well as the tags of its "recursive >>>>>>>> sub-structures". The first type argument is meant for the tag of the >>>>>>>> recursive type currently being defined. >>>>>>>> >>>>>>>> Now, when I use MLton to compile this, the tag types seem to be >>>>>>>> removed during monomorphization. Thus, >>>>>>>> >>>>>>>> (zero, zero succ) rose_tree >>>>>>>> and >>>>>>>> >>>>>>>> (zero succ succ, zero succ succ succ) rose_tree) >>>>>>>> appear to be the same type in SXML. What are my options here, if >>>>>>>> I'd like to maintain such a type level distinction? Is XML necessarily the >>>>>>>> furthest down I can go in the compiler pipeline? I'd greatly appreciate >>>>>>>> some advice here! >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Arnav. >>>>>>>> , >>>>>>>> >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to mlt...@ml.... >>>>>>>> _______________________________________________ >>>>>>>> MLton-user mailing list >>>>>>>> MLt...@li...; mlt...@ml... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/mlton-user >>>>>>>> >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to mlt...@ml.... >>>>> >>>> >>>> >>>> -- >>>> -- Yawar Raza >>>> >>>> >>> >>> -- >>> -- Yawar Raza >>> >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to mlt...@ml.... >>> _______________________________________________ >>> MLton-user mailing list >>> MLt...@li...; mlt...@ml... >>> https://lists.sourceforge.net/lists/listinfo/mlton-user >>> >> To unsubscribe from this group and stop receiving emails from it, send an email to mlt...@ml.... |