Vesta Configuration Management System / Feature Requests / #147 print full path dot

Greg Czajkowski - 2008-08-12

Logged In: YES
user_id=393776
Originator: YES

There are several human readable, text base formats available for data serialization. Here are some examples:

Generalized
JSON
Turtle, a subset of N3
XML
YAML
Language Specific
SDL has _print
Perl has Data::Dumper
Python has pprint module

As well as text based and platform independent but not human readable:
bencode

The added refinement is the output format that is suitable for grep searches. Which is a flattening of the hierarchy.

This would be answered by the format. Let’s say we generalize the problem to JSON data types and compare to SDL.

JSON <-> SDL : Similarity
Number <-> int : High
String <-> text : High
Boolean <-> bool : High
Array <-> list : High
Object <-> binding : High
Null <-> N/A : JSON only
N/A <-> file : Vesta only
N/A <-> closure : Vesta only
N/A <-> err : Vesta only (deprecated)

Then get inspired by Python and Document Object Model, but only how accesses happen to full path to the object, ie flattened.

Let’s see how the flattening could be applied to JSON

List of ints:
“a_list_of_ints”[0] : 4,
“a_list_of_ints”[1] : 0x20,
“a_list_of_ints”[2] : 07531,
“a_list_of_ints”[3] : -20,

A List of lists of ints:
“a_list_of_lists_of_ints”[0],”a_list_of_ints”[0] : 4,
“a_list_of_lists_of_ints”[0],”a_list_of_ints”[1] : 0x20,
“a_list_of_lists_of_ints”[0],”a_list_of_ints”[2] : 07531,
“a_list_of_lists_of_ints”[0],”a_list_of_ints”[3] : -20,
“a_list_of_lists_of_ints”[1],”b_list_of_ints”[0] : 0,
“a_list_of_lists_of_ints”[1],”b_list_of_ints”[0] : 50,

An object of types
“a_object”{“int”} : 4,
“a_object”{“string”} : “hello”,
“a_object”{“bar”} : true,
“a_object”{“int”} : -20,
“a_object”{“escaped\”string”} : “escaped\”string”,
“a_object”{“list”}[0] : 1,
“a_object”{“list”}[1] : 0,
“a_object”{“list”}[2],”b_list”[0] : 4,
“a_object”{“list”}[2],”b_list”[1] : “string”,
“a_object”{“list”}[3],”c_escaped_\”id_object”{“foo”} : “beef”

Simplify for readability, compress for readability
a_object{int} : 4
a_object{string} : “hello”
a_object{bar} : true
a_object{int} : -20
a_object{“escaped\”string”} : “escaped\”string”
a_object{list}[0] : 1
a_object{list}[1] : 0
a_object{list}[2].b_list[0] : 4
a_object{list}[2].b_list[1] : “string”
a_object{list}[3].”c_escaped_\”id_object”{foo} : “beef”

Apply to SDL, compress bindings using /
a_object/int = 4
a_object/string = “hello”
a_object/bar = true
a_object/int = -20
a_object/“escaped\”string” = “escaped\”string”
a_object/list[0] = 1
a_object/list[1] = 0
a_object/list[2]/b_list[0] = 4
a_object/list[2]/b_list[1] = “string”
a_object/list[3]/d_binding/e_binding/f_binding/e_list[0] = “string”
a_object/list[4]/”c_escaped_\”binding”/foo = “beef”
a_object/file.o = <file: /nfs/sc/disks/bmpvsta3/sid/b19/0be/09>
a_object/”dash-file-name.o” = <file: /nfs/sc/disks/bmpvsta3/sid/b19/0be/09>
a_object/closure = <Closure /vesta/vestasys.org/bridges/generics/6/build.ves, line 90, char 14>
a_object/root=<Model /vesta/vesta.com/platforms/linux/suse/i686/components/glibc/2.3.3-98.61/1/root.ves>
./includes/boost/type_traits/is_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/8d8/46b/f5>
./includes/boost/python/enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/b0f/600/cb>
./includes/boost/mpl/aux_/preprocessor/enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/b1f/861/d0>
./includes/boost/numeric/conversion/int_float_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/ba>
./includes/boost/numeric/conversion/sign_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bd>
./includes/boost/numeric/conversion/udt_builtin_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bf>
./includes/boost/numeric/conversion/int_float_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/ba>
./includes/boost/numeric/conversion/sign_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bd>
./includes/boost/numeric/conversion/udt_builtin_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bf>
./includes/boost/numeric/conversion/int_float_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/ba>
./includes/boost/numeric/conversion/sign_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bd>
./includes/boost/numeric/conversion/udt_builtin_mixture_enum.hpp = <file: /nfs/sc/disks/bmpvsta3/sid/a7b/651/bf>

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kenneth C. Schalk - 2008-08-14

Logged In: YES
user_id=304837
Originator: NO

For the most part, I like the proposal you're making. There
is one concern I have though, and it goes back to the issue
of lists. Consider this output form your example:

a_object/list[2]/b_list[1] = "string"

The problem I see is that it's not really consistent with
the SDL syntax. You can't use square brackets to index into
a list. In fact, the only way to index into a list is with
the _elem primitive function.

I'm concerned that this could be confusing for novice or
casual users.

Obviously you can't always simply copy and paste the printed
output of the evaluator into your SDL code. I'm not saying
that we have to make some new output meet such a high (and
probably unreasonable) standard. However, it seems to me
that it's usually pretty obvious which portions of the
current printed representation aren't valid in SDL code. I
feel like the output format you're promposing would make
that less so.

Of course maybe the answer to this conundrum could be to add
some syntax for looking up an element in a list by index. I
think we could even make square brackets do that if we
wanted. It adds a bit of context-sensitivity to the
language grammar, but we already have that with the way the
less-than sign is sometimes a comparator and sometimes the
start of a list.

Another possibility, that should be a little easier to
implement, would be to use parentheses:

a_object/list(2)/b_list(1)

The benefit of this choice is that it would require no
changes to the language parser. You could think of it as
treating a list like a function that takes the index as an
argument.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Greg Czajkowski - 2008-08-15

Logged In: YES
user_id=393776
Originator: YES

Hi Ken,

>>> The problem I see is that it's not really consistent with the SDL syntax

It really is not meant to be. It's main purpose is to aid in debug and development of extremely large projects. The output

>>> Obviously you can't always simply copy and paste the printed output of the evaluator into your SDL code.

Precisely, only part of the output of _print(.) can be dropped back into SDL. So why is _print(.) or _print(some_binding) used?

Is it to visually log the results of binding operations, functions, models? Yes
Is it used to visually bindings during debug? Yes
Is it used to copy the results somewhere else for the purpose of debugging? Partially?

When the results are files then not really.. not much can be done with this, except visualize its existence
enum.hpp=<file: /nfs/sc/disks/bmpvsta3/sid/b0f/600/06>,

closures are useful during searching, and "discovery" of models/bridges/functions, and thus sometimes using during development:
a_object/closure = <Closure
/vesta/vestasys.org/bridges/generics/6/build.ves, line 90, char 14>

same with models:
a_object/root=<Model
/vesta/vesta.com/platforms/linux/suse/i686/components/glibc/2.3.3-98.61/1/root.ves>

But in all these cases you will mangle and manipulate the results by hand.

>> However, it seems to me that it's usually pretty obvious which portions of the
>> current printed representation aren't valid in SDL code.
>> I feel like the output format you're promposing would make that less so.

The same is true for this output format. It's really meant for debug and development. Grepping which works best flattened output.

>> Of course maybe the answer to this conundrum could be to add
>> some syntax for looking up an element in a list by index.

That's not the purpose of this RFE. But yes, I was really suprised there was no language syntax for array lookups.

>> I think we could even make square brackets do that if we
>> wanted. It adds a bit of context-sensitivity to the
>> language grammar, but we already have that with the way the
>> less-than sign is sometimes a comparator and sometimes the
>> start of a list.

The brackets are a natural choice, and doing a survey of C/C++/Perl/PHP/Python/Java/C#/etc square brackets are clearly the right natural choices for new and veteran users

>> Another possibility, that should be a little easier to
>> implement, would be to use parentheses:
>>
>> a_object/list(2)/b_list(1)

With all due respect when I saw this, I wanted to cry. This is so ambiguous with function calls, that would cause much head scratching and confusion. Please don't consider this at all, and especially out of simplicity of the implementation.

Furthermore array accesses would be in a completely seperate RFE and are not directly part of this one.

Only indirectly have we wondered into a syntax for array access.

BTW. Here is an example flattening my python prototype is doing and grepping results in this:

./root/.WD/boost_mpl_aux__preprocessor_enum.hpp/<file: /nfs/sid/aa7/7fb/10>
./root/.WD/boost_type_traits_is_enum.hpp/<file: /nfs/sid/822/601/86>
./root/.WD/boost_mpl_aux__preprocessor_enum.hpp/<file: /nfs/sid/aa7/7fb/10>
./root/.WD/boost_type_traits_is_enum.hpp/<file: /nfs/sid/822/601/86>
./root/usr/include/boost/type_traits/is_enum.hpp/<file: /nfs/sid/8d8/46b/f5>
./root/usr/include/boost/python/enum.hpp/<file: /nfs/sid/b0f/600/cb>
./root/usr/include/boost/mpl/aux_/preprocessor/enum.hpp/<file: /nfs/sid/b1f/861/d0>
./root/usr/include/boost/numeric/conversion/int_float_mixture_enum.hpp/<file: /nfs/sid/a7b/651/ba>
./root/usr/include/boost/numeric/conversion/sign_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bd>
./root/usr/include/boost/numeric/conversion/udt_builtin_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bf>
./root/usr/include/boost/numeric/conversion/int_float_mixture_enum.hpp/<file: /nfs/sid/a7b/651/ba>
./root/usr/include/boost/numeric/conversion/sign_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bd>
./root/usr/include/boost/numeric/conversion/udt_builtin_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bf>
./root/usr/include/boost/numeric/conversion/int_float_mixture_enum.hpp/<file: /nfs/sid/a7b/651/ba>
./root/usr/include/boost/numeric/conversion/sign_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bd>
./root/usr/include/boost/numeric/conversion/udt_builtin_mixture_enum.hpp/<file: /nfs/sid/a7b/651/bf>
./root/usr/include/boost/preprocessor/seq/enum.hpp/<file: /nfs/sid/b0f/600/06>
./root/usr/include/boost/preprocessor/list/enum.hpp/<file: /nfs/sid/88a/6fc/c5>
./root/usr/include/boost/preprocessor/repetition/enum.hpp/<file: /nfs/sid/88a/6fc/f2>
./root/usr/include/boost/preprocessor/enum.hpp/<file: /nfs/sid/88a/6fc/87>
./root/usr/include/boost/serialization/level_enum.hpp/<file: /nfs/sid/a86/af4/ca>
./root/usr/include/boost/serialization/tracking_enum.hpp/<file: /nfs/sid/a86/af4/d8>
./root/usr/include/boost/serialization/level_enum.hpp/<file: /nfs/sid/a86/af4/ca>
./root/usr/include/boost/serialization/tracking_enum.hpp/<file: /nfs/sid/a86/af4/d8>

This is a good task for an intern to incorporate these ideas in a utility function which is capable of handling all types

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kenneth C. Schalk - 2008-08-15

Logged In: YES
user_id=304837
Originator: NO

> That's not the purpose of this RFE. [...] Furthermore
> array accesses would be in a completely seperate RFE and
> are not directly part of this one.

I didn't mean to muddy the waters with that digression. I
agree that if we decide syntax for array indexing is
something we should add to the language then it should be
covered in a separate RFE. However I think causing
confusion with this new output format is still a valid
concern and worth discussing.

I think it's pretty clear that there's benefit to the user
in being as close as reasonable to valid SDL syntax. If
similarity with SDL code was a non-issue, we could use any
format, such as one of the JSON-inspired examples you gave.

I'm not opposed to using the format you suggested with
square brackets for list indexes. The same convention
(which most programmers are familiar with) is actually used
by one kind of debug output produced by the evaluator (when
using the -dependency-check command-line switch). That
doesn't concern me as much because it's usually only used by
people working on modifying the evaluator (who would
obviously be very, very familiar with what is and isn't
valid in SDL). Here we're talking about something that
could become a new default for the output produced by the
_print primitive function and the -result switch. I think
that warrants considering what expectations it might create
in the minds of users.

Let's talk about a few more practicalities of implementing
this.

Which portions of the evalautor's output should use this new
output format? I think the _print primitive function and
the -result switch are the obvious choices, but there are a
lot of other places that use the same code. Error messages,
debug messages, and the report produced by the -printtimings
switch are all examples that include printed representations
of values. In some cases, I don't think this new format
would be appropriate. We can probably just leave it at
_print and -result for now and deal and leave the other code
using the existing value printing code. (The alternative
would be to audit all the code that uses the existing value
printing functions to come up with a complete list and make
choices about which method should be used for each.)

Since I think we're going to retain the current value
printing code for some cases, we need some way to specify
whether to use the "classic" printed representation for
values or the new proposed one for _print and -result. I
don't think we need a command-line switch, so a vesta.cfg
setting is probably the right choice. (We can make the new
output format the default if the setting isn't present in
the config file.)

Inside the evaluator's code, I think this new output format
would have to be produced by a parallel set of functions on
the ValC class and its sub-classes. (It doesn't seem
advisable to try and combine it into the existing PrintD
functions.) This could be an overloaded version of the
existing PrintD function. This new format will require
passing a prefix path string into each level of recursive
call.

I think this new set of functions will need to support the
existing "verbose" feature of _print and ValC::PrintD.
That's mainly an issue for text values, where it's used to
force the printing of a text value even if it's stored in a
file (i.e. reading the file and printing it to standard
output). We might want to suppress the verbose printing of
closures with this new output format.

I think we'll want to add a new optional argument to _print
which will be the prefix string used to represent the path
to the value being printed. For example, when you talk
about the example of "_print(.)", you seem to want "."
included in the path that's being printed. In the simple
case of "_print(some_variable_name)", I think we can
reasonably default the prefix. However, the argument to
_print can be an arbitrarily complicated expression. I
think the prefix should default to the empty string for the
non-variable case.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Kenneth C. Schalk - 2008-08-15

labels: 693271 --> Evaluator
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

print full path dot

Group

Searches

Help

#147 print full path dot

Discussion