Subscribe

General architecture questions.

  1. 2002-08-21 20:06:56 PDT
    How is BRiX going to deal with 'agents' -- that is, code that is transfered between CPUs? Without this, BRiX will be difficult (perhaps not impossible, but very difficult) to cluster.

    What about self-modifying code? It would seem to me that, by definition, BRiX would disallow a mainstay of the AI field, namely self-modifying code.

    And how about interpreted languages? Somebody will, inevitably, implement an interpreter for some language on BRiX, if it is at all possible and BRiX gets popular enough. Where do you see scripting languages fitting into the architecture?
  2. 2002-08-21 22:03:30 PDT
    I'm going to take a stab at the second and third questions.

    Self-modifying code is a bit of a misnomer. I think you're referring to recursive code generation, where each iteration generates more code to execute - if this is the case, the code can be generated as bytecode and then compiled. I don't think this would require administrator privledges like inserting fresh machine code. If you mean manual modification of the current code segment, that is hardly necessary. I can't think of any application where the recursive generation solution wouldn't be cleaner, and not raise the security issue.

    As for interpreted languages, you're forgetting that crush is (in a broad sense) an interpreted language itself. Scripting languages work by adding a layer of abstraction over machine code in the virtual machine; so scripting languages would simply run over crush. I really don't see how brix could fail to support them.
  3. 2002-08-22 05:41:42 PDT
    Recursive code generation is one aspect, but it isn't the proper term. For example, in Ruby, you can write code that actually modifies the program itself. A good example is the "once" method, which has the behavior that a method that is called once and generates a value will thereafter return only the value. In pseudocode. The code is self-modifying in two ways. The first way is the 'once' call, which rewrites the method to have this new behavior. The second is on the method call, when the modified code executes itself and then modifies itself to return the stored value instead of the method call. Usage is, eg:

    class Foo
    def some_method
    # some really long calculation that may
    # never be called, but if it is, the
    # calculation only needs to be performed
    # once for this instance of the class.
    # after the first time, the once-generated
    # value can be returned.
    end
    once :some_method
    end

    "once" is just an example, but I would say that "self-modifying" is a more accurate term than "recursive code generation".

    WRT interpreted languages, and with all due respect, Crush isn't interpreted at all. It is compiled in two passes. The first generates bytecode --- but this is never "interpreted". It is then compiled to native code. I think the confusion is because of JIT compilers, which operate on interpreted bytecode. However, (and tell me if I'm wrong about this, but) Crush bytecode is /never/ executed as bytecode instructions. It is /always/ compiled to native binary before it is executed. This isn't done dynamically, either. I mean, applications aren't stored on the computer as bytecode, are they? Interpretation implies that, as you get each instruction of interpreted code, you translate and execute it.

    Thanks for the response!
  4. 2002-08-22 10:18:09 PDT
    I agree that the code generation i described is not true self-modifying code, but i'm going to pick at your example :). I'm not familiar with ruby, but from your description the once method doesn't actually need to change the code segment. It wraps the function call in an if-statement, testing a static boolean that is set after the call and returning the already-computed value if it triggers. That's a much better solution to me, but each to his own ;)

    My interpretation is that applications /are/ stored as bytecode on disk, the documentation is a little shaky there :). I'm willing to agree that Crush isn't interpreted, but i don't see any restrictions on creating a virtual machine for languages like java and brainf*ck.
  5. 2002-08-22 20:50:39 PDT
    > I'm not familiar with ruby, but from your description the
    > once method doesn't actually need to change the code
    > segment. It wraps the function call in an if-statement,
    > testing a static boolean that is set after the call and
    > returning the already-computed value if it triggers.

    Yes, it /could/ do it that way, but that would be inefficient, so it doesn't. The 'once' method actually rewrites the method so that the method returns the value computed after the first run. There is no 'if' branch involved. Branches are relatively expensive operations -- if I remember correctly, they're the most expensive single operations in an instruction set, more so because they tend to defeat pipelining look-ahead. So it is more efficient, and more elegant, to self-modify the code. Put another way: if you know in advance that the result is static after the first run, then isn't it a waste to execute a test-branch every time the method is called?

    WRT bytecodes, you may be correct; I don't know. I thought I had read that the compilation process went Crush source -> Bytecode -> Native code, and that the second translation wasn't done 'on the fly'. As it appears that you and I are the only active members of this list (or the only ones interested in this topic), I guess we'll never know. I'm certainly not going to wade through a bunch of pseudo-Scheme code to find out ;-).

    I can't see any obvious limitations for interpreted languages. I also can't see any obvious limitations on compilers for arbitrary languages that produce BRiX bytecode; I'm guessing that all of the security checking occurs during the second compile, where it would be most obvious. The one hitch would be the possible case that the BRiX bytecode doesn't support certain features that some languages, especially highly dynamic languages, require. Runtime linking, in particular, is a very useful construction. Loosely typed languages are very useful for a large number of applications. I've already mentioned the need for self-modifying applications.

    Anyway, the architectural goals of BRiX are excellent. It seems that Brand has gotten all of the right buzzwords from a variety of other experimental OSes.
  6. 2002-08-23 11:12:46 PDT
    If ruby does that, i'm not sure that's the best solution. It is faster on average for a very large number of function calls, but not by anything you could measure. The pipeline was already broken after the function call (i'm assuming it doesn't go through the entire code base and insert code after the first function call to eliminate the transfer of control), so you're looking at a few clock cycles. And you have to actually change the code, which costs more clock cycles. If the timing of the function is so critical, i think pure assembler would be a better language preference ;)

    As for limitations on higher compiled languages, i agree... (see my post about infix operators ;)) I don't know what crush will prohibit, so i hope the bytecode isn't very far above pure assembler. I guess we'll wait and see :)
  7. 2002-08-23 18:22:15 PDT
    Ruby is an interpreted language. The 'once' method is implemented in thirteen lines of Ruby code, and executes in microseconds (on the order of 9.7e-5). Therefore, if x() is sufficiently expensive (which is the premise of the original post), the expense of the internal code re-write is entirely masked by the first call of the method, and the difference between the first call of 'once' and 'if'-style methods is unnoticable -- and 'once' is 22% faster than 'if' on all subsequent calls.

    There's another side-effect. With the 'if' method, you must define a global variable to hold the value of the result, which is not necessary with the 'once' method. (A global variable does get defined with 'once', but it is hidden). This makes the 'once' code more elegant, and more 'pure'. Here's the code. Accepting that Ruby syntax may be strange to you, tell me which you think is more elegant, simple, and intuitive.

    class OnceDesign
    def x # x() evaluates to 3
    3
    end

    once :x # Only call x() once
    end

    class IfDesign
    def x # Only calculate x if it hasn't been already calculated
    unless defined? @x
    @x = 3
    else
    @x
    end
    end
    end
  8. 2002-08-23 18:23:47 PDT
    Crap. Sourceforge sucks. The indentations got stripped. Oh well.
  9. 2002-08-24 15:00:25 PDT
    I agree that using the built-in feature is prettier than the identical hack ;).

    If there are only benifets for the self-modifying code you've outlined, then of course i can see that's the best way and would recommend it. However, I'm not convinced that the self-modifying code is faster in machine code (i guess i could write a few test programs and actually compare... hmm sounds like a good paper :)). However, ruby is modifying its own bytecode, not the machine code that is finally run (unless i'm mistaken, again). The point is that generating the machine code and overwriting pre-written code both raise really huge red flags as possible virii. I'm guessing code like that would need administrator 'special approval', and could you really tell whether a 25000 line program only modifies itself in a benign manner? I would rather eschew that technique for program security in non-system work, so that all users can install/use it. As for code already laden with assembler, the best solution should be used.

    A way to make us both happy: embed a ruby-like 'once' into the library... i would be interested in seeing how to implement that, it's a lot trickier than it first sounds. Still, that would prevent possible malicious code and give you your 22% speed boost ;)
  10. 2002-09-04 13:40:41 PDT
    <rant>
    Sourceforge sucks big wieners. The more I use it, the more I realize just how /much/ it sucks. I can't honestly think of a portal that sucks more than Sourceforge. The tremendous amount of good that Sourceforge has done the open source community is significantly less than the amount that it sucks.
    </rant>

    My point wasn't to get "once" into BRiX or Crush. "Once" is merely an example -- a proof that such a need exists, if you will -- of useful self-modifying code. My original point was that there exist situations that are best (most efficiently, most elegantly) solved with self-modifying code.
  11. 2002-10-03 00:45:51 PDT
    I don't know how clustering works but new code would have to be compiled before it could be executed. BRiX does let trusted functions compile code, but I would think that each machine in the cluster would have a copy of the code and there would be no need to do any dynamic compiling.

    Crush does not allow self-modifying code (SMC). How many real programs can you name that use SMC? None? SMC causes the instruction cache to be flushed and the processor will cache the modified code in the data cause which pushes out other data and wastes space that could be used by real data.

    Crush is mostly a compiled language but will have a JIT interpreter. It is also possible to write interpreters for other languages as long as the interpreter is written in Crush or one of its layers. The bytecode is a binary encoding of the Crush syntax and is only used to store and transport source code.

    The Ruby example you gave is not SMC, you could do the same thing in BRiX by porting the Ruby interpreter. Interpreters store information about each variable/object and that is how Ruby most likely handles your example, that is not SMC. Ruby (probably) initially links the variable/object symbol to the function and once called it points the symbol to a data object that represents the result. The IfDesign approach is slower because Ruby has to process that code whereas the built-in 'once' method has a special path in the interpreter.

    NOTE: underscores == indentation

    deftype foo ()
    __ slot exec-once bar:int (some code that returns an integer)

    Crush supports slot accessor functions that can be called whenever a slot is read from or written to. The exec-once flag could add a hidden boolean slot in the type and the code would be inserted into a DEFGET accessor that uses an IF to test the boolean and call the lambda or just return the value.

    The exec-once and DEFGET/IF approaches would both compile to the same machine code and require the same time to execute but exec-once would be easier to read. Both methods will run a hundred times faster than the Ruby equivalent because it's compiled and not interpreted.

    If I were to use SMC for the exec-once then the DEFGET code would execute the first time, modify itself to return the value instead of testing, the processor would flush its cache causing a major stall and then future accesses would be faster because there is no test.

    Your example was not a valid reason for using SMC but I will think about adding the exec-once flag.
Jump To:
< Previous | 1 | Next >

Add a Reply

This forum does not allow anonymous participation.

Log in to add a reply. Not registered? Create an account to participate and receive email updates when replies are posted to this topic.