I have read about the HaikuVM and if I understand it correctly the HEX file to be flashed on Arduino contains HaikuVM-interpreter optimized for the Java main program (which is also flashed alongside the HaikuVM-interpreter).
If I am right, my question is:
Is there a possibility to build a HEX file with full-featured (non-optimized) HaikuVM-interpreter for a loader program, which then starts the main program either from EEPROM or MMC/SDC ?
This of course implies another question: Is there a file during the whole build-process which represent only the main program, i.e. without the HaikuVM-interpreter, resp. is there a simple way to split the final binary into 2 parts (HaikuVM-interpreter + main-program code) ?
Thanks & Best Regards,
Oifan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yes, you are right. And I think you have a valid understanding of the
HaikuVM architecture.
1) As I tried to explain here: http://haiku-vm.sourceforge.net/#[[Your%20Project%20on%20Disk%20and%20at%20Runtime]]
all HaikuVM C code derived from your JAVA main-program only, is
bundled into the directory "utility". The rest of the C code is for
the HaikuVM-interpreter. As a consequence, all compiled object code of
your Java program, is bundled into the directory
"target/cross/HaikuVM/utility". It should be possible to make one lib
(a *.a file) out of this (let's call it main-program-lib) and another
lib from the rest (let's call it interpreter-lib).
2) I know how to produce this two libs but I have no idea how to make
two separate HEX files from this two libs. Let's call them
main-program-HEX file and interpreter-HEX file. (I have no deep
understanding of the possibilities of the AVR tool chain.)
3) Further on, I have no idea how to upload two different HEX files
into an AVR at the same time. (I have no deep understanding of the
possibilities of avrdude.)
4) It should be easy to produce (and freeze) a "full-featured
(non-optimized)" interpreter-lib, if you first write an artificial
JAVA main-program which uses every feature of HaikuVM (e.g. Threads,
Exceptions, type double and so on).
5) Also, it should be possible to patch any main-program-HEX file to
match the entry points of the given interpreter-HEX file using a/the
map file.
6) It's harder to solve if it comes to EEPROM or MMC/SDC. Because AVRs
are using a Havard CPU architecture. In this case the
HaikuVM-interpreter has to be rewritten to access C-structs from other
memory segments
Ok, I gave no solution for your wish but my answer is hopefully of
some help for you.
BTW, I'm curious to know, why you are interested in HaikuVM?
Kind regards
Bob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You see, the main reason I even started to care about AVR bytecode interpreters (especially for ATmega328p) is my plan to write a quite complex program for it (Arduino Pro Mini board costs cca. $2.90 on ebay), which should be able to use lots of different modules (Onewire-compatible thermometers, humidity-meters, IR-sensors, relays, RTC, bluetooth, SD-card, proximity detectors, ...).
The problem is that with each new module (maybe except relay module) I have to add another #include in my main program (I use ArduinoIDE). This increases the size of the binary that should be flashed. I fear that soon the 32kB won't be enough. And if I want to add some diagnostics output strings, then it definitely won't be enough.
So shortly after realizing this obstacle I had this cool idea - what about interpreted languages - like Basic, or so --> interpreting the main program from SD-card (FAT16 has size limit 2GB)?
Of course, it is slow to interpret plain text, interpreting bytecode is much faster - so I searched for AVR bytecode interpreters. I found these 4:
HaikuVM - http://haiku-vm.sourceforge.net/ --> the fastest (110000 Instr/s on 16MHz), takes Java-6 source as input, highly optimized - but interpreter and main-program are always glued together into 1 binary
ok, interpreting the main program from SD-card is a complete different thing. In this case, I fear, HaikuVM will not reach 110000 Instr/s on 16MHz. (And I will need your help for this project.)
For now, by design, running on a n bit architecture, HaikuVM main programs are limited to a size of 2^n. For an ATmega328p n is 16. Which is far below 2GB. (In contrast, for a x86 based computer, n is
32.)
Even if we will manage to solve it, using lots of different modules (Onewire-compatible thermometers, humidity-meters, IR-sensors, relays, RTC, bluetooth, SD-card, ...) not using pre-build C libraries, does mean writing it from scratch for JAVA/HaikuVM.
Again, I gave no solution for you but my answer hopefully shed some light on the architecture of HaikuVM.
I'm curious to know if you have experience with one of:
- EmbedVM
- HaikuVM
- NanoVM
- SBI
- uJ
?
Kind regards
Bob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
it would be totally fine if the interpreting speed would be lower - let's say 60-80 instructions per ms.
The main benefit of this MMC/SD-card solution would be that the programs would not be limited by size - ok, maybe this 64kB limit, but this could be solved by splitting the main program into several classes and each of these would be compiled to separate bytecode-file (probably each of those will contain its Constants & Exceptions table and its Microkernel ?) - at least this way the standard JRE works - each class-file acts like a DLL (or Linux SO) that can be called from other class-files and also call other class-files within class-path.
The only problem / challenge I see here is the call-stack size - it needs to contain fully qualified class names (i.e. package + class name) - probably we should set some limit for the maximum length of that fully qualified name - let's say 32 chars.
I am willing to provide some help within my free time (I work 8 hrs / day for a big IT company) - I work mostly in Java.
The libraries I mentioned (that I used in Arduino-IDE) usually contain only some bolier-plate code for initializing the modules - some constants, reading/writing specific values to specific I/O pins with specific timing - this should be easily portable to Java (and then compiled to byte-code).
Regarding all the VM solutions for ATmega I mentioned earlier, I have no practical experiences, since I am in the process of choosing the right VM for my needs - I only contacted the author of EmbedVM (I asked him how difficult it would be to add floating numbers + arithmetic -- it turned out it would mean a huge change, almost full rework of EmbedVM).
Yesterday I found also some functional language for ATmega - I think it has some potential - http://concurrency.cc.
Currently I am deciding between NanoVM and HaikuVM - but I like HaikuVM better, since it supports Exceptions and it's faster.
Best Regards,
Oifan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was also thinking about inter-class calls in HaikuVM - I guess if everything is put into 1 binary, all Java methods can be addressed with 2 bytes (for binaries <= 64kB); but when splitting the application into multiple binaries (classes) placed on a MMC/SD card on a FAT16, a more complex reference system is required.
I would suggest that in each class each method will get an index (0..255) when compiled - I think 256 methods per class are sufficient. In the class header I would put a list of all these methods in this format:
1. method_address (2B) -> where the method bytecode starts within this class,
2. method_HASH_signature (9B)
In the calling class I would reference that method as follows:
1. method_index (1B) -> determines method's signature position in the header of the containing class,
2. method_HASH_signature (9B)
The method_HASH_signature will look like this:
1. method_name_and_modifiers_HASH (4B)
2. method_args_count (1B)
3. method_argTypes_and_retType_HASH (4B)
This 9-byte method_HASH_signature would more-less ensure that binary classes from 2 different builds will be binary compatible - so the same method_HASH_signature would be in the calling class - if these 2 method_HASH_signatures don't match, a NoSuchMethodError would be thrown.
What do you think about this approach, would this be a good idea to implement in HaikuVM ?
Best Regards,
Oifan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I realy like your thoughts. And I like your list of other VMs. I will put them as reference into the "Links" page of the HaikuVM home-page (if not already there).
Open HaikuVM for program sizes of more then 64k when in 16bit mode, looks like a long way to go. And your statement about "free time (I work 8 hrs / day for a ... IT company) - I work mostly in Java" is true for me as well. ;-)
For now, my idea to attack the 64k limit is in this stages:
Stay with the 64k limit and stay with eager linking but try to decouple the interpreter from the user code.
Decouple even further by allowing the user code to stay on an other memory segment/bus (e.g. EEPROM, SD-card).
Let a JAVA/HaikuVM user program browse the EEPROM or SD-card. Let it select an arbitrary HaikuVM-file and run it as new/next and independent HaikuVM user program.
Introduce lazy binding by using method signatures to overcome the 64k limit.
BTW, did you have had some time to let HaikuVM run on your Atmega328p?
Kind regards
Bob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
no unfortunately I didn't have time for HaikuVM, in the meanwhile I had to solder & test the 2 BT modules (http://www.ebay.com/itm/180923667447 + http://www.ebay.com/itm/141151348468) to give feedback - they work nicely, I was able to change the pairing PIN and baud-rate. And today I have also tested 2 humidity sensors DHT11 on my Arduino - they work well, too :-)
Currently I am in Germany (I live in Slovakia), but I brought the ATmega328p + SD-card + USB-ASP with me so I can play with it in the evenings - I would like also to try out the HaikuVM - probably something SD-card related :-)
Best Regards,
Oifan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
last weekend I finally have had enough time to play with HaikuVM - it is really interesting. I have built (using haiku.bat) around 10 examples (and discovered missing @NativeCFunction in haikuVM\examples\src\main\java\arduino\tutorial\JNINativeLIBC.java, method sin(double)) and I have analyzed a bit the structure of the final binary that is to be flash:
table of constants
table of exceptions, table of functions
bytecode - HaikuVM and user methods
bytecode labels
native code - JVM functions
I have also tested how the HaikuVM handles more than 256 methods in a class - it creates a bytecode (1 byte) instruction for around 158 of them, others are called using INVOKESTATIC instruction + absolute method address (1+2 bytes) - it is very optimized for size.
This brings me to this conclusion: Introducing multiple classes (each in its own file) and the method reference mechanism I proposed earlier in this thread would be have a serious impact on bytecode execution speed (when calling a method from another class - there would have to be a mechanism to translate the method name to absolute address within the class file) - so I would leave the method reference mechanism as it is now - all classes and their methods should remain within a single file.
Considering the abovementioned things, I would propose these changes:
switch from 2-byte addressing to 4-byte addressing (to create and execute bytecode files > 64 kB),
reserve some byte-code prefix range for 2-byte JVM instructions (e.g. e0...ff would represent the 1st byte of a 2-byte instruction, so there could be 8192 user/system methods called using a 2-byte instruction) - calls using INVOKESTATIC instruction would occupy 1+4 bytes,
a full-featured non-optimized HaikuVM would be created (and flashed to ATmega328p), that would read the SD-card pin number (pin CS/SS) (1 byte) and the name of bytecode file in root of SD-card (max. 13 characters, can be a null-terminated string) from EEPROM,
each time user builds a Java project using haiku, it will only create the bytecode for this project which will use the full-featured HaikuVM,
there should be a utility to overwrite the SD-card pin number and the name of bytecode file (in root of SD-card) in EEPROM.
So what do you think about these proposed changes, do they seem appropriate / sufficient ?
Best Regards,
Oifan
Last edit: Oifan 2014-05-16
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
yesterday I have just tested the SD-card read speed using default Arduino SD-card & FAT16 libraries.
The speed of reading large files (from 256MB and 2GB SD-cards) using a 64-byte buffer was 180-190 kiB/s.
I guess the average JVM instruction is 2B long, so this gives circa 95 000 instructions per second - this is of course the ideal case with no seeks.
But maybe this Arduino SD-card & FAT16 libraries is not optimized for speed - have you tried to implement some SD-card in HaikuVM ?
Best Regards,
Oifan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
sorry for my late replay but I'm currently very busy (until July).
Your proposals are really interesting. I have to think about. Either way, I will need some help for this. Because until today I have no experience with reading from SD-card. I need time for learning. Stay tuned.
I already gave this thoughts but without direct response from you:
Stay with the 64k limit and stay with eager linking but try to decouple the interpreter from the user code.
Decouple even further by allowing the user code to stay on an other memory segment/bus (e.g. EEPROM, SD-card).
Let a JAVA/HaikuVM user program browse the EEPROM or SD-card. Let it select an arbitrary HaikuVM-file and run it as new/next and independent HaikuVM user program.
Introduce lazy binding by using method signatures to overcome the 64k limit.
So, what do you think?
Kind regards
Bob
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I too have lots of stuff going on now, so no problem.
I would like to respond to your 4 points:
@1,@4: The main reason for me to even bother with a VM is the possibility to execute a (byte)code OUTSIDE the built-in flash to overcome the size limit, so 32 or 64 kB is no significant difference for me. And if the resulting binary grows beyond 64 kB we definitely need a JVM JUMP instruction that takes a DWORD argument (instead of currently used WORD argument) - of course, they can be used in parallel - there can be short-jump (using WORD, to jump within the current 64-kB block) and long-jump (using DWORD, to jump elsewhere).
@3: Browse-ability of JAVA/HaikuVM program is nice, but from practical point of view it would be nice to have some non-interactive way to specify the location of the user bytecode - I would suggest following fallback:
file autorun.txt in the root of FAT16 of SD-card
null-terminating string starting at address 0x0001 on EEPROM
Both locations would specify filename within bin directory of the SD-card (containing the bytecode).
Best regards,
Matej
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I have read about the HaikuVM and if I understand it correctly the HEX file to be flashed on Arduino contains HaikuVM-interpreter optimized for the Java main program (which is also flashed alongside the HaikuVM-interpreter).
If I am right, my question is:
Is there a possibility to build a HEX file with full-featured (non-optimized) HaikuVM-interpreter for a loader program, which then starts the main program either from EEPROM or MMC/SDC ?
This of course implies another question: Is there a file during the whole build-process which represent only the main program, i.e. without the HaikuVM-interpreter, resp. is there a simple way to split the final binary into 2 parts (HaikuVM-interpreter + main-program code) ?
Thanks & Best Regards,
Oifan
Hello Oifan,
yes, you are right. And I think you have a valid understanding of the
HaikuVM architecture.
1) As I tried to explain here:
http://haiku-vm.sourceforge.net/#[[Your%20Project%20on%20Disk%20and%20at%20Runtime]]
all HaikuVM C code derived from your JAVA main-program only, is
bundled into the directory "utility". The rest of the C code is for
the HaikuVM-interpreter. As a consequence, all compiled object code of
your Java program, is bundled into the directory
"target/cross/HaikuVM/utility". It should be possible to make one lib
(a *.a file) out of this (let's call it main-program-lib) and another
lib from the rest (let's call it interpreter-lib).
2) I know how to produce this two libs but I have no idea how to make
two separate HEX files from this two libs. Let's call them
main-program-HEX file and interpreter-HEX file. (I have no deep
understanding of the possibilities of the AVR tool chain.)
3) Further on, I have no idea how to upload two different HEX files
into an AVR at the same time. (I have no deep understanding of the
possibilities of avrdude.)
4) It should be easy to produce (and freeze) a "full-featured
(non-optimized)" interpreter-lib, if you first write an artificial
JAVA main-program which uses every feature of HaikuVM (e.g. Threads,
Exceptions, type double and so on).
5) Also, it should be possible to patch any main-program-HEX file to
match the entry points of the given interpreter-HEX file using a/the
map file.
6) It's harder to solve if it comes to EEPROM or MMC/SDC. Because AVRs
are using a Havard CPU architecture. In this case the
HaikuVM-interpreter has to be rewritten to access C-structs from other
memory segments
Ok, I gave no solution for your wish but my answer is hopefully of
some help for you.
BTW, I'm curious to know, why you are interested in HaikuVM?
Kind regards
Bob
Hi Bob,
thanks for your detailed response.
You see, the main reason I even started to care about AVR bytecode interpreters (especially for ATmega328p) is my plan to write a quite complex program for it (Arduino Pro Mini board costs cca. $2.90 on ebay), which should be able to use lots of different modules (Onewire-compatible thermometers, humidity-meters, IR-sensors, relays, RTC, bluetooth, SD-card, proximity detectors, ...).
The problem is that with each new module (maybe except relay module) I have to add another #include in my main program (I use ArduinoIDE). This increases the size of the binary that should be flashed. I fear that soon the 32kB won't be enough. And if I want to add some diagnostics output strings, then it definitely won't be enough.
So shortly after realizing this obstacle I had this cool idea - what about interpreted languages - like Basic, or so --> interpreting the main program from SD-card (FAT16 has size limit 2GB)?
Of course, it is slow to interpret plain text, interpreting bytecode is much faster - so I searched for AVR bytecode interpreters. I found these 4:
I hope this explains everything.
Best Regards,
Oifan
Hello Oifan,
ok, interpreting the main program from SD-card is a complete different thing. In this case, I fear, HaikuVM will not reach 110000 Instr/s on 16MHz. (And I will need your help for this project.)
For now, by design, running on a n bit architecture, HaikuVM main programs are limited to a size of 2^n. For an ATmega328p n is 16. Which is far below 2GB. (In contrast, for a x86 based computer, n is
32.)
Even if we will manage to solve it, using lots of different modules (Onewire-compatible thermometers, humidity-meters, IR-sensors, relays, RTC, bluetooth, SD-card, ...) not using pre-build C libraries, does mean writing it from scratch for JAVA/HaikuVM.
Again, I gave no solution for you but my answer hopefully shed some light on the architecture of HaikuVM.
I'm curious to know if you have experience with one of:
- EmbedVM
- HaikuVM
- NanoVM
- SBI
- uJ
?
Kind regards
Bob
Hi Bob,
it would be totally fine if the interpreting speed would be lower - let's say 60-80 instructions per ms.
The main benefit of this MMC/SD-card solution would be that the programs would not be limited by size - ok, maybe this 64kB limit, but this could be solved by splitting the main program into several classes and each of these would be compiled to separate bytecode-file (probably each of those will contain its Constants & Exceptions table and its Microkernel ?) - at least this way the standard JRE works - each class-file acts like a DLL (or Linux SO) that can be called from other class-files and also call other class-files within class-path.
The only problem / challenge I see here is the call-stack size - it needs to contain fully qualified class names (i.e. package + class name) - probably we should set some limit for the maximum length of that fully qualified name - let's say 32 chars.
I am willing to provide some help within my free time (I work 8 hrs / day for a big IT company) - I work mostly in Java.
The libraries I mentioned (that I used in Arduino-IDE) usually contain only some bolier-plate code for initializing the modules - some constants, reading/writing specific values to specific I/O pins with specific timing - this should be easily portable to Java (and then compiled to byte-code).
Regarding all the VM solutions for ATmega I mentioned earlier, I have no practical experiences, since I am in the process of choosing the right VM for my needs - I only contacted the author of EmbedVM (I asked him how difficult it would be to add floating numbers + arithmetic -- it turned out it would mean a huge change, almost full rework of EmbedVM).
Yesterday I found also some functional language for ATmega - I think it has some potential - http://concurrency.cc.
Currently I am deciding between NanoVM and HaikuVM - but I like HaikuVM better, since it supports Exceptions and it's faster.
Best Regards,
Oifan
Hi Bob,
this weekend I was still looking for a suitable VM for ATmega328p - it seems some people have done nice work with Forth - see newer http://sourceforge.net/projects/amforth/ or older http://www.avrfreaks.net/index.php?module=Freaks%20Academy&func=viewItem&item_id=626&item_type=project .
But I am afraid that the speed of Forth will be much lower than that of HaikuVM or NanoVM.
I was also thinking about inter-class calls in HaikuVM - I guess if everything is put into 1 binary, all Java methods can be addressed with 2 bytes (for binaries <= 64kB); but when splitting the application into multiple binaries (classes) placed on a MMC/SD card on a FAT16, a more complex reference system is required.
I would suggest that in each class each method will get an index (0..255) when compiled - I think 256 methods per class are sufficient. In the class header I would put a list of all these methods in this format:
1. method_address (2B) -> where the method bytecode starts within this class,
2. method_HASH_signature (9B)
In the calling class I would reference that method as follows:
1. method_index (1B) -> determines method's signature position in the header of the containing class,
2. method_HASH_signature (9B)
The method_HASH_signature will look like this:
1. method_name_and_modifiers_HASH (4B)
2. method_args_count (1B)
3. method_argTypes_and_retType_HASH (4B)
This 9-byte method_HASH_signature would more-less ensure that binary classes from 2 different builds will be binary compatible - so the same method_HASH_signature would be in the calling class - if these 2 method_HASH_signatures don't match, a NoSuchMethodError would be thrown.
What do you think about this approach, would this be a good idea to implement in HaikuVM ?
Best Regards,
Oifan
Hello Oifan,
I realy like your thoughts. And I like your list of other VMs. I will put them as reference into the "Links" page of the HaikuVM home-page (if not already there).
Open HaikuVM for program sizes of more then 64k when in 16bit mode, looks like a long way to go. And your statement about "free time (I work 8 hrs / day for a ... IT company) - I work mostly in Java" is true for me as well. ;-)
For now, my idea to attack the 64k limit is in this stages:
BTW, did you have had some time to let HaikuVM run on your Atmega328p?
Kind regards
Bob
Hi Bob,
no unfortunately I didn't have time for HaikuVM, in the meanwhile I had to solder & test the 2 BT modules (http://www.ebay.com/itm/180923667447 + http://www.ebay.com/itm/141151348468) to give feedback - they work nicely, I was able to change the pairing PIN and baud-rate. And today I have also tested 2 humidity sensors DHT11 on my Arduino - they work well, too :-)
Currently I am in Germany (I live in Slovakia), but I brought the ATmega328p + SD-card + USB-ASP with me so I can play with it in the evenings - I would like also to try out the HaikuVM - probably something SD-card related :-)
Best Regards,
Oifan
Hi Bob,
last weekend I finally have had enough time to play with HaikuVM - it is really interesting. I have built (using haiku.bat) around 10 examples (and discovered missing @NativeCFunction in haikuVM\examples\src\main\java\arduino\tutorial\JNINativeLIBC.java, method sin(double)) and I have analyzed a bit the structure of the final binary that is to be flash:
I have also tested how the HaikuVM handles more than 256 methods in a class - it creates a bytecode (1 byte) instruction for around 158 of them, others are called using INVOKESTATIC instruction + absolute method address (1+2 bytes) - it is very optimized for size.
This brings me to this conclusion: Introducing multiple classes (each in its own file) and the method reference mechanism I proposed earlier in this thread would be have a serious impact on bytecode execution speed (when calling a method from another class - there would have to be a mechanism to translate the method name to absolute address within the class file) - so I would leave the method reference mechanism as it is now - all classes and their methods should remain within a single file.
Considering the abovementioned things, I would propose these changes:
e0...ff
would represent the 1st byte of a 2-byte instruction, so there could be 8192 user/system methods called using a 2-byte instruction) - calls using INVOKESTATIC instruction would occupy 1+4 bytes,So what do you think about these proposed changes, do they seem appropriate / sufficient ?
Best Regards,
Oifan
Last edit: Oifan 2014-05-16
Hi Bob,
yesterday I have just tested the SD-card read speed using default Arduino SD-card & FAT16 libraries.
The speed of reading large files (from 256MB and 2GB SD-cards) using a 64-byte buffer was 180-190 kiB/s.
I guess the average JVM instruction is 2B long, so this gives circa 95 000 instructions per second - this is of course the ideal case with no seeks.
But maybe this Arduino SD-card & FAT16 libraries is not optimized for speed - have you tried to implement some SD-card in HaikuVM ?
Best Regards,
Oifan
Hello Oifan,
sorry for my late replay but I'm currently very busy (until July).
Your proposals are really interesting. I have to think about. Either way, I will need some help for this. Because until today I have no experience with reading from SD-card. I need time for learning. Stay tuned.
I already gave this thoughts but without direct response from you:
So, what do you think?
Kind regards
Bob
Hi Bob,
I too have lots of stuff going on now, so no problem.
I would like to respond to your 4 points:
@1,@4: The main reason for me to even bother with a VM is the possibility to execute a (byte)code OUTSIDE the built-in flash to overcome the size limit, so 32 or 64 kB is no significant difference for me. And if the resulting binary grows beyond 64 kB we definitely need a JVM JUMP instruction that takes a DWORD argument (instead of currently used WORD argument) - of course, they can be used in parallel - there can be short-jump (using WORD, to jump within the current 64-kB block) and long-jump (using DWORD, to jump elsewhere).
@3: Browse-ability of JAVA/HaikuVM program is nice, but from practical point of view it would be nice to have some non-interactive way to specify the location of the user bytecode - I would suggest following fallback:
Both locations would specify filename within bin directory of the SD-card (containing the bytecode).
Best regards,
Matej