Thread: Re: [FreeWRL-develop] Testing update (Page 5)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 16/07/13 16:39, doug sanden wrote:
> Paulo,
>> I can confirm that the fixtures generated seem different per machine....
>> Even when testing in two similar architectures "Linux" the fixtures
>> seemed to differ.
>> I think this might be down to a combination of library versions, graphic
>> cards etc ...
> I have bit differences in 20% of comparisons on a single win32 platform,
>   but they are scattered across the image -not header related- and larger
>   color spreads on textured objects.
> That's why I use the graphicsMagick mean square error measure.
Ah, btw, I noticed that when running the test in the console is best if 
I you're not interacting physically with it as it seems to generate 
different images here and there.
>> Further, I commented the code that was writing the header file of the
>> bmp file and that seemed to generate more stable results across runs.
>> At least I don't have any bit differences between runs.
> The header issue is puzzling to me. I should have zeroed the header, and put only useful stable numbers in it. There are a few gotchas that I didn't look into: struct alignment -if a compiler pads a struct to an even 8 bytes then I write out the struct as a binary blob I'll get padding, which is bad. And intel vs motorola byte ordering - I'm not sure how to fix that.
>
> 64 BIT VS 32 BIT, INTEL VS MOTOROLA
>
> The padding can be fixed by writing each element of the struct out separately. And do I have the right data types to avoid 64bit ints? One way to tell is to try the graphicsmagick or gimp readers on each platform and see if they can read my header generated on various platforms, assuming gimp and graphicsmagick have the right idea.
> (Right now I have only 32bit hardware and compilers, and opengl 2 only on one machine)
I mentioned before I was having trouble opening the .bmp files being 
generated. Not even graphicsmagic would do the trick.
But honestly I was not really interested in opening them, the point was 
just to prove that one could generate the fixtures and test them for 
comparison even if restricted to the same machine they were being 
generated in.
>> I used a different approach though ...
>>
>> First I commented that time reduction hack so that I could record things
>> smoothly.
>> https://github.com/pecastro/freewrl/commit/4926a80062179dd1223296537eb45bf04b9a17eb
>>
>> Using recording mode, I manually created one master .fwplay whilst
>> moving in the scenegraph back and forth rotating etc and in between each
>> move taking some manual snapshots.
>> This master file was created using the freewrl/tests/2.wrl
>> https://github.com/pecastro/freewrl/commit/044850835cc7d56e4f6c2b52360485134620e4d3
>>
>> Then I used this .fwplay file to test all of the .wrl/x3d files.
>> I iterate over the list of .wrl files to be tested and for each one I
>> amend the .fwplay file, and substitute the scenefile for the current
>> file being tested.
>> I then run freewrl in playback mode for that specific .wrl file.
>> I kept all the headerless .bmp files generated and committed them to a
>> local branch.
>>
>> Then I built freewrl from scratch and ran exactly the same procedure of
>> running the script in playback mode and in the end it was a matter of
>> asking my source control system if any of those .bmp files had changed.
> Great idea.
> So summary: this method is good for checking geometry rendering changes in detail, but doesn't allow for scene-specific mouse or keyboard input -except see below^- such as clicking something.
> What it's good for:
> a) inspecting scene rendering in finer detail
> b) using only one fwplay, with high-fidelity avatar movements, to view each scene using the same avatar movements for each scene
Yes, for scene specific testing I suppose the methodology of having a 
.fwplay per test would be preferable...
The point of this exercise was to prove the possibility and understand 
the requirements.
>
> ^It might also be very helpful when you have one giant test scene with everything in it -or everything that's of interest for your development changes- and you use just that one scene for testing. In that case you can do clicking and keyboarding for that scene. For example we have 52 tests, and you could put them all side-by-side in a single scene file, then navigate the avatar between them during a test run.
I think so, though I'm not familiar on how you'd accomplish that.
>
> Q. if you have 100 test scenes, how long does it take to run the test on all of them?
> (the degraded-frame-rate, one-fwplay-per-small-scene takes 3 minutes to playback and compare 104 scenes on a 32 bit pentium)
Testing against all the freewrl/test files takes ~30 min. On a headless 
machine, this is, testing in a remote session would take up to 3 hours 
as the rendering is done by software rather then using the graphics card.
The time the test took to run is of course related with the quantity of 
things that the .fwplay is doing. Also, the biggest chunk of time is 
spent initializing a new instance of freewrl for each test file.

The real point of this was to rehearse the possibility of having a 
headless machine running the test suite in a Continuous Integration style.
Hence, the time it takes to run the test or tests is not necessarily 
important right now, the test could be running in a machine somewhere 
independent of local development smoking changes as they're committed to 
the repository.
This kind of testing is more like a safety net against nefarious changes 
that would impact the perceived visual rendering.

As a side note, my headerless bmp files differ between console snapshots 
and remote session snapshots.
> Thanks for the link. And the ideas. I wonder if the two methods can be combined somehow, perhaps as options / parameters, so developers can conveniently choose.
Doug, you were right when you said in reply to John that "This type of 
testing won't say if something is right or wrong, only if something 
changed."
This is just black box testing and it's far away from any form of unit 
testing.... but that said, I do think it's better than nothing.
My suggestion for an approach going forward is the following:

To pick the most iconic .wrl/x3d file in terms of testing features ( 
movement, texture, spatial orientation, whatever ) as I'm assuming that 
some features are common amongst many of the test files.
Create specific .fwplay files for each using an extended version of the 
methodology that I used, ( exercising the various modes, moving about 
whilst stopping to take snapshots at points in time )
Use those .fwplay files and generate those BMP features, either correct 
BMPs or the headerless BMPs just caring about the pixel frame, or 
striking the BMP file format all together and just write the pixel 
representation to a file which actually sounds similar to what 
commenting the header was achieving.
Once there's an agreed .fwplay for each most iconic .wrl file, features 
could be generated in the different platforms by running playback mode 
and committed or kept somewhere for reference.
Then it would be a matter of running the test script that would go about 
picking each .fwplay in playback mode, generating the snapshots and in 
the end compare them.

CONS:
- The overall size of features; 6 headerless bmp snapshots times 58 test 
files equals roughly 160MB on my machine ( Fedora Linux ).
- Not yet 100% sure of the feasibility of this kind of testing VS O.S. 
updates, package/library updates or other minor changes in freewrl
- Not entirely sure this is the route you'd want to go as a team...

PROS:
- Some assurance against future changes
- Most of it seems to be done, now its just a matter of gluing the 
pieces together.
- Some automated testing rather than manual

Thread: Re: [FreeWRL-develop] Testing update (Page 5)

freewrl-develop