I've added some support for hud and menu scaling in the software renderer for HoT, and during my adventures I've come across some code I have some unanswered questions for.
Just some screenshot of what I did.
Anyway, the first thing is a curiosity regarding some of the draw code for transparent images. This is from draw.c
voidDraw_TransPic(intx,inty,qpic_t*pic){byte*source,tbyte;intv,u;if(x<0||(x+pic->width)>vid.width||y<0||(y+pic->height)>vid.height){Sys_Error("%s: bad coordinates",__thisfunc__);}source=pic->data;if(r_pixbytes==1){byte*dest=vid.buffer+y*vid.rowbytes+x;if(pic->width&7){//generalfor(v=0;v<pic->height;v++){for(u=0;u<pic->width;u++){if((tbyte=source[u])!=TRANSPARENT_COLOR)dest[u]=tbyte;}dest+=vid.rowbytes;source+=pic->width;}}else{//unwoundfor(v=0;v<pic->height;v++){for(u=0;u<pic->width;u+=8){if((tbyte=source[u])!=TRANSPARENT_COLOR)dest[u]=tbyte;if((tbyte=source[u+1])!=TRANSPARENT_COLOR)dest[u+1]=tbyte;if((tbyte=source[u+2])!=TRANSPARENT_COLOR)dest[u+2]=tbyte;if((tbyte=source[u+3])!=TRANSPARENT_COLOR)dest[u+3]=tbyte;if((tbyte=source[u+4])!=TRANSPARENT_COLOR)dest[u+4]=tbyte;if((tbyte=source[u+5])!=TRANSPARENT_COLOR)dest[u+5]=tbyte;if((tbyte=source[u+6])!=TRANSPARENT_COLOR)dest[u+6]=tbyte;if((tbyte=source[u+7])!=TRANSPARENT_COLOR)dest[u+7]=tbyte;}dest+=vid.rowbytes;source+=pic->width;}}}else/* r_pixbytes == 2 */{//FIXME:pretranslateatloadtime?unsignedshort*dest=(unsignedshort*)vid.buffer+y*(vid.rowbytes>>1)+x;//FIXME:transparencybitsaremissingfor(v=0;v<pic->height;v++){for(u=0;u<pic->width;u++){tbyte=source[u];if(tbyte!=TRANSPARENT_COLOR){dest[u]=d_8to16table[tbyte];}}dest+=vid.rowbytes>>1;source+=pic->width;}}}
What does r_pixbytes refer to? Does Hexen II ever use images with a r_pixbytes values equal to 2?
In the r_pixbytes == 1 case, why are there cases for //general and //unwound, is the //unwound case ever used?
As far as I could tell, using vanilla Hexen II resources, the only case used is the //general case (where r_pixbytes is 1), which would imply much of my modified code is entirely untested and perhaps even unnecessary.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
r_pixbytes is pixel bytes: Hexen II currently supports only 8 bit output, which is r_pixbytes == 1. For 16 bit color, r_pixbytes is 2, and for 32 bits r_pixbytes is 4. Some additional effort is necessary for 16 and 32 bit color depth. As for the general case where the with is not aligned to 8, I honestly do not know, never tested it myself: easy way to find is just adding a debug print in there and observe. However, I bet that most images' widths are multiples of 8, therefore I'd expect that the unwound case is used mostly.
When you complete your code, I'd like to see it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Cool. I thought r_pixbytes was for 8bit and 16bit modes, but seeing as Hexen II, as you said, only supports 8bit output I thought I could be mistaken.
I'll provide the changes I've made once I'm done. Because what I've done is in shared code it's probably not currently compatible with the OpenGL renderer. I've simply not attempted to even compile it yet. I have made use of the vid.conwidth and vid.conheight values (which I believe is what the OpenGL version uses for scaling), so I think it should be easy enough to fix.
I've also increased the maximum available resolution available to the software renderer and in doing so I think I may have uncovered a bug. Seeing as my code is modified I decided not to report it as an actual bug. I've been getting an access violation on line 329 of d_spritce.c, which is this:
"pz" is a buffer created based on the screen resolution, and it's used when rendering 2D sprites. The crash occurs when a sprite is rendered extremely close to the player's view (most noticeable when you exit a teleporter). I think I've stopped it occurring with a simple band-aid solution (I added a bound check, when it fails break out the while loop). I was wondering if this is known already.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
No, the access violation you describe isn't known to me.
You say that you hit the issue only when you increase the maximum resolution by changing MAXHEIGHT and MAXWIDTH in r_shared.c? If yes, I can try that and debug some sometime. Note that, if you do change those macros, you must also make corresponding changes in the asm code too (but the access violation you report is in non-intel C-only code, so I assume you are compiling with asm disabled, yes?)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah. all I changed to increase the maximum resolution was increase MAXHEIGHT to 1200 and MAXWIDTH to 1600 in r_shared.c (it was the very first thing I did, and I was sincerely surprised it was that easy). I'm using the h2-noasm project form the VS 2010 solution as I'm compiling it for 64bit Windows. I did attempt a fix by doubling the size of the buffer, but eventually the crash occurred again. So far my band-aid has prevented further crashes with no apparent visual errors.
I'll have a look at the assembly, make the changes and see how a 32bit build works.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I added _Scaled functions to draw.c and gl_draw.c (though the one's in gl are just duplicates that work anyway), I also made changes to sbar.c menu.c and screen.c. This was all stuff to get the hud and menus to scale.
Other small changes. Maximum resolution change, WARPSIZE has been changed from 320x200 to 640x400, particles are smaller (by about a half), the console appears and disappears much quicker, the status bar will appear between level loads (looks better imo), the palette no longer has forced black and white entries (this caused errors when palette translations are applied when you're under water/lava) and the band-aid solution to the segfault.
It's entirely possible there are other difference I simply don't remember. I started doing this entirely for my own benefit and didn't think I'd be sharing anything, so I've not been documenting anything. I hope this is useful to you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much. It is really good that community is helping with coding. I will review this and merge changes (I have a few things to sort out first, though.)
As for the segfault, I was able to reproduce it only once on linux but was unable to do it running under the debugger yet. Will post back here.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's a pleasure, nice to know what I've done can benefit people other than just me.
Things to note, I only made changes to vid_win.c, I assume similar changes would have to be made to the SDL equivalents.
As I said earlier, I've used vid.conwidth and vid.conheight, but I've not exposed them as cvars accessible from the console or menus like they are in the OpenGL version. I believe you could have them behave exactly as they do in that version. They're also hard coded to 320 and 200 when the resolution is set in vid_win.c.
Turns out you don't have to make any changes to the assembly files for the higher resolutions to work. Obviously the particle size change is only in the no-asm builds. Though a "shr eax, 1" in the right place makes it work for the assembly too.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As for the segfault, I'm getting erratic results on linux yet, so I
still couldn't debug it. Out of curiosity: do you get any segfaults
if you don't change MAXWIDTH/MAXHEIGHT and run at the max resolution
of 1280x1024? (the question is based on the assumption that you get
the segfault only in 1600x1200 when you change MAXWIDTH/MAXHEIGHT to
1600/1200.)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I will attempt running it at 1280x1024 as you've suggested and report back.
Perhaps my earlier statements are misleading a little. I did change MAXWIDTH/MAXHEIGHT to 1600x1200, but I actually play at 1440x1080, and that's the resolution I was using when the crashes occurred. I did test that 1600x1200 worked but I didn't use it for long enough to observe any segfaults (I have a 1920x1080 screen, but I can still use taller resolutions).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, unmodified 1.5.4 no-asm build on 64bit Windows at 1280x1024 and I still get the segfault. Guess it has nothing to do with my changes then. Anything else I can try?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, this is as I suspected: the code has something funky somewhere, as to where it is I don't know yet. How does the pz pointer actually manages to overrun, that's what I am very much curious about.
The only thing else to try is, I think, see whether any of the previous versions such as 1.5.3 or 1.5.0 have the same issue.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So far both 1.5.3 and 1.5.2 also get the segfault.
What I have observed, after running a debug build, is the izi also has what could be a bizarre value when the segfault occurs. I think in normal operation it's positive, but when the segfault occurs it's negative. It's possibly izi being negative is normal, but I'm wondering if a massively positive value is becoming negative. izi isn't the only variable with a negative value at the point of the segfault. I'm also quite confused about the values of spancountminus1 being reported.
As a side note, the segfault occurs in windows mode at 800x600.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
All the 1.5.X versions gave me the crash. The only older source (1.2.2c) seems like it only has linux build files. I'm not sure if my linux install is setup to compile this, I could try though.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1.2.x doesn't have 64 bit support, 1.4.2 and newer does, and win64 support begins with 1.4.4 (unreleased) or 1.5.0: saying this because you say that you are compiling for win64, and the issue probably manifests itself more easily with 64 bit builds (I have yet to test the issue under x86_64-linux..) I removed any source older than 1.5.0 from downloads, but I can give you any version if you want: just tell me.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think I may have found something useful.
I put a watch on pspan->v and pspan->u and noticed that sometimes pspan->v is negative, but doesn't always crash. Considering this.
When pspan->v is negative, memory allocated before d_pzbuffer is potentially accessed. In most cases this will just get garbage, but in other cases it results in an access violation especially is the pointer becomes negative (which I think results in some kind of overflow and an address way outside of the running program).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Not sure about it being a good solution, but adding a print in there shows that I do hit negative ymin in my x86-linux builds. Will try understanding the thing and will possibly merge your solution if I can't do anything other.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It seems like I can't hit a negative ymin upon teleporting when I set chase_active to 1, so far. I also couldn't hit a negative ymin when I compiled using clang-3.1 yet, but builds made with gcc-3.4 or 4.5 does give me -1 or -2 ymin values. (Do you hit the crash using stock hexen2-1.5.4-win64.zip which was built using gcc4.5?)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, I also get the segfault with hexen2-1.5.4-win64.zip. I think the reason you don't get the crash when using the chasecame is because the teleport effect is rendered at the player origin. The segfault isn't specific to teleporting, it just happens to be a consistent way to get it. I've also experienced the crash after being shot by an archer.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Changing the optimization settings for the MS compiler results in different negative values coming out. I got a -4 at one point. Originally I thought the problem was how the don't draw cases are handled, but now I think it's a problem with whatever is constructing the sspan_s structs.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, I notified Steven and Sander about this issue to have their opinion.
In the meantime, can you please create a bug entry in our bug tracker with as much detail as possible as noted here, along with your patch suggestion: this way, the bug will be tracked more easily and won't get lost (because the actual topic of this thread is different.) Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Speaking of the original topic. My scaling implementation is actually painfully slow, I think if one were to implement such a thing for the official version of Hammer of Thyrion it would have to be faster. Perhaps using an off-screen buffer which is scaled to the full screen resolution once all the 2D elements are drawn.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've added some support for hud and menu scaling in the software renderer for HoT, and during my adventures I've come across some code I have some unanswered questions for.
Just some screenshot of what I did.
Anyway, the first thing is a curiosity regarding some of the draw code for transparent images. This is from draw.c
What does r_pixbytes refer to? Does Hexen II ever use images with a r_pixbytes values equal to 2?
In the r_pixbytes == 1 case, why are there cases for //general and //unwound, is the //unwound case ever used?
As far as I could tell, using vanilla Hexen II resources, the only case used is the //general case (where r_pixbytes is 1), which would imply much of my modified code is entirely untested and perhaps even unnecessary.
r_pixbytes is pixel bytes: Hexen II currently supports only 8 bit output, which is r_pixbytes == 1. For 16 bit color, r_pixbytes is 2, and for 32 bits r_pixbytes is 4. Some additional effort is necessary for 16 and 32 bit color depth. As for the general case where the with is not aligned to 8, I honestly do not know, never tested it myself: easy way to find is just adding a debug print in there and observe. However, I bet that most images' widths are multiples of 8, therefore I'd expect that the unwound case is used mostly.
When you complete your code, I'd like to see it.
Cool. I thought r_pixbytes was for 8bit and 16bit modes, but seeing as Hexen II, as you said, only supports 8bit output I thought I could be mistaken.
I'll provide the changes I've made once I'm done. Because what I've done is in shared code it's probably not currently compatible with the OpenGL renderer. I've simply not attempted to even compile it yet. I have made use of the vid.conwidth and vid.conheight values (which I believe is what the OpenGL version uses for scaling), so I think it should be easy enough to fix.
I've also increased the maximum available resolution available to the software renderer and in doing so I think I may have uncovered a bug. Seeing as my code is modified I decided not to report it as an actual bug. I've been getting an access violation on line 329 of d_spritce.c, which is this:
"pz" is a buffer created based on the screen resolution, and it's used when rendering 2D sprites. The crash occurs when a sprite is rendered extremely close to the player's view (most noticeable when you exit a teleporter). I think I've stopped it occurring with a simple band-aid solution (I added a bound check, when it fails break out the while loop). I was wondering if this is known already.
No, the access violation you describe isn't known to me.
You say that you hit the issue only when you increase the maximum resolution by changing MAXHEIGHT and MAXWIDTH in r_shared.c? If yes, I can try that and debug some sometime. Note that, if you do change those macros, you must also make corresponding changes in the asm code too (but the access violation you report is in non-intel C-only code, so I assume you are compiling with asm disabled, yes?)
Yeah. all I changed to increase the maximum resolution was increase MAXHEIGHT to 1200 and MAXWIDTH to 1600 in r_shared.c (it was the very first thing I did, and I was sincerely surprised it was that easy). I'm using the h2-noasm project form the VS 2010 solution as I'm compiling it for 64bit Windows. I did attempt a fix by doubling the size of the buffer, but eventually the crash occurred again. So far my band-aid has prevented further crashes with no apparent visual errors.
I'll have a look at the assembly, make the changes and see how a 32bit build works.
Can you give me a save so that I can try to reproduce the segfault myself?
https://dl.dropbox.com/u/29280841/s11.zip
Simply keep going through the teleporters. It's not frequent, I'd say roughly a 1 in 25 chance of it happening.
Ok, here is my modified source https://dl.dropbox.com/u/29280841/hexen2source-1.5.4m.zip
I think the best way to find the full list of changes would be to diff it against the 1.5.4 source.
I added _Scaled functions to draw.c and gl_draw.c (though the one's in gl are just duplicates that work anyway), I also made changes to sbar.c menu.c and screen.c. This was all stuff to get the hud and menus to scale.
Other small changes. Maximum resolution change, WARPSIZE has been changed from 320x200 to 640x400, particles are smaller (by about a half), the console appears and disappears much quicker, the status bar will appear between level loads (looks better imo), the palette no longer has forced black and white entries (this caused errors when palette translations are applied when you're under water/lava) and the band-aid solution to the segfault.
It's entirely possible there are other difference I simply don't remember. I started doing this entirely for my own benefit and didn't think I'd be sharing anything, so I've not been documenting anything. I hope this is useful to you.
Thank you very much. It is really good that community is helping with coding. I will review this and merge changes (I have a few things to sort out first, though.)
As for the segfault, I was able to reproduce it only once on linux but was unable to do it running under the debugger yet. Will post back here.
It's a pleasure, nice to know what I've done can benefit people other than just me.
Things to note, I only made changes to vid_win.c, I assume similar changes would have to be made to the SDL equivalents.
As I said earlier, I've used vid.conwidth and vid.conheight, but I've not exposed them as cvars accessible from the console or menus like they are in the OpenGL version. I believe you could have them behave exactly as they do in that version. They're also hard coded to 320 and 200 when the resolution is set in vid_win.c.
Turns out you don't have to make any changes to the assembly files for the higher resolutions to work. Obviously the particle size change is only in the no-asm builds. Though a "shr eax, 1" in the right place makes it work for the assembly too.
For MAXWIDTH/MAXHEIGHT change, I pushed the necessary asm fixes to svn
as of rev. 4901:
http://uhexen2.svn.sourceforge.net/viewvc/uhexen2?view=revision&revision=4901
As for the segfault, I'm getting erratic results on linux yet, so I
still couldn't debug it. Out of curiosity: do you get any segfaults
if you don't change MAXWIDTH/MAXHEIGHT and run at the max resolution
of 1280x1024? (the question is based on the assumption that you get
the segfault only in 1600x1200 when you change MAXWIDTH/MAXHEIGHT to
1600/1200.)
I will attempt running it at 1280x1024 as you've suggested and report back.
Perhaps my earlier statements are misleading a little. I did change MAXWIDTH/MAXHEIGHT to 1600x1200, but I actually play at 1440x1080, and that's the resolution I was using when the crashes occurred. I did test that 1600x1200 worked but I didn't use it for long enough to observe any segfaults (I have a 1920x1080 screen, but I can still use taller resolutions).
Ok, unmodified 1.5.4 no-asm build on 64bit Windows at 1280x1024 and I still get the segfault. Guess it has nothing to do with my changes then. Anything else I can try?
OK, this is as I suspected: the code has something funky somewhere, as to where it is I don't know yet. How does the pz pointer actually manages to overrun, that's what I am very much curious about.
The only thing else to try is, I think, see whether any of the previous versions such as 1.5.3 or 1.5.0 have the same issue.
So far both 1.5.3 and 1.5.2 also get the segfault.
What I have observed, after running a debug build, is the izi also has what could be a bizarre value when the segfault occurs. I think in normal operation it's positive, but when the segfault occurs it's negative. It's possibly izi being negative is normal, but I'm wondering if a massively positive value is becoming negative. izi isn't the only variable with a negative value at the point of the segfault. I'm also quite confused about the values of spancountminus1 being reported.
As a side note, the segfault occurs in windows mode at 800x600.
All the 1.5.X versions gave me the crash. The only older source (1.2.2c) seems like it only has linux build files. I'm not sure if my linux install is setup to compile this, I could try though.
1.2.x doesn't have 64 bit support, 1.4.2 and newer does, and win64 support begins with 1.4.4 (unreleased) or 1.5.0: saying this because you say that you are compiling for win64, and the issue probably manifests itself more easily with 64 bit builds (I have yet to test the issue under x86_64-linux..) I removed any source older than 1.5.0 from downloads, but I can give you any version if you want: just tell me.
I think I may have found something useful.
I put a watch on pspan->v and pspan->u and noticed that sometimes pspan->v is negative, but doesn't always crash. Considering this.
When pspan->v is negative, memory allocated before d_pzbuffer is potentially accessed. In most cases this will just get garbage, but in other cases it results in an access violation especially is the pointer becomes negative (which I think results in some kind of overflow and an address way outside of the running program).
It seams pspan->v is less than 0 when ymin is less than 0. So changing this
to
in D_DrawSprite (void) will prevent the access violation. Is that a good solution though?
Not sure about it being a good solution, but adding a print in there shows that I do hit negative ymin in my x86-linux builds. Will try understanding the thing and will possibly merge your solution if I can't do anything other.
It seems like I can't hit a negative ymin upon teleporting when I set chase_active to 1, so far. I also couldn't hit a negative ymin when I compiled using clang-3.1 yet, but builds made with gcc-3.4 or 4.5 does give me -1 or -2 ymin values. (Do you hit the crash using stock hexen2-1.5.4-win64.zip which was built using gcc4.5?)
Yes, I also get the segfault with hexen2-1.5.4-win64.zip. I think the reason you don't get the crash when using the chasecame is because the teleport effect is rendered at the player origin. The segfault isn't specific to teleporting, it just happens to be a consistent way to get it. I've also experienced the crash after being shot by an archer.
Changing the optimization settings for the MS compiler results in different negative values coming out. I got a -4 at one point. Originally I thought the problem was how the don't draw cases are handled, but now I think it's a problem with whatever is constructing the sspan_s structs.
OK, I notified Steven and Sander about this issue to have their opinion.
In the meantime, can you please create a bug entry in our bug tracker with as much detail as possible as noted here, along with your patch suggestion: this way, the bug will be tracked more easily and won't get lost (because the actual topic of this thread is different.) Thanks.
No Problem, bug report made.
Speaking of the original topic. My scaling implementation is actually painfully slow, I think if one were to implement such a thing for the official version of Hammer of Thyrion it would have to be faster. Perhaps using an off-screen buffer which is scaled to the full screen resolution once all the 2D elements are drawn.