DOSBox / Patches / #54 Hq2x patch

Qbix - 2004-05-20

Logged In: YES
user_id=535630

you need submit the diff again
uploading files while submitting is broken.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sjoerd - 2004-06-14

Logged In: YES
user_id=153968

Needs an update for the current cvs.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-07-05

Logged In: YES
user_id=1045474

I have updated the patch for today's CVS. The recent changes in the
scaler code made life a lot easier for me.
The hq2x code can now be considered true beta quality: It is supposed
to work stable and correct in all cases, it just needs testing as I can
only do surface rendering with 16 or 32 bpp.
(I still left some gcc'isms inside, sorry. Non-GCC users need two lines:

define attribute(x)

define __builtin_expect(x,y) x

)

There is also a small proposed patch to ymf262.c inside. According to
oprofile, it reduces CPU load of the opl2/3 code by about 50%. Stuff
commented out with "#ifdef SMALL_CACHE" might or might not
improve things on some CPUs, but I'm lacking definite numbers for
now.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sjoerd - 2004-07-05

Logged In: YES
user_id=153968

Hmm is there still any need to have the mmx distance
calculations stuff in, since we'll probably only ever need 8bpp
input sources you can probably always do it faster with a
distance lookup table.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-07-05

Logged In: YES
user_id=1045474

Oops, m ade a few mistakes. The same patch again, minus one
segfault, plus one "it actually compiles" and with added non-gcc
workiness.

I am keeping the MMX code, as the distance table is quite big. Since I
blew up my p3 laptop shortly after getting it, I haven't yet found time to
see if SSE's mmx can do better than the current code. The MMX code is
close enough in runtime, so some of the newer unpacking instructions
may pay off. The reduced cache trashing may well pay off (see my
small ymf262.cpp patch, same principle).
The mmx code is all #defined away at the moment, but I'd like it to stay
in there until I get a better devel box and can decide on facts.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-07-05

Logged In: YES
user_id=1045474

Oh, and I am pondering an adaptive distance calculation to catch even
more edges. I have ideas how to solve that with the table, but it's quite
possible MMX will win even more in that case. I fear my devel box is too
slow for that, however, so don't hold your breath.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-07-06

Logged In: YES
user_id=1045474

Another version, hopefully fixing the MS VS Net problem reported in
the forums.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-07-07

Logged In: YES
user_id=1045474

Another version. Now supports aspect ratio correction. Looks ugly, but
some people seem to like that.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sjoerd - 2004-07-10

Logged In: YES
user_id=153968

Hmm the adlib patches might be nice too though, might also be
able to use split each waveform into a table of 4 pointers and
use the highest 2 bits from the frequency as an index there,
and the pointers point to small pieces of the sine wave.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-08-04

Logged In: YES
user_id=1045474

Here's the next version, for current CVS.

New features: adaptive threshold calculation. It now scans the
surroundings of the current pixel to find the maximum and minimum
difference, and then sets the actual edge detect threshold to some
average value (the exact averaging ratio is configurable). The old
(static) threshold variant is also present, as both together give
noticeably better results.

See forums for user manual.

It can be turned off via a #define, but the CPU cost is fairly low
compared to the static algorithm. The mmx code is still inactive, but
the user reports in the forum indicate it could have a big effect on
newer CPUs. I hope to get at a bigger devel machine soon.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2004-09-26

Logged In: YES
user_id=1045474

Just a quick update: Looks like I am finally getting a decent box, so
expect some performance improvements for newer CPUs (mine will
probably be an athlon64) in the near future.

Also, the longer I use the current patch (and compare it to stock hq2x
as found in, e.g., scummvm), the more I like it. I'm quite confident that
it won't change anymore (feature-wise). Hq3x/4x may appear, if they
are noticeably better than just using 640x400 fullscreen - I'll give that a
shot.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-03-03

Logged In: YES
user_id=1045474

So here it is, the latest variant.

This time, the patch includes 3 things at once: The already well-known
software Hq2x inplementation, the hardware OpenGL-HQ scaler and
16-bit VESA SVGA support.

I didn't break them up as they depend on each other.

Changes in Hq2x:
- it now follows the template-style of the other scalers, though it's still
in it's own file (two actually) in order to get the important
320-pixel-source-width optimization
- it has become a little bit slower due to using the more generic pixel
depth conversion; previous code has been better optimized but less
flexible
- the 32-bit interpolation optimization has been added to all render
templates, see comment in render_templates.h
- the GCC way of marking conditionals as quite unlikely has been
added for all scalers, correctly #ifdef'ed
- threshold values are shared with the OpenGL-HQ code

About OpenGL-HQ:
- all OpenGL code has been broken out of sdlmain.cpp and placed
into sdl_opengl.cpp
- opengl rendering is threaded as some OpenGL calls seem to block
the process until the hardware is done, which defies the purpose of
hardware acceleration
- see the comment in sdl_opengl.cpp for some general information
- IMPORTANT: the code NEEDS SDL 1.3 (which is the current CVS),
as it uses the new platform-independent render targets; if you don't
have SDL 1.3 installed, the OpenGL-HQ code is automatically
disabled; the traditional OpenGL modes continue to work, of course
- rendering is done in 3 passes; I've tried hard to reduce that to just 2
passes, but joining pass 1+2 is impossible on my hardware, and
joining 2+3 is slower that doing 3 passes, as pass 2 uses the source
resolution and pass 3 runs at the destination resolution.

About 16-bit VESA SVGA:
- I have absolutely no idea what I did. It's really just cut'n'paste with
some educated guessing.
- It works fine. I can finally play Schleichfahrt, even speed is
sort-of-acceptable.
- It is untested with anything else.
- OpenGL output will try to exploit hardware 16 bit support.
- adding SBPP=16 required small changes to the scaler templates
- I haven't added anything but normal scaler yet, I see no point in
others, as programs using 16bpp probably need a lot of performance
themselves
- if you own something >4GHz and think you could spare some cycles
for scaling, OpenGL-HQ is probably your best option

The code can be considered stable. I'm using it for quite a while and
didn't find any need for modifications.

If anything breaks, send patches.

Have fun!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-03-04

Logged In: YES
user_id=1045474

Just an additional comment: You can apply the patch without SDL 1.3,
you will still get the updated (better-looking) Hq2x and the 16bit VESA
support, OpenGL-HQ will silently be disabled, even multithreaded
rendering won't happen.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Allustar - 2005-03-10

Logged In: YES
user_id=1039189

hey moe,

On trying to build with the standard libSDL 1.2.8 that I've
been using it would get stuck because certain OPENGL
functions were not in 1.2.8 or are different in the 1.3.x
branch. So in order to build with OPENGL at all you must
have libSDL 1.3.x branch otherwise it fails building.

/Ieremiou

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-03-13

Logged In: YES
user_id=1045474

Thanks for the report. I'll go and debug it on windows soon, resolving the crash reported in the forums and the 1.2 problem.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-04-16

Logged In: YES
user_id=1045474

Another month goes by, another version is out.

Changes:

applies to latest CVS

compiles and runs on windows

some minor bugs fixed

moved all GUI calls into render thread for openglhq (windows is less forgiving than X11)

compiles with SDL-1.2 (no openglhq, of course)

VESA fix and extension by wd

still untested on nvidia (the whole world around me seems to have ATI...)

I've also attached my win32 build including SDL-1.3 dll. Creating your own in mingw is trivial: Fetch the sources from SDL CVS (you have to use the branch option to get the 1.3 branch), optionally search for the directx fix in the forums, ./configure; make; make install.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-04-16

Logged In: YES
user_id=1045474

sourceforge seems to have problems with file uploads. Get the files here:

http://garni.ch/~jwalt/dosbox-openglhq.diff [179k]

http://garni.ch/~jwalt/dosbox-openglhq-win32.zip [4.3M]

(sorry, seems like I forgot to strip the binary ;)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-04-18

Logged In: YES
user_id=1045474

Uploads seem to work again, so here is the patch, in two different variants:

If you are a user who wants to compile his own full-featured version, use dosbox-fullhq.diff. It includes hq2x software-scaling, opengl-hq hardware-scaling and VESA 16bit support.

If you are a dosbox maintainer (hello qbix ;) ), you can use the three separate patches and integrate the ones that you already like.

As I've written before, the three patches overlap, so you can't apply all three without conflicts, that's why I provide the all-in-one patch. I will update the remaining patches if some of the code is included in CVS.

Today's patches already contain some cosmetic fixes and a possible fix for the VESA issue reported in the forums. They do no longer contain the adlib optimization, I will open a separate item for that.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-06-29

Logged In: YES
user_id=1045474

Here are two new patches:

hq2x in a much simpler variant (the actual output has not changed), better suited for inclusion in CVS; for example, experience with the current trigger code has shown that key bindings or config file settings are not needed anymore;

VESA 16bit in a slightly updated variant from forum feedback

The OpenGL-HQ patch has officially been discontinued. The SDL version is much better (simpler, faster, looks the same). It now includes a patch to have it enabled via dosbox.conf. Get it at http://garni.ch/Software/dosbox/
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-06-29

hq2x simple version

dosbox-hq2x-20050629.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

`Moe` - 2005-06-29

VESA VBE 16-bit support

dosbox-vesa16bit-20050629.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hq2x patch

An Open Source DOS emulator to run old DOS games

Group

Searches

Help

#54 Hq2x patch

Discussion

define attribute(x)

define __builtin_expect(x,y) x