Menu

Letting the server work in floating point?

2021-02-08
2021-02-10
  • David Kastrup

    David Kastrup - 2021-02-08

    Hi,

    I've taken some cursory look through ways to speed up server operation,
    and one thing that I thought relevant is that if the Opus libraries are
    not compiled using FIXED_POINT (FIXED_POINT would seem like a bad idea
    on typical general-purpose architectures and indeed does not seem
    enabled), then natively the Opus libraries deal with floating point and
    explicitly convert to fixed point when using the fixed point API.

    Now it turns out that floating point is a lot better suited to handling
    mixing (particularly once one involves APX extensions) because it deals
    a lot more gracefully with temporary and permanent overflow and also has
    special SIMD instructions available that could greatly speed up
    operation.

    The obvious disadvantage is that a "natural" .wav dump when recording
    would end up double the size it already has. But letting the wav
    recorder reduce to int would seem like a sensible option, assuming that
    queuing up the floats does not end up slower than converting and queuing
    up the shorts.

    At any rate, at least GCC (and there may be a reasonable expectation for
    servers that they are compiled with GCC) offers a deluge of options for
    compiling using AVX and similar intrinsics in a manner where the ELF
    executables will pick the best version at runtime. So even in a binary
    distribution, it's feasible to use stuff that may not be available for
    all targeted platforms.

    Has anybody experimented with converting at least the server-side
    operation to floating point?

    --
    David Kastrup

     
    • Gilgongo

      Gilgongo - 2021-02-08
       
      • David Kastrup

        David Kastrup - 2021-02-08

        It certainly is! It is presented as a matter of maintaining precision for sound cards delivering more than 16 bits, and doing so at a cost (1% of processing power). However, the way I see it this provides an entry into significantly more efficient processing using the AVX extension for SIMD processing of floating point values (namely 8 32-bit floats at a time). I'll have to look at the patch in question first, though: I don't see that there is a lot to be gained for client-side processing: the real (and reasonably low-hanging) payoff would be at the server side. I'll take a look at what the patch presented there purports to do and then come back to say how I think this would relate to what I propose.

         
      • David Kastrup

        David Kastrup - 2021-02-08

        Actually, the related issue rather is https://github.com/jamulussoftware/jamulus/pull/535/commits/1d7dec739a4a7a06cfe70e4f76d85e577ae24f7f

        And it would seem that the integrate_float2 branch is sort of supposed to do something similar? Last time master has been merged into it was in October. Not sure what the idea with this branch is.

         
        • Gilgongo

          Gilgongo - 2021-02-08

          I seem to recall a discussion in which it was concluded that floats wouldn't make much difference, but I'm not sure where that is. Perhaps on another ticket?

           
          • David Kastrup

            David Kastrup - 2021-02-09

            I see https://github.com/jamulussoftware/jamulus/issues/544#issuecomment-753603959 but

            a) I don't get the argument "someone else does it so it is unnecessary" (it's not like a good idea is exhausted once somebody followed through with it)
            b) the biggest reason for me is that it would make the O(n^2) operation of mixing on the server amenable to SIMD via AVX (for x86-based servers) and thus could really speed up operations a lot with comparatively small code replacements (which I know how to do in GCC): a consideration which I really have not seen in the discussion
            c) it also would make clipping behave a lot more gracefully than the integer variant of wrapping around

            So essentially I don't share the conclusion.

             
            • David Kastrup

              David Kastrup - 2021-02-09

              Actually I think that the focus of the FP patch is quite different: it appears to be intended as an architectural thing from driver to client to server, using float everywhere, and particularly enabling 24bit operation in soundcards etc.

              That's not my interest: I was interested just in optimising the server operation. Since the transport is done compressed (with Opus), there is not much of a point in using more than 16bt of resolution on the client side. On the server side, however, using a float-only workflow allows to use SIMD instructions for the mixing stage, and float is the "natural" format for the Opus decoder/encoder anyway.

              I think that the problems with the FP patch's audio behavior were client-side: I don't have the experience particularly with Windows to help there, but if I had chosen to code from scratch, I'd not have touched the client code anyway so those problems would not have been an issue.

               
              • David Kastrup

                David Kastrup - 2021-02-10

                Ok, one thing I did not keep in mind: server and client share a whole lot of code, particularly so because they are the same executable and the server does not need to run headless (like my server always does). So just making the server operate in float is somewhat more tricky. One way to do this may to template all the respective classes so that they deal either in short or in float.

                That would cause code duplication and a bit of complication but it would have the advantage of making it rather easy to conduct comparisons and switch operation back and forth (like, when compiling for some headless client on server on an architecture weak on floating point).

                It would also make it feasible to offer both float and short interfacing to the sound card without performance loss (though again at the cost of code duplication, most of which is done by the template engine).

                 
                • Gilgongo

                  Gilgongo - 2021-02-10

                  This feels like something to discuss on a new Github ticket perhaps. Maybe reference https://github.com/jamulussoftware/jamulus/issues/544?

                   
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.