[cadcdev-svn-commits] [SCM] branch master updated. 33958108179edd3a73e21ef7437ed9a49b996fd9

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "A pseudo Operating System for the Dreamcast.".

The branch, master has been updated
       via  33958108179edd3a73e21ef7437ed9a49b996fd9 (commit)
       via  e7ebae1f6ac3c91fef4174d09dfda03733cc0862 (commit)
       via  c520b0b815c0e11006a4e95f207b146351b0358f (commit)
       via  d6c26d169d8db24cadcc6979ae93c2b806eee5ae (commit)
       via  df95ad288ae8bc5963158da63fe33af9037c12f9 (commit)
       via  12824439b7eddb9439cff8970a9e9a9f660570a2 (commit)
       via  a01d0e2cf7ee6b76cd995f668c86d460533979c6 (commit)
      from  65cc060af59a607a2a05c742089d08f8271b0eb9 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 33958108179edd3a73e21ef7437ed9a49b996fd9
Author: Paul Cercueil <pa...@cr...>
Date:   Wed Nov 6 14:01:42 2024 +0100

    pvr: Fix hang when enabled lists aren't submitted
    
    When two lists were enabled, one for direct-rendering and one for
    deferred (DMA) rendering, but the first one was not used, only the IRQ
    corresponding to the second list fired, causing the IRQ handler to not
    start the PVR CORE rendering step, as it waited for all enabled lists to
    be transferred.
    
    Address this issue by adding dummy vertices to enabled lists that were
    not used in the current scene.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit e7ebae1f6ac3c91fef4174d09dfda03733cc0862
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 22:17:07 2024 +0100

    pvr: Rework and improve IRQ handling
    
    The previous way of handling IRQs was weird at best. The PVR rendering
    was not started directly after all lists were transferred by the
    Tile Accelerator, but following the next VBLANK event.
    
    This caused some games to be throttled down, as they had to wait for the
    PVR rendering to be completed before starting a new scene, and the PVR
    rendering was not started as soon as possible.
    
    Address that by starting the PVR rendering as soon as all lists have
    been transferred. In the case where the previous rendered scene hasn't
    been flipped yet (which happens when the game tries to render faster
    than the refresh rate), the PVR rendering is instead started from the
    VBLANK interrupt handler.
    
    Note that for simplicity the VBLANK handling code has been separated
    from the main PVR IRQ handler.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit c520b0b815c0e11006a4e95f207b146351b0358f
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 22:03:14 2024 +0100

    pvr: Move PVR render code into its own pvr_render_lists() function
    
    No functional change (intended).
    
    This new function will be called at two different spots in the future.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit d6c26d169d8db24cadcc6979ae93c2b806eee5ae
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 21:57:54 2024 +0100

    pvr: Always wait for TA ready before trying to use it again
    
    Attempting to use the Tile Accelerator for a new scene while it is busy
    processing the previous scene will almost always result in a crash.
    
    Call pvr_wait_ready() when starting a new scene to make sure that the TA
    is idle.
    
    Instead of calling it in pvr_scene_begin(), delay the call as much as
    possible, to leave time for the TA to complete its task. This means
    calling it in pvr_list_begin() for lists that use direct rendering, and
    right before uploading the DMA data for lists that use DMA.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit df95ad288ae8bc5963158da63fe33af9037c12f9
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 21:48:12 2024 +0100

    pvr: Rework pvr_wait_ready()
    
    The previous implementation used a semaphore, which was somewhat
    problematic; it meant that there had to be a 1:1 ratio between the
    number of calls to pvr_wait_ready(), and the number of renders by the
    PVR. Having two successive calls, for instance, would cause the second
    one to time out.
    
    Rework the implementation so that pvr_wait_ready() will wait until the
    PVR is ready if needed, but will always return straight away if the PVR
    is free.
    
    Instead of a semaphore, this new implementation uses the genwait
    mechanism on the pvr_state.ta_busy object. This field has also changed
    in its meaning, as it is now a generic "Tile Accelerator is busy" flag,
    independently on whether or not the hybrid mode is in use.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit 12824439b7eddb9439cff8970a9e9a9f660570a2
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 21:41:15 2024 +0100

    pvr: Start DMA for hybrid mode in pvr_scene_finish()
    
    There is no point in waiting for the next VBLANK to submit the data
    queued; it just causes unnecesary delays.
    
    Note that at that point it is guaranteed that the Tile Accelerator is
    free so we can remove the redundant checks.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

commit a01d0e2cf7ee6b76cd995f668c86d460533979c6
Author: Paul Cercueil <pa...@cr...>
Date:   Sun Nov 3 21:30:39 2024 +0100

    pvr: Fix hybrid mode signaling TA completion too early
    
    The dma_next_list() function was setting the "list transferred" flag for
    the lists that used DMA to transfer the vertices and also for the other
    ones. It also signaled completion through the semaphore after the last
    transfer completed.
    
    The problem with that, is that there is a difference between "the vertex
    data has been transferred to the TA" and "the TA is done processing the
    lists"; and the "list transferred" flags are to be set in the second
    case.
    
    Address this issue by leaving it to the IRQ handler to set the "list
    transferred" flags and handle the completion semaphore.
    
    Signed-off-by: Paul Cercueil <pa...@cr...>

-----------------------------------------------------------------------

Summary of changes:
 doc/CHANGELOG.md                                   |   1 +
 .../dreamcast/hardware/pvr/pvr_init_shutdown.c     |   8 +-
 kernel/arch/dreamcast/hardware/pvr/pvr_internal.h  |  12 +-
 kernel/arch/dreamcast/hardware/pvr/pvr_irq.c       | 175 ++++++++++-----------
 kernel/arch/dreamcast/hardware/pvr/pvr_scene.c     |  55 +++++--
 5 files changed, 138 insertions(+), 113 deletions(-)

diff --git a/doc/CHANGELOG.md b/doc/CHANGELOG.md
index 9c764a38..bda36885 100644
--- a/doc/CHANGELOG.md
+++ b/doc/CHANGELOG.md
@@ -12,6 +12,7 @@ Platform-specific changes are prefixed with the platform name, otherwise the cha
 - Add/Fixed stat() implementations for all filesystems [AB]
 - **Dreamcast**: Add network speedtest and pvr palette examples [AB]
 - **Dreamcast**: Cleaned up, documented, and enhanced BIOS font API [FG]
+- Rework PVR hybrid mode + IRQ handling [PC]
 
 ## KallistiOS version 2.1.0
 - Cleaned up generated stubs files on a make clean [Lawrence Sebald == LS]
diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c b/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c
index 29103411..cbe77d70 100644
--- a/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c
+++ b/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c
@@ -140,7 +140,7 @@ int pvr_init(pvr_init_params_t *params) {
     }
 
     /* Hook the PVR interrupt events on G2 */
-    pvr_state.vbl_handle = vblank_handler_add(pvr_int_handler, NULL);
+    pvr_state.vbl_handle = vblank_handler_add(pvr_vblank_handler, NULL);
     
     asic_evt_set_handler(ASIC_EVT_PVR_OPAQUEDONE, pvr_int_handler, NULL);
     asic_evt_enable(ASIC_EVT_PVR_OPAQUEDONE, ASIC_IRQ_DEFAULT);
@@ -194,9 +194,6 @@ int pvr_init(pvr_init_params_t *params) {
     mutex_init((mutex_t *)&pvr_state.dma_lock, MUTEX_TYPE_NORMAL);
     pvr_dma_init();
 
-    /* Setup our wait-ready semaphore */
-    sem_init((semaphore_t *)&pvr_state.ready_sem, 0);
-
     /* Set us as valid and return success */
     pvr_state.valid = 1;
 
@@ -245,8 +242,7 @@ int pvr_shutdown(void) {
     /* Invalidate our memory pool */
     pvr_mem_reset();
 
-    /* Destroy the semaphore */
-    sem_destroy((semaphore_t *)&pvr_state.ready_sem);
+    /* Destroy the mutex */
     mutex_destroy((mutex_t *)&pvr_state.dma_lock);
 
     /* Clear video memory */
diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h b/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h
index 3db2278d..4c7c72da 100644
--- a/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h
+++ b/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h
@@ -16,7 +16,6 @@
    code. If something is needed from this, an external interface should
    be added to dc/pvr.h. */
 
-#include <kos/sem.h>
 #include <kos/mutex.h>
 
 /**** State stuff ***************************************************/
@@ -169,7 +168,8 @@ typedef struct {
     uint32  lists_dmaed;                // (1 << idx) for each list which has been DMA'd (DMA mode only)
 
     mutex_t dma_lock;                   // Locked if a DMA is in progress (vertex or texture)
-    int     ta_busy;                    // >0 if a DMA is in progress and the TA hasn't signaled completion
+    int     ta_ready;                   // >0 if the TA is ready for the new scene
+    int     ta_busy;                    // >0 if a scene is ongoing and the TA hasn't signaled completion
     int     render_busy;                // >0 if a render is in progress
     int     render_completed;           // >1 if a render has recently finished
 
@@ -204,10 +204,6 @@ typedef struct {
     size_t   vtx_buf_used;               // Vertex buffer used size for the last frame
     size_t   vtx_buf_used_max;           // Maximum used vertex buffer size
 
-    /* Wait-ready semaphore: this will be signaled whenever the pvr_wait_ready()
-       call should be ready to return. */
-    semaphore_t ready_sem;
-
     // Handle for the vblank interrupt
     int     vbl_handle;
 
@@ -292,8 +288,10 @@ void pvr_blank_polyhdr_buf(int type, pvr_poly_hdr_t * buf);
 
 /**** pvr_irq.c *******************************************************/
 
-/* Interrupt handler for PVR events */
+/* Interrupt handlers for PVR events */
 void pvr_int_handler(uint32 code, void *data);
+void pvr_vblank_handler(uint32 code, void *data);
 
+void pvr_start_dma(void);
 
 #endif
diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c b/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c
index 8e72fb94..9d456c32 100644
--- a/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c
+++ b/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c
@@ -11,6 +11,8 @@
 #include <arch/cache.h>
 #include "pvr_internal.h"
 
+#include <kos/genwait.h>
+
 #ifdef PVR_RENDER_DBG
 #include <stdio.h>
 #endif
@@ -46,7 +48,6 @@ static void dma_next_list(void *data) {
                mark it as complete, so we skip trying to DMA it. */
             if(!b->base[i]) {
                 pvr_state.lists_dmaed       |= 1 << i;
-                pvr_state.lists_transferred |= 1 << i;
                 continue;
             }
 
@@ -74,22 +75,96 @@ static void dma_next_list(void *data) {
     if(!did) {
         //DBG(("dma_complete(buf %d)\n", pvr_state.ram_target ^ 1));
 
+        // If that was the last one, then free up the DMA channel.
+        pvr_state.lists_dmaed = 0;
+
         // Unlock
         mutex_unlock((mutex_t *)&pvr_state.dma_lock);
-        pvr_state.lists_dmaed = 0;
 
         // Buffers are now empty again
         pvr_state.dma_buffers[pvr_state.ram_target ^ 1].ready = 0;
+    }
+}
+
+void pvr_start_dma(void) {
+    pvr_sync_stats(PVR_SYNC_REGSTART);
+
+    mutex_lock((mutex_t *)&pvr_state.dma_lock);
+
+    // Begin DMAing the first list.
+    dma_next_list(0);
+}
+
+static void pvr_render_lists(void) {
+    int bufn = pvr_state.view_target ^ 1;
+
+    if(pvr_state.ta_busy
+       && !pvr_state.render_busy
+       && (!pvr_state.render_completed || pvr_state.to_texture[bufn])
+       && pvr_state.lists_transferred == pvr_state.lists_enabled) {
+
+        /* XXX Note:
+           For some reason, the render must be started _before_ we sync
+           to the new reg buffers. The only reasons I can think of for this
+           are that there may be something in the reg sync that messes up
+           the render in progress, or we are misusing some bits somewhere. */
+
+        // Begin rendering from the dirty TA buffer into the clean
+        // frame buffer.
+        //DBG(("start_render(%d -> %d)\n", pvr_state.ta_target, pvr_state.view_target ^ 1));
+        pvr_state.ta_target ^= 1;
+        pvr_begin_queued_render();
+        pvr_state.render_busy = 1;
+        pvr_sync_stats(PVR_SYNC_RNDSTART);
+
+        // Clear the texture render flag if we had it set.
+        pvr_state.to_texture[bufn] = 0;
+
+        // Switch to the clean TA buffer.
+        pvr_state.lists_transferred = 0;
+        pvr_sync_reg_buffer();
+
+        // The TA is no longer busy.
+        pvr_state.ta_busy = 0;
 
         // Signal the client code to continue onwards.
-        sem_signal((semaphore_t *)&pvr_state.ready_sem);
+        genwait_wake_all((void *)&pvr_state.ta_busy);
         thd_schedule(1, 0);
     }
 }
 
-void pvr_int_handler(uint32 code, void *data) {
-    int bufn = pvr_state.view_target;
+void pvr_vblank_handler(uint32 code, void *data) {
+    (void)code;
+    (void)data;
+
+    pvr_sync_stats(PVR_SYNC_VBLANK);
+
+    // If the render-done interrupt has fired then we are ready to flip to the
+    // new frame buffer.
+    if(pvr_state.render_completed) {
+        int bufn = pvr_state.view_target;
 
+        //DBG(("view(%d)\n", pvr_state.view_target ^ 1));
+
+        // Handle PVR stats
+        pvr_sync_stats(PVR_SYNC_PAGEFLIP);
+
+        // Switch view address to the "good" buffer
+        pvr_state.view_target ^= 1;
+
+        if(!pvr_state.to_texture[bufn])
+            pvr_sync_view();
+
+        // Clear the render completed flag.
+        pvr_state.render_completed = 0;
+    }
+
+    // We may have a pending render, that couldn't be done as the previous
+    // render wasn't flipped yet; do it now.
+    pvr_render_lists();
+}
+
+void pvr_int_handler(uint32 code, void *data) {
     (void)data;
 
     // What kind of event did we get?
@@ -117,9 +192,6 @@ void pvr_int_handler(uint32 code, void *data) {
             pvr_state.render_completed = 1;
             pvr_sync_stats(PVR_SYNC_RNDDONE);
             break;
-        case ASIC_EVT_PVR_VBLANK_BEGIN:
-            pvr_sync_stats(PVR_SYNC_VBLANK);
-            break;
     }
 
 #ifdef PVR_RENDER_DBG
@@ -159,91 +231,14 @@ void pvr_int_handler(uint32 code, void *data) {
         case ASIC_EVT_PVR_TRANSMODDONE:
         case ASIC_EVT_PVR_PTDONE:
 
-            if(pvr_state.lists_transferred == pvr_state.lists_enabled) {
-                pvr_sync_stats(PVR_SYNC_REGDONE);
-            }
-
-            return;
-    }
-
-    if(!pvr_state.to_texture[bufn]) {
-        // If it's not a vblank, ignore the rest of this for now.
-        if(code != ASIC_EVT_PVR_VBLANK_BEGIN)
-            return;
-    }
-    else {
-        // We don't need to wait for a vblank for rendering to a texture, but
-        // we really don't care about anything else unless we've actually gotten
-        // all the data submitted to the TA.
-        if(pvr_state.lists_transferred != pvr_state.lists_enabled &&
-           !pvr_state.render_completed)
-            return;
-    }
-
-    // If the render-done interrupt has fired then we are ready to flip to the
-    // new frame buffer.
-    if(pvr_state.render_completed) {
-        //DBG(("view(%d)\n", pvr_state.view_target ^ 1));
-
-        // Handle PVR stats
-        pvr_sync_stats(PVR_SYNC_PAGEFLIP);
-
-        // Switch view address to the "good" buffer
-        pvr_state.view_target ^= 1;
-
-        if(!pvr_state.to_texture[bufn])
-            pvr_sync_view();
+            if(pvr_state.lists_transferred != pvr_state.lists_enabled)
+                return;
 
-        // Clear the render completed flag.
-        pvr_state.render_completed = 0;
+            pvr_sync_stats(PVR_SYNC_REGDONE);
+            break;
     }
 
     // If all lists are fully transferred and a render is not in progress,
     // we are ready to start rendering.
-    if(!pvr_state.render_busy
-            && pvr_state.lists_transferred == pvr_state.lists_enabled) {
-        /* XXX Note:
-           For some reason, the render must be started _before_ we sync
-           to the new reg buffers. The only reasons I can think of for this
-           are that there may be something in the reg sync that messes up
-           the render in progress, or we are misusing some bits somewhere. */
-
-        // Begin rendering from the dirty TA buffer into the clean
-        // frame buffer.
-        //DBG(("start_render(%d -> %d)\n", pvr_state.ta_target, pvr_state.view_target ^ 1));
-        pvr_state.ta_target ^= 1;
-        pvr_begin_queued_render();
-        pvr_state.render_busy = 1;
-        pvr_sync_stats(PVR_SYNC_RNDSTART);
-
-        // Clear the texture render flag if we had it set.
-        pvr_state.to_texture[bufn] = 0;
-
-        // If we're not in DMA mode, then signal the client code
-        // to continue onwards.
-        if(!pvr_state.dma_mode) {
-            sem_signal((semaphore_t *)&pvr_state.ready_sem);
-            thd_schedule(1, 0);
-        }
-
-        // Switch to the clean TA buffer.
-        pvr_state.lists_transferred = 0;
-        pvr_sync_reg_buffer();
-
-        // The TA is no longer busy.
-        pvr_state.ta_busy = 0;
-    }
-
-    // If we're in DMA mode, the DMA source buffers are ready, and a DMA
-    // is not in progress, then we are ready to start DMAing.
-    if(pvr_state.dma_mode
-            && !pvr_state.ta_busy
-            && pvr_state.dma_buffers[pvr_state.ram_target ^ 1].ready
-            && mutex_trylock((mutex_t *)&pvr_state.dma_lock) >= 0) {
-        pvr_sync_stats(PVR_SYNC_REGSTART);
-
-        // Begin DMAing the first list.
-        pvr_state.ta_busy = 1;
-        dma_next_list(0);
-    }
+    pvr_render_lists();
 }
diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c b/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c
index 922c6500..b7f18862 100644
--- a/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c
+++ b/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c
@@ -9,6 +9,7 @@
 #include <assert.h>
 #include <stdio.h>
 #include <string.h>
+#include <kos/genwait.h>
 #include <kos/string.h>
 #include <kos/thread.h>
 #include <dc/pvr.h>
@@ -86,11 +87,25 @@ void pvr_vertbuf_written(pvr_list_t list, uint32 amt) {
     pvr_state.dma_buffers[pvr_state.ram_target].ptr[list] = val;
 }
 
+static void pvr_start_ta_rendering(void) {
+    // Make sure to wait until the TA is ready to start rendering a new scene
+    if(!pvr_state.ta_ready) {
+        pvr_wait_ready();
+        pvr_state.ta_ready = 1;
+    }
+
+    // Starting from that point, we consider that the Tile Accelerator
+    // might be busy.
+    pvr_state.ta_busy = 1;
+}
+
 /* Begin collecting data for a frame of 3D output to the off-screen
    frame buffer */
 void pvr_scene_begin(void) {
     int i;
 
+    pvr_state.ta_ready = 0;
+
     // Get general stuff ready.
     pvr_state.list_reg_open = -1;
 
@@ -164,8 +179,10 @@ int pvr_list_begin(pvr_list_t list) {
 
     pvr_list_dma = pvr_list_uses_dma(list);
 
-    if(!pvr_list_dma)
+    if(!pvr_list_dma) {
+        pvr_start_ta_rendering();
         sq_lock((void *)PVR_TA_INPUT);
+    }
 
     /* Ok, set the flag */
     pvr_state.list_reg_open = list;
@@ -304,12 +321,21 @@ int pvr_scene_finish(void) {
         b = pvr_state.dma_buffers + pvr_state.ram_target;
 
         for(i = 0; i < PVR_OPB_COUNT; i++) {
-            /* Check whether the current list type should be skipped:
-               A. We never enabled the list globally with pvr_init().
-               B. We never associated an in-RAM DMA vertex buffer with
-                  the given list type, because we're using hybrid
-                  rendering and submitted that list type directly. */
-            if(!(pvr_state.lists_enabled & (1 << i)) || !b->base[i])
+            /* We never enabled the list globally with pvr_init() - skip it */
+            if(!(pvr_state.lists_enabled & (1 << i)))
+                continue;
+
+            /* If any lists weren't used in this scene, submit blank ones now */
+            if(!(pvr_state.lists_closed & (1 << i))) {
+                pvr_list_begin(i);
+                pvr_blank_polyhdr(i);
+                pvr_list_finish();
+            }
+
+            /* We never associated an in-RAM DMA vertex buffer with the given
+               list type, because we're using hybrid rendering and submitted
+               that list type directly - skip it */
+            if(!b->base[i])
                 continue;
 
             // Make sure there's at least one primitive in each.
@@ -326,6 +352,8 @@ int pvr_scene_finish(void) {
             assert(b->ptr[i] <= b->size[i]);
         }
 
+        pvr_start_ta_rendering();
+
         // Flip buffers and mark them complete.
         o = irq_disable();
         pvr_state.dma_buffers[pvr_state.ram_target].ready = 1;
@@ -333,6 +361,8 @@ int pvr_scene_finish(void) {
         irq_restore(o);
 
         pvr_sync_stats(PVR_SYNC_BUFDONE);
+
+        pvr_start_dma();
     }
     else {
         /* If a list was open, close it */
@@ -355,11 +385,16 @@ int pvr_scene_finish(void) {
 }
 
 int pvr_wait_ready(void) {
-    int t;
+    int flags, t = 0;
 
     assert(pvr_state.valid);
 
-    t = sem_wait_timed((semaphore_t *)&pvr_state.ready_sem, 100);
+    flags = irq_disable();
+
+    if(pvr_state.ta_busy)
+        t = genwait_wait((void *)&pvr_state.ta_busy, "PVR wait ready", 100, NULL);
+
+    irq_restore(flags);
 
     if(t < 0) {
 #if 0
@@ -383,7 +418,7 @@ int pvr_wait_ready(void) {
 int pvr_check_ready(void) {
     assert(pvr_state.valid);
 
-    if(sem_count((semaphore_t *)&pvr_state.ready_sem) > 0)
+    if(!pvr_state.ta_busy)
         return 0;
     else
         return -1;


hooks/post-receive
-- 
A pseudo Operating System for the Dreamcast.