From: falcovorbis <fal...@us...> - 2024-11-06 13:30:41
|
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "A pseudo Operating System for the Dreamcast.". The branch, master has been updated via 33958108179edd3a73e21ef7437ed9a49b996fd9 (commit) via e7ebae1f6ac3c91fef4174d09dfda03733cc0862 (commit) via c520b0b815c0e11006a4e95f207b146351b0358f (commit) via d6c26d169d8db24cadcc6979ae93c2b806eee5ae (commit) via df95ad288ae8bc5963158da63fe33af9037c12f9 (commit) via 12824439b7eddb9439cff8970a9e9a9f660570a2 (commit) via a01d0e2cf7ee6b76cd995f668c86d460533979c6 (commit) from 65cc060af59a607a2a05c742089d08f8271b0eb9 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 33958108179edd3a73e21ef7437ed9a49b996fd9 Author: Paul Cercueil <pa...@cr...> Date: Wed Nov 6 14:01:42 2024 +0100 pvr: Fix hang when enabled lists aren't submitted When two lists were enabled, one for direct-rendering and one for deferred (DMA) rendering, but the first one was not used, only the IRQ corresponding to the second list fired, causing the IRQ handler to not start the PVR CORE rendering step, as it waited for all enabled lists to be transferred. Address this issue by adding dummy vertices to enabled lists that were not used in the current scene. Signed-off-by: Paul Cercueil <pa...@cr...> commit e7ebae1f6ac3c91fef4174d09dfda03733cc0862 Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 22:17:07 2024 +0100 pvr: Rework and improve IRQ handling The previous way of handling IRQs was weird at best. The PVR rendering was not started directly after all lists were transferred by the Tile Accelerator, but following the next VBLANK event. This caused some games to be throttled down, as they had to wait for the PVR rendering to be completed before starting a new scene, and the PVR rendering was not started as soon as possible. Address that by starting the PVR rendering as soon as all lists have been transferred. In the case where the previous rendered scene hasn't been flipped yet (which happens when the game tries to render faster than the refresh rate), the PVR rendering is instead started from the VBLANK interrupt handler. Note that for simplicity the VBLANK handling code has been separated from the main PVR IRQ handler. Signed-off-by: Paul Cercueil <pa...@cr...> commit c520b0b815c0e11006a4e95f207b146351b0358f Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 22:03:14 2024 +0100 pvr: Move PVR render code into its own pvr_render_lists() function No functional change (intended). This new function will be called at two different spots in the future. Signed-off-by: Paul Cercueil <pa...@cr...> commit d6c26d169d8db24cadcc6979ae93c2b806eee5ae Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 21:57:54 2024 +0100 pvr: Always wait for TA ready before trying to use it again Attempting to use the Tile Accelerator for a new scene while it is busy processing the previous scene will almost always result in a crash. Call pvr_wait_ready() when starting a new scene to make sure that the TA is idle. Instead of calling it in pvr_scene_begin(), delay the call as much as possible, to leave time for the TA to complete its task. This means calling it in pvr_list_begin() for lists that use direct rendering, and right before uploading the DMA data for lists that use DMA. Signed-off-by: Paul Cercueil <pa...@cr...> commit df95ad288ae8bc5963158da63fe33af9037c12f9 Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 21:48:12 2024 +0100 pvr: Rework pvr_wait_ready() The previous implementation used a semaphore, which was somewhat problematic; it meant that there had to be a 1:1 ratio between the number of calls to pvr_wait_ready(), and the number of renders by the PVR. Having two successive calls, for instance, would cause the second one to time out. Rework the implementation so that pvr_wait_ready() will wait until the PVR is ready if needed, but will always return straight away if the PVR is free. Instead of a semaphore, this new implementation uses the genwait mechanism on the pvr_state.ta_busy object. This field has also changed in its meaning, as it is now a generic "Tile Accelerator is busy" flag, independently on whether or not the hybrid mode is in use. Signed-off-by: Paul Cercueil <pa...@cr...> commit 12824439b7eddb9439cff8970a9e9a9f660570a2 Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 21:41:15 2024 +0100 pvr: Start DMA for hybrid mode in pvr_scene_finish() There is no point in waiting for the next VBLANK to submit the data queued; it just causes unnecesary delays. Note that at that point it is guaranteed that the Tile Accelerator is free so we can remove the redundant checks. Signed-off-by: Paul Cercueil <pa...@cr...> commit a01d0e2cf7ee6b76cd995f668c86d460533979c6 Author: Paul Cercueil <pa...@cr...> Date: Sun Nov 3 21:30:39 2024 +0100 pvr: Fix hybrid mode signaling TA completion too early The dma_next_list() function was setting the "list transferred" flag for the lists that used DMA to transfer the vertices and also for the other ones. It also signaled completion through the semaphore after the last transfer completed. The problem with that, is that there is a difference between "the vertex data has been transferred to the TA" and "the TA is done processing the lists"; and the "list transferred" flags are to be set in the second case. Address this issue by leaving it to the IRQ handler to set the "list transferred" flags and handle the completion semaphore. Signed-off-by: Paul Cercueil <pa...@cr...> ----------------------------------------------------------------------- Summary of changes: doc/CHANGELOG.md | 1 + .../dreamcast/hardware/pvr/pvr_init_shutdown.c | 8 +- kernel/arch/dreamcast/hardware/pvr/pvr_internal.h | 12 +- kernel/arch/dreamcast/hardware/pvr/pvr_irq.c | 175 ++++++++++----------- kernel/arch/dreamcast/hardware/pvr/pvr_scene.c | 55 +++++-- 5 files changed, 138 insertions(+), 113 deletions(-) diff --git a/doc/CHANGELOG.md b/doc/CHANGELOG.md index 9c764a38..bda36885 100644 --- a/doc/CHANGELOG.md +++ b/doc/CHANGELOG.md @@ -12,6 +12,7 @@ Platform-specific changes are prefixed with the platform name, otherwise the cha - Add/Fixed stat() implementations for all filesystems [AB] - **Dreamcast**: Add network speedtest and pvr palette examples [AB] - **Dreamcast**: Cleaned up, documented, and enhanced BIOS font API [FG] +- Rework PVR hybrid mode + IRQ handling [PC] ## KallistiOS version 2.1.0 - Cleaned up generated stubs files on a make clean [Lawrence Sebald == LS] diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c b/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c index 29103411..cbe77d70 100644 --- a/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c +++ b/kernel/arch/dreamcast/hardware/pvr/pvr_init_shutdown.c @@ -140,7 +140,7 @@ int pvr_init(pvr_init_params_t *params) { } /* Hook the PVR interrupt events on G2 */ - pvr_state.vbl_handle = vblank_handler_add(pvr_int_handler, NULL); + pvr_state.vbl_handle = vblank_handler_add(pvr_vblank_handler, NULL); asic_evt_set_handler(ASIC_EVT_PVR_OPAQUEDONE, pvr_int_handler, NULL); asic_evt_enable(ASIC_EVT_PVR_OPAQUEDONE, ASIC_IRQ_DEFAULT); @@ -194,9 +194,6 @@ int pvr_init(pvr_init_params_t *params) { mutex_init((mutex_t *)&pvr_state.dma_lock, MUTEX_TYPE_NORMAL); pvr_dma_init(); - /* Setup our wait-ready semaphore */ - sem_init((semaphore_t *)&pvr_state.ready_sem, 0); - /* Set us as valid and return success */ pvr_state.valid = 1; @@ -245,8 +242,7 @@ int pvr_shutdown(void) { /* Invalidate our memory pool */ pvr_mem_reset(); - /* Destroy the semaphore */ - sem_destroy((semaphore_t *)&pvr_state.ready_sem); + /* Destroy the mutex */ mutex_destroy((mutex_t *)&pvr_state.dma_lock); /* Clear video memory */ diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h b/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h index 3db2278d..4c7c72da 100644 --- a/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h +++ b/kernel/arch/dreamcast/hardware/pvr/pvr_internal.h @@ -16,7 +16,6 @@ code. If something is needed from this, an external interface should be added to dc/pvr.h. */ -#include <kos/sem.h> #include <kos/mutex.h> /**** State stuff ***************************************************/ @@ -169,7 +168,8 @@ typedef struct { uint32 lists_dmaed; // (1 << idx) for each list which has been DMA'd (DMA mode only) mutex_t dma_lock; // Locked if a DMA is in progress (vertex or texture) - int ta_busy; // >0 if a DMA is in progress and the TA hasn't signaled completion + int ta_ready; // >0 if the TA is ready for the new scene + int ta_busy; // >0 if a scene is ongoing and the TA hasn't signaled completion int render_busy; // >0 if a render is in progress int render_completed; // >1 if a render has recently finished @@ -204,10 +204,6 @@ typedef struct { size_t vtx_buf_used; // Vertex buffer used size for the last frame size_t vtx_buf_used_max; // Maximum used vertex buffer size - /* Wait-ready semaphore: this will be signaled whenever the pvr_wait_ready() - call should be ready to return. */ - semaphore_t ready_sem; - // Handle for the vblank interrupt int vbl_handle; @@ -292,8 +288,10 @@ void pvr_blank_polyhdr_buf(int type, pvr_poly_hdr_t * buf); /**** pvr_irq.c *******************************************************/ -/* Interrupt handler for PVR events */ +/* Interrupt handlers for PVR events */ void pvr_int_handler(uint32 code, void *data); +void pvr_vblank_handler(uint32 code, void *data); +void pvr_start_dma(void); #endif diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c b/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c index 8e72fb94..9d456c32 100644 --- a/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c +++ b/kernel/arch/dreamcast/hardware/pvr/pvr_irq.c @@ -11,6 +11,8 @@ #include <arch/cache.h> #include "pvr_internal.h" +#include <kos/genwait.h> + #ifdef PVR_RENDER_DBG #include <stdio.h> #endif @@ -46,7 +48,6 @@ static void dma_next_list(void *data) { mark it as complete, so we skip trying to DMA it. */ if(!b->base[i]) { pvr_state.lists_dmaed |= 1 << i; - pvr_state.lists_transferred |= 1 << i; continue; } @@ -74,22 +75,96 @@ static void dma_next_list(void *data) { if(!did) { //DBG(("dma_complete(buf %d)\n", pvr_state.ram_target ^ 1)); + // If that was the last one, then free up the DMA channel. + pvr_state.lists_dmaed = 0; + // Unlock mutex_unlock((mutex_t *)&pvr_state.dma_lock); - pvr_state.lists_dmaed = 0; // Buffers are now empty again pvr_state.dma_buffers[pvr_state.ram_target ^ 1].ready = 0; + } +} + +void pvr_start_dma(void) { + pvr_sync_stats(PVR_SYNC_REGSTART); + + mutex_lock((mutex_t *)&pvr_state.dma_lock); + + // Begin DMAing the first list. + dma_next_list(0); +} + +static void pvr_render_lists(void) { + int bufn = pvr_state.view_target ^ 1; + + if(pvr_state.ta_busy + && !pvr_state.render_busy + && (!pvr_state.render_completed || pvr_state.to_texture[bufn]) + && pvr_state.lists_transferred == pvr_state.lists_enabled) { + + /* XXX Note: + For some reason, the render must be started _before_ we sync + to the new reg buffers. The only reasons I can think of for this + are that there may be something in the reg sync that messes up + the render in progress, or we are misusing some bits somewhere. */ + + // Begin rendering from the dirty TA buffer into the clean + // frame buffer. + //DBG(("start_render(%d -> %d)\n", pvr_state.ta_target, pvr_state.view_target ^ 1)); + pvr_state.ta_target ^= 1; + pvr_begin_queued_render(); + pvr_state.render_busy = 1; + pvr_sync_stats(PVR_SYNC_RNDSTART); + + // Clear the texture render flag if we had it set. + pvr_state.to_texture[bufn] = 0; + + // Switch to the clean TA buffer. + pvr_state.lists_transferred = 0; + pvr_sync_reg_buffer(); + + // The TA is no longer busy. + pvr_state.ta_busy = 0; // Signal the client code to continue onwards. - sem_signal((semaphore_t *)&pvr_state.ready_sem); + genwait_wake_all((void *)&pvr_state.ta_busy); thd_schedule(1, 0); } } -void pvr_int_handler(uint32 code, void *data) { - int bufn = pvr_state.view_target; +void pvr_vblank_handler(uint32 code, void *data) { + (void)code; + (void)data; + + pvr_sync_stats(PVR_SYNC_VBLANK); + + // If the render-done interrupt has fired then we are ready to flip to the + // new frame buffer. + if(pvr_state.render_completed) { + int bufn = pvr_state.view_target; + //DBG(("view(%d)\n", pvr_state.view_target ^ 1)); + + // Handle PVR stats + pvr_sync_stats(PVR_SYNC_PAGEFLIP); + + // Switch view address to the "good" buffer + pvr_state.view_target ^= 1; + + if(!pvr_state.to_texture[bufn]) + pvr_sync_view(); + + // Clear the render completed flag. + pvr_state.render_completed = 0; + } + + // We may have a pending render, that couldn't be done as the previous + // render wasn't flipped yet; do it now. + pvr_render_lists(); +} + +void pvr_int_handler(uint32 code, void *data) { (void)data; // What kind of event did we get? @@ -117,9 +192,6 @@ void pvr_int_handler(uint32 code, void *data) { pvr_state.render_completed = 1; pvr_sync_stats(PVR_SYNC_RNDDONE); break; - case ASIC_EVT_PVR_VBLANK_BEGIN: - pvr_sync_stats(PVR_SYNC_VBLANK); - break; } #ifdef PVR_RENDER_DBG @@ -159,91 +231,14 @@ void pvr_int_handler(uint32 code, void *data) { case ASIC_EVT_PVR_TRANSMODDONE: case ASIC_EVT_PVR_PTDONE: - if(pvr_state.lists_transferred == pvr_state.lists_enabled) { - pvr_sync_stats(PVR_SYNC_REGDONE); - } - - return; - } - - if(!pvr_state.to_texture[bufn]) { - // If it's not a vblank, ignore the rest of this for now. - if(code != ASIC_EVT_PVR_VBLANK_BEGIN) - return; - } - else { - // We don't need to wait for a vblank for rendering to a texture, but - // we really don't care about anything else unless we've actually gotten - // all the data submitted to the TA. - if(pvr_state.lists_transferred != pvr_state.lists_enabled && - !pvr_state.render_completed) - return; - } - - // If the render-done interrupt has fired then we are ready to flip to the - // new frame buffer. - if(pvr_state.render_completed) { - //DBG(("view(%d)\n", pvr_state.view_target ^ 1)); - - // Handle PVR stats - pvr_sync_stats(PVR_SYNC_PAGEFLIP); - - // Switch view address to the "good" buffer - pvr_state.view_target ^= 1; - - if(!pvr_state.to_texture[bufn]) - pvr_sync_view(); + if(pvr_state.lists_transferred != pvr_state.lists_enabled) + return; - // Clear the render completed flag. - pvr_state.render_completed = 0; + pvr_sync_stats(PVR_SYNC_REGDONE); + break; } // If all lists are fully transferred and a render is not in progress, // we are ready to start rendering. - if(!pvr_state.render_busy - && pvr_state.lists_transferred == pvr_state.lists_enabled) { - /* XXX Note: - For some reason, the render must be started _before_ we sync - to the new reg buffers. The only reasons I can think of for this - are that there may be something in the reg sync that messes up - the render in progress, or we are misusing some bits somewhere. */ - - // Begin rendering from the dirty TA buffer into the clean - // frame buffer. - //DBG(("start_render(%d -> %d)\n", pvr_state.ta_target, pvr_state.view_target ^ 1)); - pvr_state.ta_target ^= 1; - pvr_begin_queued_render(); - pvr_state.render_busy = 1; - pvr_sync_stats(PVR_SYNC_RNDSTART); - - // Clear the texture render flag if we had it set. - pvr_state.to_texture[bufn] = 0; - - // If we're not in DMA mode, then signal the client code - // to continue onwards. - if(!pvr_state.dma_mode) { - sem_signal((semaphore_t *)&pvr_state.ready_sem); - thd_schedule(1, 0); - } - - // Switch to the clean TA buffer. - pvr_state.lists_transferred = 0; - pvr_sync_reg_buffer(); - - // The TA is no longer busy. - pvr_state.ta_busy = 0; - } - - // If we're in DMA mode, the DMA source buffers are ready, and a DMA - // is not in progress, then we are ready to start DMAing. - if(pvr_state.dma_mode - && !pvr_state.ta_busy - && pvr_state.dma_buffers[pvr_state.ram_target ^ 1].ready - && mutex_trylock((mutex_t *)&pvr_state.dma_lock) >= 0) { - pvr_sync_stats(PVR_SYNC_REGSTART); - - // Begin DMAing the first list. - pvr_state.ta_busy = 1; - dma_next_list(0); - } + pvr_render_lists(); } diff --git a/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c b/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c index 922c6500..b7f18862 100644 --- a/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c +++ b/kernel/arch/dreamcast/hardware/pvr/pvr_scene.c @@ -9,6 +9,7 @@ #include <assert.h> #include <stdio.h> #include <string.h> +#include <kos/genwait.h> #include <kos/string.h> #include <kos/thread.h> #include <dc/pvr.h> @@ -86,11 +87,25 @@ void pvr_vertbuf_written(pvr_list_t list, uint32 amt) { pvr_state.dma_buffers[pvr_state.ram_target].ptr[list] = val; } +static void pvr_start_ta_rendering(void) { + // Make sure to wait until the TA is ready to start rendering a new scene + if(!pvr_state.ta_ready) { + pvr_wait_ready(); + pvr_state.ta_ready = 1; + } + + // Starting from that point, we consider that the Tile Accelerator + // might be busy. + pvr_state.ta_busy = 1; +} + /* Begin collecting data for a frame of 3D output to the off-screen frame buffer */ void pvr_scene_begin(void) { int i; + pvr_state.ta_ready = 0; + // Get general stuff ready. pvr_state.list_reg_open = -1; @@ -164,8 +179,10 @@ int pvr_list_begin(pvr_list_t list) { pvr_list_dma = pvr_list_uses_dma(list); - if(!pvr_list_dma) + if(!pvr_list_dma) { + pvr_start_ta_rendering(); sq_lock((void *)PVR_TA_INPUT); + } /* Ok, set the flag */ pvr_state.list_reg_open = list; @@ -304,12 +321,21 @@ int pvr_scene_finish(void) { b = pvr_state.dma_buffers + pvr_state.ram_target; for(i = 0; i < PVR_OPB_COUNT; i++) { - /* Check whether the current list type should be skipped: - A. We never enabled the list globally with pvr_init(). - B. We never associated an in-RAM DMA vertex buffer with - the given list type, because we're using hybrid - rendering and submitted that list type directly. */ - if(!(pvr_state.lists_enabled & (1 << i)) || !b->base[i]) + /* We never enabled the list globally with pvr_init() - skip it */ + if(!(pvr_state.lists_enabled & (1 << i))) + continue; + + /* If any lists weren't used in this scene, submit blank ones now */ + if(!(pvr_state.lists_closed & (1 << i))) { + pvr_list_begin(i); + pvr_blank_polyhdr(i); + pvr_list_finish(); + } + + /* We never associated an in-RAM DMA vertex buffer with the given + list type, because we're using hybrid rendering and submitted + that list type directly - skip it */ + if(!b->base[i]) continue; // Make sure there's at least one primitive in each. @@ -326,6 +352,8 @@ int pvr_scene_finish(void) { assert(b->ptr[i] <= b->size[i]); } + pvr_start_ta_rendering(); + // Flip buffers and mark them complete. o = irq_disable(); pvr_state.dma_buffers[pvr_state.ram_target].ready = 1; @@ -333,6 +361,8 @@ int pvr_scene_finish(void) { irq_restore(o); pvr_sync_stats(PVR_SYNC_BUFDONE); + + pvr_start_dma(); } else { /* If a list was open, close it */ @@ -355,11 +385,16 @@ int pvr_scene_finish(void) { } int pvr_wait_ready(void) { - int t; + int flags, t = 0; assert(pvr_state.valid); - t = sem_wait_timed((semaphore_t *)&pvr_state.ready_sem, 100); + flags = irq_disable(); + + if(pvr_state.ta_busy) + t = genwait_wait((void *)&pvr_state.ta_busy, "PVR wait ready", 100, NULL); + + irq_restore(flags); if(t < 0) { #if 0 @@ -383,7 +418,7 @@ int pvr_wait_ready(void) { int pvr_check_ready(void) { assert(pvr_state.valid); - if(sem_count((semaphore_t *)&pvr_state.ready_sem) > 0) + if(!pvr_state.ta_busy) return 0; else return -1; hooks/post-receive -- A pseudo Operating System for the Dreamcast. |