Thread: [Mlt-devel] Kdenlive A/V sync issue at some frame and sample rates [PATCH]

Brought to you by: ddennedy, lilo_booter

mlt-devel

[Mlt-devel] Kdenlive A/V sync issue at some frame and sample rates [PATCH]

From: <mi...@mi...> - 2009-07-26 14:58:09

Greetings.

I opened Kdenlive issue 1034 a couple of weeks ago.  The problem that Ive
been having is A/V drifting out of sync when rendering a long duration
project with a frame rate of 24000/1001 (approx. 23.976).  After some
investigation, I think the issue lies within the mlt_sample_calculator
function in MLT.

Specifically, at 24000/1001 FPS and 48000 Hz, there should be exactly 2002
audio samples per frame.  But, due to numerical inaccuracies in the
floating point calculation, the MLT function computes the number of
samples to be 2001.9999 which is then truncated to 2001 when the value is
assigned to an integer.  This one sample per frame error causes A/V to
drift out of sync by about 1.8 seconds per hour.

Rounding to the nearest integer instead of truncating would fix this
particular case, but could still result in as much as a half sample error
for other frame rate / sample rate combinations.  Ive confirmed that
there are still A/V sync issues at 44100 Hz and 32000 Hz when rounding to
the nearest integer.  Instead, I propose the fix in the patch at the end
of this message.  The computed samples/frame dithers on a frame by frame
basis to prevent accumulating errors due to rounding.  It seems to fix my
Kdenlive problems.

I occurs to me that the function mlt_sample_calculator_to_now might
benefit from a similar change.  I notice that fps is cast to an integer
before being used to compute the cumulative number of samples.  Such a
cast will also cause truncation of some frame rates.  I'm not sure where
this function is used but my Kdenlive sync issue still seems fixed if I
leave it untouched.

Finally, it seems to me that this change makes the special case handling
for 29.97 fps in both functions unnecessary.  Does it exist for another
reason beyond preventing accumulating errors?

What do you think?

Thanks,
Mike

http://www.kdenlive.org/mantis/view.php?id=1034

--- src/framework/mlt_frame.c.orig	2009-07-25 20:50:08.000000000 -0600
+++ src/framework/mlt_frame.c	2009-07-25 20:51:58.000000000 -0600
@@ -713,7 +713,18 @@
 	}
 	else if ( fps != 0 )
 	{
-		samples = frequency / fps;
+		/* Compute the cumulative number of samples until the start of this
frame and the
+		   cumulative number of samples until the start of the next frame. 
Round each to the
+		   nearest integer and take the difference to determine the number of
samples in
+		   this frame.
+
+		   This approach should prevent rounding errors that can accumulate
over a large number
+		   of frames causing A/V sync problems. */
+
+		int samples_at_this = (int)((double) position    * (double)frequency /
(double)fps + 0.5);
+		int samples_at_next = (int)((double)(position+1) * (double)frequency /
(double)fps + 0.5);
+
+		samples = samples_at_next - samples_at_this;
 	}

 	return samples;

Re: [Mlt-devel] Kdenlive A/V sync issue at some frame and sample rates [PATCH]

From: Dan D. <da...@de...> - 2009-07-26 18:16:10

On Sun, Jul 26, 2009 at 7:34 AM, <mi...@mi...> wrote:
> Greetings.
>
> I opened Kdenlive issue 1034 a couple of weeks ago.  The problem that I’ve
> been having is A/V drifting out of sync when rendering a long duration
> project with a frame rate of 24000/1001 (approx. 23.976).  After some
> investigation, I think the issue lies within the mlt_sample_calculator
> function in MLT.

Not surprising since 24000/1001 is not a special case handled in that function.

> Specifically, at 24000/1001 FPS and 48000 Hz, there should be exactly 2002
> audio samples per frame.  But, due to numerical inaccuracies in the
> floating point calculation, the MLT function computes the number of
> samples to be 2001.9999… which is then truncated to 2001 when the value is
> assigned to an integer.  This one sample per frame error causes A/V to
> drift out of sync by about 1.8 seconds per hour.
>
> Rounding to the nearest integer instead of truncating would fix this
> particular case, but could still result in as much as a half sample error
> for other frame rate / sample rate combinations.  I’ve confirmed that
> there are still A/V sync issues at 44100 Hz and 32000 Hz when rounding to
> the nearest integer.  Instead, I propose the fix in the patch at the end
> of this message.  The computed samples/frame dithers on a frame by frame
> basis to prevent accumulating errors due to rounding.  It seems to fix my
> Kdenlive problems.
>
> I occurs to me that the function mlt_sample_calculator_to_now might
> benefit from a similar change.  I notice that fps is cast to an integer
> before being used to compute the cumulative number of samples.  Such a
> cast will also cause truncation of some frame rates.  I'm not sure where
> this function is used but my Kdenlive sync issue still seems fixed if I
> leave it untouched.
>
> Finally, it seems to me that this change makes the special case handling
> for 29.97 fps in both functions unnecessary.  Does it exist for another
> reason beyond preventing accumulating errors?

It exists because the special cases for NTSC were the only ones cared
about when that function was written and because the integer math
worked fine for PAL frame rates for the only support sample rates at
that time (32000, 44100, 48000).

> What do you think?

In your patch do we need to be concerned with overflow and using
signed instead of unsigned?

> Thanks,
> Mike

Than you for your help to make this more generic and correct.

> http://www.kdenlive.org/mantis/view.php?id=1034
>
> --- src/framework/mlt_frame.c.orig      2009-07-25 20:50:08.000000000 -0600
> +++ src/framework/mlt_frame.c   2009-07-25 20:51:58.000000000 -0600
> @@ -713,7 +713,18 @@
>        }
>        else if ( fps != 0 )
>        {
> -               samples = frequency / fps;
> +               /* Compute the cumulative number of samples until the start of this
> frame and the
> +                  cumulative number of samples until the start of the next frame.
> Round each to the
> +                  nearest integer and take the difference to determine the number of
> samples in
> +                  this frame.
> +
> +                  This approach should prevent rounding errors that can accumulate
> over a large number
> +                  of frames causing A/V sync problems. */
> +
> +               int samples_at_this = (int)((double) position    * (double)frequency /
> (double)fps + 0.5);
> +               int samples_at_next = (int)((double)(position+1) * (double)frequency /
> (double)fps + 0.5);
> +
> +               samples = samples_at_next - samples_at_this;
>        }
>
>        return samples;
>
-- 
+-DRD-+

Re: [Mlt-devel] Kdenlive A/V sync issue at some frame and sample rates [PATCH]

From: <mi...@mi...> - 2009-07-27 12:44:05

> On Sun, Jul 26, 2009 at 7:34 AM, Mike wrote:
>> Greetings.
>>
>> I opened Kdenlive issue 1034 a couple of weeks ago.  The problem that
>> Ive
>> been having is A/V drifting out of sync when rendering a long duration
>> project with a frame rate of 24000/1001 (approx. 23.976).  After some
>> investigation, I think the issue lies within the mlt_sample_calculator
>> function in MLT.
>
> Not surprising since 24000/1001 is not a special case handled in that
> function.
>
>> Specifically, at 24000/1001 FPS and 48000 Hz, there should be exactly
>> 2002
>> audio samples per frame.  But, due to numerical inaccuracies in the
>> floating point calculation, the MLT function computes the number of
>> samples to be 2001.9999 which is then truncated to 2001 when the value
>> is
>> assigned to an integer.  This one sample per frame error causes A/V to
>> drift out of sync by about 1.8 seconds per hour.
>>
>> Rounding to the nearest integer instead of truncating would fix this
>> particular case, but could still result in as much as a half sample
>> error
>> for other frame rate / sample rate combinations.  Ive confirmed that
>> there are still A/V sync issues at 44100 Hz and 32000 Hz when rounding
>> to
>> the nearest integer.  Instead, I propose the fix in the patch at the end
>> of this message.  The computed samples/frame dithers on a frame by frame
>> basis to prevent accumulating errors due to rounding.  It seems to fix
>> my
>> Kdenlive problems.
>>
>> I occurs to me that the function mlt_sample_calculator_to_now might
>> benefit from a similar change.  I notice that fps is cast to an integer
>> before being used to compute the cumulative number of samples.  Such a
>> cast will also cause truncation of some frame rates.  I'm not sure where
>> this function is used but my Kdenlive sync issue still seems fixed if I
>> leave it untouched.
>>
>> Finally, it seems to me that this change makes the special case handling
>> for 29.97 fps in both functions unnecessary.  Does it exist for another
>> reason beyond preventing accumulating errors?
>
> It exists because the special cases for NTSC were the only ones cared
> about when that function was written and because the integer math
> worked fine for PAL frame rates for the only support sample rates at
> that time (32000, 44100, 48000).

NTSC-based frame rates are a real pain.

>> What do you think?
>
> In your patch do we need to be concerned with overflow and using
> signed instead of unsigned?

You raise good points.  Since I'm using regular ints to hold the
cumulative sample counts, there will be an overflow at 2^31 samples.  At
192,000 Hz (probably a reasonable upper bound on an audio sample rate)
that would overflow after only 3 hours.  Those should be int64_t instead.

Once that change is made, the limiting factor is probably the double
precision product of position and frequency.  A double precision floating
point number has the same precision as a 53 bit unsigned integer.  Again
assuming frequency=192000, we run out of precision after about 4.69E+10
frames.  It won't overflow, but rounding errors will start to appear. 
Assuming a maximum frame rate of 60 fps, it will take just under 25 years
before that limit is hit.  That seems like a reasonable limit to me.

I'm not sure what you mean by "using signed instead of unsigned."  My code
for rounding to the nearest integer doesnt work with negative numbers. 
Is there a potential for something to be negative here?  I apologize for
my ignorance but I've just barely dipped my toe into the MLT code. 
Negative frequency and fps don't make much sense to me.  But, I suppose a
negative position is plausible.  In that case my code will give a one
sample error with negative positions.  It shouldn't accumulate and cause
A/V sync problems, but it's still not quite right.  There's an easy fix.

I'm including two new patches.  You can pick the one you prefer, if
either.  The first is a replacement for the flawed patch in my original
post which changes just enough to fix my Kdenlive render problem.  The
second is, in my opinion, more complete.  It removes the special code for
29.97 fps since I think it's no longer needed and also tries to fix
mlt_sample_calculator_to_now.

Also, I saw this comment in the code:

/* Will this break when mlt_position is converted to double? -Zach */

Do I need to be concerned about it?  I don't quite understand it.  I've
been taking "position" to be a frame number.  Does this comment mean that
it will become a double?  The concept of a fractional frame number makes
my brain hurt.

>> Thanks,
>> Mike
>
> Than you for your help to make this more generic and correct.

I'm happy to be of any help I can.

>> http://www.kdenlive.org/mantis/view.php?id=1034

Here's the first patch that makes the minimal changes to fix my Kdenlive
problem:

--- src/framework/mlt_frame.c.orig	2009-07-25 20:50:08.000000000 -0600
+++ src/framework/mlt_frame.c	2009-07-26 16:41:34.000000000 -0600
@@ -713,7 +713,23 @@
 	}
 	else if ( fps != 0 )
 	{
-		samples = frequency / fps;
+		/* Compute the cumulative number of samples until the start of this
frame and the
+		   cumulative number of samples until the start of the next frame. 
Round each to the
+		   nearest integer and take the difference to determine the number of
samples in
+		   this frame.
+
+		   This approach should prevent rounding errors that can accumulate
over a large number
+		   of frames causing A/V sync problems. */
+
+		int64_t samples_at_this =
+			(int64_t)((double) position    * (double)frequency / (double)fps +
+			( position<0 ? -0.5 : 0.5 ));
+
+		int64_t samples_at_next =
+			(int64_t)((double)(position+1) * (double)frequency / (double)fps +
+			( position<0 ? -0.5 : 0.5 ));
+
+		samples = (int)(samples_at_next - samples_at_this);
 	}

 	return samples;



Or, you can use this one instead which I think is more complete:

--- src/framework/mlt_frame.c.orig	2009-07-25 20:50:08.000000000 -0600
+++ src/framework/mlt_frame.c	2009-07-26 19:06:09.000000000 -0600
@@ -677,43 +677,25 @@
 {
 	int samples = 0;

-	if ( ( int )( fps * 100 ) == 2997 )
+	if ( fps != 0 )
 	{
-		samples = frequency / 30;
+		/* Compute the cumulative number of samples until the start of this
frame and the
+		   cumulative number of samples until the start of the next frame. 
Round each to the
+		   nearest integer and take the difference to determine the number of
samples in
+		   this frame.
+
+		   This approach should prevent rounding errors that can accumulate
over a large number
+		   of frames causing A/V sync problems. */
+
+		int64_t samples_at_this =
+			(int64_t)((double) position    * (double)frequency / (double)fps +
+			( position<0 ? -0.5 : 0.5 ));
+
+		int64_t samples_at_next =
+			(int64_t)((double)(position+1) * (double)frequency / (double)fps +
+			( position<0 ? -0.5 : 0.5 ));

-		switch ( frequency )
-		{
-			case 48000:
-				if ( position % 5 != 0 )
-					samples += 2;
-				break;
-			case 44100:
-				if ( position % 300 == 0 )
-					samples = 1471;
-				else if ( position % 30 == 0 )
-					samples = 1470;
-				else if ( position % 2 == 0 )
-					samples = 1472;
-				else
-					samples = 1471;
-				break;
-			case 32000:
-				if ( position % 30 == 0 )
-					samples = 1068;
-				else if ( position % 29 == 0 )
-					samples = 1067;
-				else if ( position % 4 == 2 )
-					samples = 1067;
-				else
-					samples = 1068;
-				break;
-			default:
-				samples = 0;
-		}
-	}
-	else if ( fps != 0 )
-	{
-		samples = frequency / fps;
+		samples = (int)(samples_at_next - samples_at_this);
 	}

 	return samples;
@@ -723,26 +705,11 @@
 {
 	int64_t samples = 0;

-	// TODO: Correct rules for NTSC and drop the * 100 hack
-	if ( ( int )( fps * 100 ) == 2997 )
-	{
-		samples = ( ( double )( frame * frequency ) / 30 );
-		switch( frequency )
-		{
-			case 48000:
-				samples += 2 * ( frame / 5 );
-				break;
-			case 44100:
-				samples += frame + ( frame / 2 ) - ( frame / 30 ) + ( frame / 300 );
-				break;
-			case 32000:
-				samples += ( 2 * frame ) - ( frame / 4 ) - ( frame / 29 );
-				break;
-		}
-	}
-	else if ( fps != 0 )
+	if ( fps != 0 )
 	{
-		samples = ( ( frame * frequency ) / ( int )fps );
+		samples =
+			(int64_t)((double)frame * (double)frequency / (double)fps +
+			( frame<0 ? -0.5 : 0.5 ));
 	}

 	return samples;