Re: [xine-devel] [PATCH] new xine deinterlacer plugin (tvtime)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Wed, 18 Jun 2003, Billy Biggs wrote:

> James Slorach (jr...@ja...):
> 
> > On 17 Jun 2003, Miguel Freitas wrote:
> > 
> > > On Tue, 2003-06-17 at 18:47, James Slorach wrote:
> > > > It makes a significant improvement, but still looks worse than the 
> > > > progressively upsampled version, especially on diagonal edges between red 
> > > > and green.
> > > 
> > > ok, i just changed it a bit to filter more agressively. i don't know the
> > > response of this filter but it looks a little better. what do you think?
> > 
> > Slightly better than before, but still worse than the progressive 
> > upsample.
> > 
> > Here is a clearer example:
> > http://janx.org/chroma_upsample.html
> 
>   Ok, so let's try and get a better idea of what the problem is that
> we're seeing.
> 
>   So, we have content in 4:2:0 that is interlaced.  We're MPEG2 content,
> so our chroma looks like this:
> 
>    F1    F2
> 1  C1
> 2        C2
> 3   :
> 4         :

I was under the impression that 4:2:0 chroma samples were located between 
the lines: http://www.mir.com/DMG/chroma.html

I have a cunning plan. I shall work out the relationships between chroma 
samples in the up-sampled 4:2:2 image and the original. These 
relationships shall be expressed as a set of vertical, 1-dimensional 
convolution filters (written horizontally to save space). Different lines 
in the image may have different convolution filters (depending on the 
line number modulo 4).

Convolution filter:
a b c d e f g h j

C is a chroma sample in the original 4:2:2 image, C' is a chroma sample in
our 4:2:2 image up-sampled from a 4:2:0 image. i is the number of the 
line.

      aC    + bC    + cC    + dC    + eC  + fC   + gC    + hC    + jC
        i-4     i-3     i-2     i-1     i     i+1    i+2     i+3     i+4
C'  = __________________________________________________________________
  i
                     a + b + c + d + e + f + g + h + j

Assumptions:

Linear interpolation is used for down-sampling. (Does anybody know what 
the MPEG2 spec. says? Or what encoders actually do in practice?)

We ignore compression artifacts.

The ideal convolution is symmetric and weighted towards the centre, 
resembling a Gaussian Blur such as 1 2 1  or  1 4 6 4 1. There will be a 
blurring effect to due the loss of resolution.

We ignore the edge cases.

C  and C  chroma samples are treated exactly the same way.
 B      R

The two fields are from the same frame (progressive originally, but 
encoded interlaced). This means that chroma up-sampling must be performed 
after 3:2 pulldown detection has put the correct fields together into 
frames.

Notation:

c is a 4:2:2 chroma sample
d is a 4:2:0 chroma sample
c' is 4:2:2 up-sampled from 4:2:0
i is a line (numbered from 1)

   Top   Bottom
  Field  Field

1  c1
   d1
2         c2

3  c3
          d2
4         c4

5  c5
   d3
6         c6

7  c7
          d4
8         c8

Progressive down-sampling:

d1 = (c1 + c2)/2
d2 = (c3 + c4)/2
d3 = (c5 + c6)/2
d4 = (c7 + c8)/2

Progressive up-sampling by nearest neighbour:

c'3 = d2
c'4 = d2
c'5 = d3
c'6 = d3

c'3 = (c3 + c4)/2
c'4 = (c3 + c4)/2
c'5 = (c5 + c6)/2
c'6 = (c5 + c6)/2

odd  :  0  0  0  0  1  1  0  0  0
even :  0  0  0  1  1  0  0  0  0

Progressive up-sampling by linear interpolation:

c'3 = (d1 + 3d2)/4
c'4 = (3d2 + d3)/4
c'5 = (d2 + 3d3)/4
c'6 = (3d3 + d4)/4

c'3 = (c1 + c2 + 3c3 + 3c4)/8
c'4 = (3c3 + 3c4 + c5 + c6)/8
c'5 = (c3 + c4 + 3c5 + 3c6)/8
c'6 = (3c5 + 3c6 + c7 + c8)/8

odd  :  0  0  1  1  3  3  0  0  0
even :  0  0  0  3  3  1  1  0  0

Interlaced down-sampling:

d1 = (3c1 + c3)/4
d2 = (c2 + 3c4)/4
d3 = (3c5 + c7)/4
d4 = (c6 + 3c8)/4

Interlaced up-sampling by nearest neighbour:

c'3 = d1
c'4 = d2
c'5 = d3
c'6 = d4

c'3 = (3c1 + c3)/4
c'4 = (c2 + 3c4)/4
c'5 = (3c5 + c7)/4
c'6 = (c6 + 3c8)/4

i mod 4 = 3 :  0  0  3  0  1  0  0  0  0
i mod 4 = 0 :  0  0  1  0  3  0  0  0  0
i mod 4 = 1 :  0  0  0  0  3  0  1  0  0
i mod 4 = 2 :  0  0  0  0  1  0  3  0  0

This is the method currently used by the tvtime plugin. Clearly, c'3 and 
c'6 are heavily biased to one side, deviating wildly from our ideal 
symmetric convolution. This is the cause of these artifacts:

http://janx.org/dirty_pair_deinterlace_film_xshm_no_scale.png
http://janx.org/pioneer_deinterlace_film.png

Interlaced down-sample + progessive up-sample by nearest neighbour:

c'3 = (c2 + 3c4)/4
c'4 = (c2 + 3c4)/4
c'5 = (3c5 + c7)/4
c'6 = (3c5 + c7)/4

i mod 4 = 3 :  0  0  0  1  0  3  0  0  0
i mod 4 = 0 :  0  0  1  0  3  0  0  0  0
i mod 4 = 1 :  0  0  0  0  3  0  1  0  0
i mod 4 = 2 :  0  0  0  3  0  1  0  0  0

This is the method used when I hacked the plugin to use progressive 
up-sampling. Note that the bias from the interlaced up-sample is 
considerably reduced. Note also that the other two lines (c'4 and c'5) are 
identical to the interlaced method. This reduced bias, bringing the 
convolution closer to our ideal, makes this method generate a 4:2:2 image 
closer to the original, despite using the 'wrong' method:

http://janx.org/dirty_pair_deinterlace_film_prog_xshm_no_scale.png
http://janx.org/pioneer_deinterlace_film_prog.png

Interlaced up-sampling by linear interpolation:

c'3 = (5d1 + 3d3)/8
c'4 = (7d2 + d4)/8
c'5 = (d1 + 7d3)/8
c'6 = (3d2 + 5d4)/8

c'3 = (15c1 + 5c3 + 9c5 + 3c7)/32
c'4 = (7c2 + 21c4 + c6 + 3c8)/32
c'5 = (3c1 + c3 + 21c5 + 7c7)/32
c'6 = (3c2 + 9c4 + 5c6 + 15c8)/32

i mod 4 = 3 :  0  0 15  0  5  0  9  0  3
i mod 4 = 0 :  0  0  7  0 21  0  1  0  3
i mod 4 = 1 :  3  0  1  0 21  0  7  0  0
i mod 4 = 2 :  3  0  9  0  5  0 15  0  0

This has a similar problem to the nearest neighbour interlaced up-sample.

Interlaced down-sample + progressive up-sample by linear interpolation:

c'3 = (3c1 + 3c2 + c3 + 9c4)/16
c'4 = (3c2 + 9c4 + 3c5 + c7)/16
c'5 = (c2 + 3c4 + 9c5 + 3c7)/16
c'6 = (9c5 + c6 + 3c7 + 3c8)/16

i mod 4 = 3 :  0  0  3  3  1  9  0  0  0
i mod 4 = 0 :  0  0  3  0  9  3  0  1  0
i mod 4 = 1 :  0  1  0  3  9  0  3  0  0
i mod 4 = 2 :  0  0  0  9  1  3  3  0  0

Still closer to the ideal convolution than the interlaced method.

It would be interesting to investigate the effect of different up-sampling 
methods (other than nearest neighbour and linear), and also of smoothing 
the chroma afterwards. I suspect that both interlaced and progressive 
methods would benefit from smoothing (since all the convolution filters 
appear to diverge from the ideal).

If the chroma up-sampling is performed after 3:2 pulldown detection, the 
smoothing and up-sampling could be performed in one step. This would avoid 
the need to do the smoothing in place.

If the progressive method is used, it must be performed after 3:2 pulldown 
detection. Using the progressive method with two fields from different 
frames will result in horrible artifacts as chroma leaks from one field
into the other.

Another possibility is using the nearest neighbour interlaced method, 
then, after 3:2 pulldown detection, swapping the bad lines (c'2 and c'3, 
c'6 and c'7, etc). The effect of this would be to shift the convolutions 1 
place left or right, bringing them closer to the symmetric centre weighted 
ideal. In the nearest neighbour case, the result of this will be identical 
to the progressive method (but the 3:2 detection code will not have to be 
modified to handle 4:2:0).

If the linear interpolation is too expensive to compute, it might be 
replaced by:

Progressive (this is equivalent to shifting the chroma up half a line):

c'3 = d2
c'4 = (d2 + d3)/2
c'5 = d3
c'6 = (d3 + d4)/2

Interlaced (this is equivalent to shifting the chroma up half a line for 
the top field and down half a line for the bottom field):

c'3 = (d1 + d3)/2
c'4 = d2
c'5 = d3
c'6 = (d2 + d4)/2

It might also be interesting to write some kind of filter for GIMP (or 
something) to simulate the effect of different methods in a controlled 
way (not relying on DVDs, which may have been encoded using unknown
methods and for which we do not have the original 4:2:2 image).

[snip]
>   And that's what you're seeing as looking 'real nice' in your
> screenshots.  The only plauible reason I could see that this would be
> right is if someone did some sort of transcode from a progressive 4:2:0
> format to an interlaced one, like an MPEG2->MPEG2 transcode, and I don't
> think it's what's going on here.  However, it's closer to case 2 than
> doing things 'the right way' (case 1), so, maybe that's why it's better?

Hopefully, I have shown that the progressive method really does produce 
better results, even with correctly down-sampled DVDs (it would help if I 
knew precisely what the correct down-sample filter is).

>   So, those are what I think the possibilities are.  Filtering the
> chroma just makes things look smoother and I'd rather hold off on
> seriously doing that until we know what the situation is here.

I think that filtering the chroma is still helpful, whichever up-sampling 
method is used.

>   In my opinion, the only sane thing to do when given an interlaced
> MPEG2 frame as our source is to upsample using case 1 and filter our
> chroma nicely.  Case 2 will only look better if the encoder uses point
> sampling, and arguably no good encoder would (maybe we should compare
> with some Criterion DVDs?), and case 3 should, in my opinion, never ever
> happen.  But I'd like to know conclusively.

Wouldn't the Criterion DVDs be progressively encoded anyway?

> 
>   Please let me know if you think there's a case I'm missing.
> 
>   Hope this helps,
>   -Billy
> 

James