Firewire audio is captured at the wrong rate, slowing video capture

A multifeatured virtual webcam software to broadcast over the Internet

Status: Beta

Brought to you by: ekatrukha, karl-ellis, naurd, owenm, phobosk

#129 Firewire audio is captured at the wrong rate, slowing video capture

Milestone: 0.72

Status: New

Owner: Tetsujin

Labels: None

Priority: Medium

Operating System: Linux

Component: Unspecified

Type: Defect

Updated: 2015-06-19

Created: 2015-06-19

Creator: Tetsujin

Private: No

What steps will reproduce the problem?
1. Start capture from a Firewire (DV) camera
2. Wait a couple minutes

What goes wrong?
Instrumentation reveals that clock time required to capture audio from Firewire device increases dramatically

The Operating system you are using (Linux, Windows etc)?
Linux

What version of WebcamStudio are you using?
0.73 (from SVN)

What version of Java are you using?
java version "1.8.0_45"
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

What is your Webcamera vendor, model and version?
Sony DCR-TRV17 (DV Camcorder)

This is related to the various other issues I've been reporting with respect to DV and multiple sources, the feed not running fast enough, suddenly getting very choppy, severely lagging, and crashing. I think this report encapsulates one piece of the problem, and may indicate another class of problems that need to be addressed in the frame builder loop.

I added some code to MasterFrameBuilder that measures how much time is spent waiting for data to be read in from the individual sources. I found that the time taken to read a frame from a DV feed fluctuates quite a bit. It starts out low but pretty quickly climbs above 30ms. Profiling indicates that SourceDV.readNext() spends about 95% of its time in Capturer.getNextAudio(). Digging deeper, getNextAudio() is pretty much just a call to audioIn.readFully(audio) - which attempts to completely fill the buffer audio[] from the audioIn stream.

My best guess at this point is that WCS is attempting to read audio (and perhaps video data as well) at slightly the wrong rate. I think that for every frame of video, gstreamer is producing (X) samples of audio, while at every frame WCS is trying to read in (Y) samples of audio, where (Y < X). I also think MasterFrameBuilder's timing loop may be incorrect: At each frame (n) of a 30fps stream, MasterFrameBuilder expects System.currentTimeMillis() to yield a value of approximately (t0 + 33ms * n) - in other words, the timing loop isn't truly 30fps, it's (1000ms / 33ms) = 30.3fps. Meanwhile, 30fps video usually isn't 30fps either, it's actually (30 / 1.001) = 29.97fps - if I'm right, then MasterFrameBuilder is actually always trying to read frame data faster than sources produce it.

This has obvious implications for reading audio data: unlike video data, the size of audio data is determined per unit of time, rather than per frame, so a small change in frame rate means a small change in the size of audio data per frame as well:
At 29.97fps, a 22050Hz stereo audio stream should produce an average of around 2942.9 bytes of data per frame
At 30.0fps, the same stream would produce around 2940 bytes of data per frame.

Somehow, I think SourceDV reaches a point where it was able to immediate read in the current frame of video, but some of the current frame's audio data was already read, so readFully() winds up waiting around 30ms for the next frame of audio data to be available so it can read part of that, too. I'm not sure how that happens (perhaps the TCP buffer for the audio stream filled up and gstreamer dropped some data?) but otherwise it fits what I'm seeing: on most frames, SourceDV is only able to get a part of the audio data immediately, and for the rest it has to wait for the next frame of data - and the amount of data it's able to get without waiting increases by 4 bytes every few frames - at an average rate of about 2.9 bytes per frame - which is pretty close to that discrepancy of 29.97fps vs. 30.0fps (2942.9 bytes per frame - 2940.0 bytes per frame = 2.9 bytes per frame...)

I'm not entirely sure how to address that - the current design is based on the assumption that we can read a certain number of bytes of audio data per frame, for every frame of video data. But the audio data in the DV cam feed is 2940 bytes on some frames, and 2944 bytes on others... I'll need to see if our design can accommodate a varying amount of available audio data per frame.

As a short-term fix I'm looking into methods to turn off audio capture (since I don't use it anyway). For a more permanent fix we'll want a solution that works properly when audio is on.

Firewire audio is captured at the wrong rate, slowing video capture

A multifeatured virtual webcam software to broadcast over the Internet

Milestone

Searches

Help

#129 Firewire audio is captured at the wrong rate, slowing video capture

Discussion