Menu

User guide

Mark Cooper Greg Wilsbacher
Attachments
1overlap.png (84100 bytes)
2newproject.png (9101 bytes)
3saveproject.png (85110 bytes)
4locate.png (95461 bytes)
5noclipping.png (72503 bytes)
6extract.png (105521 bytes)
7audioonly.png (40495 bytes)
8audiovideo.png (64575 bytes)

AEO-Light User Guide

What is AEO-Light?

AEO-Light is an open-source software that extracts audio from optical sound tracks of motion picture film. AEO-Light is produced at the University of South Carolina by a team comprising faculty and staff from the University Libraries’ Moving Image Research Collections (MIRC) and the College of Arts and Science’s Interdisciplinary Mathematics Institute (IMI). Project funding comes from the Preservation and Access Division of the National Endowment for the Humanities. AEO-Light is available through an open-source licensing agreement. The complete terms are available in the AEO-Light “ReadMe” file and in the “About” menu.

Using AEO-Light

AEO-Light extracts audio from film scans that meet the following requirements:

  • The scans must be made so that the film imaged includes the optical soundtrack in addition to the image-frame.
  • The scans must also be configured so that some information above and below each image-frame is included. The minimum amount of vertical overscan required for AEO-Light has yet to be determined. Users are encouraged to start with a larger vertical overscan at first.
    foobar
  • The scans must contain enough resolution to provide meaningful audio information. The minimum resolution required to produce acceptable audio is as yet undetermined, although audio has been produced from 1024 x 768 scans of 16mm film. Users are encouraged to scan at the highest resolutions possible for initial tests.
  • AEO-Light is not designed to process optical-sound only tracks (aka double system tracks) but additional testing is being done to improve the software’s ability to extract high quality audio from such tracks.

AEO-Light Requirements

  • 64-bit Widows, MAC, Linux
  • AEO-Light application
  • AEO-Light Unix executable file (MAC and Linux)
  • Matlab Compiler Runtime (MCR) 2012b (v.8) for Mac or 2013a (v.8.1) for Windows
  • FFmpeg v. 0.11 or later (required for video export functionality). See http://ffmpeg.org for documentation and downloads. AEO-Light beta has been tested against the static builds provided by Tessus (Mac) at http://www.evermeet.cx/ffmpeg/ and Zeranoe (Win) at http://ffmpeg.zeranoe.com/builds/ Users unfamiliar with FFmpeg are encouraged to install one of the Windows or Mac static builds.

Installing AEO-Light

  1. Unzip and install the Matlab Compiler Runtime (MCR).
  2. Drag the aeolight.exe file into the Programs folder (Windows). Drag the aeolight.app and aeolight executable to the Applications folder (Mac).
  3. Install FFmpeg. Note: a guide for installing FFmpeg from the Tessus static build is included in the zipped application package for the Mac.

Basic Instructions

  1. Launch AEO-Light by clicking on the application (Windows) or by clicking on the ‘aeolight’ UNIX executable (Mac). Note: this release of AEO-Light is known to have application launch times of up to 30 seconds.
  2. On first launch users must agree to the terms of use in order to use the software.
  3. Select “New Project” from the dialog window.
    foobar
  4. Select the source file(s) for processing. AEO-Light can read a variety of formats: DPX, TIF (full color and grayscale), AVI and MOV.1 By default the source selection menu displays all file types. Users may chose to restrict available files to a specific type (DPX, MOV, AVI, etc..).
    1. When importing a folder of frame scans (DPX or TIF images), navigate to the desired directory and select one of the individual frame files. AEO-Light will scan the directory to find all of the similarly-named files.
      • This directory must contain a single, contiguous sequence of DPX files whose names have a common prefix and differ only in a fixed-length index field occupying the positions immediately antecedent to the file extension. For example, f_00.dpx, f_01.dpx, ... , f_87.dpx is a valid sequence.
      • By comparison, f1_00.dpx, f2_00.dpx, ... , f8_00.dpx does not satisfy the criterion for file naming, nor does f_00a.dpx, f_01a.dpx, etc... AEO-Light will automatically load all files in the sequence after the frame selected.
    2. If importing a video file, select the AVI or MOV for processing.
    3. NOTE: Although external drives are supported they are discouraged as the slow data transfer times will radically decrease the per-minute rate of extraction and may cause the program to fail.
  5. AEO-Light reads the information and then displays the main window showing the video.
  6. Save the project by clicking on the “diskette” icon on the main window or selecting “save project” from the drop-down menu. AEO-Light project files are saved with the “.aeo” extension. The project file contains all of the setting about the project including the location of the source file. The source file is not copied into the project file.
    foobar
  7. By default AEO-Light will process the entire video sequence. However, users may select a portion of the video for processing by moving the slider to the desired locations and pressing the IN and OUT buttons. Users may also specify the frame number by directly typing a number into the display box and then pressing the IN and OUT button. Multiple IN and OUT points are not supported.
  8. Define the region from which the optical sound will be extracted by selecting the “Locate” button.
    foobar
    1. Move the red bounding box over the optical sound track.
    2. Adjust the width of the bounding box so that the left/right parameters fall on the edges of the track area. The software will use your selections to to set the parameters used on all frames during processing.
      • Variable density tracks can be narrowed quite considerably; a narrow box may be used to avoid severe linear scratches in the track area.
      • Variable area tracks require greater caution when setting the bounding box to ensure that the audio peaks are not cut off or “clipped” by the bounding box.
        foobar
    3. Double click with the pointer inside the bounding box. AEO-Light will randomly select a number of frames.
    4. Repeat this process for each selected frame. From the second frame on, only making the box narrower will impact the audio extraction process. All other changes are ignored. When AEO-Light has sampled enough frames the window will close. You may restart the “Locate Track” sequence at anytime during this process by closing the window and returning to the main GUI.
    5. More than one bounding box may be defined by repeating this process (the benefits of multiple bounding boxes are discussed in the “Advanced User” section).
      + Select the “Extract” button. This initiates the AEO-Light process. Depending on the configuration of the user’s computer and type of input, extraction times will vary from 3 to 14 frames per second. Once the extraction process is complete a dialog box will notify the user.
      foobar
  9. The default audio file may be sampled by selecting the “Play” button. AEO-Light saves the raw version automatically to the project’s render folders. Saving the project at this stage will allow a user to close AEO-Light and reopen the project without having to redo the audio extraction process.
  10. Select the “Export” button.
  11. To extract audio only, select “audio only” from the drop-down window, select “export” and specify a file name and location.
    1. By default AEO-Light exports 16 bit audio with a sampling rate derived from the resolution of the image input.
    2. Users may specify a particular bit depth and sampling rate by choosing from the menu option on the right of the export window. Resampling is done by FFmpeg. If FFmpeg is not installed, only the default setting can be used.
      foobar
  12. To extract synchronized audio and video, select “Video with audio” from the drop-down menu, define the video format and frame offset, select “export” and specify a file name and location. The export settings available depend upon the image input:
    1. If the film source was a video format AEO-Light can synchronize the extracted audio with the original video file.2
    2. If the film source was a frame format (e. g., DPX) then users must specify the encoding method for the video. AEO-Light defaults to H.264, but users may select ProRes (specifying a bit rate) or uncompressed video. If the “Preview or edit FFmpeg” option is selected users may modify the FFmpeg command as desired.
    3. Specify the required frame offset, desired audio sampling rate, and bit depth.
      foobar
  13. AEO-Light will automatically launch the audio (.wav) or video file using the default application as set by the operating system. If users experience difficulty with playback a third-party player like VLC should be used to open the exported files.

Advanced Instructions (in progress)

Multiple Bounding Boxes and Stereo Tracks

AEO-Light supports the simultaneous extraction of two or more defined audio regions. This functionality supports the production of two channel stereo tracks. It also provides for multiple versions of single mono tracks to compensate for damage to different areas of the optical track area.

The process for setting multiple bounding boxes differs little from setting a single box:

  1. Select “Locate” and follow the procedure for defining the first bounding box.
  2. Select “Locate” a second time and follow the procedure for defining the second bounding box.
  3. Select “Extract” to begin the audio extraction process.
    • Note: AEO-Light automatically processes each bounding box defined on the main menu regardless of which box is highlighted. The time to complete the process will increase as the number of bounding boxes increases.
  4. By default, the audio track produced by the highlighted bounding box is the the audio played back when “Play” is selected.
  5. Highlight two bounding boxes and select “Export”
  6. Follow the prompts to create a multichannel, stereo output

Project Settings

The project settings pane allows users to adjust the default settings for the core audio extraction process as outlined below. The following summary of the audio extraction method will orient users to the impact certain variable have on the process. AEO-light extraction has four main steps:

  1. Read frames from input,
  2. Create calibration mask,
  3. Register frame overlap, and
  4. Extract audio.

The quality of audio extracted is heavily impacted by the steps 2 and 3, as such, these variables should be looked at first when attempting troubleshoot the process.

THE FOLLOWING SETTINGS APPLY TO STEP ONE:

  • Fast vs. Safe

The routine AEO-Light uses to extract frames from video files may be unreliable depending on the video source (for example, when the video is on a slow external drive). If there are problems reading video, switch to safe reading mode. Safe mode uses the same routine, but takes extra time to check the frames read to verify that they are reasonable. If the check fails, the frames are read and checked again. Subsequent reads typically have a better chance of succeeding due to caching.

This parameter has no effect when reading from a frame sequence (DPX files), since that doesn't require the video reader.

  • Frames per Batch

When reading frames from a video file, AEO-Light can read several frames at once to reduce the overhead associated with opening the video. Reading two frames at a time cuts the overhead in half over reading one frame at a time, for example. The amount of speed that can be gained is limited by available memory, however. If more frames are read than can fit in memory, the system will swap them in and out to disk, crippling performance speed. (This is called disk thrashing.) Windows users are encouraged to use the Automatic option (below) to calculate the number of frames per batch based on available memory. The Automatic option is not available for Linux and Mac users because of a limitation in Matlab. Mac and Linux users should experiment with frame per batch numbers to determine what is optimum for their systems. It is reasonable to test the system with a value of 20 frames per batch.

This parameter has no effect when reading from a frame sequence (DPX files), since the access overhead is incurred on each frame regardless of how many frames are read at a time.

  • Automatic

(Windows only). If checked, AEO-Light automatically sets the number of frames per batch to maximize the number of frames in memory without exceeding memory (paging) at any stage of the algorithm. Due to limitations in MATLAB, this option is not available on OS/X or Linux.

THE FOLLOWING VARIABLES APPLY TO STEP TWO:

AEO-Light can attempt to adjust each frame's sound signal to compensate for the effects of uneven illumination during scanning. The calibration mask is constructed by smoothing an averaged sound signal. Each frame's sound signal is then divided by the mask. This step is optional. If quality audio can be extracted from scans without calibration skipping this step will expedite the extraction process.

  • Signals (Frames) per Mask

The number of sound signals to average together to produce a calibration mask. These signals are taken consecutively from the beginning of the frame sequence. To use all available signals, enter "inf" (infinite) or use any number larger than the number of frames.

  • Rows to Ignore

The maximum number of image rows in each frame that can be excluded from consideration. Excluded rows can form one block at the top and another block at the bottom of an image. The specified number puts a cap on the total in both blocks. The value can be expressed as number of rows directly or as a percentage of the total number of rows in each frame.

After Creating the calibration mask AEO-Light uses one of two methods to reduce defects in the calibration curve, thereby improving the quality of the calibration mask. User experimentation can determine which of the methods, Moving Average or Polynomial Fit, provides better defect modeling.

  1. Moving Average Options

  2. Sweeps

The number of sweeps (passes) to do a moving average over signal. The more the sweeps that are applied, the smoother the calibration mask becomes.

  • Half-span

The span (in rows) in each direction to use in producing the calibration mask. The larger the half-span is, the smoother the calibration mask becomes.

  1. Polynomial Fit Options

  2. Polynomial of Degree

Use polynomials of the selected degrees to produce a calibration mask.

Specify the degree(s) of the polynomial to be fitted through the averaged sound signal. Degree 1 corresponds to a linear polynomial (a straight line). The polynomial from the specified class of polynomials that best fits the averaged sound signal is used as the calibration mask.

THE FOLLOWING VARIABLES APPLY TO STEP THREE:

AEO-Light must register the overlapping portions of the sound signals in order to piece them together into a continuous audio track for the whole movie. This is done by looking at either the sound signals themselves ("Overlap by Sound") or the whole scanned frame ("Overlap by Image").

Guess a typical amount of overlap between any two consecutive frames. This box cannot be unchecked, for guessing is fast and very beneficial when reasonably correct.

  • Frames for overlap guess

AEO-Light first samples pairs of consecutive frames (chosen at random from the film) and registers them against each other to find the typical amount of overlap between scanned frames. This typical value is used as the starting point for registration for every pair of frames (reducing the time needed for registration, so long as the typical value is reliable).

Type in the number of frame pairs to sample to determine the typical overlap. We recommend using a minimum of 400 frame pairs.

  • Overlap by Sound

Calculate the overlap between every pair of consecutive frames by using sound-like signals to represent them.

  • Overlap by Image

Calculate the overlap between every two consecutive frames by using the full frames.

If both "by sound" and "by image" are selected, both will be done but only the "by image" part will ultimately be used. The log file and visualizations will reflect both steps, however.

  • Initial search radius

This is the expected radius (range in each direction) the overlap guess might need to be adjusted for a given pair of frames. E.g., if a scanner rarely slips by more than 10 rows (up or down), then enter 10. AEO-Light finds the best fit within this range, and if that fit is validated (see below), it is used as the registration for that pair of frames.

  • Validation Radius

An overlap registration is considered reliable if it produces a better match than any other overlap within the validation radius of it.

  • Maximum Overlap

The maximal number of rows by which a valid overlap can differ from the overlap guess without requiring further investigation, e.g., if a scanner never slips by more than 20 rows (up or down), then enter 20.

THE FOLLOWING VARIABLES APPLY TO STEP FOUR

Having prepared the audio signal for extraction in steps 2 and 3, AEO-Light extracts the complete audio signal and encodes it as a .wav file.

  • Frame Rate

Type in the frame rate of the original film when reading from .dpx or .tif files. The frame rate of a video source is determined automatically.

OTHER VARIABLES:

  • Create Visual Diagnostics

Save graphs and images representing intermediate results for presentation, diagnostics, and troubleshooting. Visualizations slow down the extraction process and cause rather bothersome blinking on the screen as the numerous figures are quickly drawn and deleted.

  • Run in parallel (main window)

Run in parallel if two or more processors (cores) are available. The maximal number of cores available for starting a pool of parallel workers is determined automatically. Typically, four or more cores are needed for noticeable speed-up. Running in parallel speeds up only the computational portions of the extraction process. Reading times remain unaffected.

Preservation Considerations

Although combined audio and video will be the goal for most users, those interested in long-term preservation of the content should anticipate that future technologies may affect the quality of the audio that can be extracted from digital image files and/or the ability to playback the muxed audio-visual file. The AEO-Light authors therefore recommend that users retain:

  1. The digital image source file(s)
  2. The extracted audio .wav file
  3. A record of the AEO-Light software version, audio file creation date and size upon creation, and the portion of the image file from which the audio signal has been created (in and out frames as well as track area).

To assist in recording this information, AEO-Light can produce a rudimentary PREMIS xml file. Simply click the check box when saving the audio file.

NOTE: This PREMIS file will record the audio filename as an identifier; best practice would employ a naming convention that encourages unique and persistent filenames. The file has been designed with the expectation that it will be supplemented by information specific to a digital repository on ingest. For more on PREMIS see: http://www.loc.gov/standards/premis/

Providing Feedback

AEO-Light Ver. 0.9 (beta) is provided to users for testing. All feedback is vital to the development of the software but we are keenly interested in reports on the following issues.

  • Performance. Data about PC configuration combined with frames per minute processing rate--this information is displayed at the end of each audio extraction.
  • Quality. Subjective and objective evaluation of the audio quality synchronization quality, etc..
  • Scanner configuration. AEO-Light is designed to be scanner and sensor neutral provided the scans meet the basic criteria outlined above. The development team values feedback about the types of scanners and sensors used to produce the DPX, TIF or video input processed by AEO-Light. Whenever possible, the team would like sample input to help with our evaluation of the software’s performance and to better contextualize the feedback provided on other issues. Unless permission for use is provided by the tester, any such scans will be used for internal evaluation only for the purposes of developing AEO-Light.

Share feedback with the community, http://sourceforge.net/p/aeolight/discussion/ OR
submit feedback via the web, http://imi.cas.sc.edu/mirc-feedback/

All Rights Reserved [License]
AEO-Light, Ver. 0.9 (Beta)
Greg Wilsbacher, Borislav Karaivanov, Pencho Petrushev, and Mark Cooper
Additional programming by L. Scott Johnson. Testing and support by Brittany Braddock. Logo design by Ashley Blewer.


Related

Wiki: Home
Wiki: License

MongoDB Logo MongoDB