Advertise on this site
Sigma Designs
46501 Landing Parkway
Fremont CA 94538
Phone: 510-770-0100
Fax: 510-770-2693
|
Brought to you by the
|
|
MPEG Extension to AVI File Format
"Editable MPEG File Format"
Draft Version 1.1 of 5/13/94
The objective of this document is to describe an extension to
the AVI file format that supports a specific type of MPEG files
[Ref 1], called Editable MPEG. Editable MPEG files offer significant
compression of video data while preserving an image quality close
to the original, and allowing frame accurate editing as well as
multiple compression-decompression sequences. These are significant
advantages over normal MPEG files, at the expense of a higher
compressed data rate: typically 600 KByte/s (vs. 150 KByte/s).
In addition, thanks to extensions to the Wave format, Editable
MPEG files can be interleaved and may contain MPEG-compressed
Audio data, thus further reducing the file size.
Finally, hardware accelerators will be shortly available to perform
real-time compression as well as the transcoding from Editable
MPEG to fully compressed MPEG. This will allow users of personal
computers to capture video in real time, edit it and store it
efficiently on their desktop computer.
The main benefits of the Editable MPEG file format are summarized
below:
- AVI encapsulation insures compatibility with existing authoring
systems
- Provides significant compression of video data, while maintaining
high image quality
- Frame accurate editing
- Multiple compression-decompression iterations
- Supported by hardware accelerators which will allow capture,
editing and storage on the desktop
- Support interleaved MPEG Audio
The MPEG standard [Ref 1] specifies a technique for significant
data compression of audio and video data. It has emerged as the
leading compression technique for multimedia applications. While
quite efficient for data storage, the MPEG format is not conducive
to video editing since some frames (P and B types) require significant
amounts of computation to obtain the decompressed image, and since
successive decompression-recompression operations usually affect
the image quality.
On the other hand, almost all multimedia tools and packages today
rely on the AVI file format to handle audio and video data. As
stated in the introduction, this document describes an extension
to the AVI file format that handles a particular type of MPEG
files and thus allows editing and conversion to a normal MPEG
file format, while maintaining the image quality.
The figure below shows a simplified data flow for an Authoring
application. The authoring system interfaces with the MPEG Compression
& Decompression Accelerator through the Video Codec. For editing,
Editable MPEG files are decompressed through the Video Codec and
the Accelerator and returned to the Authoring System as bitmaps,
which can then be manipulated, displayed or mixed with bitmaps
originating from other types of AVI files. The resulting bitmap
is then fed back to the Accelerator for compression either into
an Editable MPEG file or into a distributable MPEG format. In
practice the Accelerator may be a collection of hardware and/or
software devices.
Authoring Application
Once editing is completed, the Editable MPEG file is transcoded
by the Accelerator into a regular MPEG file.
Note also that with the recent extension of the Wave format to
support the MPEG Audio compression standard [Ref 5], compressed
audio streams can be interleaved with the MPEG video stream following
the normal AVI interleaved format [Ref 2].
The Editable MPEG files contain only Intra-coded frames (I-frames).
A MPEG I-frame has self-contained information to represent a picture.
Each frame will include the MPEG sequence header with the quantization
matrix information, in order to facilitate editing operations
without requiring decompression and recompression. Furthermore,
we encapsulate each frame in a Group of Pictures (GOP) and thus
provide each frame with a SMPTE time code.
A MPEG AVI file is an AVI file. It has the mandatory RIFF structure
of the AVI files, but uses a new extension to the DIB format .
Following is a diagram describing the RIFF form of the MPEG AVI
format.
Only a few parameters in the Stream Header Chunk need to be assigned
specific values:
- The MPEG AVI file does not use the "strd"
chunk (This area is typically used by the installable compression/decompression
driver)
- The fccType field is set to "vids"
which stands for MPEG video stream.
- The fccHandler field is set to "MPGI"
which will activate the installable compression/decompression
VIDC.MPGI driver in the Microsoft Windows environment.
If index chuncks are being used, each frame should be flagged
as a "key frame", using the AVIIF_KEYFRAME flag.
The stream video format "strf" of a AVI stream
is a MPEG DIB extension. The MPEG DIB extension (EXBMINFOHEADER)
format contains the standard DIB header (BITMAPINFOHEADER) followed
by a structure (MPEGINFOHEADER) defined to describe the characteristic
of the MPEG video stream. The definitions are as follows,
The required values for the MPEGINFOHEADER fields are as follows,
- biSize = sizeof(BITMAPINFOHEADER)+sizeof(MPEGINFOHEADER);
- biPlanes = 1;
- biBitcounts = 24;
- biCompression = mmioFOURCC('M','P','G','I');
- bPixApectRatio = [Ref 1]
bPixApectRatio | Height/Width | Example
| 0 | forbidden |
| 1 | 1.0000 | VGA etc.
| 2 | 0.6735 |
| 3 | 0.7031 | 16:9 - 625 lines
| 4 | 0.7615 |
| 5 | 0.8055 |
| 6 | 0.8437 | 16:9 - 525 lines
| 7 | 0.8935 |
| 8 | 0.9375 | CCIR 601 - 625 lines
| 9 | 0.9815 |
| 10 | 1.0255 |
| 11 | 1.0695 |
| 12 | 1.1250 | CCIR 601 - 525 lines
| 13 | 1.1575 |
| 14 | 1.2015 |
| 15 | reserved |
|
The MPEGINFOHEADER together with the AVI RIFF headers is meant
to provide the application, such as an authoring system, all the
information required to process and display the video stream.
It has been defined so as to ensure that two streams with identical
MPEGINFOHEADER can be merged into a single stream without having
to decompress/recompress the video data (assuming that the video
editing operation lends itself to it).
See section 6 for the correspondence between AVI-DIB and MPEG
parameters.
Following the AVI stream header is a LIST "movi" chunk
that contains the actual data of the stream. As in any RIFF chunk,
a four-character code is used to identify the chunk. The MPEG
AVI file uses "##dc" sub-chunks where ## is the stream
id in the AVI file and 'dc' for "DIB compressed". The
data chunk for the compressed DIB has the following form:
MPEG-I DIB '##dc'
BYTE abBits[];
abBits[] is a fully MPEG-1 compliant I-frame picture,
preceded by the sequence header, the GOP header and the Picture
header. Keeping extra information like the MPEG GOP header and
picture header with the compressed data for each I-frame picture
eases the work load of the video codec, since the whole abBits[]
can be sent to the decoder right away. Furthermore, sequence
headers allow cut/paste operations at any frame by providing the
quantization matrix information.
The following shows the mandatory parameters in the Sequence header,
Group of Picture header, and the Picture header.
For the detailed syntax of MPEG-1 streams see Reference 1.
Sequence header
MPEG Parameter | #bits
| sequence_header_code | 32
| horizontal_size_value | 12
| vertical_size_value | 12
| sample_aspect_ratio | 4
| frame_rate | 4
| bit_rate | 18
| marker_bit | 1
| vbv_buffer_size | 10
| constrained_parameter_flag | 1
| load_intra_quantizer_matrix | 1
| intra_quantizer_matrix | 8*64 (*)
| load_non_intra_quantizer_matrix | 1
|
(*) Note that the intra_quantizer_matrix is only
present if the load_intra_quantizer_matrix bit is set to
1. If load_intra_quantizer_matrix is set to 0, the default
quantization matrix specified by the MPEG standard is used.
Group Of Pictures header
MPEG Parameter | #bits
| group_start_code | 32
| time_code | 25
| closed_gop | 1
| broken_link | 1
|
Picture header
MPEG Parameter | #bits
| picture_start_code | 32
| temporal_reference | 10
| picture_coding_type | 3
| vbv_delay | 16
|
Parameter | Constrained Value | Comments
| sequence_header_code | 000001B3 |
| horizontal_size_value | |
| vertical_size_value | |
| sample_aspect_ratio | | If the code is not 0001, then
the image does not have square
pixels. This may be create
artifacts in the display if not
handled properly
| frame_rate | |
| bit_rate | |
| marker_bit | 1 | Reserved
| vbv_buffer_size | |
| constrained_parameter_flag | 0 |
| load_intra_quantizer_matrix | |
| intra_quantizer_matrix | | Optional - Contains 64 bytes of
quantizer table, if
load_intra_quantizer_matrix == 1
| load_non_intra_quantizer_matrix | 0 | All frames are Intra coded
| group_start_code | 000001B8 |
| time_code | |
| closed_gop | 1 | All GOPs are closed
| broken_link | 0 |
| picture_start_code | 00000100 |
| temporal_reference | 0 | 1 frame per GOP
| picture_coding_type | 001 | All frames are Intra coded
| vbv_delay | |
|
AVI Parameter | MPEG Parameter | Relationship
| dwMicroSecPerFrame | frame_rate | dwMicroSecPerFrame = 1e6 / frame_rate
| dwMaxBytesPerSec | bit_rate | dwMaxBytesPerSec = 50 * bit_rate
| dwSuggestedBufferSize | vbv_buffer_size | dwSuggestedBufferSize =
vbv_buffer_size * 2048
See Note 1
| dwInitialFrames | | See Section 8.3
| dwWidth | horizontal_size_value | dwWidth = horizontal_size_value
| dwHeigth | vertical_size_value | dwHeigth = vertical_size_value
| dwScale | | See below
| dwRate | frame_rate | frame_rate = dwRate / dwScale
| dwQuality | | 5,000 * [1 + log10(bit_rate/3,000)]
0 dwQuality 10,000
See Note 2
| dwSampleSize | | 0
|
Note 1: The relationship between dwSuggestedBufferSize
& vbv_buffer_size is only meaningful in the case when the
application is performing the decompression itself. Otherwise,
these two numbers are independent, and the suggested buffer size
would then typically be, for interleaved files, the size of a
complete record.
Note 2: dwQuality is computed such that the baseline
bit rate of 1.2 Mbit/s corresponds to a quality level of 5,000,
and that a bit rate of 12 Mbit/s, where the quality of the compressed
video is typically indistinguishable of the original, is equal
to 10,000. The scaling factor '3,000' is due to the fact that
bit_rate is a number in units of 400 bits/second (bit_rate
with a value of 3,000 corresponds to an actual bit rate of 1.2
Mbit/s). Note that dwQuality must be constrained to be
between 0 and 10,000.
The MPEG-AVI file format was designed to make transformations
between AVI-MPEG files and MPEG compliant files as simple as possible.
In particular, the AVI to MPEG conversion only requires stripping
the AVI-specific header and framing information.
To convert from MPEG-AVI to MPEG, all that is required is:
- Strip the AVI headers
- Strip the DIB headers from each "movi" chunk
- Concatenate the remaining data. This will result in a fully
MPEG compliant data stream, containing I-frames only.
To convert an MPEG file containing only I-frames, the sequence
header is first extracted to generate the AVI and DIB headers.
Then each I-frame, including its sequence header, GOP and Picture
headers, is concatenated with a "movi" chunk header
and concatenated to the data stream. A Sequence header must be
included with the first frame, however, the application may select
not to copy consecutive identical sequence headers. During the
processing, the length of the file is computed, and the dwTotalFrames,
dwStart & dwLength parameters are written in
the main and stream AVI Headers.
It is to be noted that MPEG decompression system usually expected
Video before Audio, and not the opposite as is recommended
in the AVI file format [Ref 2 - "AVIStreamHeader"].
Consequently, appropriate buffering must be allocated when converting
the file from AVI to MPEG and vice-versa.
[1] ISO 11172 document. Coded representation of picture, audio
and multimedia/hypermedia information.
[2] Microsoft Video for Windows Development Kit - Programmer's
Guide
[3] Microsoft Windows Multimedia Programmers Guide and Microsoft
Windows Multimedia Programmers reference.
[4] Video Compression/Decompression drivers technical note from
Microsoft.
[5] MPEG-Audio Wave format - Microsoft Corporation.
|