| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 67
Page 67
9
An SDTV Decoder with HDTV Capability: An All-Format ATV
Decoder
Jill Boyce, John Henderson, and Larry
Pearlstein
Hitachi America Ltd.
This paper describes techniques for implementing a video decoder
that can decode MPEG-2 high-definition (HD) bit streams at a
significantly lower cost than that for previously described
high-definition video decoders. The subjective quality of the
pictures produced by this ''HD-capable" decoder is roughly
comparable to current DBS delivered standard-definition (SD)
digital television pictures. The HD-capable decoder can decode SD
bit streams with precisely the same results as a conventional
standard-definition decoder. The MPEG term Main Profile at Main
Level (MP@ML) is also used to refer to standard-definition video in
the sequel.
The decoder makes use of a pre-parser circuit that examines the
incoming bit stream in a bit-serial fashion and selectively
discards coded symbols that are not important for reconstruction of
pictures at reduced resolution. This pre-parsing process is
performed so that the required channel buffer size and bandwidth
are both significantly reduced. The pre-parser also allows the
syntax parser (SP) and variable-length decoder (VLD) circuitry to
be designed for lower performance levels.
The HD-capable decoder "downsamples" decoded picture data before
storage in the frame memory, thereby permitting reduction of the
memory size. This downsampling can be performed adaptively on a
field or frame basis to maximize picture quality. Experiments have
been carried out using different methods for downsampling with
varying results. The combination of the pre-parser and picture
downsampling enables the use of the same amount of memory as used
in standard definition video decoders.
The decoder selects a subset of the 64 DCT coefficients of each
block for processing and treats the remaining coefficients as
having the value zero. This leads to simplified inverse
quantization (IQ) and inverse discrete cosine transform (IDCT)
circuits. A novel IDCT is described whereby the one-dimensional
8-point IDCT used for decoding standard definition pictures is used
as the basis for performing a reduced complexity IDCT when
processing high-definition bit streams.
A decoder employing the above techniques has been simulated
using "C" with HDTV bit streams, and the results are described.
Normal HDTV encoding practices were used in these experiments. The
bit streams were decoded according to the concepts described
herein, including pre-parsing, the effects of reduced memory sizes,
simplified IDCT processing, and the various associated filtering
steps. The pre-parser and resampling result in a certain amount of
prediction "drift" in the decoder that depends on a number of
factors, some of which are under the control of the encoder. Those
who have viewed the resulting images agree that the decoding
discussed in this paper produces images that meet performances
expectations of SDTV quality.
The HD-capable video decoder, as simulated, can be expected to
be implementable at a cost only marginally higher than that of a
standard definition video decoder. The techniques described here
could be applied to produce HD-capable decoders at many different
price/performance points. By producing a range of consumer products
that can all decode HDTV bit streams, a migration path to full HDTV
is preserved while allowing a flexible mix of video formats to be
transmitted at the initiation of digital television service.
There is an ongoing policy debate about SDTV and HDTV standards,
about a broadcast mix of both formats, and about how a full range
of digital television might evolve from a beginning that includes
either SDTV or HDTV or both. This paper offers technical input to
that debate, specifically regarding consumer receivers that could
decode both SDTV and HDTV digital signals at a cost only marginally
higher than that of SDTV alone.
There are at least two areas of policy debate in which these
issues are relevant:
OCR for page 68
Page 68
1. What is the right mix of HDTV and SDTV as the digital
service evolves over time? There are a variety of introduction
scenarios for digital television, ranging from HDTV only, to SDTV
only, to various mixes of the two. To preserve the HDTV broadcast
option no matter how digital television service is introduced, SDTV
receivers must be able to decode the HDTV signal. It is assumed
here that SDTV receivers with such HDTV-decoding capability are
both practical and cost effective. It is thus entirely practical to
preclude SDTV-only receivers. Therefore, the introduction of SDTV
would not prevent later introduction of HDTV because fully capable
digital receivers would already be in use.
2. How quickly can National Television System Committee
(NTSC) broadcasting be discontinued? The receiver design
approach described herein can be applied to low-cost set-top boxes
that permit NTSC receivers to be used to view digital television
broadcasts. The existence of such decoders at low cost is implicit
in any scenario that terminates NTSC broadcast.
Cost and Complexity of Full-Resolution
HDTV Decoder Components
The single most expensive element of a video decoder is the
picture storage memory. A fully compliant video decoder for U.S.
HDTV will require a minimum of 9 MBytes of RAM for picture storage.
An HDTV decoder will also require at least 1 MByte of RAM for
channel buffer memory to provide temporary storage of the
compressed bit stream. It can be expected that practical HDTV video
decoders will employ 12 to 16 MBytes of specialty DRAM, which will
probably cost at least $300 to $400 for the next few years and may
be expected to cost more than $100 for the foreseeable future.
The IDCT section performs a large number of arithmetic
computations at a high rate and represents a significant portion of
the decoder chip area. The inverse quantizer (IQ) performs a
smaller number of computations at a high rate, but it may also
represent significant complexity.
The SP and VLD logic may also represent a significant portion of
the decoder chip area. At the speeds and data rates specified for
U.S. HDTV, multiple SP/VLD logic units operating in parallel may be
required in a full HDTV decoder.
Cost Reductions of HDTV Decoder
This section describes several techniques that can be applied to
reduce the cost of an HD-capable decoder. The following decoder
subunits are considered: picture storage memory, pre-parser and
channel buffer, SP and VLD, inverse quantizer and inverse discrete
cosine transform, and motion compensated prediction. The discussion
refers to Figure 1, which is a block diagram of a conventional SDTV
decoder; and Figure 2, which is a block diagram of an HD-capable
decoder. The blocks, which appear in Figure 2 but not in Figure 1,
have been shaded to highlight the differences between an HD-capable
decoder and a conventional SD decoder.
Picture-Storage Memory
As described in Ng (1993), the amount of picture-storage memory
needed in a decoder can be reduced by downsampling (i.e.,
subsampling horizontally and vertically) each picture within the
decoding loop. Note in Figure 2 that residual or intradata
downsampling takes place after the IDCT block and prediction
downsampling is done following half-pel interpolation blocks. The
upsample operation shown in Figure 2 serves to restore the sampling
lattice to its original scale, thus allowing the motion vectors to
be applied at their original resolution. Although this view is
functionally accurate, in actual hardware implementations the
residual/intra downsampling operation would be merged with the IDCT
operation, and the prediction downsample operation would be merged
with the upsample and half-pel interpolation. In an efficient
implementation the upsamplehalf-pel
interpolationdownsample operation is implemented by
appropriately weighting each of the reference samples extracted
from the (reduced resolution) anchor frame buffers to form reduced
resolution prediction references.
OCR for page 69
Page 69
Figure 1
Block diagram of a conventional SDTV video decoder.
The weights used in this operation depend on the full-precision
motion vectors extracted from the coded bit stream.
Experiments have shown that it is important that the prediction
downsampling process is near the inverse of the upsampling process,
since even small differences are made noticeable after many
generations of predictions (i.e., after an unusually long GOP that
also contained many P-frames). There are two simple methods:
•
Downsample without filtering (subsample), and
upsample using bilinear interpolation; and
•
Downsample by averaging and upsample without
filtering (sample and hold).
For both of these methods the concatenated upsample-downsample
operation isidentity when motion vectors arezero. Both methods have
been shown to provide reasonable image quality.
For the residual/intra downsampling process it is possible to
use frequency domain filtering in lieu of spatial filtering to
control aliasing. Frequency domain filtering is naturally
accomplished by "zeroing" the DCT coefficients that correspond to
high spatial frequencies. Note that the prediction filtering may
introduce a spatial shiftthis can be accomodated by
introducing a matching shift in the residual/intra downsampling
process, or by appropriately biasing the motion vectors before
use.
When processing interfaced pictures, the question arises as to
whether upsampling and downsampling should be done on a field basis
or on a frame basis. Field-based processing preserves the greatest
degree of temporal resolution, whereas frame-based processing
potentially preserves the greatest degree of spatial resolution. A
brute-force approach would be to choose a single mode (either field
or frame) for all downsampling.
A more elaborate scheme involves deciding whether to upsample or
downsample each macroblock on a field basis or frame basis,
depending on the amount of local motion and the high-frequency
content. Field based processing is most appropriate when there is
not much high-frequency content and/or a great deal of motion.
Frame-based processing is most appropriate when there is
significant high-frequency content and/or little motion.
One especially simple way of making this decision is to follow
the choice made by the encoder for each macroblock in the area of
field or frame DCT and/or field- or frame-motion compensation,
since the same criteria
OCR for page 70
Page 70
Figure 2
Block diagram of a low-cost HDTV video decoder.
may apply to both types of decisions. Although field conversion
is not optimal in areas of great detail, such as horizontal lines,
simulations show that if a single mode is used, field is probably
the better choice.
In MPEG parlance, SDTV corresponds to Main Level, which is
limited to 720 × 480 pixels at 60 Hz, for a total of 345,600
pixels. U.S. ATV allows pictures as large as 1920 × 1080
pixels. Sequences received in this format can be conveniently
downsampled by a factor of 3 horizontally and a factor of 2
vertically to yield a maximum resolution of 640 × 540, a
total of 345,600 pixels. Thus the memory provided for SDTV would be
adequate for the reduced-resolution HD decoder as well. It would be
possible to use the same techniques with a smaller amount of
downsampling for less memory savings.
In a cost-effective video decoder, the channel buffer and
picture-storage buffers are typically combined into a single memory
subsystem. The amount of storage available for the channel buffer
is the difference between the memory size and the amount of memory
needed for picture storage. Table 1 shows the amount of
picture-storage memory required to decode the two high-definition
formats with downsampling. The last column shows the amount of free
memory when a single 16-Mbit memory unit is used for all of the
decoder storage requirements. This is important since
cost-effective SDTV decoders use an integrated 15-Mbit memory
architecture. The memory not needed for picture storage can be used
for buffering the compressed video bit stream.
TABLE 1 Reduced Resolution Decoder Memory
Usage
Active Horizontal
Active Vertical
H Scale Factor
V Scale Factor
Downsampled Horizontal
Down- sampled Vertical
$ Frames Stored
Down- sampled Memory Required
Free Memory With 16 MBITS DRAM
1920
1,080
3
2
640
540
3
12,441,600
4,335,616
1280
720
2
2
640
360
2
5,529,600
11,247,616
720
480
3
12,441,600
4,336,616
NOTE: Shaded row reflects SDTV format for
reference.
OCR for page 71
Page 71
As indicated in Table 1, the 1920 × 1080 format is
downsampled by 3 horizontally and 2 vertically. This results in
efficient use of memory (exactly the same storage requirements as
MP@ML) and leaves a reasonable amount of free memory for use as a
channel buffer.
The natural approach for the 1280 × 720 format would be to
downsample by 2 vertically and horizontally. This leaves sufficient
free memory that the downconverter would never need to consider
channel buffer fullness when deciding which data to discard.
After decoding of a given macroblock, it might be immediately
downsampled for storage or retained in a small buffer that contains
several scan lines of full-resolution video to allow for filtering
before downsampling. The exact method of upsampling and
downsampling is discussed below; it can greatly affect image
quality, since even small differences are made noticeable after
many generations of predictions.1
The upsampling and downsampling functions are additional costs
beyond that for an SD decoder.
The general concept of reducing memory storage requirements for
a lower-cost HDTV decoder is known in the literature. This paper
adds pre-parsing and new techniques for performing downsampling and
upsampling.
Pre-parser and Channel Buffer
A fully compliant HDTV decoder requires at least 8 Mbits of
high-speed RAM, with peak output bandwidth of 140 MBytes/sec for
the channel buffer. With the use of a pre-parser to discard some of
the incoming data before buffering, the output bandwidth can be
reduced to a peak of 23 MBytes/sec and the size of the channel
buffer can be reduced to 1.8 to 4.3 Mbits. (The lower number is
required for MP@ML and the higher number is the amount left over in
the SDTV 16-Mbit memory after a 1080 × 1920 image is
downsampled by 3 horizontally and 2 vertically, including the
required 3 frames of storage.)
The pre-parser examines the incoming bit stream and discards
less important coding elements, specifically high-frequency DCT
coefficients. It may perform this data selection while the DCT
coefficients are still in the run-length/amplitude domain (i.e.,
while still variable-length encoded). The pre-parser thus serves
two functions:
•
It discards data to allow a smaller channel buffer
to be used without overflow and to allow reduction of the
channel-buffer bandwidth requirements.
•
It discards run-length/amplitude symbols, which
allows for simpler real-time SP and VLD units.
The pre-parser only discards full MPEG code words, creating a
compliant but reduced data rate and reduced-quality bit stream. The
picture degradation caused by the pre-parsing operation is
generally minimal when downsampled for display at reduced
resolution. The goal of the pre-parser is to reduce peak
requirements in later functions rather than to significantly reduce
average data rates. The overall reduction of the data rate through
the pre-parser is generally small; for example, 18 Mbps may be
reduced to approximately 12 to 14 Mbps.
The channel buffer in a fully HDTV decoder must have high-output
bandwidth because it must output a full macroblock's data in the
time it takes to process a macroblock. The pre-parser limits the
maximum number of bits per macroblock to reduce the worst-case
channel buffer output requirement. The peak number of bits allowed
per macroblock in U.S. HDTV is 4608; this requires an output
bandwidth of 140 MBytes/sec even though the average number of bits
per macroblock is only 74. The pre-parser retains no more than 768
bits for each coded macroblock, thereby lowering the maximum output
bandwidth to 23 MBytes/sec, the same as for MP@ML.
The pre-parser also removes high-frequency information (i.e., it
does not retain any non-zero DCT coefficients outside of a
predetermined low-frequency region). Pre-parsing could remove
coefficients after a pre-specified coefficient position in the
coded scan pattern, or it could remove only those coefficients that
will not be retained for use in the IDCT. This reduces the total
number of bits to be stored in the channel buffer.
In addition to discarding data to limit bits per coded
macroblock and high-frequency coefficients, the pre-parser also
alters its behavior based on the channel buffer fullness. The
pre-parser keeps a model of buffer occupancy and removes
coefficients as needed to ensure that the decreased size channel
buffer will never
OCR for page 72
Page 72
overflow. As this buffer increases in occupancy, the
pre-processor becomes more aggressive about the amount of
high-frequency DCT coefficient information to be discarded.
This decoder management of its own buffer is a key difference
between the enhanced SDTV decoder and a "normal" SDTV decoder. In a
"normal" encoder/decoder combination, the encoder limits the peak
data rate to match the specifications of the decoder buffer; it is
the responsibility of the encoder to assure that the decoder buffer
does not overflow. In the enhanced SDTV decoder outlined in this
paper, the decoder can accept bit streams intended for a much
larger buffer (i.e., an HDTV bit stream) and can perform its own
triage on the incoming bit stream to maintain correct buffer
occupancy.2
This pre-parser is an additional cost over a stand-alone SD
decoder, but the cost and complexity are low since it can run at
the relatively low average incoming-bit rate. The pre-parser is
significantly less complex than a full-rate SP and VLD because of
its slower speed requirement and because it parses but does not
have to actually decode values from all of the variable length
codes.3
Syntax Parser and Variable-Length
Decoder
The computational requirements for the SP and VLD units of the
downconverter are substantially reduced by implementing a
simplified bit-serial pre-parser as described above. The pre-parser
limits the maximum number of bits per macroblock. It also operates
to limit the number of DCT coefficients in a block by discarding
coefficients after a certain number, thus reducing the speed
requirements of the SP and VLD units.
At the speeds and data rates specified for U.S. HDTV, multiple
SP/VLD logic units operating in parallel may be required. The
pre-parser limits the processing speed requirements for the HD
downconverter to SDTV levels. Thus the only additional requirement
on the SP/VLD block for decoding HDTV is the need for slightly
larger registers for storing the larger picture sizes and other
related information, as shown in Figure 2.
Inverse Quantization and Inverse
Discrete Cosine Transform
Reduced complexity inverse quantizer (IQ) and inverse discrete
cosine transform (IDCT) units could be designed by forcing some
predetermined set of high frequency coefficients to zero. MPEG
MP@ML allows for pixel rates of up to 10.4 million per second. U.S.
ATV allows pixel rates of up to 62.2 million per second. It is
therefore possible to use SDTV-level IQ circuitry for HDTV decoding
by ignoring all but the 10 or 11 most critical coefficients. Some
of the ignored coefficients (the 8 × 8 coefficients other
than the 10 or 11 critical coefficients) will probably have already
been discarded by the pre-parser. However, the pre-parser is not
required to discard all of the coefficients to be ignored. The
pre-parser may discard coefficients according to coded scan pattern
order, which will not, in general, result in deleting all of the
coefficients that should be ignored by later processing stages.
Processing only 11 of 64 coefficients reduces the IQ
computational requirement and significantly decreases the
complexity of the IDCT. The complexity of the IDCT can be further
reduced by combining the zeroing of coefficients with the picture
downsampling described above.
IDCT circuitry for performing 8 × 8 IDCT is required for
decoding SD bit streams. A common architecture for computing the
two-dimensional IDCT is to use an engine that is capable of a fast,
one-dimensional, 8-point IDCT. If the SC IDCT engine were used when
decoding HD bit streams, it could perform about three 8-point IDCTs
in the time of an HDTV block. Thus the SD IDCT can be used to
compute the IDCT of the first three columns of coefficients. The
remaining columns of coefficients would be treated as zero and thus
require IDCT resources.
A special-purpose IDCT engine would be implemented to do the row
IDCTs. It would be especially simple since five of the eight
coefficients would always be zero, and only two or three output
points would have to be computed for each transform. Note that only
four rows would have to be transformed if no additional filtering
were to be performed prior to downsampling.
OCR for page 73
OCR for page 74
OCR for page 75
Representative terms from entire chapter:
video decoder
Page 73
For blocks in progressive frames, or that use field IDCTs,
coefficients might be selected according to the following pattern
(retained coefficients are represented by "x"):
For blocks that use frame DCTs on an interfaced picture, we
might discardcoefficients with the following pattern:
This pattern of retained coefficients maintains temporal
resolutionrepresented by differences between the twofields of the
frame in moving images.
Motion Compensated Prediction
(MCP)
Assume that the anchor pictures4
have been downsampled, as described above. The data bandwidth to
the motion compensation circuitry is thus reduced by the same
factor as the storage requirement. As described above, motion
compensation is accomplished by appropriately interpolating the
reduced resolution picture-reference data according to the values
of the motion vectors. The weights of this interpolation operation
are chosen to correspond to the concatenation of an anti-imaging
upsampling filter, bilinear half-pel interpolation operation
(depending on the motion vectors), and optional downsampling
filter.
Complexity Comparisons
A block diagram of the HD-capable video decoder is shown in
Figure 2. This can be compared with Figure 1, "SDTV Video Decoder
Block Diagram," to identify the additional processing required over
an SD decoder. Complexity comparisons between a full-resolution HD
decoder, SD decoder, prior art HD downconverter,5 and the HD-capable decoder described
in this paper are shown in Table 2. The total costs of the HD
downconverter/SD decoder are not significantly greater than the
cost of the SD decoder alone.
Prediction Drift
In MPEG video coding a significant portion of the coding gain is
achieved by having the decoder construct a prediction of the
current frame based on previously transmitted frames. In the most
common case, the prediction process is initialized by periodic
transmission of all intra-coded (I-frames). Predicted frames
(P-frames) are coded with respect to the most recently transmitted
I- or P-frames. Bidirectionally predicted frames
Page 74
TABLE 2 Complexities of New HD-capable Decoder and
Prior Decoders
Function
ATSC Standard
HDTV
Prior Art HD Downconverter
New HD-capable Decoder
MP@ML (SDTV)
Pre-parser
< 10,000 gates 19 Mbits/sec (partial
decode)
Channel buffer size bandwidth
8 Mbits
140 MBytes/sec BW
8 Mbits
140 MBytes/sec BW
1.8 to 4.3 Mbits
23 MBytes/sec
1.8 Mbits
23 MBytes/sec
Total off-chip memory requirements
96 Mbits specialty DRAM
16 Mbits DRAM + 8 Mbits specialty DRAM
16 Mbits DRAM
16 Mbits DRAM
SP/VLD
93M coefficients/sec
93 M coefficients/sec
15.5 M coefficients/sec
15.5 M coefficients/sec
IDCT
1.5 M blocks/sec
1.5 M simple blocks/sec (HD) +
240 K full 8 × 8 blocks/sec (SD)
1.5 M half complexity simple blocks/sec (HD) + 240
K full 8 × 8 blocks/sec (SD)
240 K full 8 × 8 blocks/sec (SD)
Upsample/ Downsample
1,000 to 2,000 gates?
1,000 to 2,000 gates?
Decodes HDTV
Yes
Yes
Yes
No
Decodes SDTV
Yes
No
Yes
Yes
(B-frames) are coded with respect to the two most recently
transmitted framesof types I or P.
Let the first P-frame following some particular I-frame be
labeled P1. Recall that the decoder described above downsamples the
decoded frames before storage. Thus, when P1 is to be decoded, the
stored I-frame used for constructing the prediction differs from
the corresponding full-resolution prediction maintained in the
decoder. The version of P1 produced by the HD-capable decoder will
thus be degraded by the use of an imperfect prediction reference,
as well as by the pre-parsing and downsampling directly applied to
P1. The next decoded P-frame suffers from two generations of this
distortion. In this way the decoder prediction "drifts" away from
the prediction maintained by the encoder, as P-frames are
successively predicted from one another. Note that the coding of
B-frames are successively predicted from one another. Note that the
coding of B-frames does not contribute to this drift, since
B-frames are never used as the basis for predictions.
Prediction drift can cause visible distortion that changes
cyclically at the rate of recurrence of I-frames. The effect of
prediction drift can be reduced by reducing the number of P-frames
between I-frames. This can be done by increasing the ratio of
B-frames to P-frames, decreasing the number of frames between
I-frames, or both. As a practical matter, however, special encoding
practices are neither needed nor recommended. Experiments have
shown that reasonable HD video-encoding practices lead to
acceptable quality from the HD-capable decoder described here.
Simulation Results
Test images were chosen from material judged to be challenging
for both HDTV and SDTV encoding and decoding. Progressive sequences
were in the 1280 × 720 format; interlaced sequences were 1920
× 1024 and 1920 × 1035 (we did not have access to 1920
× 1080 material). The images contained significant motion and
included large areas of complex detail.
Some of the bit streams used for testing were encoded by the
authors using their MPEG-2 MP@HL software; others were provided by
Thomson Consumer Electronics. Decoding at HDTV was done using the
authors' MPEG-2 MP@HL software. The HD-capable decoder algorithms
described above were simulated in "C" and tested with the HDTV bit
streams.
Page 75
The simulation included accurate modeling of all the processes
described here. The bit stream was preparsed; channel buffer and
picture memory sizes, IDCT processing, and IDCT coefficient
selection were all in accord with these explanations; upsampling
and downsampling were applied to use the original motion
vectors.
The resulting HDTV and downconverted images were examined and
compared. Although the downconverted images were of discernibly
lower quality than the decoded full-HDTV images, observers agreed
that the downconversion process met performance expectations of
"SDTV quality."
Conclusions
This paper describes an enhanced SDTV receiver that can decode
an HDTV bit stream. The enhancements needed to add HD decoding
capability to an SD decoder are modest, even by consumer
electronics standards. If all receivers included the capabilities
described here, an introduction of SDTV would not preclude later
introduction of HDTV because fully capable digital receivers would
already be in use. The techniques described in this paper also
permit design of low-cost, set-top boxes that would permit
reception of the new digital signals for display on NTSC sets. The
existence of such boxes at low cost is essential to the eventual
termination of NTSC service.
References
Lee, D.H., et al. Goldstar, "HDTV Video
Decoder Which Can Be Implemented with Low Complexity," Proceedings
of the 1994 IEEE International Conference on Consumer Electronics,
TUAM 1.3, pp. 6–7.
Ng, S., Thomson Consumer Electronics,
"Lower Resolution HDTV Receivers," U.S. Patent 5,262,854, November
16, 1993.
Notes
1. The "predictions" mentioned here are
the P-frames within the GOP sequence. The downsampling and
preparsing processes alter the image data somewhat, so that small
errors may accumulate if unusually long GOP sequences contain many
P-frames. B-frames do not cause this kind of small error
accumulation, and so good practice would be to increase the
proportion of B-frames in long GOP sequences or to use GOPs of
modest length. Receiver processing techniques can also reduce any
visible effects, although they are probably unnecessary.
2. In this paragraph, the buffering
operation and its associated memory are treated as distinct from
the picture-storage memory. This distinction is useful for tutorial
purposes, even though the two functions may actually share the same
physical, 16-Mbit memory module.
3. Note that the macroblock type, coded
block pattern, and run-length information must be decoded.
4. Anchor pictures are I- and P-frames in
the MPEG GOP sequence. The downsampling that has been applied to
them by the decoding techniques described here means that the
motion vectors computed by the encoder can no longer be directly
applied.
5. The term "downconverter" as used here
applies to hardware that reduces the full HDTV image data to form
an SDTV-resolution picture. The appropriately enhanced SDTV decoder
described here inherently includes such an HDTV
"downconverter."