Sample-Rate Conversion: Algorithms and VLSI Implementation

Diss. ETH No. 10980
Sample-Rate Conversion:
Algorithms and VLSI
Implementation
A dissertation submitted to the
SWISS FEDERAL INSTITUTE OF TECHNOLOGY
ZURICH
for the degree of
Doctor of technical sciences
presented by
FRITZ MARKUS ROTHACHER
Dipl. El.-Ing.
born 24. 8. 1965
citizen of Blumenstein BE
accepted on the recommendation of
Prof. Dr. W. Fichtner, examiner
Prof. Dr. W. Guggenb¨uhl, co-examiner
1995
Acknowledgements
I would like to thank my adviser, Prof. W. Fichtner, for his confidence in me
and my work and for establishing a generous working environment. I enjoyed
working at the Integrated Systems Laboratory over the last four years.
I am grateful to my associate advisor, Prof. W. Guggenb¨uhl, for reading
and commenting on my thesis.
I want to thank Gunnar Lehtinen, Andreas Steiner, Dominique M¨uller, and
Christian Siegrist for their effort in implementing the sample-rate converter
during their semester and diploma theses.
I would also like to thank the staff of the Integrated Systems Lab: All secretaries for handling the administrative work, Hanspeter Mathys and Hansj¨org
Gisler for maintaining the logistics, the system administrators Christoph Wicki
and Adam Feigin who fixed all computer-related problems, and Hubert Kaeslin
and Andreas Wieland of the Microelectronics Design Center for their support
concerning VLSI design methodology and tools.
My special thanks go to Norbert Felber, who contributed many valuable
ideas and support in the hardware lab. He found always time and was interested
to discuss digital signal processing and related topics, and offered competent
advice. I owe most of my knowledge of real-world measurement techniques
to him.
Tom Heynemann, Robert Rogenmoser, Hubert Kaeslin, and Nobert Felber
improved the quality of this text by their constructive proofreading.
This work would not have been possible without the support and encouragement of many people outside the ETH. In particular I would like to thank
my parents for their tolerance and support.
i
Contents
Acknowledgements
i
Abstract
ix
Zusammenfassung
xi
1 Introduction
1
:::::::::
Concept : : : : : : : : : :
Related Work : : : : : : :
Structure of the Thesis : :
1.1 Motivation
1.2
1.3
1.4
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
2 Principle in Time and Frequency Domain
::::::::::
Interpolation : : : : : : : : : : : : : : : : : : :
Decimation : : : : : : : : : : : : : : : : : : :
Sample-rate Conversion for Fixed Ratios : :
Sample-rate Conversion for Rational M/L : :
2.1 Test Signals and Sampling
2.2
2.3
2.4
2.5
1
2
4
5
7
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
8
10
11
12
15
iii
:::::::::::::::::::
2.5.2 Downmode : : : : : : : : : : : : : : : : :
2.5.3 Continuous Mode : : : : : : : : : : : : :
Sample-rate Conversion for Any Arbitrary Ratio :
2.5.1 Upmode
2.6
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
3 High-Order Digital Filters
:
3.1.1 Transition Band : : : : : :
3.1.2 Stopband : : : : : : : : :
Realization Considerations : : :
3.2
3.3
3.4
3.5
16
18
19
23
:
:
:
:
FIR Filter and Hold Operation :
:
:
:
:
3.2.1
:
3.2.2 FIR Filter and Lagrange Interpolation
:
3.2.3 Alternatives : : : : : : : : : : : : : : : :
Required Resources : : : : : : : : : : : : : : :
Synthesis of High-Order FIR Filters : : : : : : :
3.4.1 Specifications : : : : : : : : : : : : : : :
3.4.2 Design Technique : : : : : : : : : : : :
3.4.3 Direct Synthesis : : : : : : : : : : : : :
3.4.4 Prototype Method : : : : : : : : : : : :
3.4.5 Relaxed Specifications : : : : : : : : :
Quantization of Filter Coefficients : : : : : : : :
3.5.1 Quantization of FIR Coefficients : : : :
3.5.2 Quantization of Interpolated Coefficients :
3.1 Required Filter Characteristics
15
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
23
24
24
26
26
27
34
36
38
38
39
40
40
43
45
45
47
4 Frequency Tracking
51
:::::::::::
Principle : : : : : : : : : : : : : :
Frequency Counter : : : : : : : :
Digital Phase Locked Loop : : :
4.4.1 Phase Detector : : : : : :
4.4.2 Loop Filter : : : : : : : :
Non-uniform Resampling : : : :
Performance Comparison : : : :
4.6.1 Frequency Counter : : :
4.6.2 Digital PLL : : : : : : : :
4.1 Requirements
4.2
4.3
4.4
4.5
4.6
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
5 Implementation
:::::
5.1.1 Frequency Counter :
5.1.2 Digital PLL : : : : : :
Datapath for FIR Filter : : : :
5.2
5.3
5.4
52
53
55
56
57
58
62
62
65
67
:
:
:
:
5.2.1 Review of Operations
:
5.2.2 Finite Word-Length in Datapath :
DSP Realization Considerations : : : :
ASIC Realization : : : : : : : : : : : : :
5.4.1 Implementation : : : : : : : : : :
5.4.2 Discussion : : : : : : : : : : : :
5.1 Frequency Tracking
51
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
68
68
69
71
71
73
75
77
77
79
6 Measurement Principle & Setup
83
:::::::::::::::::::
Analyzer : : : : : : : : : : : : : : : : : : : :
6.2.1 Frequency Matching for the FFT : :
6.2.2 Windowing for the FFT : : : : : : :
6.2.3 Complex Windows : : : : : : : : : :
Measurement Procedure : : : : : : : : : :
6.1 Generator
6.2
6.3
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
7 Results
:
7.1.2 Digital PLL : : : : : :
System Measurements : : :
Discussion : : : : : : : : : :
7.1.1 Frequency Counter
7.3
85
86
86
89
89
93
7.1 Frequency Tracking Measurements
7.2
83
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
94
95
95
98
103
8 Conclusions
107
A List of Symbols and Integrals
111
B Error Caused by Hold Effect
113
:::::::::::::::::::::::::
White Noise : : : : : : : : : : : : : : : : : : : : : : : : :
Pink Noise : : : : : : : : : : : : : : : : : : : : : : : : :
B.1 Sine Wave
113
B.2
114
B.3
C Error Caused by Linear Interpolation
C.1 Sine Wave
:::::::::::::::::::::::::
114
117
117
C.2 White Noise
C.3 Pink Noise
:::::::::::::::::::::::::
:::::::::::::::::::::::::
118
118
D Performance Measurements
121
List of Figures
146
List of Tables
147
Curriculum Vitae
149
Bibliography
154
Abstract
Many modern signal processing tasks are performed in the digital domain.
Various sample rates are used depending on the required signal quality and
the available bandwidth. Sample-rate conversion is therefore inevitable to
interface systems with different sample rates. In this thesis the concept of
digital sample-rate conversion by integer ratios is extended to arbitrary ratios
and the trade-offs of various design parameters for audio applications are
discussed.
Conceptually, the source samples are interpolated to a heavily oversampled
intermediate signal using a high-order digital filter. To allow a conversion
between arbitrary sample rates this intermediate signal is transformed into a
pseudo-analog representation by holding the value in-between the interpolated
samples. The resulting continuous-time signal is then resampled at the sink
sample rate.
For an infinite interpolation factor, ideal filters, and jitter-free sampling
clocks digital sample-rate conversion is lossless. In this thesis the error contributions of the various non-idealities are quantified. The synthesis of high-order
(> 20000) narrow-band lowpass filters and their quantization properties are
discussed. A digital PLL is presented, which allows to measure precisely the
sink sampling moment relative to the source samples, while attenuating jitter
of the sampling clocks. The effects of the non-uniform sampling due to clock
jitter are treated.
An ASIC for fully digital sample-rate conversion has been developed,
fabricated, and characterized using an all-digital measurement system. A new
(complex) window technique for the FFT is introduced, which minimizes the
window side effects when doing spectrum analysis of multirate discrete-time
signals. Based on the results of this thesis, the design parameters for a converter
with full 20-bit quality are proposed.
ix
Zusammenfassung
Viele moderne Signalverarbeitungsaufgaben werden im digitalen Bereich gel¨ost. Abh¨angig von der geforderten Signalqualit¨at und der verf¨ugbaren Bandbreite werden unterschiedliche Abtastraten verwendet. Abtastratenwandlung
ist daher unumg¨anglich, um Systeme mit unterschiedlichen Abtastraten zu
verbinden. In dieser Doktorarbeit werden die Konzepte f¨ur die digitale Abtastratenwandlung von ganzzahligen zu beliebigen Verh¨altnissen erweitert.
In Abh¨angigkeit von verschiedenen Parametern wird die Qualit¨at und der
Aufwand bez¨uglich Audioanwendungen untersucht.
Vom Prinzip her werden die Eingangswerte zu einem stark u¨ berabgetasteten Zwischensignal interpoliert unter Verwendung eines digitalen Filters hoher Ordnung. Um eine Wandlung zwischen beliebigen Abtastraten zu
erm¨oglichen, wird dieses Zwischensignal in eine pseudo-analoge Darstellung
gewandelt. Dazu wird der Wert des Zwischensignals zwischen zwei Abtastwerten gehalten. Das resultierende zeitkontinuierliche Signal wird danach mit
dem Ausgangstakt neu abgetastet.
F¨ur einen unendlich grossen Interpolationsfaktor, ideale Filter und jitterfreie Taktsignale ist die digitale Abtastratenwandlung verlustlos. In dieser
Doktorarbeit werden die Fehleranteile von verschiedenen Nichtidealit¨aten
abgesch¨atzt. Die Synthese von schmalbandigen Tiefpassfiltern hoher Ordnung (> 20000) und deren Eigenschaften f¨ur begrenzte Wortl¨angen werden
behandelt. Ein digitaler PLL wird vorgestellt, welcher die Abtastzeitpunkte
am Ausgang relativ zum Eingangstakt sehr genau misst und gleichzeitig Jitter
der Taktsignale d¨ampft. Die durch Taktjitter verursachten Auswirkungen von
nicht a¨ quidistanten Abtastzeitpunkten werden ebenso besprochen.
Ein ASIC f¨ur ausschliesslich digitale Abtastratenwandlung wurde entwickelt, hergestellt und mit einem vollst¨andig digitalen Messystem ausgemessen.
F¨ur die Spektralanalyse von zeitdiskreten Signalen mit mehreren Abtastraten
xi
wird eine neue (komplexe) Fenstertechnik f¨ur die FFT verwendet, welche die
Nebeneffekte des Fensters auf ein Minimum reduziert. Ausgehend von den
Resultaten dieser Doktorarbeit werden die Parameter f¨ur einen Wandler mit
voller 20-bit Qualit¨at vorgeschlagen.
1
Introduction
1.1
Motivation
Modern signal processing problems are often solved in the digital domain
due to the availability of powerful VLSI circuits which allow to perform
complex operations in real-time, without the well-known shortcomings of
analog implementations. The source signal is transformed into the digital
domain by an A/D converter. All data processing, e.g. filtering, shaping,
mixing etc., is done in the digital domain and only the final result is converted
back to analog. To overcome the degradation caused by successive A/D-D/A
conversion, all processing blocks must have digital interfaces.
Depending on the available bandwidth of the channel, the required quality,
and the data rate of the interfaces a wide variety of sample rates are used:
e.g. 48 kHz for professional digital audio systems, 44.1 kHz for consumer
digital audio (CD), 32 kHz for digital satellite radio (DSR), or 44.056 kHz for
video compatibility (VCR). The incorporation of all these systems is however
trouble-free, if a sample-rate converter is used at each interface.
Sample-rate conversion is also used in the growing field of digital audio
compression systems. Downsampling followed by data compression allows
transmission of a high-quality audio signal over a 64 kbit/s ISDN telephone
line [dGL94].
1
From source to sink, signals are repeatedly converted for editing, storage,
and transmission. To retain the signal quality of the original recording –
even after several cascaded converters – the conversion must be virtually
transparent.
1.2
Concept
Many digital signal processing applications necessitate a conversion of the
sample rate at the interface to match it to the following system. Conceptually, sample-rate conversion by non-rational ratios is accomplished in the
digital domain by interpolating the incoming samples to a high enough sampling frequency (usually in the GHz range) and subsequently decimating this
oversampled data to the target sample rate by choosing the appropriate samples. Practically the problem will be partitioned into a frequency tracking unit,
which determines precisely the sink phase relative to the source samples, and
into a high-order digital interpolation/decimation filter. This filter limits the
signal either to half the source or to half the sink sample rate to avoid aliasing.
Therefore two different modes of operation result depending on whether the
source or the sink sample rate is larger.
Research activities of our laboratory demonstrated a way to overcome the
need for two separate modes for up- and downsampling: The cutoff frequency
of the interpolation/decimation filter is dynamically adjusted – according to the
ratio of sink and source sample rate – by using the frequency scaling property
of the Fourier transform. We call this capability for a smooth transition
between up- and downsampling CONTINUOUS MODE [RW90]. Based on the
CONTINUOUS MODE we proposed and realized a VLSI architecture of a samplerate converter for arbitrary ratios [LS91]. We will refer to this implementation
as SARCO (SAmple-Rate COnverter).
For the interpolation/decimation filter, the precise ratio of the source and
the sink sample rates must be measured. If both sample rates are synchronous
i.e. derived from the same master clock and if this master clock is available,
both rates can be measured by a counter running with the master clock. If the
master clock is not available, it can be recovered by means of a phase-locked
loop (PLL), which additionally suppresses clock jitter. A buffer (FIFO) is then
used for the data exchange. However, non-synchronous interfaces require a
new solution.
In general three cases can be distinguished:
A. Synchronous Interfaces
B. Plesiochronous Interfaces
C. Asynchronous Interfaces
Synchronous interfaces are the easiest to handle (as shown above), but
real-world problems are often plesiochronous or asynchronous. In the plesiochronous case two systems may have the same nominal sample rates, but
due to tolerances or temperature drift of the two independent oscillators, the
two frequencies are not phase-locked, but differ by a small, varying amount e.g.
in the range of 100 ppm. Asynchronous interfaces apply for examples where
sources of different independent sample rates are to be mixed or equipment
with arbitrary sample rates has to be connected.
The frequency measurement can be realized accurately by several methods:
1. Measurement performed with a very high-frequency clock (GHz)
2. Averaging measurements performed with a medium-frequency clock
3. Determination of the precise sample phase by a (digital) phase-locked
loop
For dynamically changing sample rates an optimal trade-off between precise tracking and noise suppression must be found. High-frequency measurement allows to follow sample-rate changes immediately. Unfortunately jitter
of the sampling frequencies is not suppressed in this case, but modulated onto
the audio signal. The lowpass filters used for systems of type 2 and 3 remove
high frequency noise at the cost of slower tracking.
In this thesis the emphasis is put on a unified approach to the implementation of sample-rate conversion of audio signals with arbitrarily changing ratios.
The effects of non-ideal filtering and non-uniform sampling are analyzed and
their effect on the signal quality is discussed. Additionally, considerations
on VLSI implementation are presented. This field of research involves a
large number of digital signal processing topics, including high-order filter
synthesis, digital phase-locked loops (DPLL), real-time filtering, time-varying
and adaptive filters, polyphase structures, finite word-length effects, spectrum
analysis, and VLSI architecture.
1.3
Related Work
The simplest solution for sample-rate conversion in the digital domain is to
repeat or omit a sample from time to time to match the sample rates. This
method is especially applicable for plesiochronous sample rates, but yields
only low quality results. In professional applications the problem of samplerate conversion by arbitrary ratios is often reduced to the synchronous case
by equipping all external sources, such as CD players or DAT machines, with
a synchronization input. The speed of the external source is then adapted
according to the system master clock.
The basic operations for synchronous sample-rate conversion (type A)
from a signal processing point of view have been covered in [CR81]. [LK81,
LPW82] extended this work to non-rational ratios,but distinguished two modes
depending on whether the sink sample rate is larger or smaller than the source
rate. Plesiochronous interfaces (type B) cannot be realized with this distinction
without smooth switching between modes. A first hardware implementation
has been realized following the above principles, which required some 400
integrated circuits per audio channel [Lag82]. In 1983 a theoretical exposition
of many aspects of multirate digital signal processing appeared in [CR83]. It
treats the advantages and disadvantages of FIR and IIR filters for interpolation
and decimation by rational ratios in detail and gives many design examples.
These realizations all have the disadvantage that they are not well suited for
the CONTINUOUS MODE, but only for fixed sample-rate ratios. [CR83] also
covered the realization of high-order filters by cascading several low-order
stages, which are easier to realize. [Ram84] discussed how to reduce the
storage requirements for the filter coefficients by using various interpolation
methods.
Most publications on sample-rate conversion do not cover in depth the
need for proper frequency tracking although this is crucial for high quality
audio applications. [LPW82] suggested the use of a moving average filter
for frequency tracking. [Sti92] used a digital PLL running at a multiple of
either the source or the sink sample rate. Frequency measurement with a
GHz counter has not jet been realized due to the lack of jitter suppression and
technology limitations.
In addition to the prototype hardware implementation mentioned above,
several sample-rate converters have been realized using general purpose digital signal processors. [PHR91] implemented a sample-rate converter on a
DSP56001. The interpolation/decimation filter has been realized by multiplying a SINC function by the Blackman-Harris window function, as suggested
by [Ram82]. [CDPS91] realized a sample-rate converter between 44.1 kHz
and 48.0 kHz using B-splines. A two stage FIR interpolation each by a factor
2 is followed by a 6th order B-spline interpolation. However, these implementations are limited to fixed sample-rate ratios.
The first commercial VLSI implementation of a sample-rate converter using the CONTINUOUS MODE has appeared recently [AK93]. A second commercial realization of a sample-rate converter by [JTG+ 94] follows an alternative
approach, which will be discussed briefly.
1.4
Structure of the Thesis
This thesis describes theory, experience, and results that have been gained
during the development and test of an application specific integrated circuit
(ASIC) for sample-rate conversion of audio signals with arbitrarily changing ratios. In particular the error contribution of various non-idealities are
discussed and the parameters for a true 20-bit converter are derived.
In Chapter 2 an overview of the basic principle for sample-rate conversion
in the digital domain is given. The well-known concepts of interpolation and
decimation by integer factors are applied to sample-rate conversion by rational
ratios. The CONTINUOUS MODE already published by the author in [RW90],
which extends the sample-rate conversion to arbitrary sample-rate ratios, is
described in detail.
Chapter 3 gives insight into the strategies for the design and synthesis of
very high-order (> 20 000) narrow-band lowpass filters. Several one- and
multistage implementations are compared in performance, implementation
cost, and required computation power. An interpolation method which reduces
the storage requirements for the filter coefficients is presented. Additionally,
the effects of filter coefficient quantization on the transfer function is discussed.
Chapter 4 compares several alternatives for the realization of the frequency tracking unit. The performance and cost of high-frequency measurement,
averaged medium-frequency measurement, and a digital PLL are compared
for synchronous, plesiochronous and asynchronous sampling frequencies. In
addition the effects of non-uniform (jittered) sampling are discussed.
Chapter 5 evaluates the system complexity and compares ASIC and DSP
implementations according to hardware requirements and costs.
In Chapter 6 the measurement principles for multirate digital audio systems
and the hardware setup used for the measurements in this thesis are presented.
A new window technique for the FFT is introduced, which allows to minimize
the side effects of the window when doing spectrum analysis of multirate
discrete-time signals.
Chapter 7 compares the different concepts, architectures and implementations by measurements.
2
Principle in Time and Frequency
Domain
The obvious method to transfer an audio signal between unassociated sample
rates is digital-to-analog (D/A) conversion followed by analog-to-digital (A/D)
conversion (Fig. 2.1).
D/A
DA-ADconv.epsi
f source 103 21 mm
f source
Reconstruction
Filter
A/D
f sink
Anti-aliasing
Filter
f sink
Figure 2.1: D/A conversion followed by A/D conversion
This method works for any ratio of sink sample rate fsink to source sample
rate fsource , but is tainted with several limitations. First of all, to retain the
large signal-to-noise ratio (S/N) of a digital audio signal expensive high-quality
D/A and A/D converters have to be used. Secondly, the output signal of the
D/A converter must be reconstructed by an analog filter of high precision
with cutoff frequency fsource =2. Additionally, to fulfill the Nyquist theorem,
the input signal of the A/D converter must be band-limited to fsink =2 by an
anti-aliasing filter.
If the sink sample rate fsink is larger than the source sample rate fsource
7
– we will refer to this case as UPMODE– the second filter can be omitted. In
the opposite case (DOWNMODE) the first filter is not necessary. Therefore both
lowpass filters can be merged into one with cutoff frequency
fcutoff =
fsource if f
sink > fsource UPMODE
2
fsink
otherwise
DOWNMODE
(2.1)
2
This reconstruction/anti-aliasing filter, which must be realized in the analog
domain, needs a flat passband, a narrow transition band and a highly attenuated
stopband. The required specifications for 16 bit quality make the problem
hardly solvable.
A third disadvantage of the D/A-A/D approach is – besides the need for
high-quality D/A and A/D converters and the expensive analog filter – that
any jitter on the sampling clocks will translate into signal distortion and thus
needs to be suppressed by additional circuitry.
We can avoid many of the above problems by performing the sample-rate
conversion process in the digital domain. The A/D and D/A converters are
omitted and the analog filters are replaced by digital ones. This concept will
be explained in detail in the following sections.
2.1
Test Signals and Sampling
The spectrum of the example signal that is used for illustrative purposes
throughout this thesis is shown in Figure 2.2.
X(f)
Signal.epsi
49 31 mm
1
2
⋅f sample
f sample
f
Figure 2.2: Spectrum of example signal
It is composed of three parts: a low-frequency sine wave, a high-frequency
sine wave and a third component, which declines linearly with the frequency.
This third component shall represent the spectrum of a typical audio signal,
while the first two show the response for low- and high-frequency components.
For noise calculations three sine waves at 1; 10; 20 kHz, white noise, and
pink noise are used as test signals. White noise is a random signal, which has
an equal amount of energy per Hertz of bandwidth. Pink noise however is
random noise delivering an equal energy per octave [Ben88]. This distribution
matches more precisely the sensitivity of the ear. Figure 2.3 shows the spectral
power density of pink noise and white noise respectively.
S(f) [dB/Hz]
-20.8
Pink.epsi
101 33 mm
-30.8
-40.8
-46.0
-50.8
20
200
2 000
20 000 f [Hz]
Figure 2.3: Pink noise S (f ) = 61f , white noise S (f ) = ∆1f
The well-known sampling theorem states that a continuous-time signal x(t),
with a spectrum X(f) that is band-limited to fsample =2, can be uniquely and
error-free reconstructed from its samples x[n]. The spectrum of these samples
is composed of the original spectrum X(f), which repeats at all multiples of the
sampling frequency (Fig. 2.4). If the analog signal x(t) contains components
at frequencies higher than fsample =2 they will distort the sampled signal in the
form of spectral fold-over (aliasing) and can not be recovered anymore.
baseband
Sampling.epsi
99 18 mm
f sample
2⋅f sample
3⋅f sample
4⋅f sample
5⋅f sample
Figure 2.4: Spectrum of a sampled signal with sample rate fsample
A second parameter – besides the sampling frequency – which characterizes the sampled data is the word-length used to represent the data samples.
The effects of finite word-length will be neglected in the following sections,
but will be discussed in Sections 3.5 and 5.2.2.
2.2
Interpolation
In this section the increase of the sample rate by an integer factor L is described.
In the following we refer to this process as interpolation.
f sample
L⋅f sample
Interpolation.epsi
99 37 mm
HL
L⋅f sample
f sample
Figure 2.5: Interpolation by a factor L
If we increase the sample rate of a signal we can preserve the full signal
content according to the sampling theorem. After the interpolation the signal
spectrum repeats only at multiples of the new sample rate L fsample (Fig. 2.5).
The interpolation is accomplished by inserting L-1 zeros between successive
samples (zero-padding) and using an (ideal) lowpass filter with
(
HL (f ) =
L
0
f
for 0 < f < sample
2
f
for sample
<
f
< Lfsample
2
2
(2.2)
The multiplication of a signal with spectrum X (f ) with a transfer function
H (f ) in frequency domain Y (f ) = H (f ) X (f ) corresponds to a convolution
of the signal in the time domain y (t) = h(t) ? x(t). For the zero-padded
discrete-time signal x[.] we get
+1
X
k
y[m] =
hL [m , k] x L ;
k=,1
for k
=
L; 2L; 3L; : : :
(2.3)
If we substitute k by (m div L) L , n L we get
y[m]
=
=
+1
X
n=,1
+1
X
n=,1
hL m , (m div L) L + n L x [m div L , n]
(2.4)
hL [n L + m mod L] x [m div L , n]
(2.5)
where m div L is the integer portion of the division m=L , and m mod L
denotes the remainder.
For each output value y [m] we therefore multiply and accumulate n samples of the impulse response hL [:] by the corresponding input samples x[:].
Note that the samples of the impulse response hL[:] are equidistant with spacing
L.
2.3
Decimation
To decrease the sample rate by an integer factor M (decimation) we must
first band-limit the signal to fsample =(2 M ) (Fig. 2.6) by the lowpass filter
HM (Eq. 2.6) to comply with the sampling theorem and keep only every Mth
sample. As a result, we loose all signal content above half the target sampling
frequency fs =M .
HM
1/M⋅f sample
Decimation.epsi
104 38 mm
1/M⋅f sample
Figure 2.6: Decimation by a factor M
f sample
f sample
(
HM (f ) =
1
0
f
for 0 < f < sample
2M
f
f
sample
for 2M < f < sample
2
(2.6)
To get the decimated signal we start from an initial phase '0 , which can
be chosen arbitrarily, keep every Mth sample and skip all other samples
(Eq.2.7). There exist therefore M different sets of samples (depending on the
initial phase '0 ), which all represent the same signal.
y[m] =
2.4
+1
X
n=,1
hM [m M + '0 , n] x [n] ;
for '0
=
0; : : : ; M
,1
(2.7)
Sample-rate Conversion for Fixed Ratios
In the preceding section we have considered interpolation and decimation as
two separate operations. To perform sample-rate conversion by a fixed ratio
M=L these two operations have to be cascaded (Fig. 2.7). Note that the
interpolation must precede the decimation to keep the maximum bandwidth of
the source signal.
L
HM
HL
M
Conversion.epsi
103 38 mm
f source
L⋅f source
L⋅f source
f sink
L/M⋅f source = f sink
Figure 2.7: Sample-rate conversion by a fixed ratio M=L
The two lowpass filters can be merged into a single one with a cutoff
frequency, which is half of the minimum of source and sink sample rate
(Eq. 2.8).
HLM (f ) =
L
0
for 0 < f < min( fsource
; fsink
2
2 )
otherwise
(2.8)
The interpolation of the source signal by the factor L is succeeded by
the decimation of factor M . Therefore, instead of calculating the interpolated
signal at the rate L fsource and keeping only every Mth value, we can operate
the interpolation/decimation filter at the rate L=M fsource , as indicated by the
dotted line in Figure 2.7, but we must choose the correct set of filter coefficients
for each sink sample.
If we substitute m in Equation 2.5 by 'sink [m] = m M + '0 (Eq. 2.7)
we obtain the formula for sample-rate conversion by fixed integer ratios
y[m] =
+1
X
j =,1
hLM j L + 'sink [m] mod L x 'sink [m] div L , j
(2.9)
with
'sink [m] = m M + '0 ;
mod
Modulo division;
div
Integer division
This is the all-digital equivalent to the D/A-A/D solution (Fig. 2.1, Eq. 2.1),
if we use a infinitely large interpolation factor L.
The theoretical concept for sample-rate conversion by fixed integer ratios
has already been treated in [CR83]. But for an actual realization of Equations 2.8 and 2.9 the following problems are to be solved:
1. Equation 2.8 specifies a lowpass filter with a constant passband gain
and a transition band of width zero. The impulse response of such
a rectangular filter with cutoff frequency fcutoff and a sample rate
L fsample is
h[n] =
f
sin[2n Lcutoff
fsource ]
f
2n Lcutoff
fsource
=
SINC [n 2 fcutoff
L fsource ]
(2.10)
and extends from n = ,1 to n = +1. An actual realization can
only be a high-order lowpass filter of finite length due to the causality
principle. The synthesis problems for such high-order (lowpass) filters
are discussed in Chapter 3. In the following we assume that the filter is
of order Q L + 1 with a sufficiently large Q.
2. For a lossless sample-rate conversion we assumed an infinite stopband
attenuation of the lowpass. A filter of finite length can only reach a
finite stopband attenuation, however. Furthermore the quantization of
the filter coefficients h[:] additionally alters the filter transfer function.
In Chapter 3 we will evaluate the achievable stopband attenuation of
the lowpass h[:] and introduce the concept of quasi-floating-point (QFP)
notation for the filter coefficients. The effects of finite word-length
computation in the datapath are discussed in Section 5.2.
3. The cutoff frequency of the lowpass (Eq. 2.8) is not constant for variable
ratios M=L, but is defined by
fcutoff =
fsource if f
sink > fsource UPMODE
2
fsink
otherwise
DOWNMODE
(2.11)
2
[Pel82] distinguished two modes of operation depending on whether the
sink sample rate is larger or smaller than the source rate. The same
lowpass filter either runs with the source or the sink sample rate and
thereby only one set of filter coefficients is needed. The disadvantage
of this approach is that switching from one mode to the other requires
muting of the output signal. The CONTINUOUS MODE to be presented in
Section 2.5.3 overcomes this limitation.
4. For fixed sample rates M and L can be calculated in a straightforward
manner, e.g. for fsource = 48 kHz and fsink = 44:1 kHz we get M =
160 and L = 147. On the other hand, if the two clock sources are
not synchronous their ratio can be non-rational and may even vary over
time. The sample-rate converter must be able to accept any ratio of sink
and source sample rate. Furthermore it must suppress short-time jitter
of the sampling clocks. The different frequency tracking principles for
arbitrary ratios M=L are described in Chapter 4, whereas in Section 2.6
we will estimate the required interpolation factor L for professional
audio quality.
2.5
Sample-rate Conversion for Rational M/L
In Section 2.4 we treated the sample-rate conversion for fixed ratios and stated
that the cutoff frequency of the interpolation/decimation lowpass must be
either fsource =2 (if fsource < fsink ) or fsink =2 (Eq. 2.11). In this section we
describe how the frequency scaling property of the Fourier transform can be
used to adjust the cutoff frequency of the lowpass filter dynamically to allow
a smooth transition between both modes.
Adaptive Filter
Rational.epsi
103 26 mm
L
f source
M
f sink
L⋅f source
Figure 2.8: Sample-rate conversion by a rational ratio M=L
Note that since M=L is rational all operations can be performed on a
discrete time-grid. The time unit is defined by the source sample rate as
tunit = L f1
(2.12)
source
Using this time unit, the source sample period is L tunit and the sink sample
period is accordingly M tunit .
2.5.1
Upmode
If the sink sample rate is larger than the source sample rate we can preserve
the full bandwidth of the source signal. If we assume filter h to be of order
Q L + 1, using Equation 2.9, we get
y[m] =
Q
+ 2
X
j =, Q2
2
z
}|
sink [ ]
h j L+'
6
4
' m]
∆ [
m
3
2
{
7
5
x '
mod L
z
6
4
with 'sink [m] = m M
+
ptr[m]
}|
sink [ ]
m
'0
3
{
div L ,j 5 (2.13)
7
Figure 2.9 gives a graphical representation of Equation 2.13. Because all
mathematical operations in Equation 2.13 are integer operations, we will stay
on the time-grid defined in Equation 2.12.
ptr[m]
source
samples
L
∆ϕ[m]
ϕ0
sink
samples
ϕ sink
M
Upmode.epsi
104
52 mm
[m]
h [n]
∆ϕ[m] -L+∆ϕ[m]
L+∆ϕ[m] -2⋅L+∆ϕ[m]
Figure 2.9: UPMODE, fsink
> fsource, L > M
Starting from a measured initial phase '0 (on the grid) the phase of the
sink samples 'sink [m] is incremented by M (time units) for every value y [m].
Each value of 'sink [m] determines both the phase of the sink sample relative
to the last source sample ∆'[m] as well as its position ptr [m]. Depending on
the phase difference ∆'[m] one particular subset of the samples of the impulse
response h is taken. The samples of the impulse response have a spacing of L.
Since the lowpass filter h is of order Q L + 1, L different subsets of Q + 1
values exist.
2.5.2
Downmode
If the sink sample rate is smaller than the source sample rate we are not able
to preserve the full frequency range of the source signal, but must limit it to
fsink =2 by the lowpass filter h0 (Eq. 2.11).
Q
+ 2
X
j =, Q2
2
' m]
∆ [
3
2
z
ptr[m]
3
}|
{
}|
{
z
6
7
6
0
h 4j L + 'sink [m] mod L5 x 4'sink [m] div L ,j 75
(2.14)
In order to use the same lowpass filter coefficients as in the UPMODE
(fcutoff = fsource =2) we use the frequency scaling property of the Fourier
transform to alter the cutoff frequency from fsource =2 to fsink =2.
fsink i
h
fsource
fsource
h0[i] = fsink
L h L i
=
M
M
(2.15)
If Equations 2.14 and 2.15 are combined we get for y [m] in the DOWNMODE
2
Q
M
L2
X
3
' m]
∆ [
2
3
ptr[m]
z
}|
{
}|
{
z
L
6L
7
6
7
(j L + 'sink [m] mod L)5 x 4'sink [m] div L ,j 5
h
4
M j=, M Q M
+
L
2
with 'sink [m] = m M
+
'0
L
ptr[m]
source
samples
sink
samples
(2.16)
∆ϕ[m]
ϕ0
M
ϕ sink [m]
Downmode.epsi
108 69 mm
h’[n]
h [n]
∆ϕD[m] -L2/M+∆ϕD[m]
L2/M+∆ϕD[m] -2⋅L2/M+∆ϕD[m]
Figure 2.10: DOWNMODE, fsink
< fsource, L < M
Figure 2.10 illustrates Equation 2.16. The phase 'sink [m] for each sink
sample y [m] is calculated in the same way as in the UPMODE, but the filter
coefficients must be scaled by L=M . The effective phase difference ∆'D [m]
is therefore L=M ('sink [m] mod L) and the samples of the impulse response
have a spacing of L2 =M . Since L=M determines only the cutoff frequency
of the decimation filter we can alter it slightly so that L2 =M is an integer and
stays on the time-grid defined in Equation 2.12.
P
The number of multiplication and accumulations ( x[:] h[:]) increases
from Q + 1 in the UPMODE to Q M=L + 1 in the DOWNMODE.
2.5.3
Continuous Mode
The CONTINUOUS MODE combines both UPMODE and DOWNMODE. Since the
cutoff frequency of the lowpass filter is adapted continually, there is a smooth
transition between both modes.
y[m] = % QD
+ 2
X
j =, QD
h j L % + ∆'D [m] x ptr[m] , j
(2.17)
2
with
'sink [m]
ptr[m]
%
∆'D [m]
QD
UPMODE
DOWNMODE
if L > M
if L < M
i.e. fsink
> fsource i.e. fsink < fsource
m M + '0
'sink [m] div L
1
'sink [m] mod L
Q
L
M
L
M ('sink [m] mod L)
M Q
L
Recall that to stay on the time-grid, % must be altered slightly so that L2 =M
is an integer. Additionally, QD must be rounded up to the next even integer
and the initial phase ∆'D [0] = L=M ' must be rounded to the next integer.
2.6
Sample-rate Conversion for Any Arbitrary
Ratio
Up to now we have assumed that the ratio of source to sink sampling frequency
is a rational number and can consequently be written as M=L. If we choose L
large enough we can approximate any rational number to a sufficient precision.
However there are practical implementation limits for L. In this section we
will estimate the minimum required order of L for 18- and 20-bit quality.
x(t)
y(t)
HoldTime.epsi
107 35 mm
â
x[n]
T=1/(L⋅f source)
Figure 2.11: Continuous-time representation
between samples xˆ [n]
y(t)
by holding value in-
If the source and the sink sample rates have an arbitrary (or even varying)
ratio, we need to change from the discrete time-grid (Eq. 2.12) to a continuoustime representation. This can accomplished by interpolating the source signal
with a large, but fixed interpolation factor L and holding the value in-between
the interpolated samples (Fig. 2.11). This continuous-time signal is then
resampled at the sink sample rate.
Note that instead of holding the sample value between successive samples,
linear or higher order interpolation between samples could be used as well.
However, since the available time-resolution is limited, these operations must
be followed by a hold operation to get a continuous-time signal. We will
discuss this approach further in Chapter 3.
Figure 2.12 shows a block diagram of the required operations using the hold
operation. The source signal x[n] is interpolated to xˆ [n] and then converted
to the continuous-time signal y (t) by the hold operation. y (t) can then be
resampled at any required moment.
The hold operation of the value of the interpolated source signal during the
Adaptive
Filter
L
Continuous
Time
Discrete
Time
Arbitrary.epsi
105
30 mm Hold
â
x[n]
x[n]
f source
Resample
y(t)
y[m]
f sink
L⋅f source
Figure 2.12: Sample-rate conversion by an arbitrary ratio M=L
time T = 1=(L fsource) can be expressed in the time domain as a convolution
of the interpolated samples with a rectangle of width T (Eq. 2.18).
y(t) = xˆ [n] ?
RECT (
1
L fsource )
(2.18)
This convolution in the time domain corresponds to a multiplication with
a SINC function in the frequency domain (Eq. 2.19).
Y (f ) = Xˆ [f ] HHold (f ) = Xˆ [f ] SINC (
f )
L fsource
(2.19)
After interpolation by a factor L the spectrum of the source signal repeats
at all multiples of L fsource . Figure 2.13 shows both the spectrum of the interpolated source signal and the transform of the hold operation. The resulting
spectrum is the multiplication of both.
H Hold
Hold.epsi
103 30 mm
L⋅f source
=
1⋅f i
2⋅L⋅f source
=
2⋅f i
3⋅L⋅f source
=
3⋅f i
Figure 2.13: Interpolation by L and hold operation
The baseband signal up to fsource =2 is left (almost) unchanged by the
hold operation, while all remaining signal components at multiples of fi =
L fsource are attenuated by the SINC . They will be folded back into the
baseband after the decimation. If fsource and fsink are uncorrelated we can
assume that the signal amplitudes of different folding products that are folded
back to the same location in the baseband do neither add nor totally cancel each
other. Therefore their energy can be summed up. The total error power due to
the hold operation for a full-scale signal is listed in Table 2.1 for five different
signals. The derivation of these values can be found in Appendix B. Note that
these errors correspond to full-scale signals, and scale down proportionally for
smaller signal levels.
Sine @ 1 kHz
Sine @ 10 kHz
Sine @ 20 kHz
White Noise
Pink Noise
Error in [dB] for L=
2k
216
218
,28:5 , k 6:02 ,124:8 ,136:9
,8:5 , k 6:02 ,104:8 ,116:9
,2:4 , k 6:02 ,98:7 ,110:8
,8:6 , k 6:02 ,104:9 ,117:0
,13:2 , k 6:02 ,109:5 ,121:6
Table 2.1: Error caused by hold operation (fsource
=
220
,148:9
,128:9
,122:8
,129:0
,133:6
48 kHz)
The largest portion of the error is contributed by the components at the
first zero crossing of the SINC function (Fig. 2.13, f = fi ). If the audio
signal is a sine wave with frequency fsig we get two contributions around fi
at f = L fsource fsig with an amplitude of fsig =(L fsource ) (Eq. B.2).
These two components contain about 60 % of the total error energy due to the
hold effect. I.e. for a 1 kHz (20 kHz) signal sampled at fsource = 48 kHz and
an interpolation factor of L = 216 the two error peaks appear ,5:2 dB below
the value given in Table 2.1 at ,130 dB (,104 dB).
In this section we have calculated the error introduced due to an arbitrary,
non-rational ratio of source and sink sample rate depending on the interpolation
factor and various source signals. Thereby we presumed the sample-rate ratio
and thus the resampling moment to be known precisely. We will discuss
this assumption further in Chapter 4, after the presentation of the frequency
tracking unit.
For performance and cost estimations in the following chapters we will
use two different interpolation factors:
1. For L = 216 18-bit quality can be reached for a pink noise audio signal,
but for full-scale signals close to fsource =2 only 16-bit accuracy is
possible.
2. For L = 220 the sum of all error components is below ,122 dB for all
five signals shown in Table 2.1. 20-bit quality can therefore be achieved
over the full audio bandwidth.
3
High-Order Digital Filters
In Chapter 2 we assumed that the interpolation/decimation lowpass has a constant passband gain, a transition band of width zero, and an infinite stopband
suppression (Eq. 2.8). In this chapter we will estimate the minimally required
filter specifications for professional digital audio applications and discuss various realization methods. Furthermore the quantization properties of high-order
narrow-band lowpass filters are discussed.
As shown in Section 2.5.3, the CONTINUOUS MODE uses the same filter
coefficients for the UPMODE and the DOWNMODE. In the following sections we
will therefore design the filter and estimate the performance for the UPMODE.
We will discuss the implications on the DOWNMODE in Section 3.3.
3.1
Required Filter Characteristics
The interpolation/decimation filter is specified by a passband from 0 to fp with
a ripple smaller than p and a stopband that starts at fs with an attenuation
of at least s (Fig. 3.1).
23
δp
FiniteFilter.epsi
103 32 mm
δs
fp fs
f i = L⋅f source
f source
Figure 3.1: Filter with finite stopband attenuation
3.1.1
Transition Band
The transition band of the filter must be narrow, because signal components
just below fsource =2 must pass unaltered, whereas those just above must be
sufficiently suppressed. For high-quality audio fp = 15=32 fsource and
fs = 17=32 fsource is chosen [Pel82]. Note that the transition band is
symmetric in respect to fsource =2 and will therefore not alias into the baseband
(below fp = 15=32 fsource). Table 3.1 gives the borders of the transition band
for the three most common sampling frequencies. With 48 kHz for example,
the passband extends up to 22.5 kHz.
fsource
fp
fs
48.0 kHz
44.1 kHz
32.0 kHz
22.5 kHz
20.7 kHz
15.0 kHz
25.5 kHz
23.4 kHz
17.0 kHz
Table 3.1: Transition bandwidth for common sampling frequencies
3.1.2
Stopband
In Section 2.3 we stated that the interpolated signal is to be band-limited to
half the target sample rate before the decimation takes place. This requirement
is only approximated by a filter with a finite stopband attenuation s (Fig. 3.1).
The remaining stopband energy (from fsource =2 upwards) of the higher order
images is therefore attenuated by s and is then folded back into the baseband.
The alias error of the residual signals at multiples of fi = L fsource caused
by the hold operation has already been discussed in Section 2.6 (Fig. 2.13,
FilterHold.epsi
103 27 mm
δs
1⋅f i
2⋅f i
3⋅f i
Figure 3.2: Weighted stopband error
Table 2.1). This contribution is not affected by the finite stopband attenuation
and is therefore left blank in Figure 3.2.
As mentioned above, an additional error contribution results from the nonzero stopband which is weighted by the SINC -function of the hold operation
(Fig. 3.2). For a large interpolation factor L the signal energy is approximately
equally distributed over the stopband since the signal spectrum is repeated
many times. If we further assume that the filter has a constant stopband
attenuation s the total stopband energy is obtained by
Estop = 2 1Z
X
j =0
j
fi, fsource
( +1)
2
j fi+ fsource
2
s) f
(
2
source
!2
sin( ffi )
df
ffi
(3.1)
Assuming the same error energy density in the (narrow) passband as in the
stopband right through, Equation 3.1 can be simplified (using Eq. A.2) to
Estop =
Z
1 (s)2
!2
sin( ffi )
df
fsource ffi
2 (s )2 fi
2
= L (s )
fsource 2 2
0
(3.2)
If the error introduced by the finite stopband attenuation should be smaller
than ,110 dB and L = 216 (hold error at ,110 dB for pink noise; Table 2.1)
then s must be less than ,160 dB (Eq. 3.2). For a stopband attenuation of
,130 dB and L = 220 , s must even be below ,190 dB.
3.2
Realization Considerations
In this section we discuss several alternatives for the realization of the interpolation/decimation filter specified in the previous section. In Section 3.2.1
the order and computational cost of a single stage FIR filter is estimated. This
filter is followed by the hold operation and the decimation. In Section 3.2.2
the hold operation is replaced by linear interpolation, allowing a lower order
filter. In Section 3.2.3 alternative implementations are discussed briefly.
3.2.1
FIR Filter and Hold Operation
The most direct implementation is the use of a single stage FIR filter followed
by the hold operation as depicted in Figure 3.3. What is the order of a single
stage FIR filter, which fulfills the specifications given in Section 3.1?
Discrete
Time
Single
Stage SingleStage.epsi Hold
FIR
97 22 mm
L
Continuous
Time
Resample
f sink
f source
Figure 3.3: Single stage FIR filter followed by hold operation
[Rab73] derived an empirical formula to estimate the order of a Chebyshev
lowpass filter for specified p ; s ; fp ; fs . Table 3.2 gives the approximate order
N of the lowpass filter according to this formula for various values of Estop,
L, and p. The parameters s , fp , fs are calculated according to Equation 3.3.
q
1
17 1
;
f
=
s = Estop=L; fp = 15
s
32 L
32 L
(3.3)
Assuming a decent passband ripple of 0:01 dB, a filter of order 6 600 000
is required. If the passband ripple is reduced to 0:001 dB, the filter order
increases only slightly to 7:5 106 . A filter of order 8:2 106 results, if the
stopband error is reduced to ,130 dB. If the interpolation factor is increased
to lower the influence of the hold operation the filter order increases rapidly.
Estop L
[dB]
,110
,110
,130
,130
,130
216
216
216
218
220
p
s
[dB]
[dB]
0:01
0:001
0:001
0:001
0:001
,158
,158
,178
,184
,190
N
Q
6:6 106
7:5 106
8:2 106
33:9 106
138:9 106
102
115
126
129
132
Table 3.2: Estimated filter order
Recall that to compute one sink sample only a subset of Q coefficients
of the impulse response of the filter is taken (Q = N=L, Sec. 2.5.1). The
computational cost (which is proportional to Q) is consequently about the same
independent of the interpolation factor (last three rows of Table 3.2). However,
the number of filter coefficients that must be stored increases proportionally
to the interpolation factor. For L = 220 over 108 filter coefficients must be
stored. Additionally, direct synthesis of a lowpass filter of order 108 is clearly
beyond practical feasibility.
3.2.2
FIR Filter and Lagrange Interpolation
3.2.2.1
Principle
In Section 2.6 we used the hold operation to convert the highly interpolated
source signal to a continuous-time signal, which was then resampled at the sink
sample rate. Instead of holding the value between two consecutive samples
of the interpolated signal, we now use first order Lagrange interpolation (from
now on simply called linear interpolation) between the two samples to calculate
the sink value (Fig. 3.4).
The frequency response of a linear interpolator has a SINC 2 characteristic.
The noise suppression around the zero crossings of a SINC 2 is much higher
compared to the SINC function of the hold operation. Therefore a substantially smaller interpolation factor is sufficient. The error energy around the
zero crossings for a given interpolation factor L1 = 2k is shown in Table 3.3
and derived in Appendix C. If the error caused by the linear interpolation
(Table 3.3) is compared to the error caused by the hold operation (Table 2.1) it
can be seen that L1 = 27 corresponds to L = 216 and L1 = 29 corresponds to
Discrete
Time
Linear
L1
FIR
Lagrange1.epsiInterFilter
91 21 mm polation
Continuous
Time
Resample
f sink
f source
Figure 3.4: FIR filter followed by linear interpolation
L = 220 .
The interpolation factor can therefore be reduced drastically, if the
hold operation is replaced by linear interpolation.
Sine @ 1 kHz
Sine @ 10 kHz
Sine @ 20 kHz
White Noise
Pink Noise
Error in [dB] for L1 =
2k
27
28
,63:9 , k 12:04 ,148:2 ,160:2
,23:9 , k 12:04 ,108:2 ,120:2
,11:9 , k 12:04 ,96:2 ,108:2
,18:7 , k 12:04 ,103:0 ,115:0
,25:7 , k 12:04 ,110:0 ,122:0
Table 3.3: Error caused by linear interpolation (fsource
29
,172:3
,132:3
,120:3
,127:1
,134:1
=
48 kHz)
So far in this section we assumed that the linear interpolation is calculated
with an infinite resolution on the time axis. Therefore we got a continuous-time
signal after the interpolator (Fig. 3.4).
Linear
Interpolation
L2
Lagrange2.epsi
97 46 mm
L1
f source
FIR
Filter
Interpolation/
Decimation
Filter
Discrete
Time
Hold
Continuous
Time
Resample
f sink
Figure 3.5: FIR filter followed by discrete-time linear interpolation
In a practical realization we only have a finite time resolution of tunit .
tunit = L f1
source
with L = L1 L2 ,
where L2 is the number of resolved samples by the linear interpolator between
two FIR-interpolated samples. The block diagram of Figure 3.4 is therefore
completed with additional function blocks resulting in Figure 3.5.
We actually have a cascade of three filters (Fig. 3.6): the FIR filter HFIR ,
the linear interpolator HLin with the first zero crossing at L1 fsource , and
the hold operation HHold with the zero at L1 L2 fsource . If the following
section we discuss the error contributions of the three filters.
H FIR
H Hold
H Lin
FilterCascade.epsi
108 26 mm
L 1⋅ L 2⋅ f source
L 1⋅ f source
Figure 3.6: Transfer characteristic of FIR filter, linear interpolator, and hold
operation
Figure 3.7 shows the time domain representation of Figure 3.6. The source
samples are interpolated by the FIR filter resulting in the bold samples with
spacing 1=(L1 fsource ) in Figure 3.7. Linear interpolation leads to L2
intermediate samples with spacing 1=(L1 L2 fsource ) = 1=(L fsource ).
Finally, the hold operation converts the discrete-time signal to a continuoustime representation.
x(t)
y(t)
LinTime.epsi
89 29 mm
T=1/(L1⋅f source)
â
x[n]
T=1/(L⋅f source)
Figure 3.7: FIR filter and linear interpolation followed by hold operation
3.2.2.2
Performance Estimation
The error introduced through the three cascaded filters consists of three components. In the first place the stopband attenuation of the FIR filter is weighted
by the SINC 2 transfer function of the linear interpolator, secondly the remaining components of the higher order FIR passband attenuated by the SINC 2 as
well, and thirdly the residuals of the hold operation.
The sum of the weighted stopband attenuation is (using Eq. A.3)
Estop1 2f (s1) 2
source
Z
1
f
)
sin( L1 fsource
f
L fsource
0
!4
df = 23 L1 (s1)2
(3.4)
1
If we assume a white noise signal, we get for the contributions around
the zero crossings of the linear interpolator (Eq. C.7) and the zero-order hold
(Eq. B.7)
4
2
Estop2 7200 (L )4 + 72 (L L )2
1
1
2
(3.5)
The total error for a FIR filter with a finite stopband attenuation s1 is
therefore for the three different audio signals
4
4
2
f
fsig
sig
2
L1 (s1) + 45 L f
+
3 L1 L2 fsource
1
source
4
2
2
2
EWhite 3 L1 (s1) + 7200 (L )4 +
72 (L1 L2 )2
1
ESine 23
EPink 23
L1 (s1)2 + L f
1
source
10 960
4 +
10 472
L1 L2 fsource
2
(3.6)
(3.7)
2
(3.8)
Table 3.4 gives the contribution of the three summands for a white noise
signal (Eq. 3.7) at fsource = 48 kHz and sample values of L1 , L2 , and s1 .
Note that the second summand is reduced by 7 dB and the third by 4.6 dB, if
pink noise is used instead of white noise.
L1 L2
s1
L
[dB]
27
28
29
29
28
27
28
29
210
211
,137
,137
,137
,137
,152
216
216
216
218
220
Weighted
Linear
Zero , order
stopband interpolation
hold
Eq: 3:4
Table 3:3
Table 2:1
[dB]
[dB]
[dB]
,118
,103
,105
,115
,115
,105
,112
,127
,105
,115
,127
,115
,127
,117
,129
Table 3.4: Error contributions from interpolation and finite stopband attenuation (white noise signal)
For the first three rows of Table 3.4 the same interpolation factor L =
L 1 L2 = 216 is used, but the order of the interpolation filter increases and
likewise the number of coefficients that have to be stored.
For a sine wave test signal the second and the third error summand of
Eq. 3.6 depend on the audio signal frequency. Figure 3.8 shows plots using the
parameters in last four rows of Table 3.4. All three summands of Equation 3.6
are plotted with dashed lines and their sum with a solid line.
For the parameters of two plots in the upper half of Figure 3.8 18-bit
performance for full-scale signals is only reached up to about 5 kHz due to the
hold operation (L = 16). The lower two plots represent parameters for 18and 20-bit performance over the full range of input signals.
So far we only discussed the integral over the error contributions. If we
apply a sine wave test signal, the energy will not be equally distributed and
we will get distinct error components in the decimated sink signal. In the
following the size of these ‘peaks’ is calculated.
For a sine wave close to fsource =2 the largest contribution to the stopband
error is caused by the insufficient attenuation of the linear interpolator and the
zero-order hold at the first zero crossing (Fig. 3.6).
The minimal attenuation of the linear interpolator at the first zero crossing
L1 = 256, L2 = 256, δ s1 = −137 dB
L1 = 512, L2 = 128, δ s1 = −137 dB
−100
[dB]
[dB]
−100
−110
−120
−130
−120
100
1000
[Hz]
L1 = 256, L2 =1024, δ s1
−130
10000
1000
10000
100
1000
[Hz]
10000
−100
[dB]
[dB]
100
ErrorSine.eps
[Hz]
108
76
mm
= −137 dB
L1 = 512, L2 =2048, δ s1 = −152 dB
−100
−110
−120
−130
−110
−110
−120
100
1000
[Hz]
−130
10000
Figure 3.8: Error contributions vs signal frequency for sinusoidal audio signals
for f
=
L1 fsource fsource =2 is (Eq. C.2)
fsource
L f2
1
source
!4
=
1
16 (L1)4
(3.9)
and for the zero-order hold we get (Eq. B.2)
fsource
L L 2 f
1
2
source
!2
=
1
4 (L1 L2 )2
(3.10)
For L1 = L2 = 256 the two above equations result in ,108 dB and ,102 dB
respectively. This means that for a source signal close to fsource =2 the sink
signal contains two distinct error components at ,102 dB caused by the hold
operation and two error components at ,108 dB caused by the linear interpolation. The alias frequency of these error components depends on the exact
source and sink sample rates and the signal frequency. Therefore not only the
accumulated stopband error is a matter of concern, but also its distribution.
3.2.2.3
Implementation
Figure 3.9 shows a simplified block diagram of the principle outlined in the
previous section. The two interpolators (by L1 and L2 ) are merged into one
and the FIR filter and the linear interpolator are combined in one block.
Continuous
Time
Discrete
Time
Linear
FIR
& Inter- Lagrange3.epsi
Filter
polation 97 22 mm
L
Hold
Resample
f sink
f source
Figure 3.9: FIR filter with linear interpolator
In the time domain the multiplications of the source signal by the filter transfer functions can be expressed as successive convolutions. We will
discuss further the two implementations that are indicated with brackets in
Equation 3.11.
x ? hFIR ) ? hLin = x ? (hFIR ? hLin )
(
(3.11)
In the left hand case the source signal is interpolated by the FIR filter. Additional intermediate values are calculated by linear interpolation of the oversampled signal. In the right hand case the coefficients of the FIR filter are
linearly interpolated and the resulting filter is then applied to the source signal. Both cases result in the same transfer function, since the convolution is
associative.
The above paragraph describes only the conceptual model. As in the case
of the single stage FIR filter not all values of the interpolated source signal
must be calculated since the interpolation is followed by the decimation. In
the left hand case of Equation 3.11 two successive output samples of the FIR
filter are computed per sink sample. The sink sample is then calculated by
linear interpolation between the two. In the right hand case only one output
sample of the FIR filter is calculated, but each filter coefficient requires an
interpolation.
This second implementation resembles the single stage FIR case. But since
the linear interpolation is performed in real-time less filter coefficients have to
be stored. Instead of the transfer characteristic of a true single stage filter in
Figure 3.1, we get the one in Figure 3.10.
FilterLin.epsi
109 26 mm
δ s1
L 1⋅ L 2⋅ f source
L 1⋅ f source
Figure 3.10: FIR with interpolated impulse response
In Section 3.2.1 (Table 3.2) we estimated the order of a single stage FIR
interpolation/decimation filter and stated that such a filter can not be directly
synthesized. Table 3.5 gives the order N of the FIR filter and the number of
taps Q that must be calculated for the FIR filter if a linear interpolator is used.
Estop L1
[dB]
,118
,115
,112
,115
,127
27
28
29
28
29
p
s
[dB]
[dB]
0:001
0:001
0:001
0:001
0:001
,137
,137
,137
,137
,152
N
Q
13:3 103
26:6 103
53:1 103
104
104
104
26:6 103
57:3 103
104
112
Table 3.5: Estimated filter order
The values of L1 and Estop are taken from Table 3.4. Note that the number
of taps is the same for the first three cases, but the number of stored coefficients
doubles. The parameters for 20-bit performance increase the number of taps
only from 104 to 112.
3.2.3
Alternatives
3.2.3.1
High-Order Coefficient Interpolation
For a single stage FIR implementation with an interpolation factor L = 220
a filter of order 108 is required. If the coefficients are interpolated using first
order Lagrange interpolation the filter order can be reduced to 57 103 as
discussed in the previous sections. The filter order could be further reduced by
Interpolation
L L1 L2 Estop
s1
[dB]
[dB]
Zero order
1st order
220
220
220
29
20
211
2nd order
3rd order
220
220
26
24
214
216
,127
,127
,127
,127
,187
,152
,144
,138
N
Q
137 225 217 131
57 288 112
6 886 108
1671 104
Table 3.6: Coefficient interpolation and filter order
using higher order Lagrange interpolation. Table 3.6 shows the resulting filter
order. Note that the order Q of the actually computed filter stays about the
same, but the reduced storage requirements are traded for a more complicated
coefficient interpolation.
3.2.3.2
Separate Interpolation and Decimation Filter
[JTG+ 94] presented an implementation of a sample-rate converter which performs interpolation by only L = 64 followed by a proprietary hold operation
called ‘controlled validation’ (Fig. 3.11). Before the decimation the error
introduced by the hold operation is attenuated by a second lowpass filter.
’Controlled
Validation’
64
f source
Filter
Philips.epsi
Hold
105
28 mm
1×,2×,3×
64⋅f source
Filter
128⋅f sink
128
f sink
Figure 3.11: Separate interpolation and decimation filters
This special hold operation repeats each interpolated source sample 1-3
times in such a way that only high-frequency error components are produced,
which can be attenuated by the decimation filter. If source and sink sample
rate are the same, each interpolated sample is repeated twice on average. Since
the interpolation filter runs with fsource and the decimation filter runs with
fsink the sink samples are automatically band-limited to half the minimum of
source and sink sample rate by-passing the need to adapt the cutoff frequency
of the filter as in the CONTINUOUS MODE.
3.3
Required Resources
Before we discuss the synthesis problem for high-order digital filters we recapitulate the required resources for the approaches presented in the previous
section. We use the parameters L = 216 and L1 = 28, which results in 18-bit
performance for full-scale signals up to 5kHz and assume a (stereo) audio
signal.
Single Stage FIR Filter
If we assume a filter order of N = 7:5 106 , an interpolation factor of L = 216 ,
and use the hold operation, the number of multiplications for each stereo sink
sample is 2 Q = 2 N=L = 230. For a sink sample rate of 48 kHz we get a
cycle time of 90 ns for the multiplication (and accumulation). This approach
has the drawback that all N = 7:5 106 filter coefficients must be stored in a
way that they can be accessed in real-time.
Interpolation of FIR Impulse Response
If we use linear interpolation on the impulse response with L1 = 28, L2 = 28
(L = 216 ), and a filter order of N = 26 103 the number of multiplications for
the filter is 2 Q = 2 N=L1 = 208. We need one additional multiplication
for each interpolation of the filter coefficients. This results in 3 N=L1 = 312
multiplications and L1 = 28 coefficients to store.
FIR Filter followed by Interpolation
If we interpolate the samples after the FIR filter, two FIR calculations are
needed for each (stereo) channel. One additional multiplication per channel
is used for the linear interpolation. Therefore the number of multiplications is
2 2 Q + 2 = 4 N=L1 + 2 = 418. The number of stored coefficients is the
same as in the previous case.
Discussion
In Section 3.2 we considered different concepts for the realization of the
interpolation/decimation filter. We have seen that a trade-off between the
computational cost and the number of stored coefficients must be made.
The noise estimations in this chapter have been done for full-scale signals.
However, a full-scale sine wave at 20 kHz is not a realistic audio signal. Therefore we could reduce the requirements for higher frequencies. Additionally,
the noise level scales proportionally to the signal level. As a result we have
less distortion for lower level signals. However, these measures reduce the
required filter order only by a small fraction.
So far in this chapter we have only treated the UPMODE. In the DOWNMODE
(i.e. if the sink sample rate is smaller than the source sample rate) the width
of the FIR filter is reduced and therefore the error contributions from the
higher order passbands are smaller. The contribution from the finite stopband
attenuation of the FIR filter is about the same. On the other hand the signal
energy is also reduced due to the smaller passband.
In the following chapters we will treat further only the concept of interpolation of the FIR impulse response. The synthesis of the FIR filter of order
N = 26 103 is covered in Section 3.4.
3.4
Synthesis of High-Order FIR Filters
In this section we discuss methods which allow to synthesize the high-order
interpolation/decimation filter. For the examples, the following parameter
values are used:
L1 = 256; N 26 103; s1 = ,137 dB
A filter with these parameters is suited for a sample-rate converter with 18-bit
performance up to 5kHz (Fig. 3.8).
3.4.1
Specifications
Let us review the specifications for the interpolation/decimation filter for the
parameter values given above.
δp
FilterSpecs.epsi
108 32 mm
δ s1
fp fs
0.5
[f/f sample]
Figure 3.12: Filter specifications
The filter is used for an interpolation by a factor L1 = 256. Its (ideal)
cutoff frequency (with a normalized sample rate of 1.0) is therefore 0:5=L1 =
0:00195. The passband and the stopband borders are given in Table 3.7 for a
transition bandwidth of fsource =16. The maximum allowed passband ripple for
high-quality digital audio is a disputed subject. We demand a (conservative)
value below 0:001 dB. An average attenuation of at least 137 dB must be
reached by the filter. Since not only the average stopband attenuation is a matter
of concern, but also its distribution we request an minimal attenuation of 137
over the full stopband. A Chebyshev filter which meets these specifications is
of order 26 103 .
How could the order of the interpolation/decimation filter be reduced? The
required order of a Chebyshev lowpass filter meeting certain specifications for
Lower Edge
Upper Edge
Deviation [dB]
Passband
0.0000
0.0018
0.001
Stopband
0.0021
0.5000
,137
Table 3.7: Filter Requirements
p and s is about inversely proportional to its transition bandwidth.
If the
source signal does not extend over the full bandwidth, or if some level of
aliasing can be tolerated for high frequencies, the filter order and thus the
computational complexity can be reduced drastically by allowing a slightly
wider transition band.
An alternative approach is to split up the stopband into multiple parts with
different attenuations. The filter order is determined to a large extent by the
required stopband attenuation of the part immediately following the transition
band. The following stopband sections can reach much higher attenuations.
We will analyze the above possibility to reduce the filter order in Section 3.4.5.
3.4.2
Design Technique
A wide variety of design techniques for FIR digital lowpass filters have been
developed [RG75, Ant93]. We used the technique of equiripple design based
on Chebyshev approximation methods. The filters designed by this method
are optimal in the sense that they fulfill the minimax criterion, i.e. the peak
approximation error in the frequency domain is minimized. The resulting filter
has an equiripple transfer characteristic in the passband and in the stopband.
To solve the Chebyshev approximation problem the well-known computer
program presented in [MPR73] is used. It is based on the multiple-exchange
Remez algorithm. The program was originally written to synthesize (loworder) filters up to N = 128. We have extended the program for the synthesis
of high-order filters. However, the computational cost of the program is
approximately proportional to the square of the filter length.
Many elaborate methods have been proposed to reduce the computation
time of the above mentioned program [AW85, Ant93]. Since our main concern
is not runtime, but the convergence of the algorithm, we optimized the program
to allow synthesis of lowpass filters up to order N = 30 000 (Sec. 3.4.3). Addi-
tionally we used the prototype method proposed in [AW85] to estimate the performance of the high-order filter with a low-order approximation (Sec. 3.4.4).
3.4.3
Direct Synthesis
Numerical problems in the original Remez exchange program [Rab73] disallow computation of high-order filters. We have improved the program for
better numerical properties. The original program often computes expressions
of type
1
cos() , cos( + )
(3.12)
For values of 1 the evaluation of this expressions is inaccurate and may
even lead to an arithmetic exception (division by zero) due to finite precision
arithmetic. We have replaced the above expressions by
1
2 sin( + 2 ) sin( 2 )
(3.13)
which has much better numerical properties for small values of . Additionally, range checking and scaling of intermediate results have been added
to prevent overflow of accumulator variables during calculation of high-order
filters.
Using these modifications, direct synthesis of filter with more than 20 000
filter taps is feasible. Table 3.8 lists the runtimes on a SUN SPARCstation
10/41 for sample specifications.
3.4.4
Prototype Method
The prototype method for narrow-band lowpass filters starts out from the fact
that the computational cost is roughly proportional to the square of the filter
length. A low-order prototype filter is synthesized and is then scaled to the
high-order filter. This approach allows to evaluate rapidly the performance of
different filter specifications at low computational cost.
The frequency scaling property of the Fourier transform is used to derive
the specifications of the prototype. Consider the impulse response h(t) and its
Fourier transform H (f ). Scaling the impulse response by a factor yields a
new Fourier pair
t
h
( ) $ H ( f )
1
The edges of the pass- and the stopband are scaled to fp and fs . Due to
the wider transition band the filter order is reduced by a factor . Figure 3.14
shows an example for = 4. The above scaling transformation operates
precisely in the continuous-time domain and with good approximation in the
discrete-time domain for narrow-band high-order lowpass filters.
Filter
Order
419
1679
6719
13439
26879
64
16
4
2
1
p
[dB]
0:001
0:001
0:001
0:001
0:001
s
[dB]
,138:8
,139:0
,139:0
,139:0
,139:0
Runtime
[min]
0.37
7.73
143
564
3656
Table 3.8: Runtime and deviation depending on 4
Runtime [min]
10
2
10
Runtime.eps
106 40 mm
0
10
−2
10
−4
10
−3
10
−2
10
−1
10
0
10
[1/ λ ]
2
Figure 3.13: Graphical representation of Table 3.8
Table 3.8 lists the runtime on a SUN SPARCstation 10/41 for a target filter
of order 26 879 using the specifications shown in Table 3.7 for different values
of . The runtime is about inversely proportional to 2 – except for = 1,
where the program converges only slowly. The resulting stopband attenuation
decreases only for large values of .
Low−order filter: Time domain
Low−order filter: Freq. domain
0
0.2
−50
0.15
0.1
−100
0.05
−150
0
−0.05
100
200
300
[Samples]
400
ZeroPad.eps
103 86 mm
High−order filter: Time domain
100
0.06
0.04
−50
0.02
−100
0
−150
1000
[Samples]
1500
400
Zero−padded filter: Freq. domain
0
500
200
300
[Samples]
500
1000
[Samples]
1500
Figure 3.14: Scaling of prototype filter by = 4 using zero padding
To avoid the computation-intensive direct synthesis of the high-order filter,
the Fourier transform can be used to scale the low-order prototype filter to a
higher order filter. Figure 3.14 shows the required operations. In a first step
the impulse response of the prototype filter is transformed into the frequency
domain using the discrete Fourier transform (upper left to right). In a second
step zeros are inserted at the Nyquist sample rate in the frequency domain to
increase the filter order by the factor (right top to bottom). This zero-padded
transfer function is then converted back into the time domain using the inverse
Fourier transform (lower right to left).
Due to high stopband attenuation of the filter at the Nyquist sample rate,
the discontinuity introduced by the zero padding is very small. Figure 3.15
shows a comparison of the transfer function of the scaled prototype ( = 16)
with the directly synthesized filter.
Low−Order Filter and Zero−Padding
0
−50
−100
−150
2
10
3
10
4
10
ZeroPad2.eps
[Hz]
102
72 mm
Direct Synthesis
5
6
10
10
0
−50
−100
−150
2
10
3
10
4
10
[Hz]
5
Figure 3.15: Comparison of zero-padded prototype (
thesis
3.4.5
6
10
10
=
16) and direct syn-
Relaxed Specifications
The specifications for the interpolation/decimation filter need always to be
a trade-off between the computational cost for real-time operation and the
quality of the filter. I.e. if the transition band is widened, then the order of the
filter can be reduced substantially, but aliasing occurs for high frequency signal
components. The prototype method discussed in the previous section allows
to evaluate the cost and performance of filters with different specifications at
low computational cost.
To reduce the filter order the specifications given in Figure 3.12 are altered
to allow an intermediate stopband with a lower attenuation (Fig. 3.16).
For a sample rate of 48 kHz the transition band extends from 22.5 kHz
to 25.5 kHz for the original specifications (Table 3.1). Figure 3.17 shows an
example of a resulting filter for reduced aliasing requirements. The stopband
attenuation and the passband ripple are the same as for the original specifications, but the filter order N is reduced from 26 879 to 21 759. Thereby the
δp
FilterSpecs2.epsi
112 34 mm
δ s1
fp fs
0.5
[f/f sample]
Figure 3.16: Relaxed specifications with intermediate stopband
order Q of the actually computed filter is reduced by 19 % from 105 to 85.
The aliasing components in the baseband that result from the relaxed specifications are attenuated by 78 dB (Fig. 3.17) and are located above 21 kHz
(fsample = 48 kHz), 14 kHz (fsample = 32 kHz) respectively. The error contribution by the higher order transition bands can be neglected since they are
attenuated by at least 90 dB (L1 = L2 = 256) by the hold operation and the
linear interpolation (Eq. 3.9, 3.10). The degradation caused by the relaxed
specifications is therefore only marginal.
0
−20
−40
[dB]
−60
Aliasing.eps
79 63 mm
−80
−100
−120
−140
−160
0
0.5
1
1.5
2
2.5
[Hz]
3
3.5
4
Figure 3.17: Filter with relaxed specifications (fsample
4.5
4
x 10
=
48 kHz)
3.5
Quantization of Filter Coefficients
In Section 3.4 a narrow-band FIR lowpass filter of order 26 879 with a minimum stopband attenuation of 139dB has been synthesized (average attenuation
142 dB). As described in Section 3.2.2 the (virtual) interpolation/decimation
filter of order 106 is obtained by linear interpolation of the filter coefficients
(impulse response) of the 26 879 filter. So far the quantization of the filter
coefficients has not been treated. For the implementation the ‘basic coefficients’ are quantized twice: Once for reducing the storage requirements and
to match the word-length of the multiplier, and a second time after the linear
interpolation of those stored values. The following two sections cover the
effects of each quantization step.
3.5.1
Quantization of FIR Coefficients
Changing the coefficients of a filter alters its transfer function. In particular,
quantizing the coefficients of a lowpass filter reduces the stopband attenuation.
Figure 3.18 shows the transfer characteristic of the 26 879 filter with coefficients quantized to 20 bits: the average stopband attenuation is reduced from
142 dB to 129 dB. The equiripple characteristic in the stopband is replaced by
stopband noise.
0
−20
−40
−60
−80
Filter26879bit20.eps
74 58 mm
[dB]
−100
−120
−140
−160
−180
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
[f/f sample]
Figure 3.18: FIR filter, N
=
26 879, 20 bit coefficients
To increase the dynamic range and therefore to reduce the quantization
effects, a floating-point format can be used for the coefficients. The impulse
response h of the FIR filter has approximately a SINC ,shape i.e. a j1=xj
asymptotic behavior from the central to the outer coefficients (Fig
3.19). The
P
evaluation of one sink sample is a scalar product of the form
h x. If the
summation is started with the outermost coefficients a special block floating
point format can be used, which allows the exponent to increase only. We call
this easy to implement format quasi floating point (QFP). It can be realized
in hardware by shifting the accumulator to the right whenever the exponent
changes. No storage of the exponent is necessary, since a simple controller
can do the job.
0.8
0.6
0.4
Filter26879Impulse.eps
108 42 mm
0.2
0
−0.2
0.5
1
1.5
2
2.5
4
x 10
Figure 3.19: Impulse response of FIR filter using fixed-point coefficients
The stopband attenuation of the 26 879 order FIR filter for a given precision
of the coefficients is given in Table 3.9. Note that to reach a stopband attenuation of 137 dB (Table 3.4) 22-bit coefficients are required in the fixed-point
format, but only 19 bits if the QFP format is used (Tab. 3.9).
Coefficient
[bit]
18
19
20
22
24
Stopband attenuation
Fixed point
QFP
117 dB
134 dB
123 dB
139 dB
129 dB
141 dB
139 dB
142 dB
142 dB
142 dB
Table 3.9: Stopband attenuation depending on coefficient precision
The quantization of the FIR coefficients can therefore be modelled by
adjusting the stopband attenuation s1 of the FIR filter according to the spectrum of the quantized coefficients. Or in other words, the FIR coefficient
word-length must be chosen such that the stopband requirements given in
Section 3.2.2 are fulfilled for the quantized coefficients.
3.5.2
Quantization of Interpolated Coefficients
The (already quantized) FIR coefficients are linearly interpolated to yield the
interpolation/decimation filter of order 106 (Sec. 3.2.2). In Figure 3.10 the
transfer characteristic of the resulting filter has been sketched: The FIR filter
shape is repeated L2 times and attenuated by the SINC 2 transfer function
representing the linear interpolator in the frequency domain. Figure 3.20
shows the spectrum of the FIR filter for L2 = 64 using 22-bit (non-QFP) FIR
coefficients with ideal interpolation.
0
−20
−40
−60
[dB]
−80
Filter26879bit22-64.eps
108 85 mm
−100
−120
−140
−160
−180
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
[f/f sample]
Figure 3.20: Interpolated FIR filter, L2
=
64
The quantization error of the ‘basic FIR coefficients’ (previous section)
showed approximately random behavior, resulting in a white noise spectrum.
The quantization of the linear interpolated (quantized) FIR coefficients results
in a strongly correlated error function, which can assume only discrete values.
This results in a non-white spectrum of the quantization error.
Detail of quantization noise in time domain
0.5
0
−0.5
0
50
100
150
200
250
300
350
400
Filter26879bit22-64bit22-Noise.eps
104 83 mm
Spectrum of quantization noise, average level = −159 dB
450
500
0.45
0.5
−120
−130
−140
−150
−160
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Figure 3.21: Quantization noise of interpolated FIR filter, L2
=
64, 22 bit
The upper graph in Figure 3.21 shows the quantization error of the first
500 coefficients. The corresponding spectrum of this function is shown in
the lower half. Note that the same 22-bit coefficients have been used as for
Figure 3.20, but the interpolated coefficients have been quantized to 22 bit as
well. The average error level is at ,159 dB with some peaks up to ,126 dB.
Since all error peaks caused by the quantization of the interpolated coefficients (Fig. 3.21) are dominated by the highest peaks of the (non-quantized)
interpolation operation (Fig. 3.20) the quantization error contributes only by
its average level. Table 3.10 shows the average and the peak quantization error
level for 19 and 22-bit quantization and sample values of L2 . Note that the
average quantization error level is lowered by 17 dB using the QFP principle
– the peak value even by 30 dB.
L2
4
16
64
256
Quantization noise level [dB]
19-bit
22-bit
fixed-point
QFP
fixed-point
QFP
peak avg. peak avg.
peak avg. peak avg.
,94 ,130 ,123 ,147 ,111 ,148 ,141 ,165
,102 ,136 ,129 ,153 ,117 ,153 ,148 ,171
,106 ,142 ,138 ,159 ,126 ,159 ,157 ,177
,107 ,148 ,148 ,165 ,128 ,166 ,166 ,183
Table 3.10: Quantization noise caused by second quantization
Figures 3.22 and 3.23 show the resulting filter transfer characteristic with
both the ‘basic FIR coefficients’ and the interpolated coefficients quantized to
22 bit using the fixed-point and the QFP format.
0
−20
−40
−60
[dB]
−80
Filter26879bit22-64bit22.eps
97 76 mm
−100
−120
−140
−160
−180
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
[f/f sample]
Figure 3.22: FIR filter, L2
=
64, quantized twice to 22 bit
0
−20
−40
−60
[dB]
−80
Filter26879bit22QFP-64bit22QFP.eps
97 76 mm
−100
−120
−140
−160
−180
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
[f/f sample]
Figure 3.23: FIR filter, L2
=
64, quantized twice to 22 bit QFP
4
Frequency Tracking
In Section 2.6 we have presented the concept for sample-rate conversion by
arbitrary ratios. The source signal is interpolated by a constant factor L,
transformed into a continuous-time representation, and is then resampled at
the new (sink) sample rate. In this chapter we discuss methods to determine
precisely the sink sample moment in the digital domain.
4.1
Requirements
In order to allow the computation of a sink value of the sample-rate converter,
its precise phase (i.e. the sink sampling time relative to the source samples)
must be known. Inaccuracy in measuring this phase results in a wrong sink
sampling moment and thus in a non-uniform resampling of the sink signal. In
Section 4.5 the effects of the non-uniform sampling on the signal spectrum will
be covered in detail. It will be shown that the sample phase error is modulated
onto the audio signal as shown in Figure 4.1. A precision of at least L bits is
required for the sink phase.
The (virtual) sample rate after the interpolation and thus the time resolution
for the sampling moment of the sink samples is L fsource (Eq. 2.12). For
example, a time resolution of approximately 320 ps is needed for L = 216 and
a sample rate of 48 kHz.
51
uniform
non-uniform
Samples
Modulation.epsi
85 39 mm
Spectrum of
phase error
Spectrum of
resampled sine
Figure 4.1: Effects of non-uniform sampling on the signal spectrum
4.2
Principle
A straight forward solution to determine the phase of the sink samples is to use a
high-speed counter which starts at each source sample moment and measures
the position of the sink samples relative to the source samples (Fig. 4.2).
Besides the need for a counting frequency in the GHz range and a very stable
clock source, this direct approach exhibits the shortcoming that jitter of either
of the sampling clocks translates directly into phase errors of the sink samples.
Since jitter-free sampling clocks are hardly feasible, methods which determine
the precise average phase are asked for.
source
samples
sink
samples
L
Phase.epsi
∆ϕ[m+1]
∆ϕ[m]
95 32 mm
M[m]
Figure 4.2: Phase relations
Figure 4.2 gives a graphical representation of the phase relations as defined
in Chapter 2 (Eq. 2.17). Obviously there is no need to measure each sink sample
moment directly, instead the subsequent sink sample phases are computed from
the previous ones by adding the precisely measured ratio of source and sink
sample rate (Eq. 4.1).
This indirect solution makes it possible to suppress jitter of the sampling
clocks by evaluating a long term average M=L of the ratio fsource =fsink ,
which in general varies only slowly in time. Note that in Equation 4.1 the
interpolation factor L is constant, but the decimation factor M [m] may vary
slowly. The average decimation factor M must be equal to L times the ratio of
source and sink sample rate. Since L is chosen to be a power of 2 the modulo
operation in Equation 4.1 can be implemented easily.
∆'[m + 1] = (∆'[m] + M [m]) mod L;
with M
L
=
fsource
fsink
(4.1)
If the sink phase ∆'[:] is computed according to Equation 4.1, the calculated
phase may drift away from the actual phase due to the finite word-length of
M [m] and the filtering used for the jitter reduction. Therefore measures must
be taken to avoid this drift.
In the following section a frequency counter solution is presented. An
alternative approach to measure the average frequency ratio is a digital PLL
(Sec. 4.4). Section 4.5 treats the effects of non-uniform sampling due to
variations of M [m]. A comparison of the performance of the frequency
counter and the digital PLL is given in Section 4.6.
4.3
Frequency Counter
In order to reach the desired overall precision, the ratio of source and sink
sample rate has to be determined with a precision of at least L bits. This
measurement can be accomplished by a frequency counter using two different
methods:
1. The source sample rate is multiplied by a phase locked loop (PLL) to
2L,k fsource and is then used to measure the sink sample rate (Fig. 4.3).
If a 2L,k -fold multiple of the source sampling clock is already available,
the PLL can be omitted. An additional precision of k bits is obtained
by averaging 2k consecutive measurements. This averaging operation
additionally serves as lowpass filter to suppress short-time jitter.
2. The additional PLL used in the above method can be avoided by using a
fast counter, which runs independently of the source and sink sampling
f source
PLL
2 L-k ⋅f sourceFreq2.epsi
32 mm
Counter
70
f sink
Averaging (2 k )
Error tracking
f source / f sink
Figure 4.3: Frequency counter with PLL
clock. The basic idea is to measure and average both periods 1=fsource
and 1=fsink and divide the resulting values by each other (Fig. 4.4).
f ref ≈ 2 L-k ⋅f sample
f sink
Averaging (2 k )
Error tracking
Freq.epsi
77 32 mm
Divider
High-Speed
Counter
f source
f source / f sink
Averaging (2 k )
Error tracking
Figure 4.4: Frequency counter with high-speed reference clock
Both methods demand a trade-off between the measurement frequency,
the memory requirements of the moving average filter and the jitter reduction.
Direct measurements require a GHz counter frequency with a low-jitter clock
source and must be followed by a lowpass filter to eliminate jitter caused by the
sampling clocks. Low-frequency measurements need a high-order averaging
filter to achieve the required accuracy and therefore allow to track sample-rate
changes only slowly.
A VLSI implementation with a 50 MHz reference counter and two moving
average filters of length 128 has been realized at our laboratory [MS93]. In
order to avoid drifting of the group delay, care has been taken that the sink phase
calculated from the averaged values of source and sink sample rates (Eq. 4.1)
does not drift away from the actual value. We implemented the additional
error-tracking circuitry suggested by [Pel82]. This VLSI implementation
fulfilled our expectations for rational ratios of source and sink sample rates.
If both rates differ only by a small amount (plesiochronous case) this solution
is not accurate enough. Consequently an alternative approach to resolve this
problem was studied (Sec 4.4). The cause of the inaccuracy of the above
frequency counter implementation and performance simulations are given in
Sections 4.5 and 4.6.
4.4
Digital Phase Locked Loop
An alternative approach to measure the average frequency ratio is a digital
phase locked loop (DPLL). A DPLL is a digital circuit that controls an output
signal in a way that its phase matches the phase of an input signal [dD89,
Bes93]. PLLs (especially analog PLLs with analog or digital phase detectors)
are widely used e.g. in telecommunication applications for modulation and
demodulation of signals, clock recovery, or as frequency synthesizers.
In the case of the frequency tracking unit, the purpose of the DPLL is
to control the ratio of source and sink sample rate (M [m]=L) in a way that
the calculated sink phase matches the phase of the (de-jittered) sink sampling
clock. The lowpass characteristic of the PLL is used to suppress short-time
jitter of the sampling clocks.
We use an approach similar to the servo controlled loop proposed in [Ada93],
but we use an adaptive loop filter and avoid the resynchronization of the sampling clocks.
f source
Counter
⋅L
Pll.epsi
∆ϕ
100 32 mm
ϕ source
Phase
Detector
f sink
Loop
Filter
VCO
Integrator
ϕ sink
Figure 4.5: Digital PLL (DPLL)
The all-digital PLL consists of the same three basic building blocks as a
conventional PLL (Fig. 4.5):
1. Phase detector: The phase detector generates a signal that is proportional
to the difference of source phase 'source and sink phase 'sink .
2. Loop filter: The loop filter is a lowpass filter that determines the dynamic
behavior of the PLL. It attenuates short-time jitter of the phase detector
output and minimizes the average deviation. Again, a trade-off between
fast tracking and noise suppression must be made.
3. VCO: A VCO in the strict sense of the term is a voltage-controlled
oscillator. The input value corresponds to the (filtered) phase difference.
The output frequency of the oscillator is altered proportionally to the
input. The VCO can therefore be looked at as an integrator in the digital
domain, which is realized as an accumulator.
4.4.1
Phase Detector
The purpose of the phase detector is to measure the phase difference between
the source and the sink samples. At the first glance this phase detection can
be realized by a simple subtraction of source and sink phase. However, this
simple solution is not applicable, because the two phases are not updated at
the same time. The source phase 'source is incremented by L at the rate
fsource . The sink phase 'sink on the other hand is incremented by a variable
amount M [m] at the rate fsink . For precise operation of the phase detector
both phase values must be fetched exactly at the sink sample moment. It
is therefore undesirable to introduce additional jitter by synchronizing both
sampling clocks to a common master clock.
f sink
f source
Gray
Counter
Gray.epsi
102 17 mm
Register
Gray
Decoder
⋅L
ϕ source
Figure 4.6: Synchronization with Gray counter
An elegant solution to avoid the resynchronization is shown in Figure 4.6.
The counter running at the source sample rate fsource is realized as Gray
counter. Since the Gray code is a unit-distance code, i.e. only one bit changes
for subsequent counter values, consistent readout of the counter is guaranteed
at any moment.
4.4.2
Loop Filter
The output of the phase detector is not constant, but fluctuates considerably.
This phase noise origins from several sources. Two sources of the phase noise
are the jitter of the source and sink sampling clocks. A third source of phase
noise is caused by the fact that the source and the sink phase are incremented
only by discrete amounts at different rates: 'source is incremented by L at the
rate of fsource , whereas 'sink is incremented by M [m] at sink sample rate
fsink . The loop filter following the phase detector must suppress the phase
noise caused by all these sources.
The design of the loop filter is a trade-off between the noise bandwidth
of the filter and the dynamic behavior (fast tracking). We use a first order
lowpass filter with lead-lag compensation to stabilize the full loop as proposed
by [Sti92].
Gain [dB]
0
−50
−100
−1
10
0
10
1
10
Frequency [Hz]
2
10
3
10
Openloop.eps
108 73 mm
Phase [Deg]
−120
−150
−180
−1
10
0
10
1
10
Frequency [Hz]
2
10
3
10
Figure 4.7: Open-loop transfer characteristic of the DPLL
The cutoff frequency of the lowpass filter can be adapted in four steps.
Figure 4.7 shows the four open-loop transfer characteristics of the loop filter.
The zero and pole of the lead-lag compensation are 1:5 decades apart. The
resulting phase margin is 69 .
The variable bandwidth of the loop filter allows to adapt the dynamic behavior of the DPLL. With a wide filter a fast pull-in and an immediate tracking
for varying sample rates is accomplished. If the sample rates are stable over
time the bandwidth of the loop is reduced resulting in an improved noise suppression. The integrator following the lead-lag section avoids discontinuities
of the output phase when switching the bandwidth of the loop filter.
In-lock
Detector
f sink
ϕsource
f source
−
K1, K2, K3
2K
∆ϕ
+
−
+ −+
±
:2 K3
Pll2.epsi ACCU
106 47 mm
2K
2 K1
M
:2 K2
+
Loop
Filter
+
f sink
ϕsink
Figure 4.8: Adaptive digital PLL
Figure 4.8 shows the loop filter and the surrounding blocks. Note that all
multiplications can be realized as shift operations. The output of the phase
detector is monitored to determine whether the PLL is locked. The bandwidth
of the loop filter can then be reduced (parameter K 1; K 2; K 3), which results
in a higher jitter rejection.
The amount of jitter attenuation of the sampling clocks can be derived from
the closed-loop transfer characteristic of the DPLL shown in Figure 4.12. E.g.
sinusoidal jitter on a sampling clock at 50 Hz (400 Hz) is attenuated by 70 dB
(105 dB) for the narrowest loop filter.
4.5
Non-uniform Resampling
In the previous chapters we have treated the error introduced through the
finite interpolation factor and the limited stopband attenuation of the interpolation/decimation filter. So far we have assumed ideal resampling of the
interpolated signal. In this section we go a step further and estimate the error
caused by the non-ideal frequency tracking. A comprehensive treatment of
non-uniform sampling can be found e.g. in [Mar93]. [Ada93] studied the
effects of clock jitter on the performance of various D/A converters.
P
M [m] mod L is restricted to discrete
The calculated phase 'sink [m] =
values due to the finite word-length of M . In the following the difference
between the actual phase 'ideal [m] and the phase computed by hardware
'sink [m] is called 'q [m]. Note that 'q [m] does not only include the quantization of M , but also all other by-products of the non-ideal frequency tracking.
Consider an ideal sine wave with amplitude A and frequency fsig , which is
resampled at the (computed) sink sample moment. The resulting sink samples
are
y[m] = A sin
'
[m] + 'q [m]
ideal
2fsig L fsource
(4.2)
[Har90] approximated the spectrum of y [m] for a sinusoidal shape of
'q [m]. He interpreted Equation 4.2 as an FM modulation with the (ideal)
sine wave as carrier and a modulation index of 2fsig =(L fsource ). The
resulting spectrum is a sum of Bessel functions. The Bessel functions can
be approximated for small modulation indices by simple functions [Fet90].
[Har90] supported his findings with simulations and measurements.
We use a generalized approach, which leads to the same result for a sinusoidal shape of 'q , but is valid for any shape of 'q . Equation 4.2 can be
modified for small modulation indices by using the trigonometric equations
sin( + )
=
sin cos + sin cos sin + cos with 1
(4.3)
(4.4)
Applying Equation 4.4 to Equation 4.2 yields
y[m]
=
A sin
2fsig 'ideal [m]
L fsource
2fsig
2fsig 'ideal [m]
A
'q [m]
+
L fsource cos
L fsource
(4.5)
2
The error introduced by the above approximation (Eq. 4.4) is of order
/ ('q =L)2 and can therefore be neglected for large interpolation factors.
The resulting signal consists of two orthogonal parts: the original sine wave
and the phase deviation 'q multiplied by an attenuated cosine function. This
multiplication in the time domain corresponds to a convolution in frequency
domain. The spectrum of 'q is shifted by the cosine function to the same
frequency as the original sine wave, but is attenuated by the factor 2fsig =(L fsource ) relative to the signal amplitude.
Note that not the deviation of the decimation factor M [m] is modulated
onto the signal, but its sum – the phase deviation 'q [m]. The attenuation is
proportional to both the signal frequency and signal amplitude. Therefore the
jitter attenuation is the same for a full-scale 1 kHz signal or a ,20 dB 10 kHz
signal.
Figure 4.9 shows sample traces for an uniform distribution of the quantization error of M [m] between ,1 and +1 LSB. The second trace shows the
phase deviation 'q [m] of the sink sample moment, which is the integral of the
first trace. The third trace shows the second summand – the error term – of
Equation 4.5 for a sampling frequency of 48 kHz. Note that the error spectrum
has a 1=f characteristic, since it is the integral over a signal with uniform
distribution. The phase deviation is modulated onto the signal according to
Equation 4.5. The lowest trace in Figure 4.9 shows the resulting spectrum for
a sine wave at 997 Hz.
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
250
300
[Samples]
350
400
450
500
350
400
450
500
Phase Deviation
[Deg]
0.05
0
−0.05
50
100
150
200
250
300
[Samples]
Jitter.eps
109 124 mm
[dB]
Attenuated Spectrum of Phase Deviation
−100
−120
−140
2
3
10
4
10
10
[Hz]
[dB]
Phase Deviation Modulated on 997 Hz Sine
−100
−120
−140
2
3
10
10
4
10
[Hz]
Figure 4.9: Simulation of distortion caused by uniform jitter of M
4.6
Performance Comparison
In the previous section we have discussed the influence of non-ideal sink
phase computation on the signal quality. In this section the performance of the
frequency counter method and the DPLL are discussed. Measurements will
be given in Section 7.1. To compare both solutions we distinguish three cases:
synchronous, plesiochronous, and asynchronous sampling clocks.
In the synchronous case source and sink sample rates are subharmonics
of a common master clock. On the other hand, if source and sink sample
rate are asynchronous no such common master clock exists. The two clocks
are plesiochronous if they have the same nominal frequency, but are derived
from different oscillators. In this case the two sampling clocks are then not
phase-locked, and their frequencies may differ slightly due to tolerances of the
oscillators. Plesiochronous conditions are often found at system interfaces.
4.6.1
Frequency Counter
In this section the performance of the frequency counter (running at 50 MHz)
for synchronous, plesiochronous, and asynchronous sample rates is discussed.
The moving average filter which follows the counter (Fig. 4.3) averages
the last 2k counter readings. The transfer characteristic of the filter is a SINC
with the first zero crossing at fsample =2k and a decay of 20 dB/dec. Figure 4.10
shows the transfer characteristic for k = 7 and fsample = 48 kHz. Note that
for these parameters only frequencies above 150 Hz are suppressed.
0
[dB]
−20
Movingaverage.eps
108 36 mm
−40
−60
1
10
2
3
10
10
4
10
[Hz]
Figure 4.10: Spectrum of moving average filter
Figure 4.11 shows traces of the distortion caused by the non-ideal frequency tracking of the frequency counter for synchronous, plesiochronous
and asynchronous operation. Three subplots are given (corresponding to Figure 4.9 in the previous section): The deviation of M [m] in LSBs, the deviation
of the phase 'q [m] and the error spectrum of the phase deviation according to
Eq. 4.5.
In the synchronous case the filter is not needed at all since the counter
output is constant and no phase distortion results from the frequency tracking.
In the asynchronous case the error energy is roughly equally distributed
over the frequency band and the moving average filter reduces it by a factor
2k . All distortion products are below ,125 dB for the example given in
Figure 4.11.
In the plesiochronous case the error energy is concentrated at low frequencies and can not be suppressed by the moving average filter. In the example in
Figure 4.11 the frequency difference between the quartz that is used to generate
the source sampling clock and the quartz of the reference counter is 100 Hz.
This difference frequency is not suppressed by the moving average filter, but
can clearly be seen in the plots of the phase deviation for the plesiochronous
case in Figure 4.11. Distortion products at multiples of 100 Hz are produced
for the given example.
The frequency counter approach is therefore well suited for the synchronous case. Both sample rates can exactly be measured by a counter running
with the common master clock. If the counter frequency is asynchronous to
the sampling clock, the counter readings alternate between two values which
differ by one LSB of the counter. These counter readings can be averaged by a
moving average filter to increase the precision of the frequency measurement.
In the plesiochronous case the counter reading alternate also only between two
values as in the asynchronous case, but these variations are at low frequencies
which are not suppressed by the filter. Therefore distortion products with the
difference frequency of the source and sink sample rate result.
The performance can be improved by using a higher order moving average
filter, which reduces the cutoff frequency, or by a different type of filter. For a
cutoff frequency of 0.4 Hz (as with the digital PLL) a moving average filter of
length 216 is required. A high-order moving average filter has the drawback
that it tracks slower and needs more storage resources. A different type of
filter with a low enough cutoff frequency must be followed by a controller to
Synchronous
Frequency Deviation
[LSB_16]
2
0
−2
200
400
600
800
1000
[Samples]
1200
1400
1600
1800
2000
1400
1600
1800
2000
Phase Deviation
[Deg]
1
sync.eps
92 47 mm
0
−1
200
400
600
800
1000
[Samples]
1200
[dB]
Attenuated Spectrum of Phase Deviation
−100
−120
−140
2
3
10
4
10
10
[Hz]
Asynchronous
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
250
[Samples]
x 10
[Deg]
300
350
400
450
500
350
400
450
500
Phase Deviation
−3
6
4
2
0
−2
−4
async.eps
92 47 mm
50
100
150
200
250
[Samples]
300
[dB]
Attenuated Spectrum of Phase Deviation
−100
−120
−140
2
3
10
4
10
10
[Hz]
Plesiochronous
Frequency Deviation
[LSB_16]
2
0
−2
200
400
600
800
1000
[Samples]
1200
1400
1600
1800
2000
1400
1600
1800
2000
Phase Deviation
[Deg]
0.1
0
92
−0.1
200
400
600
800
ples.eps
47 mm
1000
[Samples]
1200
[dB]
Attenuated Spectrum of Phase Deviation
−100
−120
−140
2
3
10
10
4
10
[Hz]
Figure 4.11: Simulation of frequency counter followed by moving average
filter (fsample = 48 kHz, fsig = 997 Hz, k = 7)
ensure that the phase calculated from the frequency ratio does not drift away
from the effective phase (see page 53). These limitations can be overcome by
the DPLL solution.
4.6.2
Digital PLL
An architecture of the digital PLL has already been presented in Figure 4.8.
The Gray counter used as input to the DPLL shows a similar output behavior
as the frequency counter described in the previous section. The counter is
incremented at the rate fsource , but the counter value is fetched with fsink .
If source and sink sample rates are plesiochronous the output of the phase
detector contains mostly low frequency components. For proper operation of
the PLL this low frequency distortion must be suppressed by the loop filter.
Gain [dB]
0
−50
−100
−1
10
0
10
90
0
Phase [Deg]
1
10
Frequency [Hz]
Closedloop.eps
2
10
3
10
61 mm
−90
−180
−1
10
0
10
1
10
Frequency [Hz]
2
10
3
10
Figure 4.12: Closed-loop transfer characteristic of the DPLL
Figure 4.12 shows the closed-loop transfer characteristic of the DPLL for
the four bandwidth of the loop filter (0.4, 1.9, 29, 480 Hz). If source and sink
sample rates are stable the cutoff frequency can be stepwise reduced. This
improves the performance in the plesiochronous case. For varying sample
rates a wider filter allows fast tracking. Performance simulations similar
to Figure 4.11 for M 216 showed that all distortion products are below
,125 dB for the DPLL solution.
5
Implementation
In this chapter we will discuss alternatives for the implementation of a samplerate converter following the concepts presented in the previous chapters. Two
different realization methods are analyzed:
1. Application specific integrated circuit (ASIC)
We have implemented a prototype ASIC for 18-bit audio sample-rate
conversion with arbitrary sampling frequency ratios [LS91, RF94]. We
will refer to this implementation as SARCO. Alternatives, improvements
and extensions to this architecture and realization are considered in this
chapter. Performance measurements are given in Chapter 7.
2. Digital signal processor (DSP)
Modern DSPs have powerful instruction sets to implement digital signal
processing tasks. The choice of a programmable solution allows to tailor
the implementation for each application, but is not cost effective when
large volumes are required. We will estimate the necessary resources
for integer DSPs with 24 24 bit multiplier as the members of the
DSP5600x family [Mot88]. The dynamic range of a 24 bit datapath with
56 bit accumulation is well suited for high-quality audio applications.
Figure 5.1 shows the basic building blocks of the sample-rate converter.
The frequency tracking unit (Chap. 4) determines the ratio of source to sink
67
Frequency
tracking
Source
clock
Sink
clock
Frequency
ratio
Source
samples
BlockDiagramm.epsi
High
order digital
58 54 mm
interpolation / decimation
filter
Sink
samples
Filter
coefficients
Figure 5.1: Block diagram
sample rate while the interpolation/decimation filter does the actual conversion
(Chap. 2). The third block represents the storage of the filter coefficients, which
are interpolated from a stored subset as described in Chapter 3.
5.1
Frequency Tracking
The implementation of the two methods for frequency tracking proposed in
Chapter 4 – frequency counter and digital PLL – will be discussed here.
5.1.1
Frequency Counter
A frequency counter which runs at the chip clock (50 MHz) according to Figure 4.4 with a moving average filter of length 128 has been implemented [MS93]
for the prototype ASIC. The frequency tracking unit is connected to the filter
datapath by a serial interface which is accessible from external pins. This
concept allowed to analyze the output of the frequency tracking unit and to
evaluate alternative solutions.
Measurements showed a precision of 14bit for the worst case, where source
and sink sample rate are plesiochronous. Inaccuracies in the measurement of
the sample-rate ratio produce phase noise of the sink sampling moment, which
is modulated onto the audio signal according to Equation 4.5 (Fig. 4.9). Since
the error is proportional to the signal amplitude A and the signal frequency
fsig , the distortion has the largest influence for signal frequencies close to
fsample =2 with an amplitude near full-scale.
The frequency counter of SARCO occupies an area 13:6 mm2 . 30% of this
area is used for the RAMs of the moving average filter.
5.1.2
Digital PLL
A digital PLL solution has been developed for frequency tracking in order to
reach an accuracy of 20 bit for the ratio of source to sink sample rate. The
basic architecture of the digital PLL has already been given in Figure 4.8.
Since the loop filter is a recursive (IIR)-filter with poles close to the unit circle,
large word-lengths are needed for the filter datapath to achieve the required
20-bit precision for M . Because of the feedback involved, the circuit must be
carefully designed to guarantee stable operation and to avoid limit cycles due
to quantization effects.
Extensive simulations have been performed using a finite word-length
model of the PLL. A minimal word-length of 44 bits is required for the
accumulator. After the initialization during pull-in of the PLL the internal
registers have not reached their stable values, which may lead to overflow in
the accumulator. Therefore a limiter has been added after the phase detector,
which saturates the output of the phase detector to avoid overflow.
In-lock
Detector
f sink
ϕsource
f source
28
−
Saturate
25
∆ϕ’
218
K1, K2, K3
+
−
+ −+
±
PLL.epsi
105 44 mm
218
2 K1
:2 K3
ACCU
+
M
21
44
+
:2 K2
ϕsink
Figure 5.2: Implementation of digital PLL
f sink
ptr ∆ϕ
28
The in-lock detector adapts the bandwidth of the loop filter using the
parameters K 1, K 2, and K 3 according to the output of the phase detector.
If the output of the phase detector is below a certain limit during some time
period the bandwidth is gradually reduced. If a large phase difference is
detected the bandwidth of the loop filter is increased at once to allow fast
tracking of sample-rate changes and to guarantee that the PLL stays locked.
The PLL in Figure 5.2 has been realized in a Field-Programmable Gate
Array (FPGA) having the same serial interface as SARCO. Unfortunately the
interface of SARCO limits the frequency ratio to a precision of only 16 bits.
Therefore an additional circuit has been added, which generates a 16 bit approximation of the 20 bit ratio M by controlling the ‘duty cycle’ of the least
significant bit. This FPGA realization of the PLL is however well suited for a
sample-rate converter with an (audio) precision of up to 20 bit.
The datapath of the PLL implemented in a Xilinx XC4010 FPGA runs at
2 MHz, since only 7 additions/subtractions must be calculated for each sink
sample. The frequency ratio M is transmitted over the serial interface running
at 20 MHz. The XC4010 contains 400 configurable logic blocks (CLBs),
which is claimed to be equivalent to 10 000 gate equivalents (GE). Each CLB
provides two independent function generators of four inputs and two flip-flops.
Considerable effort was required to fit the PLL into one FPGA, since all CLBs
are occupied: 91 % of the CLBs are used for logic functions, 9 % as additional
routing resources.
To compare the complexity of the DPLL to the frequency counter solution,
the die size for an ASIC realization of the PLL has been estimated for the same
target technology as the frequency counter. The DPLL requires only 6:3 mm2 ,
which is about half the area of the frequency counter. The accuracy has been
improved from 14 to 20 bit.
Table 5.1 summarizes the technical data of both approaches.
2
Freq. counter
DPLL
[mm ]
13:6
6:3
ASIC
[GE]
9 100 + RAM
5 900
FPGA
[CLB]
[GE]
400 10 000
Table 5.1: Implementation alternatives for frequency tracking
5.2
5.2.1
Datapath for FIR Filter
Review of Operations
In Chapter 2 we have explained the principle of sample-rate conversion for
arbitrary ratios: It can be realized by a FIR filter of order Q with time-varying
coefficients (Eq. 2.17). An equidistant subset of the Q L filter coefficients of
the high-order interpolation/decimation filter is chosen for each output sample.
The selection of the subset depends on the difference between source and sink
sample phase ∆'[m]. Equation 2.17 is rewritten below for time-varying M [m]
and %[m].
y[m] = %[m] +
QD m
[
2
X
]
j =, QD m
[
h j L %[m] + ∆'D [m] x ptr[m] , j
(5.1)
]
2
'sink [m]
ptr[m]
∆'[m]
%[m]
QD [m]
∆'D [m]
=
=
=
'sink [m , 1] + M [m , 1]; 'sink [0] = '0
'sink [m] div L
'sink [m] mod L
=
=
=
1
UPMODE
L
M [m] DOWNMODE
M [m] Q
L
%[m] ∆'[m]
(5.2)
(5.3)
(5.4)
(5.5)
(5.6)
(5.7)
To clarify the required operations, a step-by-step description of the algorithm is given below.
1. Determine the new sink sample phase 'sink [m] according to the current
sample-rate ratio M [m , 1] fetched from the frequency tracking unit
(Eq. 5.2). Split the sink phase 'sink [m] into a lower part ∆'[m] (Eq. 5.4)
which determines the sink phase relative to the source samples, and into
an upper part ptr [m] (Eq. 5.3) which points to the source sample. Note
that the values of 'sink [m], ptr [m], and ∆'[m] are directly available, if
the frequency tracking is realized with a DPLL according to Figure 5.2.
2. If the sink sample rate is smaller than the source sample rate (DOWNMODE) the number Q of coefficients and the relative phase ∆'[m] are
multiplied by M [m]=L and %[m] respectively (Eq. 5.6, 5.7) yielding
QD [m] and ∆'D [m]. Note that since M [m] > L in DOWNMODE, % is
smaller than 1.
P
3. The ‘inner loop’ of the algorithm is of the form
h[hadr] x[xadr].
The address of the source samples xadr [j ] = ptr [m] , j is incremented
by one for each calculation. The spacing of the filter coefficients h is
equidistant with spacing L (UPMODE) or %[m] L (DOWNMODE).
Since only a fraction of the filter coefficients h are stored, the actual
coefficients are to be interpolated in real-time. The coefficient address
hadr has to be split into two parts: hadrHigh = hadr div L2 is used
to access the stored FIR coefficients hFIR ; hadrLow = hadr mod L2
is used for the interpolation. The difference ∆h between successive
coefficients needed for the interpolation can either be stored or calculated
in real-time – both methods need two memory accesses.
Multiplication
Memory access
h = hadrLow ∆h + h1
al = h xl + al
ar = h xr + ar
Address update
xl = xLeft[xadr]
xr = xRight [xadr]
h1 = hFIR [hadrHigh ] hadr + = % L
∆h = hFIR [hadrHigh ] xadr ++
For each iteration of the inner loop a minimum of three multiplications,
four memory accesses and two address updates must be calculated.
4. In DOWNMODE the accumulators al, ar must be multiplied by % due to
the scaling of the filter cutoff frequency (Eq 2.15).
Note that the number of repetitions of the inner loop is proportional to %. In
UPMODE it is constant. For the extreme case fsink = fsource =2 (DOWNMODE)
the number of loop repetitions doubles compared to the UPMODE. Since y [m]
is calculated at the sink sample rate (which is fsource =2 in this case) the
overall speed requirements for the inner loop in DOWNMODE are the same as
for fsink = fsource.
If the computation effort outside the inner loop is neglected, the time for
one repetition of the inner loop must be below
UPMODE
DOWNMODE
1=(Q fsink )
1=(Q fsource )
(5.8)
5.2.2
Finite Word-Length in Datapath
In this section the implications of the finite word-lengths in the datapath on
the signal quality are discussed. Figure 5.3 shows the basic structure for the
filter calculation.
ρ[m] h1, ∆h
x[n]
bx
Circular
Buffer
xadr
bx
bh
Datapath.epsi
95 59 mm
bm
M[m]
hadr
xl, xr
Address
Generator
Scale
Multiply
Accumulate
Unit
ba
by
y[m]
Figure 5.3: Datapath: word-lengths for FIR filter
The source samples x[n] are stored in a circular buffer. The size of this
buffer is mainly determined by the filter order QD . The allowed short-time
variations of the sample rates requires (few) additional words. A larger buffer
size is more tolerant to sample-rate variations, but increases the group delay
of the filter.
The address generator calculates the coefficient address
sample address xadr according to Equation 5.1.
hadr
and the
The product of the source samples xl, xr of word-length bx and the filter
coefficients h1, ∆h of word-length bh is quantized to bm bits and accumulated.
The word-length of the accumulator ba must be large enough to avoid overflow.
If QFP coefficients are used (Sec. 3.5.1) the accumulator is renormalized by
shifting when the exponent changes: block ‘Scale’ in Figure 5.3. As a last step
the accumulator value is multiplied by %. For this last multiplication the same
multiplier can be used as before, but the accumulator output must be quantized
to bh bits. Note that this last multiplication also allows to alter the overall gain
bx
SARCO [RF94]
AD1890 [Dev93]
DSP5600X [Mot88]
[bit]
18
20
24
bh
[bit]
22
22
24
bm
[bit]
22
25
48
ba
[bit]
25
27
56
by
[bit]
18
24
24
Quantization
noise [dB]
,123
,127
,140
Table 5.2: Datapath width in bits ( = QFP)
if desired. Table 5.2 shows sample word-length for three implementations.
The influence of the finite word-length filter coefficients on the filter transfer
function has already been shown in Section 3.5. The finite word-lengths in the
datapath are the reason for several sources of quantization noise:
1. Quantization to bm bits after the multiplication in the inner loop
2. Quantization caused by the renormalization for QFP coefficients
3. Quantization to bh , 1 and bm , 1 bits before and after the multiplication
with %
4. Output quantization to by bits
The quantization noise – neglecting output quantization – is therefore
for fixed-point coefficients
qbh ,1 + qbm ,1 + Q qbm
for QFP coefficients
qbh ,1 + qbm ,1
+2qbm + 4qbm ,1 + 4qbm ,2 + 8qbm ,3 + : : :
+qbm + qbm ,1 + qbm ,2 + qbm ,3 + : : :
For the QFP coefficients we thereby assumed that two coefficients have
exponent 0, four have exponent 1, four have exponent 2, etc. The resulting
quantization noise for Q = 104 is given in the last column in Table 5.2. The
error contribution of the datapath can therefore be neglected for the given
examples for audio word-length up to 20 bit.
5.3
DSP Realization Considerations
The operations reviewed in the previous section seem to be well suited for
a DSP implementation, since a fast multiply-add unit and a large memory
bandwidth are required. DSP solutions have been reported i.e. for the following
implementations:
[CDPS91] realized a sample-rate converter for fixed frequency ratios on a
DSP56001 using B-splines. Two cascaded FIR interpolators each by a factor
2 are followed by a 6th order B-spline interpolation. 300 instruction cycles
are needed per sink sample allowing real-time on a DSP56001 @ 33 MHz.
[PHR91] implemented a 16-bit sample-rate converter for rational frequency ratios on a DSP56001. The interpolation/decimation filter has been
synthesized by multiplying a SINC function with the Blackman-Harris window, as suggested by [Ram82]. The interpolation/decimation filter is of order
N = 10239 using an interpolation factor L of 160. Therefore the number of
repetitions Q of the inner loop is 63. Each repetition of the inner loop requires
only two instruction cycles, since no coefficients are interpolated. Therefore
only 126 instruction cycles are to be executed in the inner loop for each sink
sample.
Due to the real-time coefficient interpolation and the frequency tracking an
implementation of our algorithm is more complicated than the examples shown
above. The frequency tracking is not well suited for a DSP implementation
due to the synchronization problems, but can be realized as FPGA (Sec. 5.1.2).
Some problems of the filter implementation on a DSP56001 are pointed out
below.
The DSP56001 has two 256 24 bit on-chip data memories, a 512 24 bit
program memory and many powerful addressing modes, but only one access
to external memory per instruction cycle is possible. Due to the 24-bit data bus
and 16-bit address bus the external memory space is limited to 64k 24 bit.
The address ALUs, which allow auto-increment and modulo addressing, are
limited to a 16-bit word-length. The data ALU has two 56-bit accumulators
and allows single cycle multiply-accumulate from the X and Y register.
The source samples x[:] and the program are stored in the on-chip (X-)RAM.
A maximum of 64k filter coefficients can be stored in an external RAM, which
must be fast enough for one cycle accesses. Since only two accumulators
are available in the DSP56001, the ‘inner loop’ is split into two loops, which
are executed one after the other: in the first loop the filter coefficients h are
precomputed and
stored in the internal (Y-)RAM; in the second loop the actual
P
filter function h x (al + = h xl, ar + = h xr) is calculated.
In the first loop the coefficient address is updated: hadr + = % L,
hadrHigh = hadr div L2 , hadrLow = hadr mod L2 , and the coefficients
h1, ∆h are accessed from external memory. The actual filter coefficient is
then interpolated h = hadrLow ∆h + h1 and stored in the internal memory.
Figure 5.4 shows two realization alternatives: on the left hand side the DSP is
used for all to above computations; on the right hand side the FPGA calculates
the coefficient address update and addresses the RAM. This second alternative
requires a larger FPGA, but needs less (DSP) instruction cycles.
The second loop – an FIR filter with constant coefficients – is easy to
implement in two MAC instruction cycles, since both operands are stored in
the internal RAM.
Frequency
tracking
XC4010
ρ[m], M[m]
x[n]
Filter
Frequency hadr High
tracking
coefficients
XC4013
RAM 64k×24
Filter
coefficients
RAM 64k×24
hadr High
High order digital
interpolation / decimation
filter
DSP56001
ρ[m], M[m]
h1, ∆h
h1, ∆h
BlockDiagramm-DSPnew.epsi
hadr Low
105 37 mm
y[m]
x[n]
High order digital
interpolation / decimation
filter
DSP56001
y[m]
Figure 5.4: Block diagram using DSP and FPGA
The author estimates that 5 - 7 instruction cycles are required for an actual
implementation of the inner loop. A DSP5600x running at 40 (66) MHz allows
to compute 400 (660) MAC/s at a rate of 50 kHz. If 6 instruction cycles are
used for the inner loop, and if we assume that 90% of the computation time is
spent in the inner loop, 60 (100) loop repetitions can be realized. A DSP/FPGA
implementation of a sample-rate converter for arbitrary ratios seems therefore
possible, but rather expensive if needed in large quantities.
5.4
5.4.1
ASIC Realization
Implementation
A VLSI implementation of a sample-rate converter (SARCO) has been realized
in several semester and diploma theses [RW90, LS91, MS93]. This ASIC
is realized in a standard cell and macrocell technique using a 1:2m double
metal CMOS technology. It contains 230 000 transistors, including 14 kbit
of static RAM and an 18 22 bit pipelined multiplier. The total die area is
6.4 mm 9.6 mm. Figures 5.5 and 5.6 show the floorplan and a photomicrograph of the chip.
The architecture of a 16-bit sample-rate converter for arbitrary ratios and
the high-order filter synthesis program have been developed in a first diploma
thesis [RW90]. A sophisticated scheduling schema for the pipelined multiplieraccumulator has been elaborated, which allows to keep the pipeline fully
loaded during the filter computation. Based on the results of this work, the
specifications for a 18-bit sample-rate converter have been established.
To verify the performance of the algorithm for the chosen parameters
a software model of the sample-rate converter has been written and simulated [LS91]. In a second step this model has been partitioned into several
functional units according to the planned chip architecture. An additional
model has then been written for each functional block, which matched exactly
the intended implementation e.g. concerning word-length of the datapath. This
bit-level model allowed to simulate the quantization properties of the whole
system. However, the integer model executed much faster than the bit-level
model.
The bit-level model has been used to generate stimuli and expected responses for the simulation of the VLSI circuit during development. This
methodology allowed to verify each functional block separately as soon as it
was implemented. Additional simulations have been performed using several
functional blocks.
Some key parameters of the actual implementation of SARCO are given
below. A more detailed description of the filter implementation can be found
in [LS91]. In [MS93] the design of the frequency counter is documented in
detail.
Frequency Tracking
Arithmetic Unit
FIR Address
Generator
Circular
Buffer
Left
Registers
Floorplan.epsi
108 70 mm
Accu
Frequency
Measurement
Unit
Circular
Buffer
Right
Pipelined
Multiplier
Moving
Average RAM
Circular Buffer
Figure 5.5: Floorplan of SARCO
Sarco-scan50dpi.eps
108 72 mm
Figure 5.6: Photomicrograph of SARCO
The (stereo) input samples are stored in two 256 18 bit circular buffers,
for the left and right channel each. Special address generation units compute
the required addresses for these sample buffers and the address of the filter
coefficients. The heart of the digital filter architecture is a microprogrammed
pipelined multiply-accumulate unit. A cycle time of 50 ns has been reached
using three stage pipelining of the multiplier. The multiply-accumulate unit is
used both for the coefficient interpolation and for the FIR filter calculation.
The FIR filter coefficients of this prototype design are stored off-chip in
the quasi floating point format (QFP): 22 bit mantissa, 4 bit exponent. The
25 bit wide accumulator is dynamically rescaled according to the exponent of
the filter coefficient. However only subsequent shifts by either 1, 2 or 3 bits
are implemented.
A (total) interpolation factor L of 65536 is used (L1 = 256, L2 = 256).
Therefore the required number of filter taps (Q) is 105 for the full and 85 for the
relaxed specifications (Chapter 3). Each filter tap needs three multiplications:
one for the coefficient interpolation and two for the FIR filter calculation of
the stereo channels. Besides the inner loop, 30 additional cycles per output
sample are required for initialization and postprocessing. To compute one sink
sample, a total of 3 85 + 30 = 285 cycles (at 20 MHz) is needed. Therefore
the maximum sample rate which can be processed is 70 kHz.
SARCO is targeted for 18-bit performance. Therefore the input word-length
is 18 bits and the output is rounded to 18 bits. Figure 3.8 showed that for the
chosen parameters (L1 = 256, L2 = 256, s1 = ,137 dB) 18-bit performance
for full-scale sine waves is only reached up to a signal frequency of 5 kHz.
Improvements are suggested in the following section.
5.4.2
Discussion
During the development and measurements of the sample-rate converter SARCO
a deeper understanding of all side effects involved has been gained. The effects
of the various parameters on the resulting signal quality have been discussed
in detail in the previous chapters. The following improvements to the above
implementation make full 20-bit performance possible.
The frequency tracking unit realized on the ASIC reaches only a worst case
accuracy of 14 bit. The digital phase-locked loop solution (realized as FPGA)
replacing this unit allows to improve the resolution of the frequency tracking
0
−20
−40
−60
[dB]
−80
Filter57855bit22QFP-256bit22QFP.eps
106 83 mm
−100
−120
−140
−160
−180
0
0.05
0.1
0.15
0.2
0.25
[f/f sample]
Figure 5.7: Interpolated FIR filter (N
=
0.3
0.35
57855), L2
0.4
=
0.45
0.5
256, 22 bit QFP
to 20 bit. As an additional benefit the required chip area used by this unit is
reduced from 13.6 mm2 to 6.3 mm2 .
The filter coefficients of SARCO are stored in an external memory. To allow
one-cycle access, SRAMs buffered by a Lithium battery are used. To reduce
the number of pins and the external components of the 20-bit version, the
coefficients can be stored on-chip. The parameter Q, which defines the order
of the interpolation/decimation filter of SARCO, is programmable for Q 85
(N 21 759). For full 20-bit performance a filter order of at least 57 855 is
required. Since the impulse response is symmetric, and if the QFP notation is
used a ROM size of 28 928 22 bit is sufficient. For the 1:2m technology of
SARCO an additional die area of 26 mm2 is necessary.
The number of filter coefficients can be substantially reduced by using
higher order coefficient interpolation as outlined in Section 3.2.3.1. A second
multiplier is then needed to allow coefficient interpolation in real-time.
Figure 5.7 shows the spectrum of the interpolation/decimation filter of order
57 855 using an interpolation factor L2 = 256 and 22 bit QFP quantization.
The average quantization noise level is at ,186 dB with a peak value of
,165 dB. The 22 bit QFP format is therefore well suited for a filter of order
57 855.
The circular buffer for the source samples and the multiplier must be
extended from 18 to 20 bit. This leads to an area increase of approximately
1 mm2 .
The above enhancements from the (nearly) 18-bit implementation of SARCO
to a full 20-bit implementation increases the chip area from 61.4 mm2 to
81 mm2 (+32%).
To estimate the die area of the above 20-bit architecture in a 0.8 m technology the main blocks have been retargeted to this technology. Based on
these results the author estimates that a die area of approximately 50 mm2 is
sufficient.
6
Measurement Principle & Setup
In this chapter the setup of an all-digital measurement system, which has been
built for the characterization of the sample-rate converter is described. In addition to the digital measurements informal hearing tests have been performed
using a CD-player as source.
At a first glance it seems to be an easy task to measure the performance
of a digital system with digital input and output using DSP-based equipment.
However many specific particularities must be taken into account for accurate,
consistent and relevant measurement of high-quality digital audio equipment.
The measurement system is composed of generator and analyzer hardware,
and a workstation (Fig. 6.1). The generator applies repeatedly in real-time
a sequence of 24-bit audio samples to the device under test (DUT). The
analyzer acquires the response samples, which are then processed, displayed
and analyzed on a workstation using MATLAB [Mat88] – a high-performance
numeric computation and visualization software.
6.1
Generator
The source sample rate fsource is generated by dividing the frequency fgen
of a quartz oscillator by U . This concept allows for a very stable low-jitter
83
f source
Generator
RAM
256k x 24 bit
Digitally Generated
(Sine) - Waves
f sink
DUT
Source
Samples
Ssetup.epsi///Chap6Sink
103 52 mm Samples
Workstation
Analyzer
RAM
256k x 24 bit
Spectrum Analysis
with MATLAB
Figure 6.1: Setup of all-digital measurement system
sampling clock source. With a 50 MHz quartz and sampling frequencies in
the range of 48 kHz discrete frequencies with a resolution of 46 Hz can be
generated.
A large memory (256 k 24 bit) holds the source sample values and is
read out at the source sample rate. Any sequence of source values of length
up to the size of the memory can be loaded into the RAM and is then applied
periodically to the DUT. This approach allows the use of dedicated test signals,
like pseudo-random noise, artificially distorted sine waves, multiple frequency
signals, as well as music signals of a length up to 6 s.
The ratio of the test signal frequency to sampling frequency has a significant
effect on the spectral content of the signal [HB86]. If the sampling frequency is
a multiple of the test signal frequency then the quantization error components
are located at the harmonics of the test signal. This is often undesirable since
the true harmonic distortion should be separated from the quantization noise.
If adequate dither is added prior to the quantization, the above effect is
reduced, since dither randomizes the quantization process and results in a
white-noise characteristic of the quantization noise [VL84, LWV92]. The
same effect can be reached if the test signal frequency and the sampling
frequency are chosen to be of non-integer (preferably close to irrational) ratio
and the analysis is performed over several periods of the test signal. Using this
method, the signal degradation caused by the added dither can be avoided.
If the generator RAM contains LOOP values and NP periods of the test
signal, the following equations hold:
fsource = U1 fgen
NP
NP f
fsig = LOOP
source = LOOP U fgen
(6.1)
(6.2)
NP should be chosen to be a (large) prime number to ensure that NP and
LOOP have no common factors for reasonable choices of fsig and fsource .
For fsource 48 kHz and fsig 997 Hz, i.e. the following parameters result
(fgen = 50 MHz):
U = 1 042; NP = 5 087; LOOP = 244 832
The resulting frequencies are: fsource = 47:985 kHz and fsig = 997:002 Hz.
Using this method, only discrete but closely spaced test signal frequencies
can be generated (the next larger signal frequency would be 997.006 Hz in the
above example). It can be checked easily that the signal frequency fsig is not
related to the other frequencies by a small integer ratio – neither to fsource nor
to fgen .
6.2
Analyzer
The analyzer is built in a fashion similar to the generator (Fig. 6.1). The analyzer quartz frequency fanl is divided by V to generate the sink sampling clock
fsink . The sink sample rate can alternatively be derived from the generator
quartz instead of the separate analyzer quartz to allow synchronous measurements. The samples are acquired at the sink sample rate with 24-bit precision
and written into the analyzer RAM. The spectrum of the acquired samples is
calculated using the FFT and displayed graphically on the workstation using
MATLAB.
The power spectrum of an N -point FFT consists of N discrete frequency
values – so-called bins. Therefore the frequency resolution is limited to
fsink =N . When calculating the FFT, it is assumed that the sequence of N
points is repeated periodically. The discontinuities at the beginning and at
the end of the sequence, which may result from this repetition, distort the
spectrum. The two well-known methods to cope with this problem are given
in the next two sections.
6.2.1
Frequency Matching for the FFT
If the test signal frequency is chosen to lay on a bin, i.e.
fsig = n Nfsink
(6.3)
the data sequence contains exactly n periods of the test signal and no discontinuity occurs at the boundary of the measurement interval. This technique is
often used in A/D- and D/A-converter testing [Der87], but is not well suited for
sample-rate conversion characterization, since it must be possible to choose
the test signal frequency independently of the sink sample rate to comply with
the requirements given in the previous section. In addition this method is
only feasible if Equation 6.3 can be met exactly, i.e. if the exact sink sampling
frequency is available. A small mismatch leads to spectral leakage discussed
below.
6.2.2
Windowing for the FFT
The second method to avoid the discontinuity at the border of the data sequence
is to multiply the data by a window function w[i]. [Har78] compared many
different windows and concluded that the Kaiser-Bessel and the BlackmanHarris window perform best. We use the 4-term Blackman-Harris window
(BH4) and the Kaiser-Bessel window with = 6 (KB6), due to their high
sidelobe attenuation.
In Figure 6.2 the spectrum of the BH4-window is plotted. The highest
sidelobes are at ,92 dB. The ,6-dB bandwidth of the window is 2.72 bins.
Therefore signal frequencies which are at least 3 bins apart can be distinguished [Har78]. Note that if the signal frequency falls exactly on a bin of the
FFT, the BH4 window is sampled at it’s (equidistant) zero-crossings and the
effect of the window disappears. On the other hand, if the signal frequency
falls in the middle of two bins, the BH4 window is sampled on the upper
envelope in Figure 6.2.
The parameter of the Kaiser-Bessel window allows a trade-off between
mainlobe width and sidelobe attenuation. The Kaiser-Bessel window for
= 6 is plotted in Figure 6.3. The ,6-dB bandwidth of the KB6 window is
3.31 bins, and the highest sidelobes are at ,145 dB. The window coefficients
are Bessel functions and therefore the zero-crossings are not equidistant.
0
−20
−40
[dB]
−60
BH4win.eps
89 69 mm
−80
−100
−120
−140
−160
−80
−60
−40
−20
0
[bin]
20
40
60
80
Figure 6.2: Spectrum of 4-term Blackman-Harris window (BH4)
0
−20
−40
−60
[dB]
−80
KB6win.eps
89 69 mm
−100
−120
−140
−160
−180
−80
−60
−40
−20
0
[bin]
20
40
60
80
Figure 6.3: Spectrum of Kaiser-Bessel window with = 6 (KB6)
What is the required accuracy of the measurement system for high-quality
digital audio characterization?
For a full-scale sine wave quantized with k bits the quantization noise
floor is ,6:02 k , 1:76 dB below the signal level. Therefore the window
sidelobes of the BH4 window at ,92 dB would be only a minor error for 16-bit
measurements. However, this estimate is misleading, since the quantization
noise is modified by the window function and distributed over N=2 bins of
the FFT resulting in a lower noise level. Several measurements are usually
averaged to obtain a uniform distribution of the noise over all bins.
cg
Two characteristic values of a specific window w[i] are its coherent gain
and its equivalent noise bandwidth nbw [Har78].
cg = N 1
N
X
N
X
w[i]; nbw = N cg2 w2 [i]
i=1
i=1
1
(6.4)
The coherent gain cg is the average value of the window. To preserve
the energy of the signal, the amplitude of the spectrum is divided by cg . The
window function amplifies the noise level by its noise-bandwidth nbw. In a
N -point FFT the average noise floor for k-bit quantization appears therefore
at
, 10 log
N
, 6:02 k , 1:76 dB
2 nbw
(6.5)
Table 6.1 gives examples values of Eq. 6.5 for the BH4 window (nbw = 2:004).
The listed values must be increased by 1.0 dB for the KB6 window, due to its
larger noise bandwidth (nbw = 2:492).
N
k =16 bit k =18 bit k =20 bit
1024 ,122:2 dB ,134:2 dB ,146:2 dB
16384 ,134:2 dB ,146:2 dB ,158:2 dB
65536 ,140:2 dB ,152:2 dB ,164:2 dB
Table 6.1: Average noise floor level of a sine wave with k -bit quantization
using a N -point FFT with BH4 window
6.2.3
Complex Windows
In Sections 6.2.1 and 6.2.2 we have presented two methods to allow accurate
spectrum analysis of high-quality audio signals. Both methods exhibit certain
limitations: The frequency matching method requires the precise sampling
frequencies to be available and the performance of the windowing method
depends strongly on the chosen window. The BH4 window has its sidelobes
as high as ,92 dB and the KB6 window needs coefficients which are expensive
to calculate and allows only a limited frequency resolution, due to its wider
mainlobe.
To overcome the inaccuracies caused by the sidelobes of the BH4 window,
the signal frequency must be chosen to fall exactly on a bin as mentioned
above. To avoid this restriction we make use of the frequency transformation
property of the Fourier transform. The signal frequency is determined by a
first FFT and is then shifted slightly onto a bin (Eq. 6.6) without affecting the
signal energy.
s0 = s e,jΩt
(6.6)
We have called this combination of the complex correction and the window
‘complex BH4 window’ (BH4C) [RF94]. It is now also used in the newest
version of the ‘Audio Precision One’ software. This method is only applicable
for windows which have equidistant zero-crossings that can be matched on a
bin (i.e. not to KB6 window).
If the frequency measurement is not accurate enough remnants of the
window will still be visible. Figure 6.4 shows the Hanning, the BH4, and the
KB6 window with three different offsets.
The BH4C window is a good compromise between the mainlobe width and
the inaccuracies introduced by imprecise frequency measurement. The KB6
window is well suited for multi-tone measurements, where it is not possible
(and not desirable) to match all frequencies on a bin.
6.3
Measurement Procedure
This section concludes this chapter with the step-by-step description of the
procedure used for the measurements in this thesis.
Offset = 0.004 bin
Offset = 0.500 bin
0
0
−20
−20
−20
−40
−40
−40
−60
−60
−60
−80
Wincompare.eps
89 72 mm
−80
[dB]
[dB]
[dB]
Offset = 0.000 bin
0
−80
−100
−100
−100
−120
−120
−120
−140
−140
−140
−160
−10
0
[bin]
10
−160
−10
0
[bin]
10
−160
−10
0
[bin]
10
Figure 6.4: Influence of (bin-)offset on window performance: BH4 (
Hanning (, ,), KB6 (, , ,)
),
1. The source and the sink sample rates are chosen. Since they are generated by dividing the frequency of the generator and the analyzer quartz
respectively, parameters U , V are set. If a particular precise sample rate
– such as 44.100 kHz – must be generated, an appropriate quartz with a
multiple of this rate must be used.
2. The source samples are loaded from the computer into the generator
RAM. A signal frequency according to Equation 6.2 should be chosen.
If this is not possible dither must be added to the source signal.
3.
N sink samples are acquired into the analyzer RAM and transferred to
the computer. A first FFT is used to determine the signal frequency. The
signal is then multiplied by the complex Blackman-Harris window and
a second FFT is calculated in double precision using MATLAB.
To average several (Avg ) measurements the signal acquisition is repeated Avg times. The acquired N sink samples are transformed into
the frequency domain by the FFT and are then averaged.
4. The current standard for measurement of digital audio equipment [Cab91]
states that the notch filter to suppress the signal frequency for ‘total harmonic distortion and noise’ (THD+N) measurements should have an
‘electrical Q’ between 1 and 5. We realized the notch filter in frequency
domain, by setting the appropriate bins to zero. Two THD+N values are
given in each plot: one for Q = 5 and one for a notch filter with a constant width of 16 bins (i.e. Q = 21 for fsig = 1 kHz, fsink = 48 kHz,
N = 16k).
According to the standard only noise components up to 20 kHz are taken
into account. Therefore it is possible to get THD+N values which are
slightly below the minimum for k-bit quantization.
For further reference Figure 6.5 shows the spectrum of an (ideal) sine
wave quantized with 18 bit. A 16k-FFT has been calculated using the BH4C
window. The signal level is 0 dB, and the total harmonic distortion and noise
is the same for the narrow and the standard (Q = 5) notch filter. 16 sequences
have been averaged (Avg = 16). The slow rise of the noise floor towards low
frequencies is a residual of the BH4C compensation.
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −110.5dB; −110.5dB, Avg = 16
0
−20
−40
[dB]
−60
Sine18BH4.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
[Hz]
Figure 6.5: Sine wave quantized with 18 bit
4
10
7
Results
In this chapter measurements of our VLSI implementation (SARCO) of the
sample-rate converter are presented to verify the theoretical concepts developed in the previous chapters. Measurements of a commercial implementation
are added for comparison purposes.
Three major sources of signal degradation due to sample-rate conversion
have been identified in this thesis:
Non-uniform sampling due to finite precision frequency tracking
Non-ideal reconstruction/anti-aliasing lowpass filter (including the hold
operation, the coefficient interpolation and quantization, and the finite
stopband attenuation)
Quantization noise in the datapath
The phase deviation of the computed sink sample moment caused by the
finite precision frequency tracking and jitter of the sampling clocks results in a
non-uniform resampling of the signal. The phase deviation is modulated onto
the audio signal and leads to a wider signal peak. The level of the modulated
error signal is proportional to both the signal frequency fsig and the signal
amplitude A. Note however that jitter on the sampling clocks is attenuated by
93
the digital PLL and therefore the signal quality may even be improved by the
sample-rate conversion.
The zero-order hold operation and the coefficient interpolation lead to
spurious signal peaks near the zero-crossings of the interpolation function
spectrum ( SINC for hold operation, SINC 2 for linear interpolation). These
peaks are folded back into the baseband by the resampling process. Note that
the resulting filter transfer function is still strictly linear phase, since the filter
coefficients are exactly symmetric even after interpolation and quantization.
These error components caused by the interpolation and hold operation can
easily be identified in the spectrum, since they appear always in pairs with
(nearly) identical amplitude, but with a frequency difference of twice the
signal frequency fsig . The amplitude of these peaks scales proportionally to
the signal amplitude A. The error peaks caused by the hold operation are
proportional to the signal frequency fsig , whereas the peaks caused by the
coefficient interpolation scale by (fsig )2 .
In Section 7.1 measurements of the distortion caused by the non-ideal
frequency tracking are shown and discussed. In Section 7.2 the performance
of the entire sample-rate converter is measured and error components caused
by the non-ideal filtering are pointed out. Measurements of the error caused
by the finite datapath width are not shown, since this quantization is masked
by the 18 bit output quantization of SARCO.
7.1
Frequency Tracking Measurements
The phase of the sink samples 'sink [m] is calculated starting from an initial
phase '0 . It is incremented for each sink sample by the (measured) ratio of
source and sink sample rate M [m] (Eq. 5.2, Fig. 4.2). Jitter of the source and
sink sampling clock and the finite arithmetic precision of the frequency tracking
unit lead to a (small) inaccuracy of the calculated sink sample moment. For the
characterization in this section the output of the frequency measurement unit
is acquired with the analyzer (Sec. 6.2). This measured data is used to sample
an ideal, non-quantized sine wave at the corresponding sink sample moments.
The spectrum of this non-uniformly sampled sine wave is then calculated and
displayed.
The following measurements base on source and sink sample rates derived
from different quartz oscillators as described in Chapter 6. The sample-rate
ratio M [m] is tracked by either the frequency counter integrated on SARCO
(Sec. 5.1.1) or the digital PLL realized as FPGA (Sec. 5.1.2). 16 384 values
of M [m] are acquired for different source and sink sample-rate pairs using
the digital analysis system. Plots of the deviation caused by the non-ideal
frequency tracking are shown on Pages 96 and 97. Each plot is subdivided
into three traces. The top trace shows the deviation of M [m] from the average
value in (16-bit) LSBs. The sink phase is computed as integral of M [m]. The
phase deviation 'q [m] is shown in the middle plot. The bottom trace shows the
(simulated) spectrum of an full-scale sine wave at 997 Hz, which is sampled
non-uniformly with the measured phase deviation shown in the middle trace.
The measurements on Page 96 correspond to the asynchronous sample rates
fsource = 44:1 kHz and fsink = 48:0 kHz, whereas on Page 97 measurements
for fsource = 48:0 kHz and fsink = 47:9 kHz are shown. This second case
corresponds to plesiochronous sample rates with a difference frequency of
100 Hz. Note that the error distribution of frequency tracking measurements
depends on the precise quartz frequencies of generator and analyzer, since
their difference frequency may also appear in the measurements.
7.1.1
Frequency Counter
Figures 7.1 and 7.4 show measurements of the performance of the frequency
counter. The measured frequency ratio M [m] varies by up to 2 LSBs (16 bit)
even though both sample rates are constant. The resulting phase noise contains
both low-frequency and high-frequency components up to ,80 dB for a 1 kHz
full-scale signal. The frequency and amplitude of the error components depend
on the precise frequency of the source and sink sampling clocks and on the
clock frequency of the counter. If the counting frequency is unrelated to the
sampling frequency the amplitude of the error peaks in Figures 7.1 and 7.4 is
reduced.
7.1.2
Digital PLL
In Figures 7.3 and 7.6 performance measurements of the (20 bit) digital PLL
are shown. Due to a limitation in the SARCO implementation the frequency
ratio M [m] can only be used with a precision of 16 bits. For comparison
Figures 7.2 and 7.5 show the performance of the PLL with M [m] reduced to
16 bit.
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
250
[Samples]
300
350
400
450
500
Phase Deviation
[Deg]
0.5
PROTO-40-44-48.eps
95 48 mm
0
−0.5
2000
4000
6000
8000
[Samples]
10000
12000
14000
16000
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
4
10
10
Figure 7.1: Frequency counter 16 bit, 44.1 kHz ! 48 kHz
[Hz]
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
[Deg]
5
x 10
300
350
400
450
500
350
400
450
500
SARCO-44-48.eps
95 48 mm
0
−5
250
[Samples]
Phase Deviation
−3
50
100
150
200
250
[Samples]
300
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
4
10
10
Figure 7.2: Digital PLL 16 bit, 44.1 kHz ! 48 kHz
[Hz]
Frequency Deviation
[LSB_16]
0.1
0
−0.1
2000
4000
6000
8000
[Samples]
10000
12000
14000
16000
12000
14000
16000
Phase Deviation
[Deg]
0
SARCO20-44-48.eps
95 48 mm
−1
−2
2000
4000
6000
8000
[Samples]
10000
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
10
4
10
Figure 7.3: Digital PLL 20 bit, 44.1 kHz ! 48 kHz
[Hz]
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
250
[Samples]
300
350
400
450
500
Phase Deviation
[Deg]
0.5
PROTO-40-48-47.eps
95 48 mm
0
−0.5
2000
4000
6000
8000
[Samples]
10000
12000
14000
16000
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
4
10
10
Figure 7.4: Frequency counter 16 bit, 48 kHz ! 47.9 kHz
[Hz]
Frequency Deviation
[LSB_16]
2
0
−2
50
100
150
200
250
[Samples]
300
350
400
450
500
Phase Deviation
[Deg]
0.5
SARCO-48-47.eps
95 48 mm
0
−0.5
2000
4000
6000
8000
[Samples]
10000
12000
14000
16000
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
4
10
10
Figure 7.5: Digital PLL 16 bit, 48 kHz ! 47.9 kHz
[Hz]
Frequency Deviation
[LSB_16]
0.1
0
−0.1
2000
4000
6000
8000
[Samples]
10000
12000
14000
16000
12000
14000
16000
Phase Deviation
[Deg]
0.5
SARCO20-48-47.eps
95 48 mm
0
−0.5
2000
4000
6000
8000
[Samples]
10000
Phase Deviation Modulated on 997 Hz Sine
[dB]
−100
−150
2
3
10
10
4
10
Figure 7.6: Digital PLL 20 bit, 48 kHz ! 47.9 kHz
[Hz]
The output of the PLL frequency tracking unit varies only by one LSB.
Due to the lowpass characteristic of the PLL these variations are restricted to
low frequencies. Figure 7.3 shows a measurement for fsource = 44:1 kHz and
fsink = 48:0 kHz. During the measurement interval of 16k values the 20-bit
LSB ( = 0:06 16-bit LSB) switches only once. Because the frequency ratio is
in general not a power of 2 it cannot be exactly represented by a unique binary
number, but it is approximated by an appropriate duty cycle of the LSB. Since
the LSB of the frequency ratio changes only at low frequencies, its integral –
the phase deviation – may reach large values as shown in the middle trace of
Figure 7.3. This phase deviation leads to a widening of the signal spectrum
below ,120 dB as shown in the bottom trace.
Figure 7.2 shows the measurements corresponding to Figure 7.3, but with
M [m] reduced to 16 bit. The duty cycle of the (16-bit) LSB is 15 : 1 as shown
in the upper trace. The resulting phase deviation has a sawtooth shape with
a frequency of approximately 48 kHz=15 = 3:2 kHz, which leads to spectral
lines at i.e. (0:997 3:2) kHz for a sine wave at 997 Hz, as shown in the bottom
trace of Figure 7.2. Note that the spectral line at (0:997 , 3:2) kHz is aliased
to 2:2 kHz.
In the plesiochronous case shown in Figures 7.5 and 7.6 the difference
frequency of the sample rates (100 Hz) is not totally suppressed by the lowpass
of the PLL, but only attenuated. This leads to sidebands at multiples of 100 Hz.
These sidebands cause only a minor degradation of the audio quality, since
they are all below ,110 dB and will additionally be masked by the full-scale
997 Hz signal.
The performance of the digital PLL discussed in this section could even be
improved by carefully dithering the computed sink phase 'sink to distribute
the quantization error over the full spectrum. For many implementations the
inherent dither caused by the jitter of the sampling clocks may be sufficient.
7.2
System Measurements
In this section measurements of the performance of SARCO using the digital
PLL (FPGA) for frequency measurement are shown. Additionally, measurements of a commercial realization (AD1890 [Dev93]) and of SARCO using the
frequency counter are given in appendix D for comparison purposes.
All the following measurements have been carried out using an undithered
18 bit input signal and follow the concepts outlined in Chapter 6. The source
and the sink sample rate have been derived from separate quartz oscillators
with the a nominal frequency of 50 MHz.
16 384 sink samples are acquired, multiplied by the complex 4-term
Blackman-Harris (BH4C), and Fourier transformed. The spectra of 16 consecutive measurements are averaged (Avg = 16). The ‘notch filter’ for the
THD+N calculations is realized in frequency domain either as a 16 bin wide
brick-wall filter (THD+N = ,107:4 dB in Fig. 7.7) or by using a filter with an
‘electrical Q’ of 5 (THD+N = ,107:7 dB).
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -107.4dB; -107.7dB, Avg = 16
0
-20
-40
-60
Sarco-44k-48k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
Figure 7.7: SARCO + FPGA: 0 dB 997 Hz, 44.1
4
10
! 48.0 kHz
Figure 7.7 shows the resulting spectrum for sample-rate conversion from
44.1 kHz to 48.0 kHz. The THD+N almost reaches the ideal value of a requantized 18-bit signal! The THD+N value is 3 dB above the ideal value (Fig 6.5,
THD+N = ,110:5 dB) due to the 18-bit output quantization of SARCO.
The error caused by the coefficient interpolation is below the noise floor for
a signal frequency of 1 kHz (compare to Figure 3.8, L1 = L2 = 256, Eq. 3.6),
but the hold operation causes error components up to ,128 dB (Eq. 3.10). For
the measurement in Figure 7.7 they do not appear in the audio band, but if
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -106.7dB; -107.2dB, Avg = 16
0
-20
-40
-60
Sarco-48k-44k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_44k_16k_async, SARCO
4
10
Figure 7.8: SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 15000Hz, 0dB, THD+N = -94.0dB; -101.1dB, Avg = 16
0
-20
-40
-60
Sarco-48k-44k-15k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_15000_0dB_18_, ser_44k_16k_async, SARCO
4
10
Figure 7.9: SARCO + FPGA: 0 dB 15 kHz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = -103.5dB; -107.8dB, Avg = 16
0
-20
-40
-60
Sarco-48k-47k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_47k_16k_sync, SARCO
4
10
Figure 7.10: SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 47.9 kHz
16384 FFT, BH4C, 997Hz, -20dB, THD+N = -106.1dB; -106.5dB, Avg = 16
0
-20
-40
-60
Sarco-32k-31k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_32k_997_-20dB_18_, par_31k_16k_async, SARCO
4
10
Figure 7.11: SARCO + FPGA: ,20 dB 997 Hz, 32.0 ! 31.96 kHz
source and sink sample rates are exchanged they appear at frequencies around
20 kHz (Fig. 7.8). Since the error components caused by the hold operation are
aliased from their original frequency k L fsource fsig into the baseband,
their frequency depends sensitively on the exact values of fsource and fsink .
The amplitude of the error caused by the hold operation scales linearly with
the signal frequency. If the signal frequency is increased from 1 kHz to 15 kHz
the error components rise by 24 dB (= 20 log 15=1). Figure 7.9 shows the
resulting spectrum for a 15 kHz signal converted from 48 kHz to 44.1 kHz. For
a signal frequency of 15 kHz the error components caused by the coefficient
interpolation become also visible with an amplitude below ,113 dB (Eq. 3.9).
Note that both error sources could be drastically reduced by increasing the
interpolation factors L1 and L2 .
Figure 7.10 shows the artifacts caused by the frequency tracking in the
plesiochronous case. Sidebands with a modulation frequency of 100 Hz appear
113 dB below the signal peak. The phase noise caused by the non-ideal
frequency tracking scales proportionally to the signal amplitude. Figure 7.11
shows the spectrum of a ,20 dB signal converted from 32 kHz to 31.96 kHz
resulting in a reduced (absolute) amplitude of the modulated phase error.
For plesiochronous sample rates derived from crystal oscillators small
artifacts may result from the finite precision frequency tracking. However
short-time jitter of the sampling clocks is suppressed by the lowpass characteristic of the PLL. In Figure 7.12 the closed-loop transfer function and thus
the jitter attenuation is replotted from Figure 4.12. Clock jitter above 20 Hz is
attenuated by 50 dB; jitter above 300 Hz even by over 100 dB.
Jitter Gain [dB]
0
−50
Jitterreject.eps
108 36 mm
−100
0
10
1
2
10
10
3
10
Frequency [Hz]
Figure 7.12: Jitter rejection on sampling clocks
To measure the jitter rejection capabilities of the sample-rate converter the
source sampling clock fsource is generated by a signal generator which allows
to modulate the output frequency by an external signal (Fig. 7.13).
Source
Samples
Sample-rate converter
JitterMeasure.epsi
74 34 mm
D/A
f source
(jitter)
f sink
Signal
Generator
Figure 7.13: Jitter measurement set-up
The source samples and the (jittered) sampling clock are fed to a D/Aconverter and the spectrum is displayed using a HP35670A dynamic signal
analyzer. Figure 7.14 shows the spectrum of a 997 Hz signal with sinusoidal
50 Hz jitter modulated on the source sampling clock. In Figure 7.15 the
spectrum of the sink signal is shown. The 50 Hz jitter is attenuated by 70 dB
as expected from Figure 7.12. For the measurements in Figures 7.16 and 7.17
triangular 400 Hz jitter at ,40 dB was modulated on the source sampling clock.
Since 400 Hz jitter is attenuated by more than 100 dB it is suppressed below
the 18-bit noise floor in Figure 7.17. Additional jitter measurements are given
in Appendix D.
7.3
Discussion
In this chapter we have presented measurements of the frequency tracking unit
and the entire sample-rate converter and identified the source of the different
error components. In spite of all these non-idealities, the overall conversion
quality is very high. A total harmonic distortion and noise (THD+N) value below ,106 dB (,94 dB) has been reached for 1 kHz (15 kHz) full-scale signals.
For signals with jittered sampling clocks the quality is even improved!
If we take into account the decreasing sensitivity of the human ear towards
higher frequencies and the lower signal amplitude of realistic audio signals
the performance decrease for audio frequencies above 10 kHz is of minor
importance. Informal hearing test revealed no audible artifacts.
As pointed out in Chapter 3 (Fig. 3.8), the conversion quality can be
Source-jitter-50Hz-20dB.epsi
110 62 mm
Figure 7.14: Analog source signal with ,20 dB 50 Hz sinusoidal jitter of the
source sampling clock (expanded scale due to limited resolution
of HP35670A Dynamic Signal Analyzer)
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −87.4dB; −107.0dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-jitter-50Hz-20dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure 7.15: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
50 Hz sinusoidal jitter of the source sampling clock
,20 dB
Source-jitter-400Hz-40dB.epsi
110 62 mm
Figure 7.16: Analog source signal with ,40 dB 400 Hz triangular jitter of the
source sampling clock (expanded scale due to limited resolution
of HP35670A Dynamic Signal Analyzer)
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −107.0dB; −107.5dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-jitter-400Hz-40dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure 7.17: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
400 Hz triangular jitter of the source sampling clock
,40 dB
enhanced to 20 bit by increasing the interpolation factors L1 and L2 to 512 and
2048 respectively.
−90
−95
−100
[dB]
−105
Error2.eps
108 73 mm
−110
−115
−120
−125
−130
100
1000
10000
[Hz]
Figure 7.18: Total error contributions of (unquantized) interpolation/decimation filter for sinusoidal audio signals: SARCO ( ),
SARCO18 (, ,), SARCO20 (, , ,)
In Figure 7.18 the error contributions caused by the hold operation, the
coefficient interpolation and the finite stopband attenuation are (re)plotted for
three different sets of L1 and L2 : SARCO (L1 = L2 = 256); SARCO18 using
the same basic filter, but with higher coefficient interpolation (L1 = 256,
L2 = 1024); SARCO20 redesigned for full 20-bit performance. Note that the
plots in Figure 7.18 do not include the degradation caused by the quantization
of the filter coefficients and in the finite word-length datapath.
8
Conclusions
In this thesis the concepts for fully digital sample-rate conversion of stereo
audio signals between arbitrary sample rates have been covered. It has been
shown that a converter with 20-bit audio quality over the full audio band is
feasible.
Conceptually, the process of converting the digital source signal into the
analog domain by a D/A converter and resampling it at the sink sample rate
by a A/D converter is imitated in the digital domain by a high-order interpolation/decimation process. The nearly ideal reconstruction filter, which transmits
the baseband unaltered but suppresses totally the higher order images, can not
be realized efficiently directly, but is approximated by a three-stage digital
filter: high-order FIR filter, interpolator, and zero-order hold. A trade-off
between the target quality, the memory requirements and the computational
cost must be found for each implementation.
It has been shown that digital lowpass filters of order up to 30 000 can be
directly synthesized using a filter design program that has been optimized for
(very) high-order filters.
A 18-bit prototype ASIC for sample-rate conversion of audio signals by arbitrary ratios has been developed and fabricated in a standard cell and macrocell
technique using a 1:2m double metal CMOS technology. It contains 230 000
transistors, including 14 kbit of static RAM and an 18 22 bit pipelined mul107
tiplier. To improve the conversion quality for plesiochronous sample rates a
digital PLL has been implemented in an FPGA and interfaced to the ASIC.
Short-time jitter of the sampling clocks is suppressed by the loop filter of the
PLL. The adaptive bandwidth of the loop filter allows for vary-speed operation, where fast tracking is required. The ASIC is therefore well suited to
synchronize an input with jittered sampling clock to the master clock of a
digital audio system.
In this work we concentrated on the effects caused by digital signal processing and identified the sources of the different error components, but did not
evaluate the results concerning psycho-acoustic phenomena. However, informal hearing tests have been carried out. Non of the involved persons could
distinguish the sound of original CD signal from the audio signal converted
by the ASIC.
An all-digital measurement system has been built for the characterization
of the sample-rate converter in the digital domain. A generator applies in realtime a sequence of 24-bit audio samples to the circuit and the analyzer acquires
256k sink values, which are then Fourier transformed. Two windows have
been found suitable for spectrum analysis of multirate signals: the complex
4-term Blackman-Harris window (BH4C) and the Kaiser-Bessel window for
= 6 (KB6). Using these windows, accurate spectrum analysis of highquality signals is possible without constraining the allowed signal frequencies
or sample rates.
In this thesis the contribution of various non-idealities to the error in audio
sample-rate conversion has been evaluated and listed. This is not a hint on
the low quality of digital sample-rate conversion, but allows to customize
sample-rate converters for each application according to the required quality
and allowed cost.
The principles of digital sample-rate conversion can be extended towards
other applications as i.e. the conversion of rectangular pixels of a CCD-camera
to square shape for pattern recognition applications.
Appendix
A
List of Symbols and Integrals
:
:
Continuous-time signal
Discrete-time signal
()
[]
Sample rate
Source sample rate
Sink sample rate
Signal frequency
Cutoff frequency
Generator frequency
Analyzer frequency
fsample
fsource
fsink
fsig
fcutoff
fgen
fanl
Impulse response
Transfer characteristic
Interpolation filter
Decimation filter
Adaptive filter
h(:)
H (:)
HL
HM
HLM
Passband ripple
Stopband ripple
Passband edge
Stopband edge
p
s
fp
fs
Modulo division
Integer division
Convolution
?
Source samples
Sink samples
x[n]
y[m]
Interpolation factor
Decimation factor
Time grid
L
M
tunit
Initial phase
Sink phase
Phase difference
'0
'sink [m]
∆'[m]
SINC (x)
sin(x)=(x)
RECT (x)
mod
div
1 if jxj < 0:5
0 otherwise
111
The following definite integrals are used [GR94]:
Z
1 sin(a x)
a x dx
0
2
Z 1
sin(a x)
dx
ax
0
4
Z 1
sin(a x)
dx
a
x
0
6
Z 1
sin(a x)
dx
ax
0
8
Z 1
sin(a x)
dx
ax
0
a [a > 0]
=
1
2
=
1
2
=
1
3
=
11
40
=
151
630
a [a > 0]
a [a > 0]
a [a > 0]
a [a > 0]
(A.1)
(A.2)
(A.3)
(A.4)
(A.5)
The following approximations are used [BK85]:
x + O(x3) [x 1]
sin(n x) x + O (x3) [x 1]
cos(x) 1 + O (x2) [x 1]
sin x
(A.6)
(A.7)
(A.8)
The following summation formulas are used [BK85]:
1
X
1
2
x=1 x
1
X
1
4
x=1 x
1
X
1
6
x=1 x
1
X
1
8
x=1 x
=
=
=
=
2
6
4
90
6
945
8
9450
(A.9)
(A.10)
(A.11)
(A.12)
Appendix
B
Error Caused by Hold Effect
B.1
Sine Wave
Ehold =
for fsig
1
X
j fi fsig ) !2
fi
j fi fsig
fi
sin( j =1
(B.1)
fi we get
ffsigi
Ehold 2 j
j =1
1
X
!2
fsig
=2
fi
2
1
X
1
2
j =1 j
(B.2)
and finally, if we substitute fi by L fsource we get a total of
2
2
f
sig
Ehold 3 L f
source
(B.3)
113
B.2
White Noise
Ehold =
for fsig
Ehold ffi
(B.4)
fi we get
1Z
X
∆
f
+ 2
j =1 ,
f
∆
2
=
with ∆f
!2
sin( ffi )
df
∆f
1Z
1 X j fi+ 2
∆f
∆f
j =1 j fi, 2
=
ffi
1
j
∆f
1
(fi)2
∆f
!2
df = ∆f (f )2 1
i
2
∆f 3
6
34
2 =
1
X
j=
∆f
Z
∆
2
∆
2
(B.5)
2
fi
72
(B.6)
fsource and fi = L fsource we finally get
2
Ehold 72 L2
B.3
f
+ 2
2
f
df
f
j
,
1
1
(B.7)
Pink Noise
Ehold 2 for fsig
Ehold 2 1Z
X
j =1
j fi+20 000 1
6f
j fi+20
!2
sin( ffi )
df
ffi
(B.8)
fi we get
1 Z 20 000
X
j =1
20
ffi
1
6f
j
!2
df = 6 (f )2 2
i
1
X
2
j
j =1
1
Z 20 000
20
f df
(B.9)
6 (f )2
2
i
with fi
=
2
1 (20 000)2 =
6
2
20 000 6 fi
2
(B.10)
L fsource we finally get
Ehold L f
source
10 472
2
(B.11)
Appendix
C
Error Caused by Linear
Interpolation
C.1
Sine Wave
Ehold =
for fsig
1
X
j fi fsig ) !4
fi
j fi fsig
fi
sin( j =1
(C.1)
fi we get
ffsigi
Ehold 2 j
j =1
1
X
!4
fsig
=2
fi
4
1
X
1
4
j =1 j
(C.2)
and finally, if we substitute fi by L fsource we get a total of
4
4
f
sig
Ehold 45 L f
source
(C.3)
117
C.2
White Noise
1Z
X
Ehold =
for fsig
Ehold j =1
!4
sin( ffi )
df
ffi
(C.4)
fi we get
1
∆f
1Z
X
=
∆
f
+ 2
j =1 ,
=
with ∆f
j fi+ ∆2f 1
j fi, ∆2f ∆f
f
∆
2
∆f
ffi
j
!4
df = ∆f (f )4 1
i
1
X
Z
∆
f
+ 2
4
f
df
f
j
,
1
1
4
j=
∆
2
(C.5)
4
4
∆f
=
f
5 16
7200
i
4
∆f 5
1
(fi)4
90
(C.6)
fsource and fi = L fsource we finally get
4
Ehold 7200 L4
C.3
(C.7)
Pink Noise
Ehold 2 for fsig
Ehold 2 1Z
X
j =1
j fi+20 000 1
6f
j fi+20
!4
sin( ffi )
df
ffi
(C.8)
fi we get
1 Z 20 000
X
j =1
20
ffi
1
6f
j
!4
df = 6 (f )4 2
i
1
X
j
1
j=
1
4
Z 20 000
20
f 3 df
(C.9)
6 (f )4
2
i
with fi
=
4
1 (20 000)4 =
90 4
1
1080
20 000 fi
4
(C.10)
L fsource we finally get
Ehold L f
source
10 960
4
(C.11)
Appendix
D
Performance Measurements
121
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -105.8dB; -106.1dB, Avg = 16
0
-20
-40
-60
AD1890-44k-48k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, AD1890
4
10
Figure D.1: AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -95.3dB; -103.9dB, Avg = 16
0
-20
-40
-60
PROTO-44k-48k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_44k_997_0dB_18_, par_48k_16k_async, PROTO
4
10
Figure D.2: SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -107.4dB; -107.7dB, Avg = 16
0
-20
-40
-60
Sarco-44k-48k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure D.3: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -105.8dB; -106.0dB, Avg = 16
0
-20
-40
-60
AD1890-48k-44k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_44k_16k_async, AD1890
4
10
Figure D.4: AD1890: 0 dB 997 Hz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -95.0dB; -104.3dB, Avg = 16
0
-20
-40
-60
PROTO-48k-44k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, par_44k_16k_async, PROTO
4
10
Figure D.5: SARCO: 0 dB 997 Hz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 996Hz, 0dB, THD+N = -106.7dB; -107.2dB, Avg = 16
0
-20
-40
-60
Sarco-48k-44k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_44k_16k_async, SARCO
4
10
Figure D.6: SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 15000Hz, 0dB, THD+N = -95.1dB; -102.4dB, Avg = 16
0
-20
-40
-60
AD1890-48k-44k-15k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_15000_0dB_18_, ser_44k_16k_async, AD1890
4
10
Figure D.7: AD1890: 0 dB 15 kHz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 15000Hz, 0dB, THD+N = -71.2dB; -94.8dB, Avg = 16
0
-20
-40
-60
PROTO-48k-44k-15k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_15000_0dB_18_, par_44k_16k_async, PROTO
4
10
Figure D.8: SARCO: 0 dB 15 kHz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 15000Hz, 0dB, THD+N = -94.0dB; -101.1dB, Avg = 16
0
-20
-40
-60
Sarco-48k-44k-15k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_15000_0dB_18_, ser_44k_16k_async, SARCO
4
10
Figure D.9: SARCO + FPGA: 0 dB 15 kHz, 48.0 ! 44.1 kHz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = -104.7dB; -105.2dB, Avg = 16
0
-20
-40
-60
AD1890-48k-47k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_47k_16k_async, AD1890
4
10
Figure D.10: AD1890: 0 dB 997 Hz, 48.0 ! 47.9 kHz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = -81.9dB; -103.6dB, Avg = 16
0
-20
-40
-60
PROTO-48k-47k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, par_47k_16k_async, PROTO
4
10
Figure D.11: SARCO: 0 dB 997 Hz, 48.0 ! 47.9 kHz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = -103.5dB; -107.8dB, Avg = 16
0
-20
-40
-60
Sarco-48k-47k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_48k_997_0dB_18_, ser_47k_16k_sync, SARCO
4
10
Figure D.12: SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 47.9 kHz
16384 FFT, BH4C, 997Hz, -20dB, THD+N = -109.3dB; -109.5dB, Avg = 16
0
-20
-40
-60
AD1890-32k-31k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_32k_997_-20dB_18_, ser_31k_16k_async, AD1890
Figure D.13: AD1890:
4
10
,20 dB 997 Hz, 32.0 ! 31.96 kHz
16384 FFT, BH4C, 997Hz, -20dB, THD+N = -89.2dB; -106.4dB, Avg = 16
0
-20
-40
-60
PROTO-32k-31k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_32k_997_-20dB_18_, par_31k_16k_async, PROTO
Figure D.14: SARCO:
4
10
,20 dB 997 Hz, 32.0 ! 31.96 kHz
16384 FFT, BH4C, 997Hz, -20dB, THD+N = -106.1dB; -106.5dB, Avg = 16
0
-20
-40
-60
Sarco-32k-31k-1k.eps
102 72 mm
-80
-100
-120
-140
-160
2
3
10
10
sin_32k_997_-20dB_18_, par_31k_16k_async, SARCO
4
10
Figure D.15: SARCO + FPGA: ,20 dB 997 Hz, 32.0 ! 31.96 kHz
16384 FFT, KB6 , 12000Hz, −6dB, THD+N = −6.0dB; −102.4dB, Avg = 16
0
−20
−40
[dB]
−60
AD1890-twin.eps
88 73 mm
−80
−100
−120
−140
−160
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
sin_44k_11000_0dB_18__twin, ser_48k_16k_async, AD1890
Figure D.16: AD1890:
2
4
x 10
,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz
16384 FFT, KB6 , 11000Hz, −6dB, THD+N = −6.0dB; −92.3dB, Avg = 16
0
−20
−40
[dB]
−60
PROTO-twin.eps
88 73 mm
−80
−100
−120
−140
−160
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
sin_44k_11000_0dB_18__twin, ser_48k_16k_async, PROTO
Figure D.17: SARCO:
2
4
x 10
,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz
16384 FFT, KB6 , 11000Hz, −6dB, THD+N = −6.0dB; −106.7dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-twin.eps
88 73 mm
−80
−100
−120
−140
−160
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
sin_44k_11000_0dB_18__twin, ser_48k_16k_async, SARCO
2
4
x 10
Figure D.18: SARCO + FPGA: ,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −56.4dB; −66.9dB, Avg = 16
0
−20
−40
[dB]
−60
AD1890-jitter.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, AD1890
4
10
Figure D.19: AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband jitter of
the source sampling clock
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −37.5dB; −39.7dB, Avg = 16
0
−20
−40
[dB]
−60
PROTO-jitter.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, PROTO
4
10
Figure D.20: SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband jitter of
the source sampling clock
Source-jitter.epsi
110 63 mm
Figure D.21: Analog source signal with wideband jitter of the source sampling
clock (expanded scale due to limited resolution of HP35670A
Dynamic Signal Analyzer)
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −103.3dB; −106.8dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-jitter.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure D.22: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband
jitter of the source sampling clock
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −34.1dB; −81.8dB, Avg = 16
0
−20
−40
[dB]
−60
AD1890-jitter-50Hz-20dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, AD1890
4
10
Figure D.23: AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
sinusoidal jitter of the source sampling clock
,20 dB 50 Hz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −15.8dB; −25.7dB, Avg = 16
0
−20
−40
[dB]
−60
PROTO-jitter-50Hz-20dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, PROTO
4
10
Figure D.24: SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,20 dB 50 Hz sinusoidal jitter of the source sampling clock
Source-jitter-50Hz-20dB.epsi
110 62 mm
Figure D.25: Analog source signal with ,20 dB 50 Hz sinusoidal jitter of the
source sampling clock (expanded scale due to limited resolution
of HP35670A Dynamic Signal Analyzer)
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −87.4dB; −107.0dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-jitter-50Hz-20dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure D.26: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
50 Hz sinusoidal jitter of the source sampling clock
,20 dB
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −75.9dB; −75.9dB, Avg = 16
0
−20
−40
[dB]
−60
AD1890-jitter-400Hz-40dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, AD1890
4
10
Figure D.27: AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
triangular jitter of the source sampling clock
,40 dB 400 Hz
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −49.9dB; −50.4dB, Avg = 16
0
−20
−40
[dB]
−60
PROTO-jitter-400Hz-40dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, PROTO
4
10
Figure D.28: SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,40 dB 400 Hz triangular jitter of the source sampling clock
Source-jitter-400Hz-40dB.epsi
110 62 mm
Figure D.29: Analog source signal with ,40 dB 400 Hz triangular jitter of the
source sampling clock (expanded scale due to limited resolution
of HP35670A Dynamic Signal Analyzer)
16384 FFT, BH4C, 997Hz, 0dB, THD+N = −107.0dB; −107.5dB, Avg = 16
0
−20
−40
[dB]
−60
SARCO-jitter-400Hz-40dB.eps
102 72 mm
−80
−100
−120
−140
−160
2
3
10
10
sin_44k_997_0dB_18_, ser_48k_16k_async, SARCO
4
10
Figure D.30: SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with
400 Hz triangular jitter of the source sampling clock
,40 dB
List of Figures
:::::::
2.2 Spectrum of example signal : : : : : : : : : : : : : : :
2.3 Pink noise S (f ) = 61f , white noise S (f ) = ∆1f : : : : : :
2.4 Spectrum of a sampled signal with sample rate fsample
2.5 Interpolation by a factor L : : : : : : : : : : : : : : : : :
2.6 Decimation by a factor M : : : : : : : : : : : : : : : : :
2.7 Sample-rate conversion by a fixed ratio M=L : : : : : :
2.8 Sample-rate conversion by a rational ratio M=L : : : :
2.9 UPMODE, fsink > fsource, L > M : : : : : : : : : : : : :
2.10 DOWNMODE, fsink < fsource, L < M : : : : : : : : : : :
2.11 Continuous-time representation y (t) by holding value inbetween samples xˆ [n] : : : : : : : : : : : : : : : : : : :
2.12 Sample-rate conversion by an arbitrary ratio M=L : : :
2.13 Interpolation by L and hold operation : : : : : : : : : :
2.1 D/A conversion followed by A/D conversion
::::::::::
Weighted stopband error : : : : : : : : : : : : : : : : :
Single stage FIR filter followed by hold operation : : : :
7
8
9
9
10
11
12
15
16
17
19
20
20
3.1 Filter with finite stopband attenuation
24
3.2
25
3.3
26
141
:::::::::
FIR filter followed by discrete-time linear interpolation :
3.4 FIR filter followed by linear interpolation
28
3.5
28
3.6 Transfer characteristic of FIR filter, linear interpolator,
and hold operation : : : : : : : : : : : : : : : : : : : : :
29
3.7 FIR filter and linear interpolation followed by hold operation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
29
3.8 Error contributions vs signal frequency for sinusoidal audio signals : : : : : : : : : : : : : : : : : : : : : : : : :
32
:::::::::::::
3.10 FIR with interpolated impulse response : : : : : : : : :
3.11 Separate interpolation and decimation filters : : : : : :
3.12 Filter specifications : : : : : : : : : : : : : : : : : : : :
3.13 Graphical representation of Table 3.8 : : : : : : : : : :
3.14 Scaling of prototype filter by = 4 using zero padding :
3.15 Comparison of zero-padded prototype ( = 16) and direct synthesis : : : : : : : : : : : : : : : : : : : : : : :
3.16 Relaxed specifications with intermediate stopband : : :
3.17 Filter with relaxed specifications (fsample = 48 kHz) : : :
3.18 FIR filter, N = 26 879, 20 bit coefficients : : : : : : : : :
3.9 FIR filter with linear interpolator
33
34
35
38
41
42
43
44
44
45
3.19 Impulse response of FIR filter using fixed-point coefficients 46
3.20 Interpolated FIR filter, L2
::::::::::::::
3.21 Quantization noise of interpolated FIR filter, L2 = 64,
22 bit : : : : : : : : : : : : : : : : : : : : : : : : : : : : :
3.22 FIR filter, L2 = 64, quantized twice to 22 bit : : : : : : :
3.23 FIR filter, L2 = 64, quantized twice to 22 bit QFP : : : :
=
64 :
47
48
50
50
4.1 Effects of non-uniform sampling on the signal spectrum
::::::::::::::::::::::
4.3 Frequency counter with PLL : : : : : : : : : : : : : : :
4.4 Frequency counter with high-speed reference clock : :
4.5 Digital PLL (DPLL) : : : : : : : : : : : : : : : : : : : : :
4.6 Synchronization with Gray counter : : : : : : : : : : : :
4.7 Open-loop transfer characteristic of the DPLL : : : : :
4.8 Adaptive digital PLL : : : : : : : : : : : : : : : : : : : :
4.9 Simulation of distortion caused by uniform jitter of M :
4.10 Spectrum of moving average filter : : : : : : : : : : : :
4.2 Phase relations
52
52
54
54
55
56
57
58
61
62
4.11 Simulation of frequency counter followed by moving average filter (fsample = 48 kHz, fsig = 997 Hz, k = 7) : : :
64
:::::
65
4.12 Closed-loop transfer characteristic of the DPLL
:::::::::::::::::::::::
Implementation of digital PLL : : : : : : : : : : : : : : :
Datapath: word-lengths for FIR filter : : : : : : : : : : :
Block diagram using DSP and FPGA : : : : : : : : : : :
Floorplan of SARCO : : : : : : : : : : : : : : : : : : : : :
Photomicrograph of SARCO : : : : : : : : : : : : : : : :
Interpolated FIR filter (N = 57855), L2 = 256, 22 bit QFP
5.1 Block diagram
68
5.2
69
5.3
5.4
5.5
5.6
5.7
:::::::::
Spectrum of 4-term Blackman-Harris window (BH4) : :
Spectrum of Kaiser-Bessel window with = 6 (KB6) : :
73
76
78
78
80
6.1 Setup of all-digital measurement system
84
6.2
87
6.3
87
::::
:::::::::::::
6.4 Influence of (bin-)offset on window performance
90
6.5 Sine wave quantized with 18 bit
91
7.1 Frequency counter 16 bit, 44.1 kHz ! 48 kHz
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
7.2 Digital PLL 16 bit, 44.1 kHz ! 48 kHz : : : : :
:
7.3 Digital PLL 20 bit, 44.1 kHz ! 48 kHz : : : : :
:
7.4 Frequency counter 16 bit, 48 kHz ! 47.9 kHz
7.5 Digital PLL 16 bit, 48 kHz ! 47.9 kHz : : : : :
:
:
7.6 Digital PLL 20 bit, 48 kHz ! 47.9 kHz : : : : :
:
7.7 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz
7.8 SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 44.1 kHz
:
:
7.9 SARCO + FPGA: 0 dB 15 kHz, 48.0 ! 44.1 kHz
:
7.10 SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 47.9 kHz
7.11 SARCO + FPGA: ,20 dB 997 Hz, 32.0 ! 31.96 kHz :
7.12 Jitter rejection on sampling clocks : : : : : : : : : :
7.13 Jitter measurement set-up : : : : : : : : : : : : : : :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
96
96
96
97
97
97
99
100
100
101
101
102
103
7.14 Analog source signal with ,20 dB 50 Hz sinusoidal jitter
of the source sampling clock : : : : : : : : : : : : : : : 104
7.15 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,20 dB
50 Hz sinusoidal jitter of the source sampling clock : : : 104
7.16 Analog source signal with ,40 dB 400 Hz triangular jitter
of the source sampling clock : : : : : : : : : : : : : : : 105
7.17 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,40 dB
400 Hz triangular jitter of the source sampling clock : : : 105
7.18 Total error contributions of (unquantized) interpolation/decimation filter for sinusoidal audio signals : : : : : : : : 106
D.1 AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz
:::::
D.2 SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz : : : : : :
D.3 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz :
D.4 AD1890: 0 dB 997 Hz, 48.0 ! 44.1 kHz : : : : :
D.5 SARCO: 0 dB 997 Hz, 48.0 ! 44.1 kHz : : : : : :
D.6 SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 44.1 kHz :
D.7 AD1890: 0 dB 15 kHz, 48.0 ! 44.1 kHz : : : : :
D.8 SARCO: 0 dB 15 kHz, 48.0 ! 44.1 kHz : : : : : :
D.9 SARCO + FPGA: 0 dB 15 kHz, 48.0 ! 44.1 kHz :
D.10 AD1890: 0 dB 997 Hz, 48.0 ! 47.9 kHz : : : : :
D.11 SARCO: 0 dB 997 Hz, 48.0 ! 47.9 kHz : : : : : :
D.12 SARCO + FPGA: 0 dB 997 Hz, 48.0 ! 47.9 kHz :
D.13 AD1890: ,20 dB 997 Hz, 32.0 ! 31.96 kHz : :
D.14 SARCO: ,20 dB 997 Hz, 32.0 ! 31.96 kHz : : :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
D.15 SARCO + FPGA: ,20 dB 997 Hz, 32.0 ! 31.96 kHz :
D.16 AD1890: ,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz : : : :
D.17 SARCO: ,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz : : : : :
D.18 SARCO + FPGA: ,6 dB 11 + 12 kHz, 44.1 ! 48.0 kHz
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
122
122
123
124
124
125
126
126
127
128
128
129
130
130
131
132
132
133
D.19 AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband
jitter of the source sampling clock : : : : : : : : : : : : : 134
D.20 SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband jitter
of the source sampling clock : : : : : : : : : : : : : : : 134
D.21 Analog source signal with wideband jitter of the source
sampling clock : : : : : : : : : : : : : : : : : : : : : : : 135
D.22 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with wideband jitter of the source sampling clock : : : : : : : : : 135
D.23 AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,20 dB 50 Hz
sinusoidal jitter of the source sampling clock : : : : : : : 136
D.24 SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,20 dB 50 Hz
sinusoidal jitter of the source sampling clock : : : : : : : 136
D.25 Analog source signal with ,20 dB 50 Hz sinusoidal jitter
of the source sampling clock : : : : : : : : : : : : : : : 137
D.26 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,20 dB
50 Hz sinusoidal jitter of the source sampling clock : : : 137
D.27 AD1890: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,40 dB 400 Hz
triangular jitter of the source sampling clock : : : : : : : 138
D.28 SARCO: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,40 dB 400 Hz
triangular jitter of the source sampling clock : : : : : : : 138
D.29 Analog source signal with ,40 dB 400 Hz triangular jitter
of the source sampling clock : : : : : : : : : : : : : : : 139
D.30 SARCO + FPGA: 0 dB 997 Hz, 44.1 ! 48.0 kHz with ,40 dB
400 Hz triangular jitter of the source sampling clock : : : 139
List of Tables
2.1 Error caused by hold operation (fsource
:::
21
3.1 Transition bandwidth for common sampling frequencies
24
=
48 kHz)
3.2 Estimated filter order
27
3.3
:::::::::::::::::::
Error caused by linear interpolation (fsource = 48 kHz) :
28
3.4 Error contributions from interpolation and finite stopband
attenuation (white noise signal) : : : : : : : : : : : : : :
31
::::::::::
Coefficient interpolation and filter order :
Filter Requirements : : : : : : : : : : :
Runtime and deviation depending on :
3.5 Estimated filter order
3.6
3.7
3.8
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
34
35
39
41
3.9 Stopband attenuation depending on coefficient precision
46
:::
49
3.10 Quantization noise caused by second quantization
5.1 Implementation alternatives for frequency tracking
70
5.2 Datapath width in bits
:::
:::::::::::::::::::
74
6.1 Average noise floor level of a sine wave with k -bit quantization using a N -point FFT with BH4 window : : : : :
88
147
Curriculum Vitae
I was born in Thalwil, Switzerland, on August 24, 1965. After finishing
high school at the Kantonsschule R¨amib¨uhl Z¨urich (Matura Typ C) in 1984, I
enrolled in Electrical Engineering at the Swiss Federal Institute of Technology
ETH Z¨urich. I received a M.Sc. in Electrical Engineering (Dipl. El.-Ing.
ETH) in 1990 and joined then the Integrated Systems Laboratory of the ETH,
where I worked as a research and teaching assistant in the ASIC design and test
group. Besides the work on sample-rate conversion presented in this thesis, my
second main project was the evaluation of template matching algorithms and
the realization of an ASIC for real-time 2D correlation of gray-scale images
for an industrial process automation system.
149
Bibliography
[Ada93] Robert Adams. Jitter analysis of asynchronous sample-rate conversion. In Proceedings of the 95th Convention of the AES,
Preprint # 3712. Audio Engineering Society, 1993.
[AK93] Robert Adams and Tom Kwan. Theory and VLSI architectures for
asynchronous sample-rate converters. Journal of the Audio Eng.
Soc. (AES), 41(7/8):539 – 555, July/August 1993.
[Ant93] A. Antoniou. Digital Filters: Analysis, Design, and Applications.
McGraw-Hill Book Company, second edition, 1993.
[AW85] John W. Adams and Alan N. Willson. On the fast design of highorder FIR digital filters. IEEE Transactions on Circuits and Systems,
32(9):958 – 960, September 1985.
[Ben88] K. Blair Benson. Audio engineering handbook. McGraw-Hill Book
Company, first edition, 1988.
[Bes93] Roland E. Best. Phase-locked loops. McGraw-Hill Book Company,
second edition, 1993.
[BK85] I.N. Bronstein and K.A.Semendjajew. Taschenbuch der Mathematik.
Verlag Harri Deutsch, 22nd edition, 1985.
[Cab91] Richard Cabot. AES standard method for digital audio engineering –
measurement of digital audio equipment. Journal of the Audio Eng.
Soc. (AES), 39(12):962 – 975, December 1991.
[CDPS91] S. Cucchi, F. Desinan, G. Parladori, and G. Sicuranza. DSP implementation of arbitrary sampling frequency conversion for high
quality sound applications. In Proc. of the Int. Conf. on Acoustics,
Speech and Signal Processing, pages 3609 – 3612, Vol. 5, Toronto,
1991.
151
[CR81] Ronald E. Crochiere and Lawrence R. Rabiner. Interpolation and
decimation of digital signals - a tutorial review. In Proceedings of
the IEEE, pages 300–331, 1981.
[CR83] Ronald E. Crochiere and Lawrence R. Rabiner. Multirate Digital
Signal Processing. Prentice Hall, Englewood Cliffs, New Jersey
07632, 1983.
[dD89] Richard C. den Dulk. An approach to systematic phase-lock loop
design. PhD thesis, TU Delft, 1989.
[Der87] F. Deravi. Design issues in FFT test systems for high performance
converters. Colloquium on ’Advanced A/D Conversion Techniques’,
15. April 1987.
[Dev93] Analog Devices. Stereo asynchronous sample rate converters
AD1890/AD1891. Datasheet, 1993.
[dGL94] R. de Gaudenzi and M. Luise. Audio and video digital radio broadcasting systems and techniques. Elsevier science B.V., Amsterdam,
1994.
[Fet90] A. Fettweis. Elemente nachrichtentechnischer Systeme. B. G. Teubner, Stuttgart, 1990.
[GR94] I.S. Gradshteyn and I.M. Ryzhik. Table of integrals, series, and
products. Academic Press, 1994.
[Har78] Francis J. Harris. On the use of windows for harmonic analysis with
the discrete fourier transform. Proceedings of the IEEE, 66(1):51 –
83, January 1978.
[Har90] Steven Harris. The effects of sampling clock jitter on nyquist sampling analog-to-digital converters, and on oversampling delta-sigma
ADCs. Journal of the Audio Eng. Soc. (AES), 38(7/8):537 – 542,
July 1990.
[HB86] Joel M. Halbert and R. Allan Belcher. Selection of test signals for
DSP-based testing of digital audio systems. Journal of the Audio
Eng. Soc. (AES), 34(7/8):546 – 555, July 1986.
[JTG+ 94] J. Janssen, D. Therssen, J. Van Ginderdeuren, L. Van Paepegem,
P. Van Lierop, Z.L. Wu, and H. Verhoeven. A new principle/IC
for audio sampling rate conversion. In Proceedings of the 96th
Convention of the AES, Preprint # 3807. Audio Engineering Society,
1994.
[Lag82] Roger Lagadec. Digital sampling frequency conversion. In Digital
Audio, Collected Papers from the AES Premiere Conference, New
York, pages 90 – 96. Audio Engineering Society, 1982.
[LK81] Roger Lagadec and Henry O. Kunz. An universal, digital sampling
frequency converter for digital audio. In Proc. of the Int. Conf. on
Acoustics, Speech and Signal Processing, pages 595 – 598, Vol. 2,
Atlanta, 1981.
[LPW82] Roger Lagadec, Daniele Pelloni, and Daniel Weiss. A 2-channel,
16-bit digital sampling frequency converter for professional digital
audio. In Proc. of the Int. Conf. on Acoustics, Speech and Signal
Processing, pages 93 – 96, Vol. 1, Paris, 1982.
[LS91] Gunnar Lehtinen and Andreas Steiner. Digital audio sampling converter. Master’s thesis, Integrated Systems Lab, Swiss Federal Institute of Technology (ETH Z¨urich), 1991.
[LWV92] Stanley P. Lipshitz, Robert A. Wannamaker, and John Vanderkooy.
Quantization and dither: A theoretical survey. Journal of the Audio
Eng. Soc. (AES), 40(5):355 – 375, May 1992.
[Mar93] Robert J. Marks II. Advanced Topics in Shannon Sampling and
Interpolation Theory. Springer-Verlag New York, 1993.
[Mat88] Matlab. Matlab user’s manual. MathWorks, Inc., Natick, MA, 1988.
[Mot88] Motorola. DSP56000 digital signal processor user’s manual, 1988.
[MPR73] J.H. McClellan, T.W. Parks, and L.R. Rabiner. A computer program
for designing optimum FIR linear phase digital filters. IEEE Trans.
Audio Electroacoust., AU-21(6):506 – 526, December 1973.
[MS93] D. M¨uller and Ch. Siegrist. Frequency measurement for audio sampling rate conversion. Semesterarbeit, 1993. Integrated Systems
Lab, Swiss Federal Institute of Technology (ETH Z¨urich).
[Pel82] Daniele Pelloni. Der einstufige Abtastratenumsetzer. Unpublished,
1982.
[PHR91] Sangil Park, Garth Hillman, and Roman Robles. A novel structure
for real-time digital sample-rate converters with finite precision error
analysis. In Proc. of the Int. Conf. on Acoustics, Speech and Signal
Processing, pages 3613 – 3616, Vol. 5, Toronto, 1991.
[Rab73] L.R. Rabiner. Approximate design relationships for lowpass FIR
digital filters. IEEE Trans. Audio Electroacoust., AU-21:456 – 460,
October 1973.
[Ram82] Tor A. Ramstad. Sample-rate conversion by arbitrary ratios. In
Proc. of the Int. Conf. on Acoustics, Speech and Signal Processing,
pages 101 – 104, Vol. 1, Paris, 1982.
[Ram84] Tor A. Ramstad. Digital methods for conversion between arbitary
sampling frequencies. IEEE Transactions on Acoustics, Speech, and
Signal Processing, 32(3):577 – 591, June 1984.
[RF94] F. Rothacher and N. Felber. VLSI implementation of a fully digital
asynchronous audio sample-rate converter. In Proceedings of the
96th Convention of the AES, Preprint # 3832. Audio Engineering
Society, 1994.
[RG75] Lawrence R. Rabiner and Bernard Gold. Theory and Application
of Digital Signal Processing. Prentice Hall, Englewood Cliffs, New
Jersey 07632, 1975.
[RW90] Fritz Rothacher and Andreas Wieland. DIDI: Audio samplingfrequency converter for arbitrary changing ratios. Master’s thesis,
Integrated Systems Lab, Swiss Federal Institute of Technology (ETH
Z¨urich), 1990.
[Sti92] Eduard F. Stikvoort. Some Subjects in Digital Audio, Noise Shaping, Sampling-Rate Conversion, Dynamic Range Compression and
Testing. PhD thesis, TU Eindhoven, 1992.
[VL84] John Vanderkooy and Stanley P. Lipshitz. Resolution below the least
significant bit in digital systems with dither. Journal of the Audio
Eng. Soc. (AES), 32(3):106 – 113, March 1984. Corrections in JAES
32(11):889.