Karl Hiner

奔跑的瓦力 2018-08-29

展开全文

This set of Jupyter notebooks (raw Github link) covers the second book in Julius O. Smith III's excellent four-book series on audio DSP, Introduction to Digital Filters: with Audio Applications. This book takes the concepts from Mathematics of the DFT and runs with it, getting deep into digital filter theory, analysis, design and applications.

The difficulty level goes up quite a bit here - lots of complex analysis tricks are used to derive results and a substantial amount of algebraic manipulation and creativity in the complex domain is used throughout the book, and required to solve the exercises.

Analytic vs graphical approaches to understanding filters

Readers of this blog will know that I love visualizations, and in this domain I especially found that visually exploring the concepts really helped to develop my intuition. It's easy to get bogged down trying to derive exact analytic solutions for a filter's response. I spent many maddening/rewarding hours desperately trying random trig identities for half a page to try and get some closed solution, only to find at the end that it could have been done in a few steps by using some complex algebra trick.

However, when you step away from the symbolic and analytic approach and just start moving the poles and zeros of a filter around and seeing what happens, the intimidation wears off a bit and it starts to feel within the realm of possibility to maybe actually design a filter response using some pretty intuitive visual reasoning.

Designing a second-order filter to reject 60 Hz hum

For example, say we want to design a filter (as we are asked to do in the first exercise of Chapter 8) to attenuate 60 Hz hum, given a sampling rate of 40 KHz. I'll walk through my more analytic attempt at the problem, and then I'll give a visual explanation of a more canonical notch filter that cuts past a lot of analytic hair-pulling. If you'd like you can also just skip to the graphical approach.

An analytic approach

Note: For brevity I'm going to assume some notational and terminology knowledge in this section. In the next section, I will back up and explain things more fully :)

We want our filter to have a zero at 60 Hz. For now, let's consider only FIR filters (with no feedback component). We are given the constraint that the filter must be second-order. The factored form of a second-order FIR filter is

$H (z) = B (z) = (1 - q_{1} z^{- 1}) (1 - q_{2} z^{- 1}) = 1 - (q_{1} + q_{2}) z^{- 1} + q_{1} q_{2} z^{- 2}$ ,

where $q_{1}$ and $q_{2}$ are the zeros of the frequency response. If we want the amplitude response to equal zero at normalized frequencies $ω_{r 1}$ and $ω_{r 2}$ , we can simply choose complex filter coefficients such that $\frac{q_{i}}{z} = 1$ when $z = e^{j ω_{r i}} ⟹ q_{i} = e^{j ω_{r i}}$ .

Here is an example for clarity:

In [4]:

from scipy.signal import freqz

In [5]:

# Pick a couple of arbitrary frequencies to reject.
w1_reject = 0.8
w2_reject = 2.2

N = 512
q1 = np.exp(1j * w1_reject)
q2 = np.exp(1j * w2_reject)

B = [1, -(q1 + q2), q1 * q2]; A = [1]
w, h = freqz(B, A, worN = N)
plt.plot(w, np.abs(h))
for w_reject in [w1_reject, w2_reject]:
    plt.axvline(x=w_reject, linestyle='--', c='k')
plt.grid(True)

If we only want to reject a single frequency (as in this problem), we can set $q_{1} = q_{2} = q = e^{j ω_{r i}}$ , making our expanded general factored transfer function

$H (z) = 1 - 2 q z^{- 1} + q^{2} z^{- 2}$ .

In [6]:

# Only reject a single frequency with a second-order FIR filter with complex coefficients.
w_reject = 2.2

N = 512
q = np.exp(1j * w_reject)

B = [1, -2*q, q ** 2]; A = [1]
w, h = freqz(B, A, worN = N)
plt.plot(w, np.abs(h))
plt.axvline(x=w_reject, linestyle='--', c='k')
plt.grid(True)

We could technically be done here since the problem doesn't stipulate any more constraints, nor does it stipulate how wide the passband should be. However, I don't think this is what the question is after.

If we want a second-order FIR filter with a real coefficient $b_{1}$ (and $b_{2} = 1$ ), such that the amplitude response is equal to $0$ for normalized frequency $ω$ , we set the amplitude response to $0$ and solve for $b_{1}$

$\begin{aligned} G (ω) = | H (e^{j ω T}) | & = | 1 + b_{1} e^{- j ω T} + e^{- j 2 ω T} | \\ 0 & = | e^{j 2 ω T} + b_{1} e^{j ω T} + 1 | \end{aligned}$

The only way for the magnitude of a complex number to be $0$ is if the complex number itself is $0$ . I know it's obvious but I needed to think about this for a while and found out the hard way that this works cleanly after solving with trig by expanding using Euler's identity. Here's a much simpler way:

$\begin{aligned} 0 & = e^{j 2 ω T} + b_{1} e^{j ω T} + 1 \\ - e^{j 2 ω T} - 1 & = b_{1} e^{j ω T} \\ - \frac{e^{j 2 ω T} + 1}{e^{j ω T}} & = b_{1} \\ - (e^{j ω T} + e^{- j ω T}) & = b_{1} \\ - 2 \cos (ω T) & = b_{1} \end{aligned}$

We are given the sampling interval $T = \frac{1}{f_{s}} = \frac{1}{40 k} s$ .

When $f = 60 H z$ ,

$\begin{aligned} b_{1} & = - 2 \cos (ω T) \\ = - 2 \cos (2 π \frac{60}{40 k}) \\ = - 2 \cos (\frac{3 π}{1000}) \end{aligned}$

Finally, our naive second-order 60 Hz hum-reject-filter for a sampling rate of $f_{s} = 40$ Khz is

$y (n) = x (n) - 2 \cos (\frac{3 π}{1000}) x (n - 1) + x (n - 2)$ .

In [7]:

fs = 40_000
B = [1, -2*np.cos(3 * np.pi / 1000), 1]

H = np.fft.fft(B, n=fs)

f = np.arange(0, fs // 2)
plt.axvline(x=60, linestyle='--', c='k', label='Target reject frequency (60 Hz)')
plt.loglog(f, np.abs(H[:fs//2]))
plt.legend()
plt.grid(True)

How is the frequency response not ideal? What is a simple way to obtain a better frequency response at the price of higher filter order? (Hint: Consider $lim_{M \to \infty} {| \cos (θ_{c}) |}^{M}$ .)

We can see from the plot (and from the equation) that the response is nowhere near flat outside of the "stopband" (nor is it flat anywhere in the spectrum).

Let's get a grip on the hint we're given:

In [8]:

theta = np.linspace(0, 2 * np.pi, 1000)
_ = plt.plot(theta, 1 - np.abs(np.cos(theta)) ** 1000)

This tells us that $1 - {| \cos (θ_{c}) |}^{M}$ behaves as what looks like an ideal reject response when $θ_{c} = π$ as $M \to \infty$ . We can choose any frequency to "notch" here by subtracting it from $θ_{c}$ above. That is, we should be pretty happy with an amplitude response that looks like

$\begin{aligned} G (ω) & = 1 - {| \cos (ω - ω_{r}) |}^{M} \end{aligned}$

I scratched my head for awhile thinking of how to design a transfer function to match this kind of response. I ended up finding this SO answer that helped. (I will discuss this more in the next section). The constant a in the following implementation controls how close we want to make the frequency response to flat.

In [9]:

fs = 40_000

f_r = 3 * np.pi / 1000
a = 1.01
B = [1, -2*np.cos(f_r), 1]
A = [1, -2 * a * np.cos(f_r), a ** 2]
H = ((1 + a) / 2) * np.fft.fft(B, n=fs) / np.fft.fft(A, n=fs)

f = np.arange(0, fs // 2)
plt.axvline(x=60, linestyle='--', c='k', label='Target reject frequency (60 Hz)')
plt.loglog(f, np.abs(H[:fs//2]))
plt.legend()
plt.grid(True)

A graphical approach

In this case, as mentioned above, I managed to find a Stack Overflow answer that helped me solve this particular problem. But the help came in a fully-formed transfer function,

$\begin{aligned} H (z) & = \frac{1 + a}{2} \frac{(z - e^{j ω_{n}}) (z - e^{- j ω_{n}})}{(z - a e^{j ω_{n}}) (z - a e^{- j ω_{n}})} \\ = \frac{1 + a}{2} \frac{z^{2} - 2 z \cos ω_{n} + 1}{z^{2} - 2 a z \cos ω_{n} + a^{2}} . \end{aligned}$

How did they arrive at this somewhat perplexing transfer function? How would I be able to come up with a similarly clever formula to solve a problem like this? I will get to what I think is a satisfying answer using some geometric intuition after a little background!

Let's start with the definition of a transfer function:

The transfer function of a linear time-invariant discrete-time filter is defined as $Y (z) / X (z)$ , where $Y (z)$ denotes the $z$ transform of the filter output signal $y (n)$ , and $X (z)$ denotes the $z$ transform of the filter input signal $x (n)$ .

Here, $z$ is just any number in the complex plane. However, what we really care about is the frequency response, which is only evaluated on the unit circle in the $z$ plane. Thus we can set $z$ to $e^{j ω T}$ , where $ω ≜ 2 π f$ , $f$ is the frequency in Hz, and $T$ is the sampling interval in seconds.

Now that we only care about the frequency response, we can think of the transfer function in more familiar terms, as a ratio of the DFT of the output to the DFT of the input.

This is a frequency domain interpretation of digital filters. In the time domain, for linear time-invariant filters we are always simply multiplying the current input $x (n)$ , past inputs $x (n - i)$ , and past outputs $y (n - i)$ , by some coefficients $b_{i}$ and $a_{i}$ . That is, any linear time-invariant filter can be formulated as

$y (n) = b_{0} x (n) + b_{1} x (n - 1) + \dots + b_{M} x (n - M) - a_{1} y (n - 1) - \dots - a_{N} y (n - N)$ .

JOS provides a derivation that I won't reproduce here, which uses basic properties of the DFT to unite these equations into a factored form expressed in terms of poles $p_{i}$ and zeros $q_{i}$ (along with some scaling factor $g$ ):

$H (z) = g \frac{(1 - q_{1} z^{- 1}) (1 - q_{2} z^{- 1}) \dots (1 - q_{M} z^{- 1})}{(1 - p_{1} z^{- 1}) (1 - p_{2} z^{- 1}) \dots (1 - p_{N} z^{- 1})}$

Here we care about the amplitude response, $G (ω) ≜ | H (e^{j ω T}) |$ . That is, the amplitude response of a filter is defined as the absolute value of its transfer function evaluated on the unit circle. I will skip yet another derivation to get to the point, which is that we can finally express the amplitude reponse of any linear time-invariant filter as

$G (ω) = | g | \frac{| e^{j ω T} - q_{1} | \cdot | e^{j ω T} - q_{2} | \dots | e^{j ω T} - q_{M} |}{| e^{j ω T} - p_{1} | \cdot | e^{j ω T} - p_{2} | \dots | e^{j ω T} - p_{N} |}$ ,

where, again, $p_{i}$ is a pole of the filter, $q_{i}$ is a zero, and $g$ is a scaling constant.

Remember that transfer function in the Stack Overflow answer? Here it is again:

$H (z) = \frac{1 + a}{2} \frac{(z - e^{j ω_{n}}) (z - e^{- j ω_{n}})}{(z - a e^{j ω_{n}}) (z - a e^{- j ω_{n}})}$

Taking the amplitude response,

$\begin{aligned} G (ω) & ≜ | H (e^{j ω T}) | \\ = | \frac{1 + a}{2} | \frac{| e^{j ω T} - e^{j ω_{n}} | | e^{j ω T} - e^{- j ω_{n}} |}{| e^{j ω T} - a e^{j ω_{n}} | | e^{j ω T} - a e^{- j ω_{n}} |} \end{aligned}$

We finally have the tools to interpret this in a visual, geometric way! Clearly, that $\frac{1 + a}{2}$ term is our scaling factor $g$ . We can safely ignore it since it will scale all frequencies the same amount. That $ω$ term there is a frequency at which we are evaluating the amplitude response of the filter. The $ω_{n}$ term is what the Stack Overflow dude decided to use to express the angle of the placement of the poles and zeros of our notch filter. Thus, the zeros, $q_{1} = e^{j ω_{n}}, q_{2} = e^{- j ω_{n}}$ lie exactly on the unit circle, and the poles, $p_{1} = a e^{j ω_{n}}, p_{2} = a e^{- j ω_{n}}$ , are placed at the same angle as the poles, and then "pushed out" by radius $a$ .

Let's see what these poles and zeros look like in the complex plane, and their effect on the frequency response, as we change $a$ :

notch filter animation showing zeros and poles

As the poles are moved closer to the zeros, the frequency response magnitude increases everywhere except at the angle $ω_{n}$ (and $- ω_{n}$ ) at which we've placed our poles and zeros. Which is exactly what we want!

To understand why this is the case, look back at the amplitude response (removing the constant scaling factor):

$G (ω) = \frac{| e^{j ω T} - e^{j ω_{n}} | | e^{j ω T} - e^{- j ω_{n}} |}{| e^{j ω T} - a e^{j ω_{n}} | | e^{j ω T} - a e^{- j ω_{n}} |}$

Here we can see that the frequency response magnitude at frequency $ω$ is given by the product of vector lengths from the zeros to the point $e^{j ω T}$ on the unit circle, divided by the product of vectors lengths from the poles to that same point. (The phase response is similarly given by the sum of angles of the zero vectors, subtracted by the angles of the pole vectors. But I won't be discussing phase reponse in this post.)

Let's now see that same animation but with these vectors explicitly drawn. The red point in the amplitude response chart is the result of taking the ratio of the product of the orange (numerator) vectors over the product of the blue (denominator) vectors. The frequency response is obtained by finding this ratio for all frequencies (all such points along the unit circle).

notch filter animation showing zeros and poles with vectors lines for numerator and denominator

Since the zeros lie on the unit circle, the response will evaluate to exactly $0$ for $ω = p_{n}$ . As the poles get closer to the zeros, the difference between the product of the blue vectors and the product of the orange vectors gets smaller and smaller, and so their ratio approaches $1$ . However, until they are exactly equal, the orange (zero) vectors will still asymptotically approach $0$ as $ω \to p_{n}$ , while the poles will still have some positive magnitude, resulting in a ratio (response magnitude) close to $0$ (and equal to $0$ when $ω = p_{n}$ ).

With this interpretation we've transformed a somewhat abstract mathy goal (design a filter that rejects 60 Hz hum) into a problem that, at least for me, is much more intuitive: How can we arrange points in a 2D plane such that the ratio of vector-lengths from those points to the unit circle is near zero at the angle corresponding to 60 Hz, but near one everywhere else?

Visualizations for common audio filters

We can deepen our intuition further by moving the poles and zeros of some common filter families, and seeing what effect it has on the amplitude, phase and impulse responses. (The impulse response is the output of the filter when the input is a unit impulse - a single 1 followed by an infinite number of 0's.)