Signal Processing 92 (2012) 170–178 Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro On optimal threshold and structure in threshold system based detector Gencheng Guo, Mrinal Mandal Department of Electrical and Computer Engineering, 2nd Floor, ECERF Building, University of Alberta, Edmonton, AB, Canada T6G 2V4 a r t i c l e in f o abstract Article history: Received 3 May 2010 Received in revised form 27 June 2011 Accepted 30 June 2011 Available online 6 July 2011 Threshold systems are widely used in nonlinear signal detection. The optimal threshold and the optimal structure are crucial to obtain good performance. In this paper, we propose an efficient method to estimate optimal threshold in a threshold system based detector (TD). In addition, we investigate the structural issue of TD and propose a technique to determine the optimal structure. An optimal threshold detector is then implemented based on the optimal structure with the optimal threshold(s). Experiments show that the proposed optimal TD can be fairly compared to the optimal detection with much easier implementation and much less computational complexity. & 2011 Elsevier B.V. All rights reserved. Keywords: Signal detection Optimal threshold detector Optimal threshold Optimal structure 1. Introduction The problems of detecting a known signal in additive noise with known distribution have well-known solutions [1]. Matched filters provide optimal detection in the Gaussian noise. For non-Gaussian noise, the optimal detection performance can be achieved only with a nonlinear system. Given a weak signal in non-Gaussian noise with known probability density function (pdf), a locally optimal (LO) nonlinear detector can be derived and is considered to be asymptotically optimum [1]. However, accurate estimation of the noise pdf is difficult to obtain in many applications because the noise may vary with time or from one application to another. Furthermore, even though the pdf is available, some optimal detectors have high implementation complexity. Therefore, more robust, suboptimal detectors provide a more practical choice [2]. Threshold system (TS) based detector, henceforth referred to as threshold detector (TD), is widely used in many applications as a Corresponding author. Tel.: þ1 780 492 0294; fax: þ1 780 492 1811. E-mail addresses: [email protected] (G. Guo), [email protected] (M. Mandal). URL: http://www.ece.ualberta.ca/~mandal/ (M. Mandal). 0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2011.06.016 suboptimal detector [3–6]. These detectors have several advantages: high speed, low resource requirement, and high detection efficiency. Chapeau-Blondeau [3] demonstrated that TD can improve the detection performance, in case of non-Gaussian noise, compared to linear detector. Consider the binary hypotheses detection problem described as follows: ( H0 : x½n ¼ w½n n ¼ 0,1, . . . ,N1 H1 : x½n ¼ s½n þ w½n n ¼ 0,1, . . . ,N1 , ð1Þ where w[n] is an independent and identically distributed (i.i.d.) noise and s[n] is a DC signal. Note that here, for simplicity, we derive the basic results (optimal threshold and optimal structure) for a DC signal in an i.i.d. noise with known pdf. We will explain how to deal with a general signal s[n] (non-DC signal, such as a sinusoidal signal) in noise with unknown pdf in Section 4. The schematic of a TD typically used [3,7–9] is shown in Fig. 1. The output of the threshold system (TS) can be calculated using either of the following two equations: ( y½n ¼ 1 x½n Z t 0 x½n o t ð2Þ G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 171 calculated. The decision H1 is made if LðzÞ 4 g, where LðzÞ ¼ fZ ðz; H1 Þ=fZ ðz; H0 Þ is the LRT (likelihood ratio test) statistic on Z, fZ ðz; Hi Þ, i ¼ 0,1 are the pdf’s of Z under the two hypotheses, g is calculated using the following equation for a given PFA ¼ a: Z PFA ¼ fZ ðz; H0 Þ dz ¼ a: ð4Þ z:LðzÞ 4 g With the above g, PD is calculated by Z PD ¼ fZ ðz; H1 Þ dz: Fig. 1. The schematic of the threshold detector. or ( y½n ¼ ð5Þ z:LðzÞ 4 g 0 x½n Z t 1 x½n o t , ð3Þ according to different situations, where t is a threshold P of the TS. A test statistic Z ¼ z ¼ ð1=NÞ N1 n ¼ 0 y½n is then calculated. We use Z (upper case) to represent the random variable (RV) and z (lower case) to represent a scalar value calculated from a given observation x½n, n ¼ 0,1, . . . ,N1. The decision H1 is made if z 4 g0 . Note that g0 and t are two different thresholds, where g0 is the threshold always used in binary detection while t is the threshold of TS used to construct TD. The performance of the TD is sensitive to the TS threshold t, and therefore t needs to be calculated properly. However, the optimality issues about TD have not been analyzed systematically. The calculation of the optimal threshold is computationally intensive. In addition, to the author’s knowledge, no analysis on the optimal structure has been available in the literature. The optimal structure refers to a specific type of TS that leads to optimal detectability for a given noise pdf. In this paper, we address these two limitations. First, we propose a faster method to calculate the optimal threshold. This is performed by using a novel detectability indicator, which monotonically increases with the probability of detection (PD). Secondly, based on the indicator, we determine an optimal TS structure of the detector. We show that the optimal TD is implemented based on the optimal structure with the optimal threshold(s). Here, the optimal TD is the one that has maximum PD for a given PFA (probability of false alarm) using the test shown in Fig. 1. Experimental results demonstrate that the performance obtained from the optimal TD can be very close to that obtained from the LO detector, and can be much better than the performance obtained from the linear matched filter, for non-Gaussian noise with heavy pdf tails. The paper is organized as follows. In Section 2, we propose a new indicator of detectability. We then calculate the optimal threshold and determine the optimal structure in Section 3. In Section 4, we address some practical issues when employing the proposed TD. In Section 5, several examples demonstrate the optimal design and the detection performance, which is followed by conclusions in Section 6. 2. An indicator of the detectability PD Typically, the pair ðPD ,PFA Þ is used for the performance measurement of a detector. For the given test in Fig. 1 and an observation x½n, n 2 ½0,N1, the test statistic z can be In this section, we first derive fZ ðz; Hi Þ, i ¼ 0,1 and hence PD can be calculated. We then show that the PD monotonically increases with a novel indicator, which can be used instead of PD. 2.1. Estimation of fZ ðz; H0 Þ and fZ ðz; H1 Þ Let fX ðx; H0 Þ and fX ðx; H1 Þ be the pdf’s of one sample in x½n, n 2 ½0,N1 under the conditions of H0 and H1, respectively. Given a threshold t for the TS described in Eq. (2) or (3), Y ¼ y½n is a random variable (RV) taking only two values, 0 or 1. Hence, Y can be considered as an output of a success (1)/failure (0) experiment known as the Bernoulli trial. The probability Pr ðY ¼ 0; H0 Þ refers to the probability of Y¼0 under the condition of H0. The probabilities Pr ðY ¼ 1; H0 Þ, Pr ðY ¼ 0; H1 Þ and Pr ðY ¼ 1; H1 Þ are defined in similar manner, and are calculated as follows: Z t Pr ðY ¼ 0; H0 Þ ¼ fX ðx; H0 Þ dx ¼ q0 , ð6Þ 1 Pr ðY ¼ 1; H0 Þ ¼ Pr ðY ¼ 0; H1 Þ ¼ Z 1 t Z t fX ðx; H0 Þ dx ¼ p0 , ð7Þ fX ðx; H1 Þ dx ¼ q1 , ð8Þ fX ðx; H1 Þ dx ¼ p1 , ð9Þ 1 Pr ðY ¼ 1; H1 Þ ¼ Z 1 t where p0 þ q0 ¼ 1 and p1 þ q1 ¼ 1. We observe that p0 and p1 are the probability of Y¼1 under the conditions of H0 and H1, respectively. It can be verified that if the DC signal A 4 0 (fX ðx; H1 Þ is a right shift of fX ðx; H0 ÞÞ and the TS based on Eq. (2) is employed, p1 4p0 holds true. For A o0, we can choose the TS based on Eq. (3) to make p1 4 p0 hold true. Henceforth, we only consider the case p1 4 p0 and the TS expressed by Eq. (2), which is the case for A Z 0. While for A o0, using the TS expressed by Eq. (3), we will have the same results. Consider an observation x½n, n ¼ 0,1, . . . ,N1 under H0. Let m 2 ½0,N be the number of successes when x[n] is applied to the TS. The discrete probability distribution of the number of successes is a binomial distribution, and the probability mass function (pmf) of the number m is N m Nm fM ðmÞ ¼ ðm Þp0 q0 . We then have ! 1 N 1 NX m NNz y½n ¼ ; H0 ¼ : ð10Þ fZ z ¼ pNz 0 q0 Nn¼0 N Nz Because fZ ðz; Hi Þ, i ¼ 0,1 are discrete (N and Nz are both non-negative integers, z ¼ m=N, m ¼ 0,1, . . . ,NÞ, the ROC 172 G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 (receiver operating characteristic) is a staircase form (see Fig. 4(b)) and the step size (height and width) of the staircase depends on N. If N is large, the size of the staircase is small and the demonstration could be more precise. To eliminate the discreteness effect of N, we use the continuous form of Eq. (10) for theoretical demonstration, which is shown as follows: fZ ðz; H0 Þ ¼ N GðN þ 1Þ pNz qNNz , GðNz þ 1ÞGðNNz þ1Þ 0 0 ð11Þ where N, Nz 2 R, 0 r Nz rN and GðÞ is the Gamma function defined as Z 1 GðnÞ ¼ t n1 et dt: 0 Note that the factor N in the numerator of Eq. (11) is a scale factor to make the integral be 1. Similarly, we obtain fZ ðz; H1 Þ ¼ N GðN þ 1Þ pNz qNNz : GðNz þ 1ÞGðNNz þ 1Þ 1 1 ð12Þ Using Eqs. (11) and (12), we calculate L(Z) as follows: Nz NNz m p1 q1 ¼ L z¼ : ð13Þ N p0 q0 Because we have assumed p1 =p0 4 1, we also have q1 =q0 o1. Thus, both ðp1 =p0 ÞNz and ðq1 =q0 ÞNNz monotonically increase with respect to z. Therefore, L(z) is a monotonically increasing function and the decision H1 is made if z 4 g0 , where g0 is calculated using the following equation (for a given PFA ¼ aÞ: Z fZ ðz; H0 Þ dz ¼ a: ð14Þ PFA ¼ z 4 g0 With the above g0 , PD can be calculated as follows: Z fZ ðz; H1 Þ dz: PD ¼ ð15Þ z 4 g0 Note that we use integral instead of summation in Eqs. (14) and (15) because we consider fZ ðz; Hi Þ, i ¼ 0,1 as continuous in theoretical derivation. Eqs. (14) and (15) conform to the schematic shown in Fig. 1, i.e., we decide H1 provided that z 4 g0 . Next, we will show that Dp ¼ p1 p0 can be an indicator of PD, where p0 ,p1 are shown in Eqs. (7) and (9), respectively. 2.2. A detectability indicator Dp Given fX ðx; H0 Þ, fX ðx; H1 Þ, the optimal t can be calculated by topt ¼ argmaxPD : t For a given t, PD is calculated from the derived fZ ðz; H1 Þ and fZ ðz; H0 Þ. Because the closed form of the integral of fZ ðz; H1 Þ and fZ ðz; H0 Þ is not available, numerical method needs to be employed. To reduce the computational complexity and for other convenience, we propose an alternative detectability indicator Dp ¼ p1 p0 , which is shown in the following Lemma. Lemma 1. If A 5 s and N,m,ðNmÞ b 1, for a given PFA , PD can be approximated as a monotonically increasing function of Dp, where Dp ¼ ðp1 p0 Þ, s is the standard deviation of the noise. Proof. In this proof, we first show that PD is a function of p1 and p0. By defining Dp ¼ p1 p0 , we then express PD as a function of Dp and p1. We then show that @PD =@p1 0 and @PD =@Dp 40, therefore we can approximate PD as a monotonically increasing function of Dp. This means that we can use Dp as an indicator of PD. The conditions A5 s and N,m,ðNmÞ b 1 are reasonable for a TD to be used in DC detection. Signal is often assumed weak compared to the noise strength [1]. The condition N b 1 is generally true because A 5 s and the number of samples in one observation should be larger for a good detectability. If the conditions m b1 or ðNmÞ b 1 are not true, the chosen t (the line x ¼ tÞ should be at very right tail (m is close to 0) or very left tail (ðNmÞ is close to 0) of fX ðx; Hi Þ, i ¼ 0,1 (see Eqs. (6)–(9)). Using this t, the TS will convert almost all the samples x[n] (for both H0 and H1) to 0’s or 1’s, i.e., the information in x[n] to distinguish H1 from H0 is lost, and hence the detectability will be poor. Therefore, we should chose a t that makes the output of the TS satisfy m b1 and ðNmÞ b1. If N,m,ðNmÞ b 1, using de Moivre–Laplace theorem, fZ ðz; H0 Þ and fZ ðz; H1 Þ in Eqs. (11) and (12) can be approximated by the normal distribution fZ ðz; H0 Þ N ðz; p0 ,p0 ð1p0 Þ=NÞ, ð16Þ fZ ðz; H1 Þ N ðz; p1 ,p1 ð1p1 Þ=NÞ, ð17Þ 2 where N ðz; m, s Þ is normal distribution with mean m and variance s2 , N is the number of the samples. Using Eq. (16), Eq. (14) can be expressed as ! g0 p0 Q pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ a, ð18Þ p0 ð1p0 Þ=N where Q ðÞ is the complementary CDF (cumulative distribution function) of standard normal pdf. The g0 is solved from Eq. (18), which is shown as follows: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi g0 ¼ p0 þ Q 1 ðaÞ p0 ð1p0 Þ=N, ð19Þ where Q 1 is the inverse function of Q ðÞ. Substituting the g0 into Eq. (15), we obtain ! pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! p0 p1 þQ 1 ðaÞ p0 ð1p0 Þ=N g0 p1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : PD ¼ Q pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Q p1 ð1p1 Þ=N p1 ð1p1 Þ=N ð20Þ Substituting p0 ¼ p1 Dp into Eq. (20) (since Dp ¼ p1 p0 Þ, we obtain pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! Dp þ Q 1 ðaÞ ðp1 DpÞð1p1 þ DpÞ=N pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PD ¼ Q : p1 ð1p1 Þ=N Because Q ðÞ is a monotonically decreasing function, we can only consider the part inside Q ðÞ function, which is denoted as pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Dp þQ 1 ðaÞ ðp1 DpÞð1p1 þ DpÞ=N pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð21Þ Y¼ p1 ð1p1 Þ=N G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 Claim (1) @Y=@p1 0. Because A 5 s, and fX ðx; H1 Þ ¼ fX R1 ðxA; H0 Þ, we have Dp ¼ p1 p0 ¼ tA fX ðx; H0 Þ R1 dx t fX ðx; H0 Þ dx Af X ðt; H0 Þ 0. Because Dp 0, it can be shown that t s, t=2ss=2t 0, and thus @Y=@p1 0 (see Eq. (23)). Y is function of Dp and p1 as shown in Eq. (21). Due to @Y=@p1 0, we can view Y as a function of Dp only. Claim (2) @Y=@Dp o 0. To show this, we recast Eq. (22) as ! @Y 1 Q 1 ðaÞðp0 1=2Þ pffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p p ¼ N : @Dp p1 ð1p1 Þ p0 ð1p0 Þ ð24Þ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Because p1 ð1p1 Þ is positive, we only need to decide the sign of the following equation: ! Q 1 ðaÞðp0 1=2Þ pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N : Cðp0 , a,NÞ ¼ ð25Þ p0 ð1p0 Þ The value of Cðp0 , a,NÞ is related to the variables a, p0 and N. Generally, it can be either positive or negative because some factors in Eq. (25) can be infinity, such as lima-1 Q 1 ðaÞ ¼ 1. However, considering reasonable values for these variables, we can obtain a useful result. First, since N,m,ðNmÞÞ b 1, p0 and p1 should not be very close to 0 or 1. Therefore, we assume 0:2 o p0 o 0:8 that is generally satisfied when topt is chosen (see the examples in Section 5). Secondly, for any detector, limPFA -0 PD ¼ 0 and limPFA -1 PD ¼ 1. Therefore, we omit these extremes. Under these considerations, for 0:2 o p0 o 0:8, we have 0:75 o pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ððp0 1=2Þ= p0 ð1p0 ÞÞ o 0:75. We further assume 0:05 o a o0:95, then 1:6449 o Q 1 ðaÞ o 1:6449. The maximum value of the first factor in Eq. (25) is about 1.24. If N Z2, the sign of Cðp0 , a,NÞ is negative. As explained at the beginning of this proof, the assumption N,m,ðNmÞ b 1 is typically satisfied. Therefore Cðp0 , a,NÞ will be negative. From Claim (1) and Claim (2), we show that approximately Y is a monotonically decreasing function of Dp. A justification is demonstrated in Fig. 2 using a practical case. The subplot (a) shows the contour lines of the PD with respect to p1 and p0. For every ðp1 ,p0 Þ, the PD is calculated using Eqs. (14)–(17), with PFA ¼ 0:1, N¼100. When p1 ¼ p0 , the contour line PD ¼ PFA is the line p0 ¼ p1 . The other contour lines are like straight lines parallel to the line p0 ¼ p1 . Because the contour lines parallel to p0 ¼ p1 , the points in one of these contour lines all have same Dp. Also because the points are in same contour line, these points have same PD. It means that same Dp leads to same PD, no matter which p1 is used, which verify the claim (1). We also observe that the PD monotonically increases from top isoline PD ¼ 0:1 to the bottom isoline PD ¼ 0:9 with the increase of Dp, which verifies the claim (2). An enlarged plot for the region in the box (p0 ,p1 2 ½0:2,0:5Þ of (a) is shown in (b). A possible function p1 ðp0 Þ is added to (b) shown as the thick curve. If we move the threshold t of the TS from the right to the left of the pdf’s fX ðx,Hi Þ, i ¼ 0,1, we will have an increasing p1 and p0. Sometimes the increasing p1 is faster than the increasing PD=0.1 0.8 p0=p 1 0.6 p0 pffiffiffiffi t s t þ Dp t Dp N Q 1 ðaÞ ð12p1 Þ @Y 2s 2t s ¼ , ð23Þ @p1 t2 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where s ¼ ðp1 DpÞð1p1 þ DpÞ and t ¼ ðp1 Þð1p1 Þ. In the following, we show that Claim (1) @Y=@p1 0, which means that the changing p1 has little influence to Y and therefore Y can be approximated as a function of only Dp; and Claim (2) @Y=@Dp o 0, which means that Y monotonically decreases with Dp. Hence PD is a monotonically increasing function of Dp because PD ¼ Q ðYÞ is a monotonically decreasing function. & 0.4 PD=1 in this area 0.2 0.2 0.4 p1 0.6 0.8 0.5 D 0.45 p0=p 0.4 1 PD=0.1 p0 The differential of Y with respect to Dp and p1 can then be obtained as follows: pffiffiffiffi @Y 1 ð1 þ 2p1 2DpÞ N þQ 1 ðaÞ , ð22Þ ¼ t 2s @Dp 173 0.35 0.3 B C 0.25 0.2 0.2 0.25 0.3 0.35 p1 0.4 0.45 0.5 Fig. 2. A justification of Lemma 1. (a) The contour lines of PD with respect to p1 and p0. (b) The enlarged region in the box as shown in (a) with a ðp1 ,p0 Þ curve. 174 G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 of p0, i.e., the increasing Dp 40, as shown in the curve from point B to the point C, and sometimes the increasing Dp o 0 as shown in the curve from the point C to the point D. It is observed that the PD monotonically increases with Dp. Lemma 1 shows that PD is a monotonically increasing function of Dp. Therefore, Dp can be used as an indicator of PD. We note that the proof of Lemma 1 is based on some approximations, and hence Lemma 1 is not rigorously correct for all cases. For example, in Fig. 2(a), the isoline PD ¼ 0:9 in the region p1 o0:2 is not a parallel line to p0 ¼ p1 and Lemma 1 may not hold true. However, for the general cases of signal detections (p0 and p1 are not close to 0 or 1, and Dp is small), it works very well. Dp calculation is much easier than PD. Next Dp indicator is employed to calculate topt and determine optimal structure. 3. Optimal threshold calculation and optimal structure determination In this section, we propose techniques to calculate the optimal threshold and determine the optimal structure using the Dp indicator. 3.1. Calculation of the optimal threshold based on Dp Dp indicator provides a much easier and faster way to find topt , which is shown in the following lemma. Lemma 2. The topt should be one of the x’s among the intersections between fX ðx; H0 Þ and fX ðx; H1 Þ. Proof. According to the Dp indicator, PD increases with the increase of Dp. Therefore, topt should locate at one of the local maximum points of DpðtÞ. Because DpðtÞ ¼ R1 p1 p0 ¼ t ðfX ðx; H1 ÞfX ðx; H0 ÞÞ dx, the local maximum points of Dp should satisfy dDp=dt ¼ fX ðt; H0 ÞfX ðt; H1 Þ ¼ 0 because fX ð1; Hi Þ ¼ 0, i ¼ 0,1. Therefore, topt has to be one of the x’s among intersections between fX ðx; H0 Þ and fX ðx; H1 Þ, i.e., the x’s satisfy fX ðx; H1 Þ ¼ fX ðx; H0 Þ. & Based on Lemma 2, we can derive the topt for some specific cases, which is illustrated as follows. Corollary 1. topt is the x such that fX ðx; H0 Þ¼ fX ðx; H1 Þ if there is only one intersection between fX ðx; H0 Þ and fX ðx; H1 Þ. In particular, if fX ðx; H0 Þ is unimodal and symmetric, we have topt ¼ ðx0 þ x1 Þ=2, where x0 and x1 are the mode of fX ðx; H0 Þ and the mode of fX ðx; H1 Þ, respectively. From Lemma 1, we know the topt locates at the only one intersection. The unimodal means that the probability distribution has a single mode and the mode is a value at which the pdf/pmf takes its maximum value. If the noise pdf is unimodal and symmetric (like Gaussian), we can easily know that the pdf fX ðx,H0 Þ is symmetric with respect to x ¼ x0 , where x0 is the mode of fX ðx,H0 Þ. Because fX ðx; H1 Þ ¼ fX ðxA; H0 Þ (right shifted), the only intersection has to be x ¼ x0 þ A=2 ¼ ðx0 þ x1 Þ=2, where x1 is the mode of fX ðx; H1 Þ. Corollary 2. For multi-intersection case, for example, K intersections, topt ¼ argmaxt ¼ xi Dp and xi , i ¼ 1, . . . ,K is the x of the K intersections, i.e., fX ðxi ; H0 Þ ¼ fX ðxi ; H1 Þ. From Lemma 1, PD increases with the increase of Dp. Therefore, the t that yields to maximum Dp is topt . It is worth noting on the computational complexity. Using Dp, we can show that the computational complexity is significantly reduced compared to using PD directly. For a given range and resolution of t, if using Dp, two integrals, one integral on fX ðx; H0 Þ and one integral on fX ðx; H1 Þ, are needed to calculate all the pi , i ¼ 0,1 for every t to find the intersections located at xi , i ¼ 1, . . . ,K. We still compare the Dpi , i ¼ 1, . . . ,K to find the maximum one. The total computing cost is about two integrals. If using PD, beside the two integrals for p0 and p1, it needs two integrals for every t to calculate the PD for a given PFA . Therefore, the computational complexity is 2 þ 2nðrange =resolutionÞ integrals. Obviously, the computational complexity of calculating topt is significantly reduced if Dp indicator is used. A running time to find the topt will be presented in Section 5.1. 3.2. Structure of threshold detector We calculated the topt in Section 3.1. However, in cases where multiple local maximum (MLM) points exist in DpðtÞ, optimal PD cannot be achieved using only one TS expressed by Eq. (2). We propose a TS structure for better performance. Lemma 3. The optimal TS structure for MLM cases is the combination of the multiple TS’s that are designed for the partitions of data space divided by the local minimum Dp’s. Proof. DpðtÞ has the significant values: ð0,B1max ,B1min , . . . , BKmax ,0Þ corresponding to the t’s ð1, t1max , t1min , . . . , tKmax , þ 1Þ, where Bimax represents the ith, i 2 ½1,K local maximum Dp and Bjmin represents the jth, j 2 ½1,K1 local minimum Dp. Partition the data space X 2 R into K subsets by the tjmin , denoted as Xk ,k 2 ½1,K. The K independent TS’s are designed on every subset. In the Xk, the pdf’s of x are given as fX ðx; H0 ,Xk Þ ¼ fX ðx; H0 Þ , Pr ðx; H0 ,Xk Þ fX ðx; H1 ,Xk Þ ¼ fX ðx; H1 Þ , Pr ðx; H1 ,Xk Þ where Pr ðx; H0 ,Xk Þ and Pr ðx; H1 ,Xk Þ are the probability of x locating in Xk under the conditions of H0 or H1, respectively. The local maximum Dpk in Xk can be calculated using Z Dpk ¼ ðPr ðx; H1 ,Xk Þ fX ðx; H1 ,Xk Þ x:x2Xk , tk o x Pr ðx; H0 ,Xk Þ fX ðx; H0 ,Xk ÞÞ dx, Z ðfX ðx; H1 ÞfX ðx; H0 ÞÞ dx, ¼ ð26Þ x:x2Xk , tk o x where tk is the optimal t for the TS designed in Xk. From Eq. (26), we know that tk is identical to the t corresponding to the local maximum point of Dp in Xk before x is divided into K subsets. Therefore, the partition of X can be performed based on the DpðtÞ and only one maximum P Dpk in Xk. Therefore, Dps ¼ Kk ¼ 1 Dpk is the maximum Dp G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 we can obtain for X 2 R. Also because the change in Dp dominates the change of PD, the proposed TS structure will gain the optimal performance because it obtains the maximum Dp. & Lemma 3 shows how to determine the optimal structure. For a simple case that DpðtÞ has the significant values ð0,B1max ,B1min ,B2max ,0Þ corresponding to the t’s: ð1, t1max , t1min , t2max , þ1Þ, the optimal structure is the combination of TS1 and TS2, where TS1 is a TS (expressed by Eq. (2)) with t1 ¼ t1max for X1 : fx o t1min g and TS2 is with t2 ¼ t2max for X2 : fx 4 t1min g. An example for the structure determination will be presented in Section 5.2. We proposed the optimal threshold and the optimal structure based on Lemma 1. We point out further comments for the condition N,m,ðNmÞ b 1. Though these conditions are common in signal detection and with them we can approximate the binomial distribution to a normal distribution (in showing the proof of Lemma 1), we note that it is not necessary for the implementation of the TD. Consider that we have only one sample. Statistically, the optimal design of the TD is still valid. This is because, if the detection from the one sample is done many times, the detection problem can be considered as a detection from many samples and the TDs used for many samples and any one sample are identical. Other practical issues about the implementation are addressed below. 4. For non-DC signals and noises with unknown pdf’s In Section 2, we proposed an indicator of the detectability of the TD and in Section 3, we derived the optimal threshold and optimal structure. However, these results are obtained assuming that the signal is DC and the noise pdf is known. In practice, nevertheless, we often need to deal with non-DC signals and the noises with unknown pdf’s. In this section, we will address these practical issues using the theory we have derived to implement the TD. 4.1. Non-DC signal Now we want to determine whether a known signal s[n] exists from an observation x[n], where s[n] is a known non-DC signal. Though the non-DC signal could be any deterministic signals, from now on we will use sinusoidal signals in our justification and experiments because the 175 sinusoidal signal is widely used in signal detection [1,10]. This strategy can be easily extended from sinusoidal signals to any other deterministic signals. Consider one specific sample in the observation x[n], n ¼ 0,1, . . . ,N1, denoted as x½t, t 2 ½0,N1. We have x½t ¼ s½t þw½t. If we want to make a detection decision only based on this one sample x[t], the detection problems is a known DC signal s[t] embedded in noise w[t]. The optimal design of the TD is valid for this sample, and we obtain the maximum Dp. Therefore, for a specific sample x[t], we use a TS (shown in Eq. (2) or Eq. (3)) denoted as TS[t] and its output is denoted as zt. The overall detector can be the sum of the zt ,t 2 ½0,N1 with weight js½tj, i.e., z¼ N 1 X zt js½tj: ð27Þ t¼0 Note that we alternatively choose the TS using Eq. (2) or Eq. (3) depending on s[n] is positive or negative and we always have p1 4p0 . Therefore, zt always contributes positively to H1 and hence the weight should be positive. That is why js½tj is employed as the weight. Based on the above discussion, the schematic of the TD for detecting a known signal s[n] is shown in Fig. 3. The structure of the TD is composed of a multiway switch, a TS array and a replica-correlator. The strategy can be used to detect any known deterministic signals. Detailed analysis (for example, performance analysis) will be presented in a future paper. The detection performance of the TD in detecting a sinusoidal signal in non-Gaussian noise is presented in Section 5.3. 4.2. Noise with unknown pdf So far, the results we derived are based on known noise pdf’s. The accurate pdf of the noise is difficult to obtain in many applications and that is why we use TD as a suboptimal choice. In this section, we will demonstrate that optimal design of the proposed TD for unknown noises is much easier than the detectors that need full knowledge of the noise, such as LO detector [1] and another TD [6]. We classify the noises into two categories for the demonstration. (1) The noises with unimodal and symmetric pdf, which is called Type I noise in this paper. This type of noise covers a wide range of applications and the assumption does not appear to be overly restrictive for practical Fig. 3. The schematic of the TD for known sinusoidal signals. G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 To conclude, for the Type I noises, the optimal design of the proposed TD does not depend on the noise pdf’s. For Type II noises, we only need the knowledge that decides the intersections of the pdf’s fX ðx,Hi Þ, i ¼ 0,1, which is much less than the full knowledge of the noise pdf, easier to estimate and less sensitive to the inaccurate estimation. We note that the proposed TD is not suitable for all the noise. Only for the noises with heavy pdf tails, the TD can have a performance very close to the optimal one. The validation (if TD is suitable or not), detail methodology and some other issues for the different types of the noises will be addressed in a future paper. 5. Experimental results Two examples are presented in this section to illustrate the optimal threshold, the structure of the TD for DC signals and another one is used to demonstrate the detection performance for a sinusoidal signal. 2 where c ¼ ½a þ ð1aÞb 1=2 , 0 o a o1, b 4 0. A DC signal A ¼0.1 embedded in GM noise with a ¼ 0:9, b ¼ 5 and s ¼ 1. According to the symmetry of the GM, there is only one intersection between fX ðx; H0 Þ and fX ðx; H1 Þ, located at 0.05, and therefore topt ¼ 0:05. We show that the theoretical and simulation PD with respect to t for different PFA in Fig. 4(a). Obviously, the optimal t ¼ 0:05 is justified by the theoretical results. Some comments about the simulation results are worth mentioning here. The PD’s obtained from simulations are not very close to the theoretical ones. That is because the pmf’s fZ ðz; Hi Þ, i ¼ 0,1 obtained from simulations are discrete. It can be verified that in this case the numbers of the non-zero values for the two pmf’s fZ ðz; Hi Þ, i ¼ 0,1 are about 11 (see the number of the staircases in the simulation ROC in (b)). The pmf’s lead to the inaccurate PFA’s and PD’s. The error is not slight because the size of staircase is large (only 11 steps from 0 to 1). However, the simulation PD’s go up and down 1 Theoretical results Probability of detection (PD) applications [11]. For example, the generalized Gaussian (GG) and unimodal Gaussian mixture (GM) [1] belong to this large group. Note that GG (including Gaussian, Laplacian, uniform, etc.) and unimodal GM are widely used noise types in a variety of applications [1,6,12,13]. For the Type I noises, the optimal design of the TD is simple. Because we have only one intersection between fX ðx; H0 Þ and fX ðx; H1 Þ, the structure design is not necessary. Based on Corollary 1, the topt should be x0 þ A=2 for a DC signal s½n ¼ A or x0 þ s½n=2 for a non-DC signal s[n], where x0 is the mode of fX ðx; H0 Þ. Therefore, assuming the noise pdf is symmetric and unimodal, the optimal design of the TD does not depend on the noise pdf. (2) Other noises. For the noises that do not belong to the Type I noise, referred to as Type II noise, the solution cannot be as straightforward as the one for Type I noises. For example, in Section 5.2, we show a bimodal Gaussian mixture (GM) noise (the pdf with two different modes). For these noises, we need to calculate the topt and to determine the optimal structure. According to the Lemmas 2 and 3, to design the TD, we only need to know the intersections between fX ðx; H0 Þ and fX ðx; H1 Þ. In other words, only the knowledge that is useful to determine the intersections is necessary. The required knowledge is much less compared to the full knowledge of the noise pdf and is easier to estimate in applications. For instance, for the GM (bimodal) noise shown in Fig. 5, we only need to estimate the locations (x) of the two modes of fX ðx; H0 Þ. This can be performed based on the estimation theory. While for the LO detector, the full knowledge of the pdf is needed. Simulation results PFA=0.3 0.8 0.6 0.4 PFA=0.1 PFA=0.2 0.2 −1 −0.5 0 τ 0.5 1 1 Probability of detection (PD) 176 0.8 0.6 0.4 LO detector Threshold detector (theory) Threshold detector (simulation) Matched filter 0.2 0 0 0.2 0.4 0.6 0.8 1 Probability of false alarm (PFA) 5.1. Optimal threshold determination We consider the GM (unimodal) noise [1,6], which is defined as " !# 2 2 c c x 1a c 2 x2 fX ðx; H0 Þ ¼ pffiffiffiffiffiffi a exp exp 2 þ , b 2s2 s 2p 2b s2 ð28Þ Fig. 4. Performance of the TD of a DC signal in the Gaussian mixture noise. The DC signal is A ¼ 0.1. The GM is with a ¼ 0:9, b ¼ 5 and s2 ¼ 1. The number of the samples in one observation is N ¼100. Theoretical results are calculated based on Eqs. (11) and (12). For a given t and PFA, simulation PD is calculated from fz ðz; H0 Þ and fz ðz; H1 Þ, which are the histograms generated from 1000 simulations for (a) and 10 000 simulations for (b). (a) The PD with respect to t and PFA. (b) The ROCs from LO detector, matched filter and TD. G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 (1) Using PD. Step I: we calculate all the ðp0 ,p1 Þ’s for all the different t’s and for every pair ðp0 ,p1 Þ we have fZ ðz; Hi Þ, i ¼ 0,1 as shown in Eqs. (11) and (12). Step II, for every fZ ðz; Hi Þ, i ¼ 0,1, two additional integrals are needed to find g0 and PD shown as Eqs. (14) and (15). In step II, we will totally calculate 2 1001 ¼ 2002 numerical integrals to obtain 1001 PD’s and then choose the maximum one. (2) Using Dp. We only need to calculate the intersections between fX ðx; H0 Þ and fX ðx; H1 Þ. Based on the same resolution x ¼ 5 : 0:01 : 5, we compare fX ðx; H0 Þ and fX ðx; H1 Þ to find the intersection(s), i.e., if ðfX ðxk ; H0 Þ 4 fX ðxk ; H1 ÞÞ and (fX ðxk þ 1 ; H0 Þ o fX ðxk þ 1 ; H1 ÞÞ or vice versa, we can consider that the intersection is either xk or xk þ 1 , where xk and xk þ 1 are the kth and (kþ1)th components in x ¼ 5 : 0:01 : 5. We calculate fX ðx; H0 Þ and fX ðx; H1 Þ for all x and therefore the computational complexity is only as same as the step I when using PD. The 2002 numerical integrals in step II are saved here. The GM (bimodal) is defined as fx ðx; H0 Þ ¼ 12 N ðx; m, s2 Þ þ 12N ðx; m, s2 Þ: fx ðx; H1 Þ ¼ 12 N ðx; m þ A, s2 Þ þ 12N ðx; m þ A, s2 Þ, ð30Þ where A is the DC signal. Fig. 5(a) shows fX ðx; H0 Þ and fX ðx; H1 Þ, and the intersections locate at t1 , t2 and t3 , corresponding to the extrema of Dp. The Dp ¼ p1 p0 is illustrated in Fig. 5(b). Two maximum points B1max and B2max , locate at t1 and t2 . One minimum point B1min locates at t3 . x is divided into two complementary subset X1 and X2, where X1 : x r t3 and X2 : x 4 t3 . The MTS is constructed using two TS’s (TS1 and TS2) represented by PN1 1 Eq. (2), i.e., the test statistics are TS1 : z1 ¼ ð1=N1 Þ n¼0 P ðx½n 4 t1 Þ for x½n r t3 , n 2 ½0,N1 , TS2 : z2 ¼ ð1=N2 Þ nN2¼10 ðx½n 4 t2 Þ for x½n 4 t3 , n 2 ½0,N2 and the MTS is z ¼ z1 þ z2 . We can have Dp1 ¼ B1max B1min , Dp2 ¼ B2max and Dps obtained from the TD with the MTS is as Dps ¼ Dp1 þ Dp2 . Dps is the maximum Dp that can be obtained for this specific mean-shift Gaussian mixture detection and therefore z is the optimal TD. 5.3. Sinusoidal signal detection Fig. 6 illustrates the detection performance when the TD shown in Fig. 3 is used to detect a sinusoidal signal in GM (unimodal) noise. The TD is composed of TD½t, t 2 ½0,N1 and TD[t] adapts topt ¼ s½t=2. We now show the 0.2 fx(x;H1) fx(x;H0) 0.1 0 −5 τ 0 3 x τ1 τ2 5 τ2 5 0.1 2 1 Δ p=p1−p0 Bmax Comparing these two cases, it is obvious that the computational complexity to obtain topt using Dp is much less compared to the classical method using PD. For this case, the average running time (Matlab) is about 0.189 s using PD and 0.034 s using Dp in a computer with AMD 8450 triple-core processor, 2.1 GHz and 4 GB memory. ð29Þ A mean-shift detection problem leads to fx(x) around the theoretical ones and we consider that the simulation results are consistent with the theoretical results. To alleviate the inaccuracy, we can use the linear interpolation between the staircases of the pmf’s. Nevertheless, because of the uncertainty of the estimated pmf’s, the interpolation is not easy. A better way is to use the continuous pdf instead of pmf for accurate demonstration, which is what we used in this paper (see Eqs. (11) and (12)) and topt is accurately verified from the theoretical ROCs. Given optimal t ¼ 0:05, we can have the theoretical ROC from Eq. (20) shown in Fig. 4(b) with the ROC from Monte Carlo simulations. The upper bound of this detection is the performance of the LO detector. It is shown that the TD with optimal threshold only has an average uniform loss of 0.34 dB (4%). Compared to the matched filter, the TD has an average uniform improvement of 1.19 dB (15%). We now compare the computational complexity for calculating topt from PD, and from Dp. Because the GM (unimodal) is Type I noise (zero mean), we can obtain the topt ¼ A=2 ¼ 0:05 without any calculation. To compare the calculation time fairly, it is assumed that we need to calculate the intersections given Eq. (28) and A¼0.1. Assume that we tune t and x from 5 to 5 with a step size 0.01 (t ¼ 5 : 0:01 : 5Þ, i.e., totally the number of t’s or x’s is 1001. 177 Bmax 0.05 X2 X1 1 5.2. Optimal structure determination In this section, we use a mean-shift bimodal Gaussian mixture (GM) detection to show the optimal MTS structure. 0 −5 τ1 0τ3 τ Bmin Fig. 5. (a) fX ðx; H0 Þ and fX ðx; H1 Þ and (b) Dp ¼ p1 p0 in the mean-shift Gaussian mixture detection with A ¼ 0.5, m ¼ 3 and s ¼ 1. 178 G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178 assuming a DC signal in a noise with known pdf. We present techniques to use the TD for any deterministic signal in a noise with unknown pdf. Hence, the proposed TD is good to be used in practice applications. Experimental results show the validation of the optimal threshold and the optimal structure. For non-Gaussian noise with heavy pdf tails, the performance of the TD can outperform the matched filter, and be very close to the performance of the LO detector. Probability of detection (PD) 1 0.8 0.6 0.4 0.2 0 Acknowledgment LO detector Proposed TD Matched filter 0 0.2 0.4 0.6 0.8 1 Probability of false alarm (PFA) Fig. 6. The comparison of the ROCs obtained from the Monte Carlo simulations in detecting sinusoidal signal between the proposed TD and other detectors, including the LO detector and the matched filter. The GM noise is with a ¼ 0:9, b ¼ 5 and s2 ¼ 1. The sinusoidal signal s½n ¼ 0:1 sinð0:02pnÞ,n 2 ½0,N1 and N ¼100 in one observation. ROCs are obtained from 10 000 simulations. performance comparison between the proposed TD and the LO detector, the matched filter. It is shown that the proposed TD only has an average uniform loss of 0.389 dB (4.6%) compared to the LO detector, with the benefit of not being sensitive to the parameters of the noise pdf. Compared to the matched filter, the TD obtains a significant improvement, an average uniform improvement of 1.283 dB (16%). 6. Conclusions In this paper, we considered the optimality design of a TD. From the pdf’s of the observation x[n] under two hypothesis, fX ðx; H0 Þ and fX ðx; H1 Þ, the corresponding pdf’s of the output of the test statistic is binomial distribution such that the optimal TD can be constructed in the sense of Neyman–Pearson criterion. We showed that the PD can be substituted by an indicator, Dp ¼ p1 p0 (Lemma 1) to find optimal performance. Using Dp, the optimal threshold topt can be easily calculated. At the same time, the TS structure is addressed. With the help of Dp, a MTS is proposed as the optimal structure. Using the optimal structure with optimal threshold(s), we can implement an optimal TD. The optimal design of the TD is conducted The authors would like to thank the reviewers for their comments that help improve the manuscript. Also the authors thank Dr. Yindi Jing and Rongfei Fan for some valuable discussions. References [1] S. Kay, Fundamentals of Statistical Signal Processing, Volume II: Detection Theory, Prentice Hall PTR, 2008. [2] J. Thomas, Nonparametric detection, Proceedings of the IEEE 58 (5) (1970) 623–631. [3] F. Chapeau-Blondeau, Nonlinear test statistic to improve signal detection in non-Gaussian noise, Signal Processing Letters, IEEE 7 (7) (2000) 205–207, doi:10.1109/97.847369. [4] M. Guerriero, S. Marano, V. Matta, P. Willett, Stochastic resonance in sequential detectors, IEEE Transactions on Signal Processing 57 (1) (2009) 2–15, doi:10.1109/TSP.2008.2005087. [5] H. Papadopoulos, G. Wornell, A. Oppenheim, Low-complexity digital encoding strategies for wireless sensor networks, in: Proceedings of ICASSP, 1998. [6] A. Saha, G. Anand, Design of detectors based on stochastic resonance, Signal Processing 83 (6) (2003) 1193–1212. [7] S. Kay, Can detectability be improved by adding noise? IEEE Signal Processing Letters 7 (1) (2000) 8–10, doi:101109/97.809511. [8] S. Kay, J. Michels, H. Chen, P. Varshney, Reducing probability of decision error using stochastic resonance, IEEE Signal Processing Letters 13 (11) (2006) 695–698, doi:10.1109/LSP.2006.879455. [9] H. Chen, P. Varshney, S. Kay, J. Michels, Theory of the stochastic resonance effect in signal detection: part I—fixed detectors, IEEE Transactions on Signal Processing 55 (7) (2007) 3171–3184. [10] B.C. Levy, Principles of Signal Detection and Parameter Estimation, first ed., Springer, 2008. [11] S. Kay, D. Sengupta, Advances in Spectrum Analysis and Array Processing, Recent Advances in Non-Gaussian Autoregressive Processes, Prentice-Hall Signal Processing Series, vol. 1, 1991. [12] W. Machell, C.S. Penrod, G.E. Ellis, Statistical characteristics of ocean acoustic noise processes, in: E.J. Wegman, S.C. Schwartz, J.B. Thomas (Eds.), Topics in Non-Gaussian Signal Processing, Springer, New York, 1989, pp. 29–57 (Chapter 3). [13] D. Powell, G. Wilson, Class A modeling of ocean acoustic noise processes, in: E.J. Wegman, S.C. Schwartz, J.B. Thomas (Eds.), Topics in Non-Gaussian Signal Processing, Springer, New York, 1989, pp. 17–28 (Chapter 2).
© Copyright 2024