The influence of exterior unknown values on interior image values

Loïc Simon, Mauricio Delbracio, Jean-Michel Morel
Abstract.

The goal of these notes is to provide a theoretical framework to evaluate the precision of image values inside the image domain, taking into account that by Shannon theory, these values for a band limited image depend on the exterior, unknown values. Since the exterior values are actually unknown the evaluation of this precision requires a model. Following other studies [] and in particular Shannon’s proposal, we take as worst case the case where the image inside and outside is a white noise. This permits easy theoretical computations and actually corresponds to a worst case but also to a realistic case. This is realistic because real digital images have noise and contain microtexture whose characteristics is close to white noise. This is a worst case because digital images have a decay spectrum that is significantly faster than the white noise spectrum (which is flat). Therefore among band-limited digital images white noise has the least predictable samples from neighborhoods. The calculations deliver quantitative security rules. They give us a guaranteed image accuracy depending on the distance from the image boundary, and show that for large images the inside values actually depend marginally on the outside values, up to a narrow security band on the boundary.

1 The error caused by the ignorance of outside samples

1.1 The case of 1D signals

We shall start the calculations in 1D. We denote the sinus cardinal function \hbox{sinc}x:=\frac{\sin\pi x}{\pi x}. Its Fourier transform is supported in [-\pi,\pi]. In such a case we shall say that the function is “band limited”. The system S_{k}(x):=\hbox{sinc}(\pi-k), k\in\mathbb{Z} is an orthonormal system of the set of band limited functions in L^{2}(\mathbb{R}). By Shannon-Whittaker theorem, a function u(x)\in L^{2}(\mathbb{R}) is band limited if and only if the sequence u(k) belongs to l^{2}(\mathbb{Z}) and

u(x)=\sum _{{k\in\mathbb{Z}}}u(k)\hbox{sinc}(x-k). (1)

Our general model for an unknown function will be a stationary white noise. This noise will represent the unknown samples, being outside the signal domain. Consider a noise (n_{k})_{{k\in\mathbb{Z}}} whose samples are \mathcal{N}(0,\sigma^{2}) independent real variables. The sum \sum _{k}n_{k}^{2} is almost surely unbounded. Nevertheless we can still use the Shannon-Whittaker formula (1). Indeed set n(x)=\sum _{{k\in\mathbb{Z}}}n_{k}\hbox{sinc}(x-k). Then

\mathbb{E}n(x)^{2}=\sigma^{2}\sum _{{k\in\mathbb{Z}}}\left(\frac{\sin\pi(x-k)}{\pi(x-k)}\right)^{2}.

This series is uniformly bounded with respect to x. By Lebesgue theorem it follows that the function n(x) belongs to L^{2}_{{loc}}(\mathbb{R}) almost surely and that the above series converges in L^{2}_{{loc}}. Thus we can make a sense of the Shannon-Whittaker L^{2}_{{loc}} interpolate of a white uniform noise, defined as the limit in L^{2}_{{loc}} as M\to\infty of the Shannon-Whittaker interpolate of a the truncated noise n_{k}\mathbb{1}_{{[-M,M]}}. Thus, even if in practice we end up playing with finitely many samples, it is convenient to consider this white noise interpolate.

We assume that the u(k) are the samples of a band limited function in [-\pi,\pi]. The samples u(k) are known for k\in[-K,K] and the rest is unknown and can have arbitrary values that we model in a worst case strategy by a white noise n_{k}. Here we shall consider several “extrapolation” possibilities that will turn out equivalent for the error analysis (I am not yet sure that this statement is really true). There are actually two classical extrapolation for the missing samples (we could think of the DCT as well - it assumes a mirror symmetry before periodizing). The first is to decide that the samples u(k) outside the observed domain [-K,K] are null. Then the Shannon-Wittaker interpolation will be faulty, and the error e(x) between the original and the interpolated one will be e(x)=\sum _{{|k|>K}}u(k)\hbox{sinc}(x-k). The alternative consists in a periodization of the observed image. It corresponds to the use of the discrete Fourier transform to interpolate the known samples u(k) amounts to assuming that the outside samples are mere repetitions of u(k). This discrete case will differ a little and we shall consider it later on. Now we focus on the error analysis caused by the ignorance of the samples u(k), |k|>K. We have just seen that the error caused inside the domain, namely for x\in[-K,K], satisfies

e(x)=\sum _{{|k|>K}}n_{k}\frac{\sin\pi(x-k)}{\pi(x-k)}. (2)

Thus the error variance is

\mathbb{E}e(x)^{2}=\sigma^{2}\sum _{{|k|>K}}\frac{\sin^{2}\pi(x-k)}{\pi^{2}(x-k)^{2}}. (3)

For x in the signal domain [-K,K], define the distances to the (respectively left and right) boundary d_{l}(x):=x+K,d_{r}(x):=K-x. We shall estimate how e(x) varies as a function of these distances. Starting with \sin\pi(x-k)=(-1)^{k}sin\pi x, we obtain

\displaystyle\mathbb{E}e(x)^{2} \displaystyle= \displaystyle\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\sum _{{|k|>K}}\frac{1}{(x-k)^{2}}
\displaystyle= \displaystyle\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\sum _{{k=K+1}}^{{\infty}}\frac{1}{(k+x)^{2}}+\sum _{{k=K+1}}^{\infty}\frac{1}{(x-k)^{2}}\right)
\displaystyle\leq \displaystyle\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\int _{{K}}^{\infty}\frac{1}{(y+x)^{2}}dy+\int _{{K}}^{\infty}\frac{1}{(x-y)^{2}}dy\right)
\displaystyle\leq \displaystyle\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{K+x}+\frac{1}{K-x}\right)
\displaystyle\leq \displaystyle\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{d_{l}(x)}+\frac{1}{d_{r}(x)}\right)

Note, the third line (i.e. the first inequality) is obtained by using an argument of decreasing function. Wa can use the same argument but to get a lower bound of the error and then obtain the following feasible range

\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{d_{l}(x)+1}+\frac{1}{d_{r}(x)+1}\right)\leq Ee(x)^{2}\leq\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{d_{l}(x)}+\frac{1}{d_{r}(x)}\right).

We have proved

Lemma 1.

The error e(x) on the value at x of a band limited function whose samples are known on an interval [-K,K] and whose unknown samples are i.i.d. with variance \sigma^{2} outside the domain satisfies

\mathbb{E}e(x)^{2}\leq\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{d_{l}(x)}+\frac{1}{d_{r}(x)}\right)

where d_{l}(x) and d_{r}(x) are the integer parts of the distances of x to the left/right boundaries.

Besides, the tightness of this upper bound can be estimated as

\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}}\left(\frac{1}{d_{l}(x)(d_{l}(x)+1)}+\frac{1}{d_{r}(x)(d_{r}(x)+1)}\right).

1.2 Extension to 2D images

We now proceed to the 2D extension of the above estimate. The 2D model is

u(x,y)=\sum _{{(k,l)\in\mathbb{Z}^{2}}}u(k,l)\hbox{sinc}(x-k)\hbox{sinc}(y-l). (4)

The error formula becomes

\mathbb{E}e(x,y)^{2}=\sigma^{2}\sum _{{(k,l)\in\mathbb{Z}^{2}\setminus[-K,K]^{2}}}\frac{\sin^{2}\pi(x-k)}{\pi^{2}(x-k)^{2}}\frac{\sin^{2}\pi(y-l)}{\pi^{2}(y-l)^{2}}. (5)

Using the respective horizontal and vertical distances d(x) and d(y) of (x,y) to the boundary of the square [-K,K]^{2}, assuming without loss of generality that x,y\geq 0, and using11The mentioned formula is a straightforward consequence of the Shannnon-Whittaker interpolation formula with f_{x}(y)=\hbox{sinc}(x-y) at y=x. the identity \sum _{{k\in\mathbb{Z}}}\hbox{sinc}^{2}(x-k)=1, we obtain

\displaystyle\mathbb{E}e(x,y)^{2} \displaystyle\leq \displaystyle\left(\sum _{{l\in\mathbb{Z}}}\hbox{sinc}^{2}(y-l)\right)\left(\sum _{{k=K+1}}^{{\infty}}\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}(k+x)^{2}}+\sum _{{k=1}}^{\infty}\frac{\sigma^{2}\sin^{2}\pi x}{\pi^{2}(d(x)+k)^{2}}\right)
\displaystyle+ \displaystyle\left(\sum _{{k\in\mathbb{Z}}}\hbox{sinc}^{2}(x-k)\right)\left(\sum _{{l=K+1}}^{{\infty}}\frac{\sigma^{2}\sin^{2}\pi y}{\pi^{2}(l+y)^{2}}+\sum _{{l=1}}^{\infty}\frac{\sigma^{2}\sin^{2}\pi y}{\pi^{2}(d(y)+l)^{2}}\right)
\displaystyle\leq \displaystyle\frac{\sigma^{2}}{\pi^{2}}\left(\frac{\sin^{2}\pi x}{d_{l}(x)}+\frac{\sin^{2}\pi x}{d_{r}(x)}+\frac{\sin^{2}\pi y}{d_{l}(y)}+\frac{\sin^{2}\pi y}{d_{u}(y)}\right).
Theorem 1.

The expected square error \mathbb{E}e(x,y)^{2} on the value at x of a band limited image whose samples are known on a domain [-K,K]^{2} and whose unknown samples are i.i.d. with variance \sigma^{2} outside this domain satisfies

\mathbb{E}e(x,y)^{2}\leq\frac{\sigma^{2}}{\pi^{2}}\left(\frac{\sin^{2}\pi x}{d_{l}(x)}+\frac{\sin^{2}\pi x}{d_{r}(x)}+\frac{\sin^{2}\pi y}{d_{l}(y)}+\frac{\sin^{2}\pi y}{d_{u}(y)}\right),

where d_{l}(x) and d_{r}(x) (resp. d_{l}(y) and d_{u}(y)) are the integer parts of the distances of x (resp. y) to the left/right (resp. lower/upper) boundaries.

In short denoting by \delta(x,y)=\inf(d_{l}(x),d_{r}(x),d_{l}(y),d_{u}(y)) the taxi driver distance of (x,y) to the image boundary, and having K\to\infty while \delta(x,y) remains bounded, the error bound behaves like \frac{\sqrt{2}\sigma}{\pi\sqrt{\delta(x,y)}}.

1.3 Discussion

We can first give some orders of magnitude. A typical digital image has standard deviation of about 20. Thus avoiding an error \mathbb{E}e(x,y)^{2} larger than 1, requires by the preceding result \delta\simeq 80. Since digital images nowadays are very large, this estimate is somewhat reassuring. Indeed, an order 1 error on colors escapes our attention and the security zone to remove is small if not negligible for images that attain respectable widths of 4000 pixels and more. On the other hand, this estimate clearly hinders using cameras as high precision numerical tools. Furthermore, the error caused by our ignorance of the image outside its domain remains serious even far inside the image domain, since it decreases like the square root of the distance to the boundary.

Besides, the problem of the truncation error bound has been studied in previous literature, in the case of deterministic signals. It is interesting to note that form of the error estimate presented in Lemma 1 bears a strong resemblance with the ones in [5, 3]. Note that [2] has also studied similar bounds but for a slightly different interpolation, and we will not study the comparison in this case. However, a more careful comparison with these two works reveals subtle distinctions. First of all, both bounds are obtained under deterministic model.

The closest formula is this proposed in [3], which under our notations becomes:

|e(x)|\leq\frac{|sin\pi x|}{\pi}S_{K}\left(\frac{1}{\sqrt{d_{l}(x)}}+\frac{1}{\sqrt{d_{r}(x)}}\right),

where S_{K}^{2}:=\sum _{{|k|>K}}n_{k}^{2}. Note that for a white noise S_{K} is almost-surely unbounded.

In [5], the formula is a bit different:

|e(x)|\leq\frac{M|sin\pi x|}{2\pi\cos{\pi f_{{max}}}}S_{K}\left(\frac{1}{d_{l}(x)}+\frac{1}{d_{r}(x)}\right),

where, M=\max _{{x\in[-\infty,\infty]}}n(x) and f_{{max}}\leq\frac{1}{2} is the upper bound of the frequencies present in n.

I should also read [1, 4], when I have access to them.

Note also that many articles (at least [5, 2]) impose a frequency guard band regarding the strict Nyquist limit.

\ointctrclockwiseop\nolimits f(z)dz

2 The error caused by missing outside samples for DFT interpolation

Again we start studying the phenomenon in 1D. Now we consider the classic DFT interpolation. This interpolation starts with the samples n_{k},k\in[-K,K]. Computing the discrete Fourier transform of this vector yields 2K+1 Fourier coefficients,

\tilde{n}_{l}=\sum _{{k=-K}}^{{K}}n_{k}e^{{-\frac{2i\pi kl}{N}}}, (6)

obtained from the samples n(k), k=0,1,\cdots N-1, which can themselves be recovered by the discrete inverse Fourier transform

n(k)=\frac{1}{N}\sum _{{l=-K}}^{{l=K}}\tilde{n}_{l}e^{{\frac{2i\pi k}{N}}}, (7)

where for a sake of concision we set N=2K+1. In this framework the function n(x) is implicitly assumed to be N periodic, which is of course simply not true. Indeed, its DFT interpolate is the Fourier truncated series

m(x)=\frac{1}{N}\sum _{{l=-K}}^{{l=K}}\tilde{n}_{l}e^{{\frac{2i\pi x}{N}}}, (8)

where m(x) is clearly distinct from n(x). Thus our goal is to evaluate the difference between m(x) and the lawful Shannon-Whittaker interpolate (assuming all samples known),

n(x)=\sum _{l}n_{k}\frac{\sin\pi(x-k)}{\pi(x-k)}. (9)

To evaluate m(x)-n(x), our first goal will be to put (8) in a form similar to (9). To do so we can substitute (6) in (7),

m(x)=\frac{1}{N}\sum _{{l=-K}}^{K}\sum _{{k=-K}}^{K}n_{k}e^{{-\frac{2i\pi lk}{N}}}e^{{\frac{2i\pi lx}{N}}}=\frac{1}{N}\sum _{{k=-K}}^{K}n_{k}\sum _{{l=-K}}^{K}e^{{\frac{2i\pi l(x-k)}{N}}},

which after some reorganization yields the classic DFT interpolation formula

m(x)=\sum _{{k=-K}}^{K}n_{k}\frac{\sin\pi(x-k)}{N\sin\frac{\pi}{N}(x-k)}. (10)

By subtracting (10) from (9) we obtain a closed formula for the error n(x)-m(x)=:=e_{1}+e_{2} with

e_{1}(x):=\sum _{{k=-K}}^{K}n_{k}\sin\pi(x-k)\left(\frac{1}{\pi(x-k)}-\frac{1}{N\sin\frac{\pi}{N}(x-k)}\right)

and

e_{2}(x):=\sum _{{|k|>K}}n_{k}\,\hbox{sinc}\,\pi(x-k).

We have already evaluated e_{2}(x) in the preceding section. Thus we have to evaluate e_{1}(x). In the following we normalize the space variable x=\tilde{x}N so that when N\to\infty, the signal domain becomes \tilde{x}\in[-\frac{1}{2},\frac{1}{2}]. By the Riemann sum theorem we have

\displaystyle\mathbb{E}e_{1}(x)^{2} \displaystyle= \displaystyle\mathbb{E}\left[\sum _{{k=-K}}^{K}n_{k}\sin\pi(x-k)\left(\frac{1}{\pi(x-k)}-\frac{1}{N\sin\frac{\pi}{N}(x-k)}\right)\right]^{2} (11)
\displaystyle= \displaystyle\mathbb{E}\left[\sum _{{k=-K}}^{K}n_{k}\left(\frac{\sin\pi(x-k)}{\pi(x-k)}\right)\left(1-\frac{\pi(x-k)}{N\sin\frac{\pi}{N}(x-k)}\right)\right]^{2} (12)
\displaystyle\leq \displaystyle E\left\| n_{k}\frac{\sin\pi(x-k)}{\pi(x-k)}\right\|^{2}\left\| 1-\frac{\pi(x-k)}{N\sin\frac{\pi}{N}(x-k)}\right\|^{2} (13)
\displaystyle= \displaystyle\sigma^{2}\underbrace{\sum _{{k=-K}}^{K}\hbox{sinc}^{2}(x-k)}_{{\leq 1}}\sum _{{k=-K}}^{K}\left(1-\frac{\pi(x-k)}{N\sin\frac{\pi}{N}(x-k)}\right)^{2} (14)
\displaystyle... (15)
\displaystyle\mathbb{E}e_{1}(x)^{2} \displaystyle\leq \displaystyle\sigma^{2}\sum _{{k=-K}}^{K}\left(\frac{1}{\pi(x-k)}-\frac{1}{N\sin\frac{\pi}{N}(x-k)}\right)^{2} (16)
\displaystyle= \displaystyle\frac{\sigma^{2}}{N}\frac{1}{N}\sum _{{k=-K}}^{K}\left(\frac{1}{\pi(\tilde{x}-\frac{k}{N})}-\frac{1}{\sin\pi(\tilde{x}-\frac{k}{N})}\right)^{2} (17)
\displaystyle\simeq \displaystyle\frac{\sigma^{2}}{N}\int _{{-\frac{1}{2}}}^{{\frac{1}{2}}}\left(\frac{1}{\pi(\tilde{x}-t)}-\frac{1}{\sin\pi(\tilde{x}-t)}\right)^{2}dt (18)
\displaystyle= \displaystyle\frac{\sigma^{2}}{N}\int _{{-\frac{1}{2}-\tilde{x}}}^{{\frac{1}{2}-\tilde{x}}}\left(\frac{1}{\pi u}-\frac{1}{\sin\pi u}\right)^{2}du. (19)

This means that no matter where \tilde{x} is in the signal domain, the error between Shannon and DFT is O(\frac{1}{\sqrt{N}}). We now pass to estimate how the error grows near the domain boundary. Thus we assume that \tilde{x}\to\frac{1}{2}. Then the above integral is O(\frac{1}{\pi(\frac{1}{2}-\tilde{x})}). Indeed, the singluarities present in the integral at its upper bound (which tends to 0) cancel out. And the singular term introduced when the lower bound tends to -1 is \frac{1}{\sin^{2}\pi u}=\frac{1}{\sin^{2}\pi(u+1)}\sim\frac{1}{\pi^{2}(u+1)^{2}}. In other term using that \tilde{x}=\frac{x}{2K+1}, the term bounding \mathbb{E}e_{1}(x)^{2} is equivalent to \frac{\sigma^{2}}{\pi(x-K)}, which means that the term bounding \mathbb{E}e_{1}(x)^{2} depends on the distance of x\in[-K,K] to the domain boundary and is of order \frac{\sigma}{\sqrt{\pi(x-K)}}.

Although this equivalent is correct, the above estimates need some refinement. First we need to replace the inequality in (11) by a real equivalent. Second, we have made above two successive equivalence arguments which need some more rigor. The solution to avoid the inequality is to replace the upper bound on \sin(x-k)^{2} by an equality, which can be probably done by using Cauchy Schwartz before taking the expectation of the square of e_{1}(x). The chain of equivalences can be straightened by replacing them by asymptotic expansions.

2.1 test

References

  • P.L. Butzer and W. Splettstosser (1977)
    A sampling theorem for duration-limited functions with error estimates,
    Information and Control 34 (1), pp. 55–65.
    External Links: Link.
    Cited by: 1.3.
  • L. Campbell (1968)
    Sampling theorem for the fourier transform of a distribution with bounded support,
    SIAM Journal on Applied Mathematics 16 (3), pp. 626–636.
    Cited by: 1.3, 1.3.
  • D. Jagerman (1966)
    Bounds for truncation error of the sampling expansion,
    SIAM Journal on Applied Mathematics 14 (4), pp. 714–723.
    Cited by: 1.3, 1.3.
  • S. Ries and V. Splettstößer (1984)
    On the approximation by truncated sampling series expansions,
    Signal processing 7 (2), pp. 191–197.
    Cited by: 1.3.
  • K. Yao and J. B. Thomas (1966)
    On truncation error bounds for sampling representations of band-limited signals,
    Aerospace and Electronic Systems, IEEE Transactions on (6), pp. 640–647.
    Cited by: 1.3, 1.3, 1.3.