Published by Canadian Center of Science and Education Unmixing and Target Recognition in Airborne Hyper-Spectral Images

We present two new linear algorithms that perform unmixing in hyper-spectral images and then recognize their targets whose spectral signatures are given. The first algorithm is based on the ordered topology of spectral signatures. The second algorithm is based on a linear decomposition of each pixel's neighborhood. The sought after target can occupy sub- or above pixel. These algorithms combine ideas from algebra and probability theories as well as statistical data mining. Experimental results demonstrate their robustness. This paper is a complementary extension to Averbuch & Zheludev (2012).


Data Representation and Extraction of Spectral Information
We assume that a hyper-spectral signature of a sought after material is given.In many applications according to Winter (1999), a fundamental processing task is to automatically identify pixels whose spectra coincide with the given spectral shape (signature).This problem raises the following issues: How the measured spectrum of a ground material is related to a given "pure" spectrum and how to compare between them to determine if they are the same?Spatial and spectral sampling produce a 3D data structure referred to as a data cube.A data cube can be visualized as a stack of images where each plane on the stack represents a single spectral channel (wavelength).The observed spectral radiance data, or the derived surface reflectance data, can be viewed as a scattering of points in a K -dimensional Euclidean space K  where K is the number of spectral bands (wavelengths).Each spectral band is assigned to one axis.All the axes are mutually orthogonal.Therefore, the spectrum of each pixel can be viewed as a vector where its Cartesian coordinates i x are either radiance or reflectance values at each spectral band.Since 0, 1, ,   , then the spectral vectors lie inside a positive cone in K  .Changes in the illumination level can change the length of the spectral vector but not its, which is related to the shape of the spectrum.When targets are too small to be resolved spatially or when they are partially obscured or of an unknown shape, as shown in Winter (1999), then the detection has to rely on the available spectral information.Unfortunately, a perfect fixed spectrum for any given material does not exist.
In agreement with Winter (1999), spectra of the same material are probably never identical even in laboratory experiments.This is due to variations in the material surface.The variability amount is even more profound in remote sensing applications because of the variations in atmospheric conditions, sensor noise, material composition, location, surrounding materials and other contributing factors.As a result, the measured spectra, which correspond to pixels with the same surface type, exhibit an inherent spectral variability that prevents the characterization of homogeneous surface materials by unique spectral signatures.
Another significant complication arises from the interplay between the spatial resolution of the sensor and the spatial variability present in the observed ground scene.According to Winter (1999), a sensor integrates the radiance from all the materials within the ground surface that are "seen" by the sensor as a single image pixel.Therefore, depending on the spatial resolution of the sensor and the distribution of surface materials within each ground resolution cell, the result is a hyper-spectral data cube comprised of "pure" and "mixed" pixels, where a pure pixel contains a single surface material and a mixed pixel contains multiple (superposition of) materials.
A linear mixing model is the most widely used spectral mixing model.It assumes that the observed reflectance spectrum of a given pixel is generated by a linear combination of a small number of unique constituent known as endmembers.This model is defined with constraints in the following way (Harsanyi & Chang, 1994): where 1 , , M s s  are the M endmember spectra that assumed to be linearly independent, 1 , , M a a  are the corresponding abundances (cover material fractions) and w is an additive-noise vector.

Outline of the Algorithms to Identify Target with Known Spectra
The new methods in this paper achieve targets identification with known spectra.Target identification in hyper-spectral has the following consecutive steps: 1) Finding suspicious points: there are points whose spectra are different in any norm from the spectra of the points in its neighborhood.This is also called anomaly detection; 2) Extracting from the suspicious points the spectra of the independent components (unmixing) where one of them is the target that its spectrum fits the given (sought after) spectrum.
We assume that spectra of different materials are statistically dependent and the difference between them occurs from the behavior of the first and second derivatives in some sections in the spectrum.If they are statistically independent, then all the related work such as Maximum Likelihood (ML) and Geometrical (MVT, PPI and N-FINDR) work well.
The experiments in this paper were performed on three real hyper-spectral datasets, which were measured as reflectance, titled: "desert", "city" and "field" which were acquired by the Specim camera SPECIM camera ( 2006) located on a plane.Their properties with a display of one waveband per dataset are given in Figures 1-3.The paper has the following structure: Section 2 describes the related work.The two algorithms, which are described in this paper, are compared with the performance of the orthogonal subspace projection (OSP) algorithm.Section 3 presents an algorithm that identifies the target's spectrum where the target occupies at least a whole pixel.This method assumes that the target's spectrum is distorted by atmospheric conditions and noised.Section 4 presents an unmixing method that is based on neighborhood analysis of each pixel.This method can also be used for detecting a subpixel target.This algorithm contains two parts.In the first part, suspicious points are discovered.
The algorithm is based on the properties of neighborhood morphology and on the properties of the Diffusion Maps (DM) algorithm Coifman & Lafon (2006).The second part unmixes the suspicious point.It is based on the application of DM to the linear span of the neighboring background spectra.The appendix describes the Diffusion Maps algorithm for dimensionality reduction.

Related Work
Up-to-date overview on hyper-spectral unmixing is given in Bioucas-Dias & Plaza (2010;2011).The challenges related to target detection, which is the main focus of this paper, are described in the survey papers Manolakis, Marden, & Shaw (2001), Manolakis & Shaw (2002).They provide tutorial review on state-of-the-art target detection algorithms for hyper-spectral imaging (HIS) applications.The main obstacles in having effective detection algorithms are the inherent variability target and background spectra.Adaptive algorithms are effective to solve some of these problems.The solution provided in this paper meets some of the challenges mentioned in Manolakis & Shaw (2002).
In the rest of this section, we divided the many existing algorithms into several groups.We wish to show some trends but do not attempt to cover the avalanche of related work on unmixing and target detection.
Linear approach: Under the linear mixing model, where the number of endmembers and their spectral signatures are known, hyper-spectral unmixing is a linear problem, which can be addressed, for example, by the ML setup Settle (1996) and by the constrained least squares approach Chang (2003).These methods do not supply sufficiently accurate estimates and do not reflect the physical behavior.Distinction between different material's spectra is conditioned generally by the distinction in the behavior of the first and second derivatives and not by a trend.
Independent component analysis (ICA) is an unsupervised source separation process that finds a linear decomposition of the observed data yielding statistically independent components Common (1994), Hyvarinen, Karhunen, & Oja (2001).It has been applied successfully to blind source separation, to feature extraction and to unsupervised recognition such as in Bayliss, Gualtieri, & Cromp (1997), where the endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions.Numerous works including Nascimento & Bioucas-Dias (2005) show that ICA cannot be used to unmix hyper-spectral data.
Geometric approach: Assume a linear mixing scenario where each observed spectral vector is given by where r is an L vector ( L is the number of bands), is the mixing matrix ( i m denotes the i th endmember signature and p is the number of endmembers present in the sensed area), s a   (  is a scale factor that models illumination variability due to a surface topography), 1 , , is the abundance vector that contains the fractions of each endmember ( T denotes a transposed vector) and n is the system's additive noise.Owing to physical constraints, abundance fractions are nonnegative and satisfy the so-called positivity constraint . Each pixel can be viewed as a vector in a L -dimensional Euclidean space, where each channel is assigned to one axis.Since the set is also a simplex whose vertices correspond to endmembers.
Several approaches Ifarraguerri & Chang (1999), Boardman (1993), Craig (1994) exploited this geometric feature of hyper-spectral mixtures.The minimum volume transform (MVT) algorithm Craig (1994) determines the simplex of a minimal volume that contains the data.The method presented in Bateson, Asner, & Wessman (2000) is also of MVT type, but by introducing the notion of bundles, it takes into account the endmember variability that is usually present in hyper-spectral mixtures.
The MVT type approaches are complex from computational point of view.Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it.Aiming at a lower computational complexity, some algorithms such as the pixel purity index (PPI) Boardman (1993) and the N-FINDR Winter (1999) still find the maximum volume simplex that contains the data cloud.They assume the presence of at least one pure pixel of each endmember in the data.This is a strong assumption that may not be true in general.In any case, these algorithms find the set of most of the pure pixels in the data.
Extending subspace approach: A fast unmixing algorithm, termed vertex component analysis (VCA), is described in Nascimento & Bioucas-Dias (2005).The algorithm is unsupervised and utilizes two facts: 1) The endmembers are the vertices of a simplex; 2) The affine transformation of a simplex is also a simplex.It works with projected and unprojected data.As PPI and N-FINDR algorithms, VCA also assumes the presence of pure pixels in the data.The algorithm iteratively projects data onto a direction orthogonal to the subspace spanned by the endmembers already detected.The new endmember's signature corresponds to the extreme projection.The algorithm iterates until all the endmembers are exhausted.VCA performs much better than PPI and better than or comparable to N-FINDR.Yet, its computational complexity is between one and two orders of magnitude lower than N-FINDR.
If the image is of size approximately 300  2000 pixels, then this method, which builds linear span in each step, is too computationally expensive.In addition, it relies on "pure" spectra which are not available all the time.

Statistical methods:
In the statistical framework, spectral unmixing is formulated as a statistical inference problem by adopting a Bayesian methodology where the inference engine is the posterior density of the random objects to be estimated as described for example in Dobigeon, Moussaoui, Coulon, Tourneret, & Hero (2009), Moussaoui, Carteretb, Briea, & Mohammad-Djafaric (2006), Arngren, Schmidt, & Larsen (2009).

Orthogonal Subspace Projection (OSP)
The method of orthogonal subspace projection (OSP) for unmixing and target detection is described in Ahmad & Ul Haq (2011), Ahmad, Ul Haq, & Mushtaq (2011), Ren & Chang (2003).We will compare between our method and the method in Ahmad & Ul Haq (2011) that is currently considered to be very effective.According to the notation in Ahmad & Ul Haq (2011), we are given the dataset i X SA W   where S is the set of pure signatures, A is the corresponding abundance fractions and W is a white noise matrix.According to the OSP method in Ahmad & Ul Haq (2011), the mixing matrix is found as where , U  are a singular matrix and an eigenvalues-matrix, respectively, of the projection matrix to the subspace L of the pure signatures and  T U U is the pseudo inverse of U .The creation of the subspace L is described in Ren, H., & Chang, C. I. ( 2003), pp.1236.
We present the results from target detection by the application of the OSP method with a given target signature s and compare them to our method.The targets in the scene are detected via the application of the OSP method on multipixels, which contain the dominant coefficient from the matrix A , corresponding to target signature s .

Linear Classification for Threshold Optimization
According to Cristianini & Shawe-Taylor (2000), a binary classification is frequently performed by using a real-valued function : otherwise, to a negative class.We consider the case where   f x is a linear function of x with the parameters w and b such that where   where l is the number of examples, is the output domain.
The Rosenblatt's Perceptron algorithm (Cristianini & Shawe-Taylor, 2000;Burges, 1998;pages 12 and 8, respectively) creates an hyperplane 0 w x b      with respect to a training set S .It creates the best linear separation between positive and negative examples via minimization of measurement function of "margin" distribution
The perceptron algorithm is guaranteed to converge only if the training data are linearly separable.A procedure that does not suffer from this limitation is the Linear Discriminant Analysis (LDA) via Fisher's discriminant functional Cristianini & Shawe-Taylor (2000).The aim is to find the hyperplane   , w b on which the projection of the data is maximally separated.The cost function (the Fisher's function) to be optimized is: where i m and i  are the mean and the standard deviation, respectively, of the function output values   Definition 2.2.(Cristianini & Shawe-Taylor, 2000) The dataset S from Equation 3 is linearly separable if the hyperplane 0, which is obtained via the LDA algorithm (Cristianini & Shawe-Taylor, 2000), correctly classifies the training data.It means that   , 0 , 1 , ,.
is linearly separable according to definition 2.2.In this case, the absolute value of b is the separation threshold.
Suppose that we have a set of n samples.First, we want to partition the data into exactly two disjoint subsets 1 S and 1 S  .Each subset represents a cluster.The solution is based on the K-means algorithm (Duda, Hart, & Stork, 2001).K-means maximizes the function   J e where e is a partition.The value of   J e depends on how the samples are grouped into clusters and on the number of clusters (see Duda, Hart, & Stork, 2001) where is an "within-cluster scatter matrix" (Duda, Hart, & Stork, 2001), l is the classes, i S are the classes and i m are the center of each class.B S is called "between-cluster scatter matrix" (Duda, Hart, & Stork, 2001), where

P
When a dataset is separable?One criterion is when where the notation in Equation 4 is used.diam is defined as Another criterion is: where 1 e is the partition and the number of classes is 1 and 2 e is the best partition into two classes.
then the dataset is inseparable and Fisher's separation is incorrect.

Method I: Weak Dependency Recognition (WDR) of Targets That Occupy One or More Pixels
We assume that a target occupies one or more pixels.The process, which determines whether a given target's spectrum and the spectrum of the current pixel are dependent, is described next. .
Let T be a given target's spectrum and P is the pixel's spectrum.We assume that the spectra of T and P are discrete vectors.In general, we assume that T and P are normalized and centralized.The following hypotheses are assumed: 0 H : T and P are weakly dependent.
1 H : T and P are not weakly dependent.

Hypotheses Check
We find an orthogonal transformation  that permutes the coordinates of T into a decreasing order.This permutation  is applied to P and T .We get that where 1 T is monotonic.If 0 H holds, which means that T and P are weakly dependent, then the values of 1 P are either monotonic decreasing or increasing and the first and second derivatives of 1 P are close to zero -see Figure 4 (left).Otherwise, 1 H holds and 1 P has an oscillatory behavior -see Figure 4 (right).In addition, 1 P has a subset of coordinates whose first and second derivatives have an oscillatory behavior -see Figure 4 (right).If the permutation of the coordinates of P provides that their values are either decreasing or increasing monotonically, then the first and second derivatives of P have a minimal norm.This is another criterion for deciding who has weak dependency.
The norm is defined as Definition 3.2.Let  be an orthogonal transformation that permutes the coordinates of T into a decreasing order.Denote the second derivative of a vector X by 2 X .Define the mapping x X be a dataset of spectra from all the pixels in the scene.Denote can be classified as: is separable according to definition2.5. 2 is inseparable according to definition 2.5.
In the first case,   , w b is the best separation for the set  according to definition 2.4 and b is the Fisher's threshold for this separation.Then, the set    is the set of targets.In the other case, there are no targets in the scene.
The flow of the WDR algorithm is given in Figure 5.

Experimental Results
Figures 6-8 display the results after the application of the algorithm in section 3.1 to the "desert" image (Figure 1).The yellow lines mark the neighborhood of the detected targets.The point 1 P in Figure 8 is the pattern of the known target's material.Its spectrum is displayed in Figure 4 as a plot of the "target".Other spectra plots, which were detected by the WDR algorithm in the scenes of Figures 6-8, are classified as "spectra of suspicious points".The UNSP For ease o -neighborh Figure 17.
The set Ŝ can be in one of two cases: 1) Ŝ is inseparable according to definition 2.5.This means that the pixels, which are correlated with the target, are inseparable from the other pixels; 2) Ŝ is separable according to definition 2.5.This means that the pixels, which are correlated with the target, are separated from the other pixels.
If we are in case 1, then Y is not a suspicious point.If we are in case 2, assume that  is the first cluster closest to 1.According to definition 2.4,   , w b provides the best separation.It separates the set  from the other points where b is the Fisher's threshold for this separation.Then,  can be represented as If the set  represents two or more connected components, then Y is also not a suspicious point.If Y   then Y is also not a suspicious point.Therefore, 1 H holds.In other words, if Y is a suspicious point, then  is a set of pixels that intersects with the target and this set of correlated points is concentrated around the central point Y .
Here and below, we assume that a correlated point is a pixel whose d -spectrum and   d Y are correlated with the correlated coefficient that is greater than Fisher's threshold b .
Let 1 N be the neighborhood . 2 N is called the external square.They are visualized in Figure 21., .i j   The set of all these vectors is denoted by V .This is the set of all the d -spectra that belong to  .If In order to derive the d -spectrum of some material in the central pixel, the background around the central pixel has to be removed.For that, we construct an orthogonal projection  , which projects all the d -spectra onto the orthocomplement of the linear span where the background of the d -spectra is located.If the d -spectrum of the central pixel   d Y does not belong to this linear span, then this projection extracts an orthogonal component of   d Y which does not mix with the background of the d -spectrum.For example, if   1 2 where 1 d belongs to the linear span generated by the background of the d -spectrum and 2 d belongs to the orthocomplement of this span.Then, after projection we obtain  which does not correlate with the background of the d -spectrum.Hence, the background influence is removed by this projection.Now, we formalize the above.Assume the matrix E is associated with the vectors 1 , , s v v  where   Assume that e T is the Fisher's threshold, which separates between the big and small absolute values of the eigenvalues of the matrix E .In some cases, e T can separate between zero and nonzero eigenvalues.The eigenvectors associated with the eigenvalues, which are smaller than e T , generate the eigensubspace, which is the orthocomplement of the linear span of the principal directions of the set V .Denote this orthocomplement by C .
Throughout this paper, we assume that in our model the spectrum of any pixel X consists of three components: 1) The spectrum of the material M is different from its background; 2) The spectrum of the background was generated from a linear combination of spectra of pixels from the X -neighborhood; 3) Random noise is present.
The same model is true for the d-spectra is a linear combination of the vectors 1 , If the correlated points concentrate around Y , then these points consist of the same material as Y .If the uncorrelated points do not contain this material then they belong to the background.Consider the orthogonal projection operator  .This operator projects vectors onto the orthocomplement C .The vector is approximated to be a zero vector.Thus, this orthogonal projection removes from the d -spectrum of   d Y the influence of the background.

Detection of Outliers within a Single Testing Cube
In section 4.1, we presented how to detect suspicious points.There is another way to do it.An alternative detection method uses dimensionality reduction by the application of the Diffusion Maps (DM) algorithm Coifman & Lafon (2006) and a nearest-neighbor scheme.The DM is a non-linear algorithm for dimensionality reduction.
Assume, we are given a data cube D of size X Y Z   where X and Y are the spatial dimensions and Z is the wavebands.We define a small testing cube which is included in the hyper-spectral data cube D.

Dimensionality Reduction by DM Application
Assume that a sliding testing cube d, pointed by the arrows in Figure 22, is moving by ironing each time a different fragment in the data cube D described in Figure 2. Section 4.3 describes in details how the testing cube d moves. .Thus, each of the N data points is a vector , 1, , ,  .Typically, R is in the range 3-5, which is determined by the magnitudes of the corresponding eigenvalues.R is the number of essential eigenvalues of the covariance matrix and it is determined as explained in Coifman & Lafon (2006).Figure 23 displays the embedding on three major eigenvectors of the data from four positions of the sliding testing cube.These are the embeddings onto three major eigenvectors of the covariance matrices.We observe that the overwhelming majority of the embedded data points form a dense cloud while a few outliers present.It can be a single point, which lies far away from the rest of points or, more frequently, there exists a small group of points, which are located close to each other but far away from the majority of the cloud.This reflects the situation when an optional target can occupy the area of size from one to several pixels (or even a subpixel).These single or grouped outliers are detected as explained in the next section on outliers detection.

Detection of Grouped Outliers
Assume we are looking for groups of outliers that consist of no more than K members.It is done by the following steps: 1) For each row , 1, , , to all other rows and sort them in ascending order 2) Form the matrix P to its first two nearest neighbors is greater than the respective distances of all the other points.
b) However, it may happen that some point lies close to 2 P while all the others are far apart.It can be interpreted as a pairwise outlier.An indicator of this situation is the fact that the maximum 1 , 1 . In this case, we add the point   , P P as a pairwise outlier.The index of the point   6) While looking for grouped outliers that may contain up to 2 K  members, we find We emphasize that, once the upper limit K is given, the number 1 L  of group members is determined automatically depending on the data within the sliding testing cube d. Figure 24 illustrates the grouped detected outliers in the 3-dimensional space of eigenvectors of the data from four positions of the sliding testing cube.

Detection of Singular Points within the Whole Data Cube
In the section on outliers detection, we described how to find a group of data points (multipixels) within one sliding testing cube, whose geometry differs from the geometry of the majority of the data points.Let  s appended to the combined list 1 2    .Again, the common gain weights.We proceed with the right shifts till the right edge of the data cube D.Then, the sliding testing cube slides down by / 4 v   and starts  -shifts to the left and so on.As a result, we get a combined list It is important that each point i P in the list  is supplied with the weight i w , which can range from 1 to more than 40.The weight i w can serve as a singularity measure for the point i P .A large weight i w reflects the fact that the point i P is singular for a big number of overlaps between sliding testing cubes.Thus, it can be regarded as a strong singular point in the sliding data cube D and vice versa.Let Y be a suspicious point and let T be the given target's spectrum.What portion of the target is contained in Y ?
We consider a simplified version of Equation 1 via the definition of a simple mixing model that describes the relation between a target and its background.Assume P is a pixel of mixed spectrum (a spectrum that contains background influence and the target) and T is the given target's spectrum.Consider three spectra: an average background spectrum , a mixed pixel spectrum (spectrum of a suspicious point) P and the target's spectrum T .They are related by the following model was taken from the neighborhood pixel.Therefore, all of them are close to each other and have a similar feature.
We are given the target's spectrum T and the mixed pixel spectrum P .Our goal is to estimate t denoted by t , which will satisfy Equation 6 provided that B and T have some independent features.Once t is found, the estimate of the unknown background spectrum B , denoted by B , is calculated by  .Estimating the parameter t in Equation 6 is called linear unmixing.
In Step 2 from Section 4.1, we calculated the following: V is the d -spectra set, which is uncorrelated with   d Y pixels from the m -neighborhood of Y and  , is the projection operator onto the orthocomplement of the linear span of V .Let P tT  .
The fact that two vectors 1 X and 2 X are independent is equivalent to  for any analytical function  (Hyvarinen, Karhunen, & Oja, 2001).An analytical function can be represented by a Taylor expansion of its argument's degrees.Then, the condition , 0 n n corr X X  for any positive integer n where n denotes a power.In our algorithm, we limit our self to 1, 2, 3, 4 n  . From the independency criterion between the two vectors 1 X and 2 X we can have which equals to zero in case 1 X and 2 X are independent.If ' t is estimated, then ' P t T B   where P is the spectrum of the suspicious point and B is a mix of the background's spectrum from the neighborhood that is affected by noise.
The flow of the UNSP algorithm is given in Figure 30.

Experimental Results
In this section, we consider two scenes "field" (Figure 3) and "city" (Figure 2) that contain the subpixel's targets.As a first step, we find all the suspicious points via the application of anomaly detection process (section 4.2).The next step checks the anomaly by the "morphological-filter" which was described in section 4.1.If the pixel is passed via the application of the "morphological-filter" then the target is present in it.
Figures 31 and 32 present the outputs from the application of the "morphological-filter" algorithm to two different hyper-spectral scenarios.method.The green line corresponds to the WDR method

Conclusions
We presented two algorithms for linear unmixing.The WDR algorithm detects well targets that occupy at least one pixel but fails to detect sub-pixel targets.The UNSP algorithm detects well sub-pixels targets but it is computational expensive due to the need to search for the spectral decomposition in each pixel's neighborhood by sliding the "morphology-filer".In the future, we plan to add to these algorithms a classification method with machine learning methodologies.

Figure 1 .
Figure 1.The dataset "desert" is a hyper-spectral image of a desert place taken from an airplane flying 10,000 feet above sea level.The resolution is 1.3 meter/pixel, 286  2640 pixels per waveband with 168 wavebands the parameters that control the function.The decision rule is given by be the weight vector and b is the threshold.Definition 2.1.(Cristianini & Shawe-Taylor, 2000)) A training set is a collection of training examples(data) the cardinality of a class and m is the center for all the dataset.Definition 2.4.Let   , w b be the best separation for the set -means and Fisher's discriminant analyzes Cristianini and Shawe-Taylor (2000), Burges (1998).  , w b is called the Fisher's separation and b the Fisher's threshold for the data .

Definition 3. 1 .
Two discrete functions 1 Y and 2Y are weakly dependent if there exists a permutation  of the coordinates that provides monotonic order for the values of

Figure 4 .
Figure 4.The x -and the y -axes are the wavebands and their reflectance values, respectively.The spectra are represented after the application of the permutation to the coordinates, which permutes T into a monotonic deceasing order.Left: Weak dependency between T and P , Right: No weak dependency between T and P

Figure 5 .
Figure 5.The flow of the WDR algorithm

Figure 6 .
Figure 6.Left: One wavelength part from the original "desert" image (Figure 1).Right: The white points mark the detected targets.The intensity of each pixel in the right side corresponds to the value   X  where X the spectrum in the current pixel Figure 9. C Fig Figure 20.The indices of a pixel

Figure 21
Figure 21. 1 N is the internal square and 2 N is the external square

T
be the given d -spectrum of the target.If the correlation coefficient of   ' coefficient of ' P and'  T , then Y is a suspicious point, M is the target, ' '

Figure 22 .
Figure 22.An urban scene of size 294  501 (from the "city" in Figure 2) with different locations of the sliding testing cube d.The arrows point to these locations length Z .We arrange these data points into a matrix M of size N Z  .The next step applies the DM (see the appendix for its description) algorithm to the matrix M. It reduces the dimensionality of the data vectors by embedding them into the main eigenvectors of the covariance matrix of the data M.This projection reveals the geometrical structure of the data and facilitates a search for singular (abnormal) data points.The data matrix M of size N Z  is mapped onto the eigenvectors of the matrix P of size , N R R Z 

Figure 23 .
Figure 23.Embedding of the data from different positions of the sliding testing cube on the image in Figure 2 onto three major eigenvectors of the diffusion matrix

Figure 24 .
Figure 24.Detection of grouped outliers in data from different positions of the sliding testing cube embedded in the diffusion space the list of such data points in the sliding testing cube 1 d of size v h Z   located in the upper left corner of the sliding data cube D as illustrated by the arrow in Figure 22.The next testing cube 2 d is obtained by a right shift by be the list of outliers in the cube 2 d .Append the list 2 to 1  .Because of the vast overlap between the cubes 2 d and 1 d , some outliers data points can be common for the lists 2  and 1  .In the united list, these points gain the weight 2. The next right shift produces the sliding testing cube 3 d outliers list 3


of outliers, where R is the number of jumps of the testing cube d within the sliding data cube D. Figure 22 illustrates a route of the cube d on the data cube D.
Figure 25 illustrates the distribution of the weighted singular points around the data cube U D of size 500  294  64 from the urban scene displayed in Figure 22 whose source is Figure 2.

Figure 25 .
Figure 25.Distribution of the weighted singular points around the data cube U D .Left: All the singular points.Right: Singular points whose weights exceed 12

Figure 29
Figure 29.A singular point   242, 202 P .Left: Vicinity of the point P. Right: Multipixel spectra at the point   242, 202 P and the surrounding points.The weight of the data point P is 32

Figure 30 .
Figure 30.The flow of the UNSP algorithm

Figure 31 .
Figure 31.Left: The source image (Figure 2).Right: The white points are the suspicious points in the neighborhood of diameter 10 m  unmix Figure 33.