A Sparsity-Based Approach for Spectral Image Target Detection from Compressive Measurements Acquired by the CASSI Architecture 1

Hyperspectral imaging requires handling a large amount of multidimensional spectral information. Hyperspectral image acquisition, processing, and storage are computa-tionally and economically expensive and, in most cases, slow processes. In recent years, optical architectures have been developed for acquisition of spectral information in compressed form by using a small set of measurements coded by a spatial modulator. This article formulates a processing scheme that allows the measurements acquired by such compressive sampling systems to be used to perform spectral detection of targets, by adapting traditional detection algorithms for use in the compressive sampling model, and shows that the performance is comparable with that obtained by detection processes without compression. on designing a target detection model that uses compressive measurements to find a sparse representation of image pixels from spectral information-based dictionaries. In addition, an algorithm is implemented that determines whether the evaluated pixel is a target pixel. The proposed algorithm is based on a joint sparsity model, where every f i pixel is approximately represented by a few sets of training signatures among the entire training dictionary. This dictionary is composed of sub-dictionaries of the target and background signatures. The sparse vector represents the atoms in the training dictionary, and their associated weights for each pixel can be recovered from the CASSI compressive measurements by solving a sparsity-constrained optimization problem. This process is used to determine whether the observation pixel is a target pixel.


Introduction
Over the last several decades, the development of optical sensors has facilitated remote sensing analysis with rich spatial, spectral, and temporal information. The increase in the spectral resolution of hyperspectral images (HSI) and infrared sounders has led to new application domains and poses new methodological challenges in data analysis. HSI allows characterization of the objects of interest (e.g., land-cover classes) with unprecedented accuracy and aids in keeping inventories up to date. Furthermore, improvements in spectral resolution have necessitated advances in signal processing and exploitation algorithms [1].
Hyperspectral image classification and target detection are among the most important problems in various scientific disciplines, such as machine learning [2] image processing, and computer vision. Several critical issues should be considered in the classification of hyperspectral data. For instance, the high number of spectral channels and low number of labeled training samples lead to the curse-of-dimensionality problem (i.e., the Hughes phenomenon [3]) and result in the risk of overfitting of the training data [4]. To alleviate the problems that come with the great dimensionality of data, the spatial variability of spectral information, and the high cost of true sample labeling and to enhance the numerical stability, a variety of approaches have been proposed [5].
In general, these approaches take advantage of the inherent sparsity in a certain basis of the natural signals, whereby they can be approximately represented by a few coefficients that carry the most relevant information [6]. Applications of sparse representations in computer vision and pattern recognition can be found in various fields, including motion segmentation [7], image super-resolution [8], image restoration [9], and discriminative tasks such as face recognition [10], iris recognition [11], tumor classification [12], and HSI classification [13]. In these applications, the use of sparsity as a prior condition often leads to state-ofthe-art performance. Furthermore, the sparse nature of spectral imagery can be exploited when classifying images that were acquired using compressive spectral imaging systems, which require fewer measurements than do those attained by systems with traditional hyperspectral imaging sensors [14], [15].
The coded-aperture snapshot spectral imaging (CASSI) system depicted in Figure 1 is a compressive hyperspectral imager that is used to acquire compressive spectral measurements. The CASSI system simultaneously encodes spatial and spectral information of a scene in a small set of coded focal plane array (FPA) measurements [16]. The main elements of the CASSI system are the coded apertures, a dispersive element, and the sensor responsible for capturing the energy of the scene encoded. The coded apertures are matrix arrays composed of translucent optical elements that block or unblock the path of light through the system. The dispersive elements (usually prisms or gratings) are responsible for splitting the light into its wavelengths. The quality of the images acquired by the system depend on three main factors: the percentage of translucent elements that allow light into the apertures (commonly known as transmittance), the size of the data cube, and the compression rate [17].
Mathematically, CASSI projections measured in the i-th shot can be treated as shown in Figure 2 and are described by y = Hf + w,where H is a N(N + L -1) × (N 2 L) matrix whose structure is determined by the coded aperture entries and the dispersive element effect. For spectrally rich or spatially detailed images, a single-shot FPA measurement is not sufficient to achieve proper quality reconstructions, and additional shots are required. The CASSI architecture is capable of admitting multiple snapshots, each with a different coded aperture pattern, thus yielding a less ill-posed inverse problem and improved signal reconstructions [18]. The set of k << L FPA measurements is given by  Among the main limitations of the CASSI system are the mixture of spectral information with spatial information due to spectral shifting and the way in which the energy is integrated within the detector. In other architectures, only the spectral information is mixed [20]. In addition, the number of spectral bands available is limited by the size of the detector, M × (N + L -1). However, this example is one of the most broadly studied compressive spectral architectures and has been used in several applications [19], [21], [22], which is why it was selected for this work. This paper focuses on designing a target detection model that uses compressive measurements to find a sparse representation of image pixels from spectral information-based dictionaries. In addition, an algorithm is implemented that determines whether the evaluated pixel is a target pixel. The proposed algorithm is based on a joint sparsity model, where every f i pixel is approximately represented by a few sets of training signatures among the entire training dictionary. This dictionary is composed of sub-dictionaries of the target and background signatures. The sparse vector represents the atoms in the training dictionary, and their associated weights for each pixel can be recovered from the CASSI compressive measurements by solving a sparsity-constrained optimization problem. This process is used to determine whether the observation pixel is a target pixel.

Spectral image target detection using a sparsity model
Traditional spectral target detection based on sparsity takes advantage of the fact that any pixel in an image can be sparsely represented using a trained dictionary M composed of the selected target and the background pixels. Considering the existent spatial correlation between each pixel and its spatial neighborhood, we can model a linear problem as follows: Where F is the neighborhood consisting of T number of pixels and A is the sparse coefficient matrix of the {f i } i=1,2, ..., T pixels represented in the subspace spanned by M.
With a proper pixel-wise sparse representation, the goal of the detection task is to apply a detection function to each pixel in the image as follows: Where A b consists of the first N b rows of the matrix A corresponding to the background sub-dictionary M b and A t consists of the remaining N t rows in A that correspond to the target sub-dictionary M t . If the output D(x) is greater than a fixed threshold, then the test sample is labeled a target; otherwise, it is labeled background. Further details on matrix A estimation and sparse representation are explicitly presented in previous works [23] and [24].

Proposed model
A principal component transformation can be performed on the structured dictionary M ∈ R L×T such that an orthonormal basis Ψ is obtained from the set of N T training vectors. From [19], we know that the orthonormal basis Ψ can be formed by the set of eigenvectors of the correlation matrix C ∈ ℝ L×L given by the following: Such that w = V T . Principal component analysis (PCA) is one of the most frequently used approaches for hyperspectral dimensionality reduction and compression in HIS because it preserves the meaningful information of the image in a few of its components. Such basis transformation was successfully performed for classification in [19]. Thus, the original sparse vector w representing the test pixel f i can be transformed into a new sparse vector w ∈ ℝ L , which represents the pixel in the orthogonal basis w. The observation pixel f i can now be expressed as: Using the sparse representation of the pixels in the training basis w, the compressive CASSI measurements can be rewritten as Where y = H + = Ψ ⊗ I , I is a N 2 × N 2 identity matrix and ⊗ is the Kronecker product operator. Additionally, = 1,1 { } ,..., In Eq. (5), w is the noise of the system, and H is the CASSI sensing matrix. The proposed algorithm first finds an estimate of the sparse vector w directly from the FPA measurements y by solving the sparsity-constrained optimization problem given by Where the ℓ 1 norm accounts for the sparsity constraint and the ℓ 2 error norm finds the closest sparse vector to the optimal CASSI compressive measurements. A variety of algorithms have been used in the literature to solve problems similar to the one stated in Eq. (6), such as the ℓ 1 -regularized least square solution via the interior point method [25] or, in this case, the gradient projection for sparse reconstruction (GPSR) [26]. Spatial inter-pixel correlation can be included in the optimization problem given in Eq. (6), by replacing the sparsifying basis y = H + = Ψ ⊗ I with y = H + = Ψ ⊗ 2 D T , where 2 D T is the 2D wavelet basis dictionary used in [21].
The sparse target detection model proposed in this paper requires the detection algorithm introduced in Eq. (2) to operate over the subspace described in Eq.

end for
Source: Authors' own elaboration.
The target detector is based on a joint sparse model to extract the contextual information in HSI. In particular, it is assumed that pixels for the same material in a region share a common sparsity pattern. Thus, similar neighboring pixels can be sparsely represented by a linear combination of a few shared atoms: Where F ∈ ℝ L is the sparse representation of pixels in a spatial neighborhood formed by T neighboring pixels of the i-th test pixel. From the estimated matrix F , the joint sparsity problem can be formulated as Where Ᾱ is a joint sparse matrix with only K non-zero rows, K 0 denotes the sparsity level of Ᾱ and || • || F denotes the Frobenius norm. Once the sparse matrix Ᾱ is obtained, the label of the test pixel f i is determined using the minimal total residual.
Where Ᾱ b and Ᾱ t consist of the N b and N t rows in Ᾱ that are associated with the background and target sub-dictionaries M b and M t , respectively.

Computer simulations and results
The performance of the proposed sparsity model in target detection is evaluated from compressive CASSI measurements obtained from two spectral images. The first image is the self-test dataset from the RIT-CIS-DIRT project [27], and the data of this image, as shown in Figure 2, were collected as a component of a field experiment conducted in July 2006, near the small town of Cooke City, MT, USA. The hyperspectral imagery was collected using the HyMap sensor operated by HyVista, with approximately 3-meter ground resolution. The sensor generates 128 bands across the reflective solar wavelength region of 0.45-2.5 µm, with contiguous spectral coverage (except in the atmospheric water vapor bands) and bandwidths between 15 and 20 µm. A small fabric panel was used as target, and its reflectance spectra were measured by a Cary 500 spectrophotometer in the laboratory.
The second image is the EO1H0070552014301110PF-SG1-01 spectral image, as shown in Figure 3, collected by the EO-1 Hyperion sensor on October 28, 2014, in the region of Mogotes, Santander, Colombia. The spectral imagery has approximately 30-meter resolution, and only a small patch of the whole image is used in the experiments. The image is composed mostly of cultivated and ready-to-cultivate fields. The sensor is capable of resolving 220 spectral bands (from 0.4 to 2.5 µm) with a 30-meter resolution. In both images, the model of the multishot CASSI system [16] is used to obtain a set of FPA compressive measurements using different numbers of shots corresponding to an approximate percentage of the sensed information of the image. The target detector proposed in Algorithm 1 is used over the F representation of the image. The sparse vector Θ that solves the problem formulated in Eq. (6) is obtained using the GPSR algorithm proposed in [26].
The sparsity-constrained problem in Eq. (6) is solved using the SOMP algorithm, with a fixed sparsity level K 0 = 4 and a joint sparsity neighborhood with T = 9. The estimated matrix Ᾱ is later used to calculate the score matrix D, whose entries determine the probability that the pixel area is a target, as shown in Eq. (9).
The proposed algorithm is compared with three target detection algorithms for hyperspectral images, i.e., Adaptive Matched Subspace Detector (AMSD), Orthogonal Subspace Projection (OSP), and Constrained Energy Minimization (CEM), which are available in the Matlab signal-processing toolbox. These algorithms were used without compression using 100% of the spectral data.
The results analyzed both visually and quantitatively using the receiver operating characteristic (ROC) curves, as shown figures 4 and 5. The ROC curve describes the probability of detection (PD) as a function of the probability of false alarms (PFA). To calculate the ROC curve, we pick thousands of thresholds between the minimum and maximum of the detector output. The target or background labels for all pixels in the test region are determined at each threshold. The PFA is calculated by using the number of false alarms (background pixels determined as target) over the total number of pixels in the test region, and the PD is the ratio of the number of hits (target pixels determined as target) and the total number of true target pixels.

RIT-CIS-DIRT dataset: Cooke City
The first spectral image used to test the performance of the classifier is the Cooke City self-test [27]. In the following simulations, the number of bands is reduced to 90 (3rd-46th, 49th, 51st-62nd, 66th, 69th-72nd, and 86th-122nd) by eliminating 38 absorption and low-SNR bands. This image has a spatial resolution of 3 m per pixel and a spatial dimension of 800×200 pixels. The proposed algorithm was tested by varying two specific parameters: the compression level that we expect to achieve and the transmittance level of the coded apertures. The numerical results in Table 1 show the area under the curve (AUC) of the detector under the selected transmittance level and compression rate. The detection results for the proposed algorithm and the comparative results from other detection algorithms are shown in Figure 6. For additional clarity, the ROC curves of the algorithms are displayed in Figure 4. Based on these results, it is reasonable to conclude that the proposed method achieves a performance similar to that of the target detection algorithms used in traditional spectral imaging.

Hyperion image
The spatial dimension used for this spectral image is 32×32. A set of 10 target pixels were taken from the image to be used as training samples. A set of 206 pixels also taken from the image were used as test pixels to be assigned as target or background. As in the previous experiments, the GPRS algorithm was used to solve Eq. (6).
Results of simulations performed in this image using the proposed algorithm are shown in Table 2 in the same fashion of the first experiment, numerical results show the AUC under different levels of compression and transmittance, the overall scores over 85% might be explained due to the spatial resolution of the image and the spectral correlation.

Conclusion
This work proposes a spectral image target detector that directly labels each spectral pixel as either target or background from a set of compressive CASSI measurements. This detector uses the sparsity of spectral pixels in a given training basis. The sparse vector representing each pixel in the training basis is recovered from the CASSI measurements and is then used to determine whether or not the test pixel is a target pixel. The inter-pixel correlation in HSI is incorporated using a joint sparsity model, where the pixels in a small neighborhood in the test image are represented by a linear combination of a few common training samples weighted with a different set of coefficients for each pixel. The resulting sparse representations are used directly for target detection. The proposed detection method achieves a probability of detection of 98.92% if only 40% of the spectral information is used. A transmittance level between 10% and 30% produces the most accurate results.