With fewer feature dimensions, filter banks are often used in light-weight full-band speech enhancement models. In order to further enhance the coarse speech in the sub-band domain, it is necessary to apply a post-filtering for harmonic retrieval. The signal processing-based comb filters used in RNNoise and PercepNet have limited performance and may cause speech quality degradation due to inaccurate fundamental frequency estimation. To tackle this problem, we propose a learnable comb filter to enhance harmonics. Based on the sub-band model, we design a DNN-based fundamental frequency estimator to estimate the discrete fundamental frequencies and a comb filter for harmonic enhancement, which are trained via an end-to-end pattern. The experiments show the advantages of our proposed method over PecepNet and DeepFilterNet.
Target speaker information can be utilized in speech enhancement (SE) models to more effectively extract the desired speech. Previous works introduce the speaker embedding into speech enhancement models by means of concatenation or affine transformation. In this paper, we propose a speaker attentive module to calculate the attention scores between the speaker embedding and the intermediate features, which are used to rescale the features. By merging this module in the state-of-the-art SE model, we construct the personalized SE model for ICASSP Signal Processing Grand Challenge: DNS Challenge 5 (2023). Our system achieves a final score of 0.529 on the blind test set of track1 and 0.549 on track2.
The diversity of terrestrial vascular plants plays a key role in maintaining the stability and productivity of ecosystems. Monitoring species compositional diversity across large spatial scales is challenging and time consuming. The advanced spectral and spatial specification of the recently launched DESIS (the DLR Earth Sensing Imaging Spectrometer) instrument provides a unique opportunity to test the potential for monitoring plant species diversity with spaceborne hyperspectral data. This study provides a quantitative assessment on the ability of DESIS hyperspectral data for predicting plant species richness in two different habitat types in southeast Australia. Spectral features were first extracted from the DESIS spectra, then regressed against on-ground estimates of plant species richness, with a two-fold cross validation scheme to assess the predictive performance. We tested and compared the effectiveness of Principal Component Analysis (PCA), Canonical Correlation Analysis (CCA), and Partial Least Squares analysis (PLS) for feature extraction, and Kernel Ridge Regression (KRR), Gaussian Process Regression (GPR), Random Forest Regression (RFR) for species richness prediction. The best prediction results were r=0.76 and RMSE=5.89 for the Southern Tablelands region, and r=0.68 and RMSE=5.95 for the Snowy Mountains region. Relative importance analysis for the DESIS spectral bands showed that the red-edge, red, and blue spectral regions were more important for predicting plant species richness than the green bands and the near-infrared bands beyond red-edge. We also found that the DESIS hyperspectral data performed better than Sentinel-2 multispectral data in the prediction of plant species richness. Our results provide a quantitative reference for future studies exploring the potential of spaceborne hyperspectral data for plant biodiversity mapping.
* To appear in ISPRS Journal of Photogrammetry and Remote Sensing
Diversity of terrestrial plants plays a key role in maintaining a stable, healthy, and productive ecosystem. Though remote sensing has been seen as a promising and cost-effective proxy for estimating plant diversity, there is a lack of quantitative studies on how confidently plant diversity can be inferred from spaceborne hyperspectral data. In this study, we assessed the ability of hyperspectral data captured by the DLR Earth Sensing Imaging Spectrometer (DESIS) for estimating plant species richness in the Southern Tablelands and Snowy Mountains regions in southeast Australia. Spectral features were firstly extracted from DESIS spectra with principal component analysis, canonical correlation analysis, and partial least squares analysis. Then regression was conducted between the extracted features and plant species richness with ordinary least squares regression, kernel ridge regression, and Gaussian process regression. Results were assessed with the coefficient of correlation ($r$) and Root-Mean-Square Error (RMSE), based on a two-fold cross validation scheme. With the best performing model, $r$ is 0.71 and RMSE is 5.99 for the Southern Tablelands region, while $r$ is 0.62 and RMSE is 6.20 for the Snowy Mountains region. The assessment results reported in this study provide supports for future studies on understanding the relationship between spaceborne hyperspectral measurements and terrestrial plant biodiversity.
The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.