Classification and Mapping of Plant Communities Using Multi-Temporal and Multi-Spectral Satellite Images

Classification and mapping of plant communities is an essential step for conservation and management of ecosystems and biodiversity. We adopt the Genus-Physiognomy-Ecosystem (GPE) system developed in the previous study for satellite-based classification of plant communities at a broad scale. This paper assesses the potential of multi-spectral and multi-temporal images collected by Sentinel-2 satellites for the classification and mapping of GPE types. This research was conducted in seven representative study sites in different climatic regions ranging from one warm-temperate site in Aya to six cool-temperate sites in Hakkoda, Zao, Oze, Shirakami, Kitakami and Shiranuka. The GPE types were enumerated in all study sites and ground truth data were collected with reference to extant vegetation surveys, visual interpretation of high-resolution images, and onsite field observations. We acquired all Sentinel-2 Level-1C product images available for the study sites between 2017-2019 and generated monthly median composite images consisting of ten spectral and twelve spectral-indices. The Gradient Boosting Decision Trees (GBDT) classifier was employed for the supervised classification of the satellite data with the support of ground truth data. The cross-validation accuracy in terms of kappa coefficient varied from 87% in Oze site with 41 GPE types to 95% in Hakkoda site with 19 GPE types; with average performance of 91% across all sites. The GPE maps produced in this research demonstrated a clear distribution of plant communities in all seven sites, highlighting the potential of Sentinel-2 multi-spectral and multi-temporal images with GPE classification system for operational and broad-scale mapping of communities.


Introduction
Classification and mapping of plant communities is an essential step for conservation and management of ecosystems and biodiversity. In recent years, availability of free and open access data, high performance computing, and automated data processing and analysis capabilities have brought new opportunities for classification and mapping of plant communities from remotely sensed images (Murakami and Mochizuki, 2014;Wulder, 2018). In contrast to potential natural vegetation mapping based on climatic parameters available at coarse spatial resolution (Hengl et al., 2018), actual vegetation mapping (Bredenkamp et al., 1998;Su et al., 2020) with recently available satellite images can provide much detailed information at higher spatial resolution for improving the knowledge of plant community.
In Japan, a wide variety of land cover and vegetation types, ranging from Southern Subtropical Forests to Northern Arctic Meadows, exists (Numata et al., 1972;Miyawaki, 1984;Himiyama, 1998). Nationwide vegetation surveys have been conducted since 1973 and plant communities have been enumerated. First vegetation survey of the entire country was completed in 1999 with the production of vegetation survey maps at 1:50,000 scale (MoE and AAS, 1999). Since 1999, extensive field surveys have been repeated and a 1:25,000 scale vegetation survey map is being produced nationwide (Hioki, 2007). The vegetation survey follows phyto-sociological units-based organization of plant communities (Miyawaki 1968;Ohno, 2006). The plant communities are recognized through field observations and delineated in a geographical environment via a manual procedure facilitated by visual interpretation of aerial and satellite images. The manual delineation procedure is subject to human discernment, laborious, and costly. To cope with these issues, more intelligent technology has been expected.
The major objective of this paper is to assess the potential of multi-spectral and multi-temporal images available from the Sentinel-2 mission satellites (Drusch et al, 2012) for operational and broad-scale mapping of land cover and plant community types by adopting the Genus-Physiognomy-Ecosystem (GPE) system developed in the previous study (Sharma, 2021).

Study Sites
This research was conducted in seven representative study sites in different climatic regions countrywide ranging from one warm-temperate site in Aya to six cool-temperate sites in Hakkoda, Zao, Oze, Shirakami, Kitakami, and Shiranuka. These study sites were selected in such a way that they represent a variety of plant communities existing in the country. The location map of seven study sites has been shown in Figure 1.

Preparation of Ground Truth Data
The land cover and plant community types present in seven study sites were enumerated by adopting the Genus-Physiognomy-Ecosystem (GPE) system developed by Sharma (2021) for satellite-based classification and mapping of plant communities at a large scale. Extant vegetation survey reports available from Nature Conservation Bureau, Ministry of the Environment and Asia Air Survey Co., Ltd were utilized as reference materials for enumerating GPE types in each study site. The land cover and plant community types were further verified by onsite field observations between 2017 and 2020 in all study sites. The final confirmed list of GPE types present in seven study sites has been described in Table 1. The ground truth data, polygons representing homogeneous GPE types of around 1ha size, were collected with reference to extant vegetation survey maps (1:25,0000 scale) produced from extensive field surveys between 2012 to 2020, and visual interpretation of time-lapse images available in the Google Earth by local experts in plant ecology and vegetation sciences.

Processing of Satellite Data
We acquired all Level-1C product images collected by Sentinel-2 mission satellites (Sentinel-2A and 2B) for the study sites between 2017-2019. The Sentinel-2 mission satellites collect optical imagery at high spatial resolution (10-60m) in visible, near infrared, and short-wave wavelengths at a frequency of five days (Drusch et al., 2012). The images were processed for cloud masking and ten spectral bands (blue, green, red, red edge 1-3, near infrared, mid infrared, and shortwave infrared 1-2) were extracted. For each scene, twelve vegetation indices (as shown in Table 2) were also calculated. The spectral and spectral-indices images were composited by computing monthly median values. In this manner, we generated 264 features (22 spectral and spectral-indices × 12 months) altogether for machine learning, classification, and mapping.

Machine Learning and Classification
We employed Gradient Boosting Decision Trees (GBDT) classifier implemented by XGBoost, an efficient and optimized distributed gradient boosting library (https://github.com/dmlc/xgboost) for the supervised classification of Sentinel-2 images as it can handle large data volume with Compute Unified Device Architecture (CUDA) computations. We implemented a train-test split method for fine tuning of input features and model parameters. Classification accuracy metrics (Kappa coefficient and F1-score) were utilized for quantitative evaluation. For this method, ground truth data were shuffled and randomly splitted into train (75%) and test (25%) sets. The GBDT model was trained on the training data, whereas test data was utilized for fine tuning the parameters of the model. The GBDT model established in this was utilized for prediction and mapping of land cover and plant community types separately for each site.

Model Test Results
The model test results obtained from the machine learning (GBDT classifier) of multi-temporal Sentinel-2 images have been shown using the confusion matrix figures (Figures 2-4) for three sites (Hakkoda, Zao, and Shirakami). Due to many classes involved, class-wise accuracy tables (Tables 3-6) have been shown for four sites (Oze, Kitakami, Shiranuka and Aya).     The classification accuracy matrices obtained for all study sites have been summarized in Table 7. The classification accuracy in terms of kappa coefficient varied from 87% in Oze site with 41 classes to 95% in Hakkoda site with 19 classes.

GPE Maps
The Land Cover and GPE maps produced in this research have been shown in Figures 5-11. These maps demonstrate the extent and distribution of land cover and plant community types clearly for the study sites concerned.  Preparation of ground truth data becomes very difficult, time-consuming, and expensive when the heterogeneity and complexity of plant community types increase. Even with the large amounts of high-quality ground truth data, classification of satellite images becomes increasingly challenging as the number of classes increases. On the other hand, the characteristic species based phyto-sociological classes (Poore, 1955;Whittaker, 1980;Miyawaki and Fujiwara, 1988) delineated by nationwide vegetation survey is out from automated digital mapping approach as remote sensing signals are mostly governed by physical interactions of dominant species rather than characteristic species. Therefore, a right and effective organization of plant communities is essential for operational and broad-scale mapping. In line with this, the Genus-Physiognomy-Ecosystem (GPE) system, developed by Sharma, 2021 for the classification of plant communities from the perspective of satellite remote sensing, was extended in this research for operational mapping of land cover and plant community types. Figure 11. 25-class land cover and plant community map of Aya site produced in the research.

Conclusions
In this research, we presented operational mapping of land cover and plant community types in seven study sites in warm and cool temperate regions in Japan by utilizing multi-spectral and multi-temporal Sentinel-2 images. Machine learning based accuracy analysis showed potential of the Sentinel-2 images for the mapping of land cover and plant community types by adopting Genus-Physiognomy-Ecosystem (GPE) system as the kappa coefficient varied from 87% (41 classes in Oze site) to 95% (19 classes in Hakkoda site). Still, some misclassifications were detected in some classes such as Betula DBF, Alnus DBF, Fagus DBF, Quercus DBF, Picea ECF, Hydrangea Shrub, and Zoysia Herb particularly in sites associated with many classes. Further increase in the temporal resolution of the satellite data with future launch of Sentinel-2 mission satellites is highly expected for improving the classification accuracy of plant communities. Future plan is to expand this methodology for seamless mapping of plant communities by increasing the ground truth data.