Reliability of Summer Crop Masks Derived from Second Order Polynomial Equations

Remote Sensing techniques are useful to determine planted areas, especially of commodities like maize and soybeans (summer cultures). The vegetation indexes as NDVI (Normalized Difference Vegetation Index) has been used to map agricultural areas due to its low cost and accessibility. Therefore the objective of this study was to develop a new methodology to generate masks of summer crops based on fitting second order polynomial equations to temporal NDVI profiles. The results showed that selecting polynomial equations with r fitting above 0.75, a Kappa index of 0.86 and a global accuracy of 93% is obtained. This is slightly higher than the results obtained when using the maximum-minimum NDVI technique, with a Kappa index of 0.82 and a global accuracy of 91%. However, while quantifying the areas under study, one verified that the mask used by the proposed methodology, was closer to the official IBGE data, with a difference of -10.25%, followed by the vector technique with 23% and by the maximum-minimum NDVI, with a difference of 42.8%.


Introduction
Information on the dynamics of crops in Brazil are extremely important because the space occupied by them changes constantly due to market, environment, migration and public policy, to mention only a few factors.The governmental bodies responsible for the estimation of cultivated areas in Brazil are IBGE (Brazilian Institute for Geography and Statistics) and CONAB (National Company for Provision/Ministry for Agriculture, Livestock and Provision) and at regional level, there are several organizations connected to State governments such as DERAL/SEAB (Department of Rural Economy from the Paraná State, Secretary for Agriculture).These institutions perform field surveys and data analysis for the estimation of areas, which are expensive and time consuming.
In the literature there are other techniques, such as remote sensing, which help to map agricultural fields (Doraiswamy et al., 2004, Chang et al., 2007) and consequently to estimate planted areas.Among the several remote sensing characteristics, one of the most important is its temporality, filling out the need of up-to-date information, in the dynamic agriculture context, subsidizing reliably decision making.In accordance with this, CONAB took the initiative to aggregate to its analysis tools, information derived from remote sensing and geo-processing, to subsidize decisions for the estimation of areas and harvest from cultures (CONAB 2012).
Remote sensing has been used to survey planted areas from both temporary and permanent crops (Mercante et al. 2009).It is a consensus in the literature that mapping of crops must take into account the temporal aspect of the vegetative cycle (Sakamoto et al., 2005;Rudorff et al., 2007;Lamparelli et al., 2008).
Several authors are using the vegetation indices, which allow a more efficiently exploration of those wavelengths, which present significant responses to certain biophysical parameters (foliar area, photo-synthesizing pigments) of vegetation (Ponzoni & Shimabukuro, 2007).Among the several vegetation indices, the NDVI (Normalized Difference Vegetation Index), proposed by Rouse et al. (1973) can be mentioned, which operates in the red and near-infrared wavelengths.This index allows improve monitoring of the crop development, and for this purpose, the basic aspect to be considered, is the study of the spectral profile, obtained from variations of NDVI along time (Esquerdo et al., 2011;Ramme et al., 2010).
Some studies, at local or regional level (Martinez & Gilabert, 2009) tested the performance for the use of changes from vegetation indices along time, to identify and map annual crops (Wardlow et al., 2008).These works opened several new perspectives for the speed at which the areal information is generated.In spite of the advancement of mapping, some questions still remain unanswered.A crucial open issue for the generation of areal information is the reliable separation of targets.Studies by Nuarsa et al. (2011), correlated values of rice productivity with NDVI data, adjusting the temporal evolution with second-order polynomial equations.They found r 2 values greater than 0.9.
When considering the reliability of mapping, which here is defined as a "masks", one reports to traditional test methods such as the Kappa index and the global accuracy (Congalton, 1991;Congalton, 2001), which are not sufficient to validate mapping made with remote sensing data.A new approach is necessary to correlate more directly classification with statistics.
Due to that, we proposed to develop a methodology which is able to generate masks of summer crops, based on second order polynomial equations, fitted to temporal NDVI profiles in the municipality of Cascavel, Paraná State, where the determination coefficients from these equations are the reliability levels of the mask.The slopes are smooth on the northern section and smooth to undulating at the southern section according to Figure 2. The largest part of the municipality presents slopes between 0 and 10%, followed by slopes of 10 to 20%, predominantly at its southern portion (ITCG, 2008).

Study Area, Data and Method
This study was done at the Cascavel municipality, within the agricultural belt of Paraná State, located between latitude S 24°44′33.72″and S 25°22′16.59″,longitude W 53°04′21.12″ and W 53°43′23.88″, occupying an area of 2,103 Km 2 (Figure 1).The climate, according to Köppen (1931) is of type CFB, Sub-tropical humid.It presents average yearly temperatures of 19°C, with average of hottest month above 22°C, and of the coldest month below 18°C.The predominant soil types of this region are Latosols, Clay soils, Neo-soils and Nitosoils (ITCG, 2008).In this study we used spectral data from MODIS (Moderate Resolution Imaging Spectrometer), onboard TERRA satellite, product MOD13Q1 (Huete et al., 1999), referring to NDVI.This product was downloaded, free of costs, from NASA homepage (NASA, 2012).For the image extraction followed by re-projection, we used software MRTools (MODIS Re-projection Tools), developed by NASA.The raw HDF (Hierarchical Data Format) images were processed to separate the NDVI product from all others, as well as re-projected of the lat/long coordinates, datum WGS-84.At the end of the procedure, the images were saved in GeoTiff format and manipulated on ENVI 4.5 program.

Transformation of LANDSAT-TM 5 Images in Vectors
To transform those areas of summer crops (soybeans and maize) in vectors, initially a color composition with RGB-453 was used.In this composition, just in maximum vegetative time (peak vegetative) the summer crops presented a spectral behavior which allowed its visual discrimination from other targets (Figure 4) followed by its manual transformation in vectors using program ArcGIS 9.3 (Sanches et al., 2005).From this procedure a map of summer crops mask was generated.This map was used as a reference to find the cutoff point at the construction of the mask, originated from the methodology of the polynomial equations.

Visualization of the Temporal NDVI Profile for Summer Crops (Soybeans and Maize) in the MODIS Images
The temporal NDVI profile for the agricultural year under analysis was processed with ENVI 4.5 software, function Z profile (Spectrum).A set of 12 images, every 16 days, totaling 192 days was necessary, which corresponds in average to the growing cycle of summer crops (soybeans and maize).

Identification of the Temporal NDVI Profile: the Beginning and the End of the Vegetative Cycle
On the temporal profile, the beginning and the end of the cycle were identified, observing the behavior of the curves generated with function Z (Spectrum) from ENVI 4.5, where changes referring to vegetal physiology can be diagnosed.According to Zhang et al. (2003) the NDVI profiles can show the phenological phase of the culture.Therefore sections were made on the temporal profiles to get only the cultural cycle for the harvest of 2007/08.For this task the tool Rule Classifier, also from ENVI 4.5, was used.This tool identifies those dates (bands) with the lowest and highest values found in the temporal profile of the vegetation index.In this case, these values were considered as the beginning (lowest value) and the peak (highest value) of the vegetative period, according to a study by Zhang et al. (2003) who identified the different phonological phases of vegetative development in the temporal NDVI profiles.

Adequacy of Second Order Polynomial Equation for Each Pixel of a Temporal Series
After identification of the temporal profiles, they were put in matrix format (spreadsheets), where each spreadsheet corresponds to a date, and each cell of each spreadsheet to a pixel.With a dataset, the cell corresponding to a certain pixel at any X, Y position (geographical coordinates) a second order polynomial function was defined, using program R. The determination coefficient (r 2 ) for each equation indicates the degree of fitting of equations to the observed data.Values of r 2 were calculated following equation 1 below: Where, SQT is the sum of squares from the dependent variable, and SQP is the sum of squares, because it is a polynomial surface.
The R program at the end of the fitting process, using the method of least squares, returned the values of coefficients "a", "b" and "c" for the second order polynomial equations.

Construction of the Mask for Summer Crops (Soybean and Maize), Based on the Determination Coefficients of Second Order Polynomial Equations
These coefficients were adjusted to the temporal NDVI profiles for each pixel and an image was generated (the so-called "reliability image"), where each pixel corresponds to the value of the determination coefficient.Afterwards 2 classes were discriminated: 1 st determination coefficients larger or equal to 0.75.
(r 2 ≥ 0.75) and 2 nd determination coefficients smaller than 0.75 (r 2 < 0.75), for the average NDVI profile.In order to determine this point, the known areas occupied with summer crops were considered, using for this purpose the vectors obtained from the LANDSAT-TM 5 image, described earlier.

Construction of the Mask Summer Crops (Soybeans and Maize) Using the Maximum-Minimum NDVI Methodology
This methodology consists on the use of several NDVI images to obtain the so-called "Minimum NDVI image" and "Maximum NDVI image".For the generation of minimum NDVI images, which includes the sowing phase, scenes from September to November were processed and for the maximum NDVI image, corresponding to the vegetative peak, images from December to February were used, so that the entire vegetative cycle was included, independently of the sowing date, which varies a lot in the region under study.This procedure was implemented in IDL language, according to a methodology described by Fernandes et al. (2011).

Application of the Kappa Index, Global Accuracy and Error Matrix
For the statistical verification of the accuracy from the second order polynomial functions over each pixel at the polynomial mask, three methods were applied, namely: the error matrix or contingency, global accuracy (GA) and Kappa index (Landis & Koch, 1977).To do that, 100 points were drafted over the polynomial masks and minimum-maximum NDVI, on a stratified and disproportionate form.
Afterwards these points were exported to an image obtained by LANDSAT-TM 5 because it has a relative higher spatial resolution (30 m at the visible bands) to verify the number of agreement hits with class summer culture (polynomial methodology), as well as by the maximum-minimum NDVI, and so to obtain the error matrices.

Comparison among Data from Cultivated Area (Summer Crops)
Comparisons were made among summer crop areas, obtained by the maximum-minimum NDVI method described at item 2.8: vector data over LANDSAT-TM 5 data, and data obtained with the methodology proposed using, for comparison purposes, official data delivered by governmental agencies (IBGE, 2012).

Result and Discussion
The methodology proposed for the generation of a summer crop mask (soybean and maize) by second order polynomial equations adjusted to temporal NDVI profiles, was based on the behavior of both crops in the field, because they have unique patterns of temporal profiles, according to a graphic representation (Figure 5).One verified that the temporal NDVI profiles follow a tendency, and that it is possible to adjust second order polynomial equations for each pixel of the temporal series.The vegetative cycle of summer crops is characterized by two distinct periods, which command its phenology (INPI, 2009).These two periods were identified in the Cascavel municipality, based on the average temporal NDVI profile, namely: 1 st at the beginning of the photosynthesis activity until flowering (VE to V5) with an average cycle of 50 days; nd period extending from the formation of pods till the full maturation (R1 to R8), encompassing approximately seventy days.These phases depend on the varieties cultivated and on the climatic effects (Garcia et al., 2007).The cutoff chosen, considering the average value of the determination coefficients found within the crop mask, was 0.75.Since this is a relatively high value, it was assumed that the adjusted polynomial equation, would represent its vegetative development cycle.Zhang et al. (2003) reported that it is possible to adjust functions to temporal NDVI series for the phenology of some characteristic ecosystems.Nuarsa et al. (2011) found r 2 values between 0.916 and 0.973 by adjusting NDVI temporal evolutions with polynomial equations of second order in rice crops.These r 2 values are agreed with the results obtained in the present work.
As for the temporal NDVI profiles, which don't represent the summer crop, the fittings to the second order polynomial equations gave a low determination coefficient, in average below 0.57. Figure 6 shows the fitting of the second order polynomial equations to targets which are not summer crops.Table 2 shows the error matrix between the mask generated from the LANDSAT-TM 5 image and the summer crop mask, generated by the methodology proposed, using r 2 > 0.75.This limit was chosen because the area generated is very close to that one originated from the LANDSAT-TM 5 image.One observes coherence among them, expressed by the Kappa index (KI) 0.86 and the global accuracy (GA) of 93%, indicating that the mask generated from r 2 values above 0.75 are tolerable to select summer crops in temporal NDVI series.Studies performed by Zhou et al. (2008) found an EG between 88.9% and 95.2%, used to validate high spatial resolution images.This result indicates that the mask has an excellent reliability to map summer crops with the methodology presented.However, the 8% inclusion values for summer crops and 6% for non-summer crops can be explained by the fact that the reference image is dated from February 2008.At this date, according to the Secretary for Agriculture and Provision of Paraná / Dept. of Rural Economy -SEAB/DERAL (2011), many regions, above 24%, had already been harvested, presenting high error values.Some of the points randomly drafted, could have been these regions.This fact was proven by temporal NDVI profiles, presenting a typical behavior of summer crops.
When comparing visually the LANDSAT-TM 5 composition R4G5B3 (Figure 7a) with the polynomial mask (Figure 7b) one verifies that at the southern center of the municipality there was a lower occurrence of summer crops, in spite of a more rolling terrain if compared to the northern section of the municipality.This may be the reason why farmers prefer to plant maize in small properties, instead of soybeans, because maize can be harvested manually, at the contrary of soybeans, whose harvest can be performed only mechanically with higher costs.One observes also that these are bare soil areas, planted with summer crops, because the NDVI temporal profile reveals this condition and these areas were easily observed in LANDSAT-TM 5 RGB-453 composition images.It is also probable to be maize, because this corn is normally planted before soybean.This observation can be corroborated by the analysis of the LANDSAT-TM 5 reference image, indicating it (Figure 7a).This analysis is strengthened by the importance of continuous temporal monitoring, delivered by the MODIS sensor, with a higher temporal resolution than TM.When correlating the mask obtained with the maximum-minimum NDVI methodology with the LANDSAT-TM 5 reference (Figure 8a), one verifies that the central-southern region of the State didn't have many selected areas, related to the mask obtained by the polynomial methodology (Figure 8b).Probably this is due to the fact the minimum-maximum NDVI methodology is based on the difference between both maximum (vegetative peak) and minimum (sowing) NDVI, considering as cutoff point those values where this difference is not significant, which can cause omission errors.However, the mask generated by the polynomial methodology with r 2 fitting above 0.75 considers more information, because fitting is done over the entire temporal series of NDVI, and therefore it presents a higher chance to identify the crops, whereas the maximum-minimum NDVI methodology considers only part of the cycle.Table 3 shows the error matrix between the LANDSAT-TM 5 image and the summer crops mask, generated by the maximum-minimum NDVI methodology.One verifies here that the Kappa values of 0.82 and global accuracy of 91% are slightly below those obtained by the polynomial methodology, but the omission error of 0.04% for the summer crops demonstrates that the maximum-minimum NDVI mask was excellent for the identification of summer crops.However when observing the omission error (12.7%) for the non-summer crops, indicates that the methodology was not able to select all the areas of summer crops, testified by the omission areas in the center south, differently from the polynomial methodology which presented an omission error of 7.8%, probably because much more areas of bare soil were identified as summer crops, according to the temporal NDVI profile of these areas.Comparing the area values of summer crops (soybean and maize) found by the 3 methodologies: as for fitting polynomial equations to the temporal NDVI series, maximum-minimum NDVI and vectorization of LANDSAT-TM 5 image, one verifies that the methodology proposed of polynomial equation fitting approached most to the official IBGE data for the harvest under study.A differential error of -10.25% was found, followed by vectorization with 23.5% and by the maximum-minimum NDVI mask, with a difference of 42.8%, according to Table 4.

Conclusions
The NDVI product, a 16 days composition, generated with MODIS sensor images, allowed to draw temporal profiles of the summer crops and to correlate them with the corresponding phenologic phase.
The mask generation methodology for the summer crops by polynomial equations fitted to the temporal NDVI profiles with r 2 above 0.75, obtained a global accuracy of 93%, slightly above the global accuracy found with the maximum-minimum NDVI methodology of 91%.
The results indicate that the methodology of mask generation for the summer crops by fitting of polynomial equations to temporal NDVI series obtained good results and it is close to the official IBGE data.They presented a lower omission error when compared to the maximum-minimum NDVI technique, indicating a higher precision at the classification, because it included more areas of summer crops.

Figure 1 .
Figure 1.Area under study

Figure 5 .
Figure 5. Representation of the average NDVI temporal profile for summer crop with vegetative (VE to V5) and reproductive (R1 to R8) phases, Municipality of Cascavel -Paraná State, 2007-08 harvest

Figure 6 .
Figure 6.Representation of an average NDVI temporal profile for targets which are not summer crops in the Cascavel municipality, 2007-08 harvest

Figure 7 .
Figure 7.Comparison between LANDSAT-TM 5 image and the summer crop mask, obtained by polynomial methodology

Figure 8 .
Figure 8.Comparison between LANDSAT-TM 5 image and summer crops mask, obtained with maximum-minimum NDVI methodology

Table 1 .
The methodology was considered due to take in consideration the natural profile presented by NDVI representing the growing cycle in annual crops, that is, minimum values at the beginning of the cycle increasing during the growing season and decreasing after growing peak.Hence the methodology has considered the variation in maximum-minimum amplitude in NDVI values.Due to the variability of sowing dates, which can be checked in the maize and soybean agricultural calendars for Paraná State, at Table 1 (SEAB/DERAL, 2011) the maximum and minimum were considered taking in account a time of period instead of just one date.Monthly percentage of sowing and harvesting maize and soybean, Paraná State, harvest 2007/08

Table 2 .
Error matrix between the reference (LANDSAT-TM 5, "false color" R4G5B3) and the polynomial mask

Table 3 .
Error matrix between the reference (LANDSAT-TM 5) and the maximum-minimum NDVI mask

Table 4 .
Comparison between official IBGE data and masks obtained by the following methodologies: fitting of polynomial equations to temporal NDVI series, maximum-minimum NDVI and vectorization of LANDSAT-TM 5 image, harvest 2007/08