Geostatistics and Sample Density of Chemical Attributes for Soil under Sugarcane and Agroforestry in Humaitá – AM, Brazil

There is a lack of studies focusing on spatial variance of chemical attributes in Amazonas, Brazil. The objective of this study was to evaluate geo-statistics and sample density of chemical attributes in soils under sugarcane and agroforestry, Humaitá, Amazon state, Brazil. The research was carried out in the municipality of Humaitá, Amazon state, the areas were in meshes of 70 m x 70 m regularly spaced by 10 m, with 64 points per area, and then soil samples were collected in layers of 0.0-0.2 m and 0.4-0.6 m. We performed chemical analyses of the proprieties of soil. Then, it was applied descriptive statistics, geo-statistics and sample density techniques based on semivariogram coefficient of variation and range. As a result, we observed spatial dependence for most of the chemical proprieties in the two areas. Based on Cline, sugarcane presented a sample density of 237 (0.0-0.2 m) and 225 (0.4-0.6 m); and agroforestry had 356 (0.0-0.2 m) and 465 (0.4-0.6 m) points per hectare. Yet in the range sample density, sugarcane showed 30 (0.0-0.2 m) and 5 (0.4-0.6 m), whereas agroforestry areas had 23 (0.0-0.2 m) and 9 (0.4-0.6 m) points per ha.


Introduction
The spatial variations of soil attributes are influenced by pedogenetic processes (Fonseca et al. 2021), as well as by human activities . The understanding in the variation in soil characteristics makes it possible to understand the connections between soils and environmental conditions (Goovaerts, 1998) thus helping in soil use practices (Plant, 2001) favoring the proper management of the area and providing sustainability to the systems of production.
The use of geostatistical tools favors the interpretation of data based on the structure of the natural variation of the studied properties, considering the degree of spatial dependence in the sampling space studied (Lima et al. 2022a). Despite this, it is essential to have a robust and accurate sampling plan in order to minimize failures on the interpretation of results (Lima et al. 2022b).
high correlation or spatial dependence of the variability of soil chemical attributes in areas with different uses (Souza et al., 2023).
Thus, there are still few studies on the spatial variability of the chemical properties of Amazonian soils. We aimed to study geo-statistics and sample density of chemical attributes in soil under sugarcane and agroforestry management in Humaitá, Amazon State, Northern Brazil. The information from these surveys can help in gathering information about soil properties and their relationships with plant data variation.

Method
The studied areas are located in the region of Humaitá, Amazonas, Brazil, with the following characteristics: i) area under sugarcane cultivation (7º 54' 38" S and 63º 14' 27" W and average altitude of 70 m above the sea level); ii) area under agroforestry system (7º 28' 29" S and 63º 02' 07" W, and altitude of 63 m) (Figure 1). The local climate is of the rainy tropical type, temperatures are between 25 and 27 °C, the average annual precipitation is 2,500 mm with a dry (short) and rainy period from October to June and relative humidity of the air around 90 % (Alvares et al., 2013), the soil was classified as Plinthic Cambisol Alitic (Santos et al., 2018) and Dystric Leptic Cambisol the World Reference Base of Soils (IUSS Working Group WRB, 2022).
Sampling meshes of 70 m 2 were established and soil samples were collected at the intersection of the points, with a regular space of 10 m, in a total of 64 points per mesh. (Figure 1). Soil samples were collected in layers 0.0-0.2 and 0.4-0.6 m to determine chemical properties, the locations were geo-referenced using the Garmin 'Etrex' GPS (South American'69). The texture characterization of the collected samples is presented in Table 1.  After the soil had undergone a drying process under the shade and was sieved through a 2 mm mesh, characterizing a Fine Air Dry Earth (FADE), chemical analyzes were carried out, according to the methodology proposed by Teixeira et al. (2017). The pH in water was determined potentiometrically, using a pH meter in the soil: water ratio.
Potential acidity (H+Al) was extracted with buffered calcium acetate at pH 7.00 and determined by titulometry using 0.025 mol L -1 NaOH and phenolphthalein as an indicator.
Potassium and available phosphorus were extracted by Mehlich-1. P contents were determined by UV-Vis spectrophotometer and K + contents by flame spectrophotometry.
Based on the determinations of exchangeable cations and potential acidity, potential cation exchange capacity (CEC) and base saturation (V%) were calculated.
The total organic carbon (TOC) will be determined by the Walkley-Black method, modified by Yeomans & Bremner (1988) and organic matter content, in turn, was estimated based on organic carbon.
After the data collection, we performed exploratory data analysis, calculating means, median, variance, coefficients of variation, skewness and kurtosis and normality test. The coefficient of variation (CV) was calculated as proposed by Warrick and Nielsen (1980), who classified it as low (CV <12%), medium (from 12% to 60%) and high (CV > 60%). The data normality hypothesis was tested using the Kolmogorov-Smirnov test through the Minitab 14 statistical software (MinitaB, 2000).
Spatial variability characterization was made through geo-statistics (Isaaks & Srivastava, 1989). On basis of intrinsic hypothesis theory, the experimental semivariogram was estimated by the following Equation (1) The semivariograms were adjusted based on the best coefficient of determination (R 2 ) and cross validation (VC), estimated by the GS + 7.0 Software (Robertson, 2004). Based on these adjustments, the coefficients of the theoretical model for the semivariogram were defined: nugget effect (C0) = semivariance value for distance zero, which represents the random variation component; structural variance (C1); threshold (C0 + C1) = semivariance value at which the curve stabilizes over a constant value; and range (a) = distance from the origin to where the threshold reaches stable values, expressing the distance beyond which the samples are not correlated (Trangmar et al., 1985).
In the spatial dependence analysis of variables under study (Table 2), we used the classification proposed by Cambardella et al. (1994), in which [(C0/(C0+C1)] values lower than 25% are considered strong, between 25 and 75% indicating moderate and higher than 75% poor spatial dependence.
Based on the coefficient of variation, it was determined the number of subsamples required to form a composite sample, and estimating the mean value of variables using the formula described by Cline (1944) (2): (2) wherein: n = minimum number of required samples to perform an optimal sample mesh for soil attributes; tα = "t" Student value (at 95% probability) CV = coefficient of variation; D= variance percentage from mean value (5%).

Results and Discussion
When evaluating the dispersion of soil variables (chemical attributes) via descriptive statistics, it was found that most of the studied variables presented close mean and median values, which may indicate distribution of data close to normality, being also confirmed by the data of the coefficients of asymmetry and kurtosis near to zero (Tables 2 and 3).  Ca 2+ , Mg 2+ , SB (0.0-0.2 and 0.4-0.6 m) and V% (0.4-0.6 m) attributes in agroforestry area (Table 2); H+Al and CEC (0.0-0.2 m); H+Al, MO, Ca 2+ , Mg 2+ , SB, CTC and V% (0.4-0.6 m) in the sugarcane area showed distant mean and median values indicating asymmetrical distribution (Table 3). This information is confirmed by the asymmetry and kurtosis coefficients that are farther from zero, indicating that these attributes do not follow the pattern of the distribution curve, being non-symmetric. The skewness coefficient has been used to characterize how much and how the frequency distribution deviates from symmetry: i) if Cs > 0, there is a straight skewed distribution; ii) Cs < 0, there is asymmetric distribution to the left; and iii) Cs = 0, the distribution is symmetric (Silva, 2010). Other authors such as Cortez et al. (2011) state that the closer the asymmetry and kurtosis values are to zero, the greater the normality of the data, thus the information on these parameters make the "by feeling" adjustment of the semivariogram.
The average values of the chemical attributes behaved differently in the areas studied (agroforestry and sugarcane), for both layers studied (Tables 2 and 3). However, it was found that the agroforestry area showed slightly higher values than the area under sugarcane cultivation. The cation exchange capacity (CEC) exhibited values of 19.43 and 21.47 cmol c kg -1 at layers of 0.0-0.2 and 0.4-0.6, for the agroforestry area, while the sugarcane area 11.27, 13.28 cmol c kg -1 at layers of 0.0-0.2 and 0.4-0.6, respectively. This behavior is due to the contribution of the contents of H+Al in solution, of 18.95 and 21.18 cmol c kg -1 in the layers of 0.0-0.2 m and 0.4-0.6 m, respectively for the agroforestry area, while in the sugarcane area it had potential acidity values of 8.96 and 12.66 cmol c kg -1 at the respective layers, thus having less influence of this component.
Regarding the classification of the coefficient of variation established by Warrick and Nielsen (1980), it was verified that the variables pH in water, H+Al and CTC in the layers of 0.0 -0.2 and 0.4 -0.6 m for the agroforestry area and the pH in water in the sugarcane area in the two layers exhibited low coefficients of variation (CV < 12 %), indicating low variability of the data. The Mg and V% contents in the 0.0 -0.2 m layer; P and Ca in the 0.4 -0.6 m layer in the agroforestry area. And the P, Ca 2+ , Mg 2+ , SB and V% in the 0.0 -0.2 m layer; MO, Ca 2+ , Mg 2+ , SB and V% in the 0.4 -0.6 m layer in the sugarcane area showed high CV (CV >24.1%). All other chemical attributes studied showed moderate CV (CV > 12.1% < 24%) for the studied areas (agroforestry and sugarcane). According to Lima et al. (2022b) this marked presence of moderate variability of chemical properties for agroforestry systems and sugarcane areas is due to the moderate to high heterogeneity of these attributes.
The coefficient of variation is a measure that helps to compare the variation between the studied attributes regardless of the unit of the analyzed variables; on the other hand, it does not contribute to the evaluation of the spatial distribution of the data (Lima et al. 2022a). Brito et al. (2022) state that even if the results of the analysis of soil attributes exhibit high coefficients of variation, the most important thing is to present spatial variability in the data and that there is normality in the data distribution.
Regarding the Kolmogorov-Smirnov test, it was observed that the attributes SB, CTC and V% in both layers in the agroforestry area; H+Al, CTC, V% in the 0.0-0.2 m layer and MO, Ca 2+ , SB and CEC in the 0.4 -0.6 m layer in the sugarcane area showed data normality, on the other hand the other variables did not show data normality in any area (Table 3 and 4).
The evaluated attributes were submitted to the application of a semivariogram to analyze their spatial dependence. A geostatistic analysis was performed with the studied chemical attributes, aiming to verify their spatial variability, as can be seen in Tables 4 and 5. Thus, in the agroforestry area, adjustments were made to the exponential semivariogram model for the H+Al attributes and to the spherical model for P and CEC layer 0.0 -0.2 m, all other attributes showed a pure nugget effect. As for the 0.4 -0.6 m layer, it was observed that the pH in water and OM fitted the exponential model, the K + showed a pure nugget effect and all other variables studied fitted the spherical model (Table 4). These results are similar to those found by Souza et al. (2023) in studies developed under different crops in the Amazon.
In the area under sugarcane cultivation, there was adjustment to the exponential model for the attributes OM, H+Al and CEC in the 0.00-0.20 m layer and all other variables adjusted to the spherical model; in the 0.20-0.40 m layer, it was observed that P, K + and V adjusted to the spherical model and all other attributes fit to the exponential model ( Figure 5).  Vol. 16, No. 4; In general, the dominance of the spherical model adjusted to the chemical attributes studied in both areas was verified, corroborating studies developed by Souza et al. (2020) in areas under cultivation of Amazonian species. On the other hand, according to Brito et al. (2021) it is common to observe spherical and exponential models adjusted to soil and plant properties.
The coefficient of determination showed values ranging from 0.69 to 1.00 for the agroforestry area and between 0.57 to 0.99 for the area under sugarcane cultivation in the studied layers. According to Lima et al. (2022b) the closer the R2 is to 1.00, the better the estimation of values using the common kriging method.
On the other hand, Oliveira et al. (2015b) shows that the pure nugget effect (C0) points to a random spatial variability, that is, unexplained variation, and may even indicate sampling errors or undetected microvariation, indicating that the spacing between the collection points was not sufficient to capture the spatial dependence.
According to Aquino et al. (2015), the pure nugget effect (C0) is an important measure, since it represents the unexplained variation, normally associated with measurement failures or when a soil attribute cannot be perceived by the sampling grid. Thus, the smaller the ratio between the pure nugget effect and the sill (C0+C1), the better the spatial dependencies of soil attributes will be, indicating more accurate mathematical models in estimating a natural phenomenon (Cunha et al. 2017). Thus, Cambardella et al. (1994) suggest that the chemical properties of the soil can be classified according to the magnitude of the spatial dependence, taking into account the nugget effect.
In assessing the degree of spatial dependence (SDD), it was observed that in the CEC agroforestry area in the 0.0-0.2 m layer; pH in water, Ca and SB in the 0.4 -0.6 m layer showed a moderate degree of spatial dependence, all other variables studied in both layers showed a weak degree of spatial dependence. In the sugarcane area it was verified that the attributes, Mg 2+ and V% in the 0.0 -0.2 m layer; the Mg 2+ , K + , CTC and V% in the 0.4 -0.6 m layer were classified as moderate SDD, while the rest were weak. Some researches, such as Soares et al. (2016), Aquino et al. (2015) and Souza et al. (2023) studying soils under pasture cultivation, agricultural and forestry species in the Amazon found a similar degree of dependence.
According to Araujo et al. (2023) the strong degree of spatial dependence is associated with the intrinsic characteristics of the soil (formation factors) while the weak degree of spatial dependence is related to extrinsic factors. For Soares et al. (2021) the moderate or weak degree of spatial dependence may indicate intense homogeneity of soils due to aspects related to uses or due to management systems.
The range is a measure that indicates the maximum distance at which the sample points are correlated, so distances greater than the ranges found are considered independent of each other . Thus, the range constitutes an important parameter for sample planning, as well as the definition of sampling procedures.
The ranges in the agroforestry area ranged from 45.60 m to 84.64 m in the 0.0-0.2 m layer and from 19.80 m to 68.17 m in the 0.2 to 0.4 m layer. In the sugarcane area, ranges ranged from 13.91 m to 33.01 m in the 0.0 to 0.2 m layer and 21.31 m to 57.71 m in the 0.4 to 0 layer, 6 m. It appears that the ranges of values are greater in the agroforestry area in relation to the sugarcane area, indicating greater spatial continuity in this environment. Soares et al. (2016) highlight the existence of horizontal variation, as well as vertical variability, a fact observed in this study (areas of agroforestry and sugarcane), given that the range was different for the variables evaluated in the layers studied.
Likewise, the use of geo-statistics based on range parameter and also using the coefficient of variation in soil attribute variability assist in estimating the minimum number of sub-samples to gage an attribute value at a certain area (Table 6 and 7). This way, using as a basis the formula proposed by Cline (1944), which takes into consideration the attribute CVs, the minimum number of samples can be estimated to measure the studied chemical attributes for both areas.  According to Souza et al. (2006), the minimum number of soil sub-samples is directly proportional to CV, and it was found that agroforestry shows less variability of chemical attributes, since it had low sample density and greater spacing in relation to sugarcane. Based on this parameter, the two studied areas showed higher sampling density than provisions of the meshes with 237 -225 points per ha (from 0.0 to 0.2 and from 0.4 to 0.6 m) and 353 -465 points per hectare (from 0.0 to 0.2 and from 0.4 to 0.6 m) for agroforestry and sugarcane, respectively. Montanari et al. (2012) found similar values, with the highest sample density than the one established in sampling grid for some chemical attributes in sugarcane field in Jaboticabal, SP.
By contrast, using the semivariogram range, it can be appreciated that the number of samples collected for chemical attribute evaluation have decreased compared to Cline formula (Table 6 and 7). With this data, we can estimate an average value of 30 and 23 points per hectare at the first layer (0.0 to 0.2 m), and 5 to 9 points per hectare for the second one (0.4 to 0.6 m) in agroforestry and sugarcane, respectively. In conclusion, it was observed a greater variability in agroforestry area based on geostatistics.
This information can facilitate fieldwork, since several sampling studies have been conducted to reduce the soil variability (Montanari et al, 2005). Nevertheless, in general, current researches have not been considering singlesample variability within composite ones, and how the land plots are defined.

Final Considerations
Most of the chemical attributes studied show moderate or weak spatial dependence for the agroforestry and sugarcane areas. Based on Cline formula, sugarcane presented a sample density of 237 (0.0-0.2 m) and 225 (0.4-0.6 m) points per hectare, with spacing of 16 m (0.0-0.2 m) and 9 m (0.4-0.6 m). Yet agroforestry had a sample density of 356 (0.0-0.2 m) and 465 (0.4-0.6 m) points per hectare, with spacing of 9 m (0.0-0.2 m) and 13 m (0.4-0.6 m). Sample planning based on recommended range obtained a sample density of 30 (0.0-0.2 m) and 5 (0.4-0.6 m) points/ ha with spacing of 38 (0.0-0.2 m) and 47 (0.4-0.6 m). However for agroforestry, these numbers were 23 (0.0-0.2 m) and 9 (0.4-0.6 m) for sample density with spacing of 21 (0.0-0.2 m) and 33 (0.4-0.6 m). The study presents important results that could serve as a basis for other studies in cultivated and natural environments, being an instrument for agricultural and environmental planning.