Statistical Data Processing in Rocket-Space Technology

The most measured data in the rocket-and-space technics is a broadband non-stationary random process. Using traditional methods of cyclic digitalization at processing of such data leads to heavy computational costs and heavy amounts of memory of on-board computing devices. As a rule, in processing of discontinuous processes, the recovery of original implementation on the receiving side is not required and processing involves calculation of probabilistic characteristics. In this case processing of data in the rocket-and-space technics has a number of features. First, random processes always represented by the sole implementation. Secondly, priori knowledge of measured random process probabilistic properties is not possible. Thirdly, there is need in on-line processing that predetermines use of rapid processing methods. The article considers application possibilities of nonparametric solution theory methods for evaluation of probabilistic properties of non-stationary broadband random processes in the rocket-and-space technics on-board systems.


Introduce the Problem
At the rocket and space technology always has been a pressing problem transferring to Earth in real-time telemetry data rate (Mamchev, 2007).This problem buys a special urgency by transmission to the land of parameters of propulsion systems, such as temperature and combustion chamber pressure, a fuel rate.The knowledge of these parameters in real rate of time allows making operatively a solution at origin of emergencies in propulsion systems.

Explore Importance of the Problem
Physical parameters in propulsion systems (for example, temperature and combustion-chamber pressure), always look like broad-band no stationary random process with a breadth of a spectrum 2÷32 kilohertz, i.e. represent a non stationary trend from the broad-band random component superimposed on it.However, in the modern rocket-engineering systems for measurement of these parameters often use low-frequency transmitters which on an exit give not the random determined signal (trend) (Voronov, 2011).Thus, the random component of physical process is filtrated.Such mode of measurement and handling of parameters is applied for the purpose of abbreviation of computing costs and decrease of demands to radio engineering communication channels as frequency of an initial cyclical digitization Δt depends on a signal spectrum ΔF (Novikov, 2003).In practice, this relation is equal ÷ Δ Here (5÷10) means that the coefficient gets out in the range from 5 to 10 depending on the demanded restoration accuracy.From the ratio given above it is visible that measurement and processing of broadband signals requires increase in speed of onboard computing means and their memory sizes in tens times, and at data transmission it is necessary to increase by Earth in ten times the capacity of communication channels.Meanwhile, there is a practical need of reduction of characteristics of onboard computing means that it is connected with requirements of reduction of their mass-dimensional characteristics and energy consumption.
Representation of physical parameters in the form of a no stationary low-frequency trend is justified, if space vehicle flight passes in a regular condition.However, the history of assimilation of an outer space knows many facts of accidents of space vehicles.For example, accidents of the American orbiting space crafts of reusable use Space Shuttle, the Russian launchers «Proton» in the end of past century happened because of uninominal situations in propulsion systems.In addition, if in the second case business was limited to loss of satellites, in the first case numerous loss of human life took place.In our opinion, measurement of data on physical parameters in the combustion chamber of motive systems in full, i.e. measurement of a non-stationary broadband signal, and not just trend, has paramount importance.It is connected with that the casual component starts reacting to the coming nearer emergency situation much earlier, than a trend.Thus, the random component is as though a forerunner of an uninominal situation.In addition, its measurement and transmission to the Earth in real rate of time allows making operative solutions on the Earth, in particular to generate a pilot signal on a cut-off of drives and not to suppose accidents.
Thus, the problem of working out of accelerated processing methods of broadband random processes in onboard radio telemetry systems is extremely actual.
Importance of this problem strongly increases if necessary to treat huge data arrays in deadlines.
It gains a special urgency in connection with sharp magnification of a volume of the treated data called by expansion of a scale and research problems, and also rushing to more detailed research of measured processes.Such demand, naturally, leads to essential thickening as telemetry systems, and radio engineering communication channels.

Solutions to the Problem
Solution of problems was possible at use of so-called methods of compression data (Ivanov, 2008;Salamon, 2004).
As a rule, in a spectrum of measured parameters the leading position is occupied with broad-band processes (Mala, 2007).Introduction of methods of quasireversible compression of data is effective only for slowly changing processes.Therefore, it does not solve a problem of cumulative reduction of volume of transmitted data and unloading of a communication channel.Thus, the special urgency is gained by a problem of compression of broad-band signals which make all (10÷30) % from the general nomenclature of measured parameters, but load the transmission channel on (60÷80) %. (Ivanov, 2007;Molodchik, 2002) For compression of broad-band signals it is possible to use methods of irreversible compression of data which consist in definition on a transmitting leg of is informational-measuring system estimations of probabilistic characteristics of measured random processes and their transmission over communication channels.
Difficulties in realization of methods of irreversible compression are significantly higher, in spite of the fact that their efficiency that is reductions of redundancy of data 10 times more.These results from the fact that the data processed onboard very often represent the only realization of non-stationary casual processes in the absence of aprioristic data on a type of function of distribution.The modern mathematical statistics does not arrange methods for an estimation of probabilistic characteristics of such processes.

Method
The assessment of probabilistic characteristics of non-stationary casual process is possible the basis on of adaptive division of temporary series of observations into stationarity sites with use of the nonparametric theory of decision-making.(Gamiz, 2011;Gibbons, 2010, Kvam, 2002;Richardson, 2005).Thus, the measured process structure can be described by additive-multiplicative model of the next form (Figure 1): Here y(t) -the measured non-stationary casual process, X(t) -a stationary component of casual process, F(t)a non-stationary trend.

Division of an Interval of Observation into Intervals of Stationarity
The algorithm works as follows.Nonparametric statistics, known in literature as Kendal statistics (Tarasenko, 1976;Sheskin, 2003;Sprent, 2001) (Yesmagambetov, 2007;Yesmagambetov, 2006).This means, the dynamic observational series is divided into any quantity of sections of different length on which a process with specified probability is considered stationary.The sequence of intervals of stationarity of different duration will be result of application of the specified procedure.On stationarity intervals average value of casual process with probability Р=1-α is considered constant.
Figure 3. Separation of the observational data on the stationarity intervals In figure 3 on a horizontal axis are shown number of counting on stationarity intervals.For example, the means that on the 3rd interval of stationarity there is a of counting.

Nonparametric Estimation of Mean Value Random Process
As known, that there are no universal estimations of probabilistic characteristics for the wide class of random processes (Golovnyh, 2005).For example, maximum likelihood estimates ) are optimum for Gaussian distributed random numbers.However, they aren't effective for evenly distributed numbers, and also with correlation between counting of Gaussian casual process.Thus, the a priori knowledge of an aspect of a cumulative distribution function of measured random process is a necessary condition of correct sampling of estimations of statistical performances.It is especially important if to consider that at information handling, as a rule, it is necessary to deal with the only realization of nonstationary casual process.However, in practice a priori informations about measured process are often absent.That practically expels a feasibility of usual parametric modes for statistical handling.
This case, the various algorithms based on adaptation to absolute properties of implementation used: iterated methods of search of quasioptimum conditions of an estimation by volume samples, to length of implementation or a digitization pitch; methods of successive approximations, tests and errors (Karpenko, 2011;2013).
However, such algorithms are labour-consuming enough and represent too large demands to microcomputer performances.Therefore, response of the microcomputer should make up 62 million simple 60 operations per second at handling of 32 signals with a breadth of a spectrum of 4 kilohertz, and core budget for storage of programs, input data and outcomes of handling should make 2 thousand 8-discharge words.
Thus, procedure of deriving of optimum estimations of probabilistic characteristics is tightly connected with problems of minimization of computing costs.Simplification of estimations of probabilistic characteristics was possibly at use of ordinal statisticians (OS) of ranked a series (RS). ( of random numbers gained on the interval of stationarity in sequence.Here (1) x -ordinal statistics of the ranged row on a stationarity interval.Ranging can be made both in ascending order, and in decreasing order.
In a number of studies carried out the errors of estimating the probability characteristics of order statistics.However, these activities were restricted to study of stationary random process while the particular interest represents deriving of estimations of probabilistic characteristics on samples of nonstationary random process.
Application of ordinal statisticians allows to use simple enough procedures for the calculation of average value of casual process, based on central ordinal statistics (COS) ranked beside (Tarasenko, 1975;Daivid, 1979).
There are estimates based on the truncation ranked series The estimations using various combinations of enumerated estimations can be synthesised: ( ) where φ -specially chosen multiplier which choice in this article isn't considered.

Nonparametric Estimation of a Random-Process Variance
When measuring the dispersion is advisable to use the same number of the ranked order statistics as in the estimation of the mean value.At the same time, it is best not to evaluate the dispersion of the process itself, and the ratio of the mean square deviation (MSD).To estimate MSD in nonparametric statistics using the simplest function of scope (2) 12 ( ) , ( ) It is possible to use estimations also: where for σ31 and for σ32 As in the case of estimating the average, different combinations of central order statistics and extreme order statistics (EOS): ) ) ( The coefficient ν can be assigned from a wide range; however, the most effective factor values are as follows (Mamchev, 2007): 1;...1 / 2;...1 / 3;...1 / 4;

Nonparametric Estimation of Function of Distribution of Casual Process
The ranked number of order statistics can be used to estimate the function of distribution of casual process F(x) and function of density f(x).In this case, it suffices to estimate one of them and indirectly to estimate the other, respectively, differentiating F(x) and integrating f(x).With regard to the technique of transmission of telemetry data it is better assess the function of distribution of casual process F(x).This connected with the greater complexity in the implementation of methods for estimating f(x) and better noise immunity transfer F(x) compared to f(x), because of the continuous increase in the ordinate F(x).Therefore, consideration of methods of estimating function of distribution of casual process be paid more attention.
The classic definition of the distribution function, as the probability of the event (x(t) <x) allows us to write the following relation.
Where -Prob(…) means probability, N -sample size, N x -number of samples of the process x(t), not exceeding the value of x, ( ) Statistical relationship between the sample value and its rank allows us to write the following approximate value: where R is a rank of counting of X (R) in the ranged row.
Modification of this method, based on fixation as quantile not order statistic x(R) of rank R, but a linear combination Q of order statistics ( ) allow to generate the following estimates At these estimations in the capacity of a quantile magnitude, average of two or three ordinal statisticians is fixed.
Other mode of the estimation of a cumulative distribution function is based on the evaluation of a nonparametric tolerant interval (L2-L1) where L1 and L2 name 100 β -percent independent of distribution F(x) tolerance limits at level γ and ( 2) ( 1) Prob ( ) If to suppose L1=x(R), and L2=x(S), where R <S the tolerant interval [x(R), x(S)] is equal to the sum of elementary shares from R-th to S-th, i.e.
( ) ( ) Thus γ is a function of arguments N, S-R and β.There are some minimum value N min to which in each specific case there matches quite certain combination R and S. It is possible to determine 1 ( 1) 2 N N − tolerant intervals with various level γ between which N/2 and N (N-1)/2 (depending on that even or odd N) will be symmetric.For security of symmetry of a rank should be connected a condition: Then for an estimation of cumulative distribution function F 5 (x) in points x(R) and x(S) with a confidence coefficient γ is possible to accept the following magnitudes: ( ) Thus, changing value R from 1 to N/2 and computing matching values S, it is possible to gain estimation F 5 (x) in N points.
Another mode of nonparametric estimation F 6 (x) can be generated from definition of a nonparametric confidence interval [ ( ) R x , ( ) R K x + ] for a quantile x p level p.The Confidence level γ is determined from a relation:  ( , ) (1 ) .( ) ( )


In addition, the probability γ that the quantile х р will appear between ordinal statistics ( ) R x and ( ) R K x + does not depend on an aspect of initial distribution F(x).

Results
Comparative analysis of the following methods of estimation of the distribution function showed that in terms of the volume of the most satisfactory computational costs are estimates of the form F 1 , F 2 , F 3 and F 4 .In addition, F 6 is much more complicated than an estimation of aspect F5 in implementation and on their use for this reason is undesirable.
Error analysis of the mean and mean-square deviation values in the random process had been carried out by statistical modeling method in Mathcad environment.

Random function X(t) with distribution function of ( , , )
rnorm N μ σ form having Gaussian (normal) distribution with mean μ and mean-square deviation σ (for example μ=1, σ =0) have been used for the modeling.
Signal (trend) F(t) has been stipulated on the random function of the next form Where А and 1 a are varied in different limits for the modeling aims.
Trend of F(t) = t form has been stipulated in a number of cases.
In a result we have generated the non-stationary random process (Figure 4): The modeling results showed the following.
Error of the estimated mean essentially depends on the trend form and on the proportion of random and nonrandom components (noise/signal relationship) (Karpenko, 2014;Mala, 2005).The fore cited dependences point on the presence of clear-cut error minimum of the non-stationary estimated mean which is in the range of significance value α = 0.05÷0.06l.Error increase at α value, which are out of this range explained by two properties.On the one hand, there is increase in the stationarity hypothesis region at α < 0.05 value that leads to the decrease in the non-stationary component separation accuracy.On the other hand, there is decrease in the quantity of readings for the stationarity participation at the decrease in the stationarity hypothesis region (α > 0.06) that leads to the increase in the estimated mean error.α=0.05÷0.06minimum position does not depend on the trend character and the noise/signal relationship.It is obvious that the central order statistics are the easiest to implement.In Figure 8 shows a comparative analysis of different methods of computational cost estimation of the average.The minimum costs, apparently, have estimations of an aspect  Comparative analysis of the estimates shows the dependence of the mean error of the volume of uncorrelated samples N (Figure 9 and 10), and this dependence is similar to the corresponding dependence evaluation 0 m  , where it can be concluded about the consistency of the estimates based on the use of order statistics ranked series.The following designations are given in figures 9 and 10: ϗN -the specified realization length, where ϗ= 1/ K N ; K N , -number of counting on a correlation interval; ε, δ -estimation errors.
The considered COS -count the best indicators of accuracy has a simple estimate of the form 12 m  , having besides the minimum computational cost.Uncertainties in estimates using extreme order statistics, for small N is not worse than the COS -estimates, but for large values of N, they are less effective than simple estimates of the form 11 m  and 12 m  .In class evaluations, using truncation ranked series, the quality of estimates increases with decreasing j.This is especially noticeable in the region of small N.For large values of N (N> 50) decrease j virtually no effect on the quality of estimates.Estimates of species 6 7 8 , , No preferential have areas of use in mind more complex computing operations to obtain them.These dependences for all modes of evaluation are at least attributable to the value of N is the sample size equal to 20÷30 counts.These dependencies are common to all types trender.Deterioration of the accuracy for N> 30 is due to non-stationarity of the process being measured, that for large sample sizes to have a greater impact on accuracy than the increase in sample size.
With a sample size of 20-30 counts, errors of all estimates does not exceed 30%.
From these dependences it is possible to conclude that the most effective, both in terms of accuracy of the estimations, and in the sense of minimizing the computational cost is a simple estimate of the form-DSP

12
( 1) The analysis of the conducted researches at a variance estimation has shown that at the fixed value of factor ν there are no satisfactory estimations in a broad band of modification N. At values N=const the more correlative association between references, the more demanded value of a factor.On Figure 11 values of lapses for different estimations are reduced at various factors ν depending on a sample size for uncorrelated references.From the reduced associations it is visible that value of factor ν is necessary for appointing in the adaptive image, proceeding from a sample size.
Use of estimations of type 2 j σ and 4 j σ is unpromising, as any of these estimations does not give comprehensible accuracy 30% δσ < in all range of modification N at any values of factor ν. Aspect estimations 3 j σ and 5 j σ are less preferable in comparison with the estimations using extreme ordinal statisticians, in view of large computing costs.
Thus, the most comprehensible is estimations of an aspect 11 σ and 12 σ , yielding satisfactory outcomes in a broad band of modification N. Optimum conditions of an estimation can be noted in a following aspect:

Discussion
Sampling of an optimum pitch of a digitization.In references there are various approaches to sampling of a pitch of a digitization Δt.The majority of recommendations assumes to make a digitization of random process, proceeding from the theorem of Katelnikov.In some activities it is supposed to determine a digitization pitch depending on accuracy of approximation of estimations of correlation function random process.However, too high demands to frequency of inquiry in the first case and a priori independence of a correlation function in the second does not allow to use these techniques at handling of broad-band random process.
In a number of activities it is recommended for the basis at sampling of a pitch of a digitization to accept absolute error of restoration of continuous implementation.However, it is possible to consider as the most comprehensible criterion assigning of a pitch of a digitization proceeding from a statistical lapse of gained estimations.As analytical research of expressions for lapses is difficult for fulfilling because of bulkiness, the analysis of lapses was made by a modelling method on the computer.
As variances of estimations have a minimum sampling of a quasioptimum pitch of a digitization k t Kτ Δ = К is possible, ensuring an estimation of probabilistic characteristics with a margin error, close to minimum at possible a smaller amount of references in implementation.In this case k τ -a correlation interval for which at an average estimation to a variance accept a time of the first intersection normalized a level correlation function 0.30 λ = . At an estimation of a cumulative distribution function, the sampling step can be appointed by the same technique, as qualitatively character of behaviour of a lapse of an estimation of a cumulative distribution function same, as at average estimations to a variance.Level λ in this case starts equal 0.7.Thus, accuracy of an estimation worsens on 3÷5 %, and the sample size in comparison with an optimum estimation is divided out almost to an order.However, as value of a correlation function a priori is not known, for a correlation interval k τ it is better to adopt a value an integral interval of correlation 0 ( ) , and in this case it is enough for assigning of a pitch of a digitization a priori knowledge only a spectrum breadth.
Outcomes of researches have shown that at an average estimation an effective mode of assigning of a pitch of a digitization is sampling of its condition of independence of selective references as presence of a correlation worsens accuracy of an estimation (Figure 13).Therefore in this case Necessity of deriving of all estimations of probabilistic characteristics on the same sampled data does not allow gaining an optimality simultaneously for all estimations owing to different demands to a digitization pitch.In this case, as the most important is the information on mean value, it is expedient to choose a digitization pitch t Δ equal to a correlation interval τк and to gain optimum estimations of a nonstationary average.Variance and cumulative distribution function estimations in this case though will not be optimum will be comprehensible enough.

Conclusion
Random processes in the rocket-and-space technics, as a rule represented by the sole implementation in conditions of expected uncertainty on the kind of distribution function.Processing of such processes on-line is possible using nonparametric solution theory methods.The present work considers technique for the dynamic observational series division on stationarity intervals with the following valuation of mean values and dispersion.
In the future, we are planning to perform researches on valuation of the distribution function and correlation function of the non-stationary broadband random process.

Figure 1 .
Figure 1.Additive-multiplicative model of the signal

Figure 2 .Insertion
Figure 2. Algorithm of division of an interval of supervision into stationarity intervals

Figure 4 .
Figure 4. Generating example of the non-stationary random process

Figure 5 .
Figure 5. Separation of the non-stationary component (red color non -stationary casual process y(t), blue color -F(t) trend, black color -11

Figure 6 .
Figure 6.Dependence of an error of calculation of average value on a significance value

Figure 7 .
Figure 7. Dependence of an error of calculation of average value on a significance value

Figure 8 .
Figure 8.The comparative analysis of computing costs of an estimation

Figure 9 .
Figure 9. Dependences of an error of estimation of average value on the specified realization length

Figure 11 .Figure 12 .
Figure 11.Dependence of an error of estimation of the mean square deviation of counting on a stationarity interval the normalized correlation function of the random process.Singularity of application of an integral interval of correlation is that this magnitude connected with an effective breadth

Figure 13 .Figure 14 .
Figure 13.Dependence of an error of calculation of average value on a digitization interval

and comparison procedure is repeated for a new stationarity interval
is formed by , i T T T