Probabilistic Approach to the Synthesis of Algorithm for Solving Problems

This paper, based on the content of the axioms for the randomized algorithm, considers the collection of using correct algorithms at synthesis for solving the problem of probabilistic hidden Markov model. Application of this model allows forming algorithm with its flexibility according to a substantial situation for ensuring structural and functional stability of the program realizing this algorithm. We found that randomization of the algorithm, increasing its flexibility and efficiency, does not improve its risk compared with the corresponding deterministic algorithm. The synthesis of the algorithm based on hidden Markov model implies that the available observed data is used to determine hidden parameters of the most likely sequence of states, determining the synthesized algorithm. At the first strategy step, the "back and forth" algorithm is used to evaluate how well the model matches with the input data of the synthesized algorithm. At the second stage, the given hidden Markov model with the space of hidden states, initial probabilities of presence in state i and probabilities of transition from state i to state j, and basing on the observed states and using the Viterbi algorithm, the Viterbi path is found. At the third strategy stage, the hidden Markov models are corrected by optimizing the parameters of the model using the Baum-Welch algorithm.


Introduction
At the basis of any purposeful activity, there are decision-making procedures of the problem as a sequence of actions (algorithms) that convert raw data to achieve the goal (the desired result of solving the problem) at the lowest cost.The problem can be solved by different algorithms, which have their own advantages and disadvantages at its decision.At that, the synthesis of solutions includes the construction of a qualitative model of the problem, with subsequent writing it in mathematical form, construction of the objective function variables and study of the effect of variables on the objective function.
When making decisions under the conditions of certainty, the criteria approach is used; in which each alternative is evaluated using criteria.However, the multi-factorial impact on variables of the objective function of the algorithm for the environmental conditions and controlling actions restricts the use of this approach because of a rare situation of complete certainty of the consequences of choice, and because in practice, several different criteria are often used and rarely one alternative is the best for each of them.

Literature Review
To overcome this limitation, currently stochastic algorithms are widely used (Fedosova, A. V. & Zavriev, S. K., 1988;Wardi, Y. 1990;Lukshin, A. V., & Smirnov, S. N., 1988;Simonov, N. A. 1995) and their varietiesprobabilistic and heuristic algorithms, which are characterized by the ability to control their flexibility and efficiency that defines the use of the algorithm in the formation of a probabilistic model.At that, the randomized algorithm sets the strategy for solving the problem in several ways or methods resulting in the probability of achieving the result (Kazharov, A. A., & Kureichik, V. M., 2010;Levanova, T. V., 2004), and for heuristic algorithms the achieving of the end result is not clearly predetermined, as well as the entire sequence of actions is not defined, and all the actions of the performer are not revealed.The example of heuristic search algorithms used for solving optimization problems and modeling by random selection, combination, and variation of the desired parameters using the mechanisms resembling biological evolution are genetic algorithms (Kureichik et al., 2006) The purpose of this paper is to investigate the collection of using of a probabilistic model of the algorithm at syntheses of correct algorithms.That model determines the ability to control the flexibility of the algorithm to provide structural and functional stability program implementing this algorithm, during, for example, hacker at it.

General Analysis of the Decision Making Process
In accordance with (Rudakov K. V. 1 1987; Rudakov K. V. 2 1987; Rudakov K. V. 3 1987; Rudakov K. V. 1988) let us define the problem of synthesizing correct algorithm as the process of converting primary (original) information J in ={X(S i )|S i ∈U} at some collection U={S i }, which elements are in the form of objects X(S i ) and are described for their observations through I in , so J in ={I in }.Depending on the application, I in can be a number or a non-numerical mathematical object (a symbol of the abstract alphabet, a vector, a sequence of characters, a function of a single variable (process) or a function of two variables (an image), and a function of a more complex domain).
The problem is solved using its model M defining the collection of solution algorithms A М ⊆{a|a: J in ⎯→ ⎯ М J out }, i.e. using the collection of algorithms А М , the algorithm a∈A М implements mapping from the space of initial information J in to the space of final information J out .As a result, the "black box" turns "white", which is characterised by the structural information of the problem I s , which is mapping completely and correctly the essence of the problem J in .Structural information of the problem I s allocated at the subcollection of the allowable mappings, model M[I s ], and defining the correct solution to the problem J out ={I s }.Algorithm a, implementing admissible mapping, which is determined by structural information I s , is the correct solution to the problem.The collection of stochastic algorithms contains not only the rules in the class of strict solutions (deductive approach), abductive approach (inverse deduction) is determined as well, which is used to determine the most probable initial predication of the conclusions by the inverse transformation and inductive to identify the most probable regularities arising from comparison of the initial data and the known results.At that, making this choice can be realized in various options: • Collection of algorithms can be discrete or continuous.
• Selection mode can be one-time or iterative.
• Evaluation of alternatives shall be made according to the criteria of different types.
• The consequences of each alternative choice may be known (conditions of certainty), not known for sure, but the probability of effects can be estimated (conditions of statistical uncertainty), are unknown, and the probability of effects cannot be assessed (conditions of uncertainty).

The General Analysis of the Bayesian Approach to Problem Solving
2.2.1 The Choice of Criteria for Evaluating the Quality of the Solution for the Problem Bayesian approach is widely used at present time for solving a variety of parametric approach (Harin, A. Yu., 2013;Orlov, A. I., 2004), based on the knowledge of the distribution density of the input variables of the problem, for which the correct algorithm for this problem can be written in explicit analytical form using formalised structural information of algorithm I s .
Bayesian problems and the basic properties of Bayesian algorithms are defined by mathematical properties of collections of input initial information J in , states X and solutions D out ∈I out , which determine the efficiency of the algorithms.The algorithm is defined at a finite collection of the observed parameters y∈Y and at finite collection of hidden states of the algorithm x∈X, in addition, it shall be consistent with the input data D in ∈J in .At the collection Y×X of all possible pairs of observations y∈Y and states x∈X, the distribution of joint probability P YX (y, x): Y×X→R is set.
To introduce the problem solution quality criteria let us define for the collection x∈X and possible solutions d out ∈D out the loss function W(x, d out ): X×D out →R.For algorithm a: Y→D out , which assigns to each observation y∈Y a solution a(y)∈D out , shall we define the risk R(a) as a mathematical expectation of the loss function for the algorithm a.At that, the Bayesian problem of statistical solutions is in the fact that for the set collections Y, X, D out , set by the algorithm a: Y→D out , which minimises the Bayesian risk . (1) The results of Bayesian algorithm a * functioning is the solution of the Bayesian problem with the minimal risk R(a).At that, the collections of states X and solutions D out have different forms.Depending on the restrictions on the mathematical form from the elements of the collection of observations Y, states X and solutions D out , the formulation of the Bayesian problem is specified.

Generalized Bayesian Formulation of Problems Solving
Bayesian formulation of the problem solving with model M defines the expansion of the collection of algorithms A M in a way that it included not only algorithms of the form a: Y→D out , but all possible distributions of collections P s (d out |y), i.e. in stochastic algorithms, each value y randomly gets a suitable solution d out in accordance with the probabilities P s (d out |y).The search for the best element is performed in the data of the stochastic algorithms, in which element at a fixed value of x the same deterministic solution d out =a(y), is taken which is in contradiction with the random nature of the state in which the algorithm is present.
For Bayesian formulation for solving the problem on finite collections shall we define Y, X, D out with the distribution of probabilities P YX : Y×X→R and a loss function W: X×D out →R, the stochastic algorithm a s : D out ×Y→R, which risk is Theorem: For any stochastic algorithm, there is a deterministic algorithm a: Y→D out , which risk is not more than R s , i. e. the Bayesian problem can be reduced to finding a deterministic algorithm a: Y→D out .
Proof: Let us rewrite equation ( 2 Let a(y) serve as a designation for any value of d out , for which ( , ) ( , ( )) min ( , ) ( , ) This algorithm a: Y→D out is deterministic which is not worse than the stochastic a s ( , ) ( , ( )) i.e. for a determined algorithm a the risk R det is≤R s .

Bayesian Algorithm Problem Solving Model
Conceptual model of the deterministic algorithm is a sequence of machine instructions (operators) v i , defined at the collection of the machine instructions , where n mk is the number of machine instructions contained in the sample class of algorithms, which are understood as a system of instructions of the processors class in use.For the stochastic algorithm model, variables are defined by the following meaningful axioms:  observable (hidden) events are represented as a sequence, ordered by time (Markov property, ensuring convergence of the strategy), thus t-th hidden variable s t at known (t-1)-th varibale s t-1 is independent from all previous (t-1) variable that is the transition function.

P(s t s
that is not time dependent. • the two sequences S=s 1 s 2 ... s T and O=o 1 o 2 ...o T shall be aligned, i.e. each observed event o t shall correspond to one hidden event s t , i.e. the value of the observed variable y(t) depends only on the value of hidden variable x(t) (both at the point of time t).
• the calculation of the most likely hidden sequence before time t depends only on the observed event at time t, and the most likely sequence until t-1.
These axioms (8, 9) determine the use as a model of the algorithm of Hidden Markov Model (HMM) (Figure 1 3) The distribution the probability for the state transition (transition probability matrix) P=||P ij ||, where P ij =P[s t+1 =x j s t =x i ], 1≤j, i≤n -is the probability of the transition from the state s t =x i at the point of time t to the state s t+1 =x j at the next point of time t+1.
4) Distribution of probability for appearing observation symbols in state j, P s =||P j (k)||, where P j (k)=P[v k s t =x j ], 1≤j≤n, 1≤k≤m.
In short form HMM has the form Θ=(P, P s , П) and represents a doubly stochastic process consisting of a pair of random variables {o 1 ,…, o t , s 1 ,…, s t }, where o t -is known discrete observations describing the appearance of the observation symbols (machine instructions) s t -are "hidden" discrete values, determining changes in the state of the model, but it is not known how many and what state connection there are between them (unknown parameters of the model).

The Strategy of Solving Algorithm Synthesis
The accepted model of the algorithm determines the iterative strategy of algorithms synthesis, which is reduced to the transformation of the original model Θ=(P, P s , П), leading to the goal in the form of an optimal model of correct algorithm ( , ,П) Statement of the problem.The observed value y(t) is lnown in the form of sequence Y T =y 0 , y 1 ,…, y Т-1 of length Т, produced by the sequence O=o 1 o 2 …o T .The probability to observe y(t) is equal to P(Y T )= ( ) ( ) where sum is defined over all possible hidden variables x(t) in the form of sequence of hidden nodes X T =x 0 , x 1 ,…, x Т-1 .According to the available data y(t), it is required to determine the hidden parameters of the most likely sequence of states of the Markov chain S=s i1 …s iT .
The solution of the formulated problem can be achieved in three stages, each of which implements by a known algorithm.
First stage: T steps in model Θ=(P, P s , П) give the sequence of observations O 1,T =o 1 ,…,o T , while at the point in time t for the state s: X i =i probability P(O 1 , t-1 |X t =s) of the fact that during the transitions there was formed the sequence of observations O 1,t-1 , and P(O t,T |X t =s) -is the probability that the sequence of observations O t,T is observed after that state.
There is the search for the probability P(X t =s|O)=P(X t =s|O 1 , t-1 ∩O t,T ) of the fact that at the point of time t the chain will be in the state s.
1.For the random state s at ramdom step t, the probability P(O 1,t |X t =s i ) that the fact the the sequence O 1,t was produced on its way for the following t can be made recursively: The probability to get into the state s at the t-th step, taking into account that the event o t will happen after the transition will be equal to be in the state j at the t-th step multiplied by the probability to transit from the state j to s, having performed the event o t for all j∈S.
2. The probability that after a random state s there will be produced a sequence O t+1 ,T is defined recursively: 3. To find the probability that the chain of events will be made, P(O), there shall be got the sum of the product for the two probabilities for all states at a random step t: as the future of the Markov chain does not depen on the past and the probability of observation of the event O t does not depend on the past observations of the event O 1,t-1 , then the probability that at the point in time of time t the cahin will be in the state s: This problem is solved by the "back and forth" algorithm (Binder et al., 1997;Lawrence, R., & Rabiner, A., 1989;Lawrence et al., 1986), which allows to find in the hidden Markov model the probability of getting into the state s at the t-th step at the sequence of observations O and (hidden) sequence of states X.

Second stage:
For the given HMM with the space of states S={s 1 , s 2 ,..., s K }, the initial probabilities i of being in the state i and probabilities P i,j of transition from the state i to the state j, according to the observed y 1 ,..., y T and the majority of the initial information J in are used to find the most plausible sequence of the states for the hidden nodes S=s 1 ...s T (the Viterbi path) which describes the given model better.Then the most likely sequence of states x 1 ,..., x T is defined by the recursive relations: , 1 , max( ) where V t,k -is the probability of the most likely sequence of states responsible for appearing of the first t of the observed symbols ending in the state k.Viterbi path is searched based on the states x, satisfying the equation ( 15), i.e., x T = , arg max( ) The solution of this problem uses the Viterby algorhithm (Andrew Viterbi), allowing to obtain the most likely sequence of hidden states (Viterbi path) of the Hidden Markov model on the basis of a sequence of observations (Forney, G. D. Jr., 2005;Forney, G. D. Jr., 1973;Viterbi A. D., & Omura, J. K., 1982;Zolotariov, V. V., & Ovechkin, G. V., 2004;Morelos-Zaragoza, R., 2006.).

Third stage:
The given output sequence of observations O is used to define unknown parameters of HMM Θ, maximising the collection of observations O At that, the number of point in times r, where the observations are made, is set beforehand and comprises of the following steps: 1) definitions of all the model's states sequence S i ={s i1 ,…, s ir }, i=1,…, r of the problem-solving system at the specified point in times; 2) evaluation of probabilities P(S i ) for appearing of each sequence S i , i=1,…, r, identified at the previous step, by calculating the probabilities of transitions between the works of the model state within the range of the established control points in time, namely: where P s,i,t,t+1 -is the collection of transition from the state s it , in which the system was at the point in time t, into the state s i,t+1 , occupied at the point in time t+1; 3) collection of appearing of the observed sequence X={x 1 ,…, x r } for states sequence S i , i=1,…, r where P x,i,j is the collection of getting the observed characteristic x j at state s ij ; 4) the choice of the most likely state sequence S max ∈{S i } i=1,...,r , of the corresponding biggest collection.

Discussion and Conclusions
The conducted analysis of the probabilistic approach to the synthesis algorithm for solving the problem allows formulation of the following conclusions: 1. Meaningful axioms for the model of randomized algorithms allow for the collection of using them as a model of the algorithm of Hidden Markov Model, HMM, which is applicable not only to the original types of algorithms, but also to the correction algorithms, as well as for synthesized compositions.
2. Using a probabilistic model of the algorithm, it is possible to control the relationship between the algorithm variables by setting the flexibility of the algorithm to achieve the desired structural and functional stability of the software implementing the algorithm.
3. Randomization of the algorithm, increasing its flexibility and efficiency, does not improve its risk compared with the corresponding deterministic algorithm.
4. The problem of synthesis of algorithm based on HMM is that the available observed data is used to determine the hidden parameters of the most likely sequence of states.
5. At the first strategy stage for the given model parameters Θ=(P, P s , П) and sequence of hidden S and relevant observable states О, the probability of occurrence of these sequences is determined.Thus on the first step, the model Θ and collection of initial data D in get the determination of π i =P(D in Θ) and the evaluation of how well the Θ model is agreed to the initial data.This problem in such a formulation is solved using the "back and forth" algorithm which allows finding in HMM the collection of getting into state s i at t-th step at the sequence of observation O and (hidden) sequence of states S.
6.At the second stage, at the given hodden Markov model Θ with the space of hidden states S={s 1 , s 2 ,..., s K }, initial collections π i presence in state i and collections P i,j of transition from the state i to the state j, and basing on the observed states, o 1 ,..., o T the Viterbi path can be found S V =s 1 ...s T , which described this model in the fullest way.It is natural to use the Viterbi algorithm for this purpose.
7. The third stage of the strategy implies correcting the hidden Markov models by optimizing the model parameters Θ=(P, P s , П) in a way to maximise P(OΘ) at the observed data O.This strategy step is fully implemented by the Baum -Welch algorithm.
The collection of using the synthesis algorithm, grounded in the article, for solving the HMM problem determines the further research of control by flexibility of the synthesized algorithm to provide structural and functional stability of the software implementing this algorithm.

Figure 1 .
Figure 1.Transition diagram to HMM: x−hidden states; y− observed results; P ij − transitions probabilities; P i − probability of the result.

s
Θ = P P, by iterative recalculation of the model parameters ( , ,П) s Θ = P P until convergence.