Large deviations, Basic information theorem for fitness preferential attachment random networks

For fitness preferential attachment random networks, we define the empirical degree and pair measure, which counts the number of vertices of a given degree and the number of edges with given fits, and the sample path empirical degree distribution. For the empirical degree and pair distribution for the fitness preferential attachment random networks, we find a large deviation upper bound. From this result we obtain a weak law of large numbers for the empirical degree and pair distribution, and the basic information theorem or an asymptotic equipartition property for fitness preferential attachment random networks.


Introduction
This paper establishes an asymptotic equipartition property (AEP) for fitness preferential attachment (P.A) random networks. The AEP is an important characteristics used often in information theory to partition output samples of a stochastic data source. See, example (Doku-Amponsah, 2010) and the references therein for similar result for networked datasets modelled as coloured random graphs or random fields.
In the past three decades technological advances in the Social Sciences, Web Science and related fields have yielded large amounts of diverse networked datasets which are best described in terms of the preferential attachment graphs. Example the WWW, consisting of over 800 million documents (vertices) and a large number of links ( edges) pointing from one document to another, is best model by preferential attachment graphs. See, ( Lawrence andGiles, 1998,1999). In order to transmit or compress datasets from this random network source, one require efficient coding schemes and approximate pattern matching algorithms, and the AEP for P.A networks play a key role in this regard, example by providing bounds on the possible performance of these schemes or algorithms. P.A models can be easily defined and modified, and can therefore be calibrated to serve as models for social networks and the web graph. These graphs model fairly well the dynamics of the occurrence of power law degree distributions in large networks. See ( Barabasi and Albert, 1999).
The main ideal behind the P.A models is that growing networks are constructed by adding nodes successively. If a new node appears, it gets a fit or colour or symbol or spin according to some law µ on a finite alphabet and it is linked by an edge to one or more existing node(s) with a probability proportional to function of their degree and fits. The dynamics of the graph is completely determined by the function f known as the attachment rule.
There are three regime of P.A graphs. Namely, for linear regime: f (k) ≈ k,sublinear regime: f (k) ≤ k and superlinear regime: f (k) ≥ k. Several results about the asymptotic behaviour of these graphs have been established recently.
Few large deviation results for P.A model have so far been found. In article ( Choi et. al, 2011), P.A schemes where the selection mechanism is possibly time-dependent are considered, and an in infinite dimensional large deviation principle for the sample path evolution of the empirical degree distribution is found by Dupuis-Ellis type methods. (Dereich and Morters, 2009) studied a dynamic model of random networks, where new vertices are connected to old ones with a probability proportional to a sublinear function of their degree. For this model of random networks, they obtained a strong limit law for the empirical degree distribution. Results on the temporal evolution of the degrees of individual vertices via large and moderate deviation principles were also found. (Bryc et. al, 2009) found the large deviation principle and related results for a class of Markov chains associated to the 'leaves'in P.A model of random graphs using both analytic and Dupuis-Ellis-type path arguments.
In this article, we prove a large deviation upper bound for the empirical degree and pair distribution, and use it to find an AEP for for P.A models of random graphs in the linear regime f. i.e. f (k) ≈ k. Our proofs use the techniques of exponential change change-of-measure for random graphs, see example (Dembo et. al, 2003), , Morters, 2010),or ( Doku-Amponsah, 2010).
To be specific, we prove a large deviation upper, see Theorem 0.1, for the empirical degree and pair distribution of the fitness P.A model of random graphs. For a given, empirical degree and pair distribution we prove from the large deviation upper bound a weak law of large numbers,see Theorem 0.3. And from this weak law of large numbers we find the AEP for a networked structure datasets model, see Theorem 0.7, as a fitness P.A model of random graphs.

Large deviation upper bound for P.A random graphs
We write N = N ∪ {0}. Given a weight function f : N × X → [0, ∞] and a probability law µ on finite alphabet X, we define coloured(fitness) P.A random network as follows: • Assign vertex n = 1 (the root of the network) colour X(n) according to µ : X → [0, 1].
• If a new vertex n is introduced, it gets colour X(n) independently according µ, • it connects to vertices v n ∈ { 1, . . . , n − 1 } independently with probability proportional to where A(n) = X(v n ), X(n) and N(m) is the in-degree of vertex m. We consider (N(v n ), A(n)) : n = 1, 2, 3, . . . under the joint law of colour and tree. Denote by X a typed tree and by X(i) colour of vertex i.
We write X * = X × X. In this paper, we shall restrict ourself to functions of the form where γ : X * → (0, ∞], β : X * → [0, ∞]. We assume γ(a) + β(a) := c, for all a ∈ X. (1) and that the function f satisfy the following weak preference condition: Let N (m) (i) be the degree of vertex i at time m and observe that at time n, the law of the fitness P.A graph is given by For every X, we define empirical degree and pair measure measure M X on N × X * by We write ℓ m (a) = j m ∈ {1, 2, 3, ..., m − 1} : x( j m ) = a 1 , x(m) = a 2 and for every m = 2, 3, 4, ..., n − 1 we define a probability measure on N × X * by and notice, L X 1 (k, a) = M X (k, a). We denote by M(X) the space of probability measures on X equipped with the weak topology and M(N × X * ) the space of probability measures on N × X * , equipped with the topology generated by total variation metric.
We are now in the position to state our large deviation upper bound for the fitness P.A model of random graphs. We writeω(k | a) := 1l − k j=0 ω(k | a) and state our large deviation upper bound for the empirical pair measure.
Theorem 0.1. Suppose X is coloured P.A random graph with colour law µ : X → (0, 1] and linear weight function where ω 2,1 is the X− marginal of the probability measure ω 2 and , and hence solving recursively for ω(· | a) we get . ( Here we remark that conditions (1) and (2) are necessary for π f (· |a) to be a probability measure on N. See (Dereich and Morters, 2009, p. 13). Note, if f (k, a) = w(k) then (3) concise with the asymptotic degree distribution of random trees and general branching processes found in (Rudas et. al, 2008).

Basic Information Theorem for fitness P.A random networks
Our main theorem is the AEP for networked datasets modelled as fitness P.A graph. In this section, we state the AEP for networked data structure described by fitness P.A graphs. By P we denote the (probability) law of a fitness P.A graph with n vertices. Thus we write Theorem 0.2. Suppose X is coloured P.A random graph with colour law µ : X → (0, 1] and linear weight function In other words, in order to transmit a coloured P.A graph in the given regime one needs with high probability, about n log 2 a 1 ∈X µ(a 1 ) log µ(a 1 ) For any ν ∈ D M we write ν t (k |a) : , a) , for all t ∈ [0, 1] and (k, a) ∈ N × X. Writeν t := dν t dt for the time derivative of the measure ν t and we associate with each path ν ∈ D M the relaxed measure on [0, 1] × (N × X) We call ν ∈ D M absolutely continuous if for each k ∈ N, there existsν(k|a) such that For each absolutely continuous path ν , we define ν ν (·|a),ν(·, ·|a)almost everywhere by By ν ν ≪ ν we mean ν is absolutely continuous. We write Note that the measure L X [nt]/n , for t ∈ [0, 1) is deterministic and its distribution is degenerate at some ν [nt]/n , for t ∈ [0, 1) converging to ν t , t ∈ [0, 1).

Exponential Change-of-Measure
Throughout the remaining part of this paper we shall assume ν t (k|a) ≤ ν ν t (k|a), for all t ∈ [0, 1]. Letg : N × X → R, and write lim n→∞ L [nt] n := ν t ∈ D M , we define the function Ug : [0, 1] × X → R by and note that We useg to define a new fitness P.A random graph as follows: • At time m = 1 assign the root m of the network fit X(m) according to the lawμ given bỹ µ(a 1 ) = e˜h (a 1 )−U(h) µ(a 1 ).
• For any other time m = 2, 3, 4, ....n new node m which appear gets fit X(m) according to the fit lawμ. It connects to node v m , independently with probability proportional tõ We denote by Pf ,n the law of the new fitness P.A graph and observe that it is absolute continuous with respect to P f,n , as for fitness graph X we have that where id is the identity function from [0, 1] to [0, 1]. The following Lemma will be used to establish the upper bound in a variational formulation.
Proof. Let 1 ≥ δ > 0, and l ∈ N. We choose k(l, δ) ∈ N large enough such that, for large n, we have for all a ∈ X and for all t. Now given θ we choose M > θ + δ + log 2 and define the set Γ δ,θ := ν : ν(N > k(l, δ)) < l −1 , l ≥ M As N ≤ k(l, δ) is pre-compact, Γ δ is compact in the weak topology by prokohov criterion. Moreover Now letting K θ be the closure of ∩ 1≥δ>0 Γ δ,θ and taking limit as n approaches ∞ we have (6) which ends the proof the Lemma.

Proof of Theorem 0.1
We derive the upper bound in a variational formulation. To do this, we denote by C 1 the space of all functions on X and by C 2 the space of all bounded continuous functions on N × X * .
We define on the space of probability measures M(N × X) the functionK given bŷ Proof. We leth ∈ C 1 ,g ∈ C 2 and use the Jensen's inequality to obtain This yields the inequality Now, because the functiong is bounded, we can find open neighbourhood B ω of ω, such that Take δ = ε, apply the Chebyshev's inequality to (10) and use (9) to get lim sup n→∞ 1 n log P f,n M X ∈ B ω (L [nt]/n = ν [nt]/n, ∀t∈(0,1] ) Using Lemma 0.3 with θ = ε −1 we may choose the compact set G ε such that lim sup n→∞ 1 n log P f,n M X G ε (L [nt]/n = ν [nt]/n , ∀t ∈ (0, 1]) ≤ −ε −1 .
We show that the functionK ν (ω) in Lemma 0.4 may be replaced by the good rate function Lemma 0.5. For every ν ∈ D M we have thatK ν (ω) ≥ K ν (ω). Moveover, the function K ν is good rate function and lower semi-continuous on M(N × X).
Proof. Suppose ν 1 = ω.Then, using the Jensen's inequality, by our assumption (1) and the variational characterization of entropy we have Recall the definition of K ν above and notice, mapping ω → K ν (ω) is continuous function. Moreover, for all α < ∞, the level sets {K ν ≤ α} are contained in the bounded set and are therefore compact. Consequently, K ν is a good rate function.

Proof of Theorem 0.1 By Mixing
To use the technique of mixing LDP results developed in (Biggins, 2004), we check the main criteria needed for the validity of (Biggins, 2004, Theorem 5(a)) in the following Lemma. We write Θ n := D M n (N×X) , Θ := D M(N×X) , and define P f,n (ν 1 ) := P M X = ν 1 L X [nt] n (·, a) = ν [nt] n (·, a), t ∈ [0, 1) and a ∈ X n Then, the joint distribution of M X and L X is obtained by the mixture of P f,n and P n as follows: dP f,n (ν, ν 1 ) := dP n (ν)dP f,n (ν 1 ).
Lemma 0.6. The family of distributions (i) (P f,n , n ∈ N) (ii) (P f,n , n ∈ N) are exponentially tight.
Proof. (i) As this family distributions obey a large deviation upper bound with a good rate function K ν (ω), the family (P f,n , n ∈ N) is exponentially tight. See, e.g. (Dembo and Zeitouni, 1998, Exercise 4.1.10(c)).
(ii) By (i) for every θ 2 we can find K θ 2 , compact subset of D M(N×X) such that, we have lim sup Also by Lemma 0.3, for every θ 1 we can find K θ 1 , compact subset of M(N × X) such that, we have lim sup n→∞ 1 n log P f,n (K c θ 1 ) ≤ −θ 1 .
Take θ = min(θ 1 , θ 2 ) and define the relatively compact set Γ θ by Now, let δ > 0 and notice that, for sufficiently large n we have that Taking limit n → ∞ followed by δ ↓ 0 of above inequality, yields lim sup n→∞ 1 n logP f,n (Γ c θ ) ≤ −θ which proves the second part of the Lemma. Now, as J(ν 1 ) is lower semi-continuous by the continuity of the relative entropies, and by Lemma 0.6 the families of distributions (i) (P f,n , n ∈ N) (ii) (P f,n , n ∈ N) are exponentially tight, we have that the latter obeys a large deviation upper bound with good rate function give by J(ν 1 ). See, (Biggins, 2004, Theorem 5(a) and proof).
We obtain the form of the rate function in Theorem 0.1 by noting that
Notice, by Theorem 0.1 we have that lim sup We end the proof of the Lemma by showing that the left hand side of (12) is negative. For this purpose we suppose that there exists a sequence ω n such that J(ω n ) ↓ 0. Then, because J is good rate function and all its level sets are compact, and by lower semi-continuity of the mapping ω → J(ω) there is a limit ω ∈ F with J(ω) = 0. Then, we have H ω 2,1 µ = 0 and a∈X ω 2 (a)H ω(·|a) c f (·, a) ⊗ω(·| a) = 0.
This implies ω 2,1 (a 1 ) = µ(a 1 ) and ω(k |a) = π f (k |a) which contradicts ω ∈ F. We begin by recalling the distribution of the typed graph X as follows . and note that, by Lemma 0.7 as n approaches infinity, which completes the proof of the AEP.