International Journal of Statistics and Probability
http://ccsenet.org/journal/index.php/ijsp
<em>International Journal of Statistics and Probability (IJSP) </em>is an open-access, international, double-blind peer-reviewed journal published by the <a href="/web/">Canadian Center of Science and Education</a>. <br /><br />This journal, published quarterly in both print and <a href="/journal/index.php/ijsp/issue/archive">online versions</a>, keeps readers up-to-date with the latest developments in all areas of statistics and probability.<br /><br />It is journal policy to publish work deemed by peer reviewers to be a coherent and sound addition to scientific knowledge and to put less emphasis on interest levels, provided that the research constitutes a useful contribution to the field.<br /><br />Canadian Center of Science and Educationen-USInternational Journal of Statistics and Probability1927-7032<p>Submission of an article implies that the work described has not been published previously (except in the form of an abstract or as part of a published lecture or academic thesis), that it is not under consideration for publication elsewhere, that its publication is approved by all authors and tacitly or explicitly by the responsible authorities where the work was carried out, and that, if accepted, will not be published elsewhere in the same form, in English or in any other language, without the written consent of the Publisher. The Editors reserve the right to edit or otherwise alter all contributions, but authors will receive proofs for approval before publication.</p><p><br />Copyrights for articles published in CCSE journals are retained by the authors, with first publication rights granted to the journal. The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author.</p>Estimation of a Spearman-Type Multivariate Measure of Local Dependence
http://ccsenet.org/journal/index.php/ijsp/article/view/30940
A multivariate measure of local dependence written in terms of copulas is proposed, which if integrated, coincides with a population version of a multivariate global measure of Spearman's rho. We propose nonparametric estimators of this measure for independent sample data and also for time series data. Some properties of the estimators are derived. Simulations with different copulas and sample sizes were performed to assess the theoretical findings. Empirical applications are given for selected economic indexes of $166$ countries and for the returns of the DAX, CAC40 and FTSE indexes.Sumaia A. LatifPedro A. Morettin2014-03-202014-03-203The Distribution of Maximal Prime Gaps in Cramer's Probabilistic Model of Primes
http://ccsenet.org/journal/index.php/ijsp/article/view/35285
In the framework of Cramer's probabilistic model of primes, we explore the exact and asymptotic distributions of maximal prime gaps. We show that the Gumbel extreme value distribution exp(-exp(-x)) is the limit law for maximal gaps between Cramer's random "primes". The result can be derived from a general theorem about intervals between discrete random events occurring with slowly varying probability monotonically decreasing to zero. A straightforward generalization extends the Gumbel limit law to maximal gaps between prime constellations in Cramer's model.Alexei Kourbatov2014-03-202014-03-203Multivariate Relationships Between Physiologic and Anthropometric Variables: A Data Based Analysis
http://ccsenet.org/journal/index.php/ijsp/article/view/31340
To establish the relationship between two sets of variables measured on the same subject, canonical correlation analysis (CCA) is the most appropriate and popular method. In this study we consider two sets of variables which consist of different types of measurements. Here one set has three physiologic variables whereas the other set has eighteen anthropometric variables (mentioned in section 3.1 with abbreviations). The aim of this study is to evaluate the relationship between two sets and to find out the factors which influence the relationship between the two sets. This study has revealed that first two canonical correlations were significant and WT, APC, TVC, CCN, MUAC and WC (anthropometric variables) are the risk factors for SBP and DBP (physiologic variables). Furthermore considering these risk factors, General Linear Model (GLM) indicated that CCN and WC are highly significant factors which influence the physiologic set. Thus the model (CCA+GLM) provide the most important factors which influence physiologic variables.Baidyanath PalBabulal Seal2014-03-202014-03-203Random Variables Fundamental in Probability and Sigma-Complete Convergence
http://ccsenet.org/journal/index.php/ijsp/article/view/35417
The aim of this paper is to study some necessary and sufficient conditions for fundamental (Cauchy) in probability sequences of random variables. In this way, we will be able to deduce some relationships between certain types of convergence and these sequences of random variables characterized because in their definition the random variable limit does not appear. Finally, we introduce the concept of a Sigma-completely convergent sequence and a sufficient condition for it.Salvador Cruz RambaudAntonio Luis Rodriguez Lopez-Canizares2014-03-252014-03-253Inferring Transcriptional Regulatory Relationships Among Genes in Breast Cancer: An Application of Bayes' Theorem
http://ccsenet.org/journal/index.php/ijsp/article/view/33357
The introduction of Deoxyribonucleic acid (DNA) microarray technologies provides a means of measuring the expression of thousands of genes simultaneously. It has generally sought to revolutionalize biological research by significantly elucidating biological processes. Gene networks may be inferred from such microarray data. Bayes' theorem, in this work is applied to the problem of inferring new transcriptional regulatory relationships among gene products in Breast Cancer. A compendium of human breast epithelial cell probe level microarray data from the Gene Expression Omnibus (GEO) repository was subjected to the Robust Multiarray Average (RMA) procedure for normalization and background correction. A subset of the resulting expression matrix consisting of the expression values of only relevant probe-set identifiers (IDs) representing the genes of interest in the data were extracted with a LISP code. This subset was supplied to a Bayesian Network inference learning algorithm to unearth new regulatory relationships from the data. Variations in parameters of the learning algorithms resulted in the prediction of at least 10 new relationships among genes in breast cancer. Among these were the direct regulatory signaling relationship between S-phase kinase associated protein 2 (SKP2) and the Cell division cycle 25A (CDC25A) and that between the cyclin-dependent kinases regulatory subunit 1 (CKS1B / CDC28) and E2F transcription factor 3 (E2F3). The identified causal networks are potentially useful for understanding complex drug actions and dysfunctional signaling in breast cancer.Emmanuel S. AdaborGeorge K. Acquaah-MensahFrancis T. Oduro2014-03-252014-03-253A New Family of Distributions: Libby-Novick Beta
http://ccsenet.org/journal/index.php/ijsp/article/view/28413
We define a family of distributions, named the Libby-Novick beta family of distributions, which includes the classical beta generalized and exponentiated generators. The new family offers much more flexibility for modeling real data than these two generators and the Kumaraswamy family of distributions (Cordeiro \& de Castro, 2011). The extended fa\-mily provides reasonable parametric fits to real data in several areas because the additional shape parameters can control the skewness and kurtosis simultaneously, vary tail weights and provide more entropy for the generated distribution. For any given distribution, we can construct a wider distribution with three additional shape parameters which has much more flexibility than the original one. The family density function is a linear combination of exponentiated densities defined from the same baseline distribution. The proposed family also has tractable mathematical properties including moments, generating function, mean deviations and order statistics. The parameters are estimated by maximum likelihood and the observed information matrix is determined. The importance of the family is very well illustrated. By means of two real data sets we demonstrate that this family can give better fits than those ones using the McDonald, beta generalized and Kumaraswamy generalized classes of distributions.Gauss M. CordeiroLuis H. de SantanaEdwin M. M. OrtegaRodrigo R. Pescim2014-03-252014-03-253Estimating Statistical Measures of Pleiotropic and Epistatic Effects in the Genomic Era
http://ccsenet.org/journal/index.php/ijsp/article/view/35702
Recent developments in the technology for sequencing the genomes of various species has had a profound effect of the working paradigms of various fields of genetics. Included among these fields is the classical field of quantitative genetics, which is a subfield of statistical genetics, that is devoted to traits that can be quantified on some continuous scale and are often influenced by alleles at many loci. In recent years, many investigators have conducted genome wide sweeps and have used a variety of statistical criteria to judge whether identified regions of the human genome have a significant influence on the expression of some quantitative trait such as measurements on patients with Alzheimer's disease. From the point of view of quantitative genetics, the regions of a genome that have some influence on a quantitative trait may be viewed as loci, and variations among these loci at the $DNA$ level, such as nucleotide substitutions or other markers, may be used as working definitions of alleles, and, therefore, can be used to determine whether an individual carries a particular allele at some locus. Given such data, an investigator can identify the genotype of each individual in a study, with respect to the loci under consideration as well as the two alleles present at each locus in a diploid species such as man. This ability to use these working definitions to identify the genotype of each individual in a sample results in a significant change in the working paradigm of sub-field of quantitative genetics, called variance and covariance analysis, because effects and components of variance and covariance may be estimated directly in a sense that will be described in detail in the paper.Charles J. Mode2014-04-012014-04-013Large Deviations, Basic Information Theorem for Fitness Preferential Attachment Random Networks
http://ccsenet.org/journal/index.php/ijsp/article/view/34289
For fitness preferential attachment random networks, we define the <em>empirical degree and pair measure</em>, which counts the number of vertices of a given degree and the number of edges with given fits, and the sample path <em>empirical degree distribution</em>. For the empirical degree and pair distribution for the fitness preferential attachment random networks, we find a large deviation upper bound. From this result we obtain a weak law of large numbers for the empirical degree and pair distribution, and the basic information theorem or an asymptotic equipartition property for fitness preferential attachment random networks.<br />K. Doku-AmponsahF. O. MettleT. Ansah-Narh2014-04-082014-04-083Exponential Approximation, Method of Types for Empirical Neighbourhood Distributions of Random Graphs by Random Allocations
http://ccsenet.org/journal/index.php/ijsp/article/view/33755
In this article we find exponential good approximation of the empirical neigbourhood distribution of symbolled random graphs conditioned to a given empirical symbol distribution and empirical pair distribution. Using this approximation we shorten or simplify the proof of (Doku-Amponsah \& Morters, 2010, Theorem~2.5); the large deviation principle (LDP) for empirical neigbourhood distribution of symbolled random graphs. We also show that the LDP for the empirical degree measure of the classical Erd\H{o}s-R\'{e}nyi graph is a special case of (Doku-Amponsah \& Moerters, 2010, Theorem~2.5). From the LDP for the empirical degree measure, we derive an LDP for the the proportion of isolated vertices in the classical Erd\H{o}s-R\'{e}nyi graph.K. Doku-Amponsah2014-04-212014-04-213Testing Inference in Accelerated Failure Time Models
http://ccsenet.org/journal/index.php/ijsp/article/view/35111
We address the issue of performing hypothesis testing in accelerated failure time models for non-censored and censored samples. The performances of the likelihood ratio test and a recently proposed test, the gradient test, are compared through simulation. The gradient test features the same asymptotic properties as the classical large sample tests, namely, the likelihood ratio, Wald and score tests. Additionally, it is as simple to compute as the likelihood ratio test. Unlike the score and Wald tests, the gradient test does require the computation of the information matrix, neither observed nor expected. Our study suggests that the gradient test is more reliable than the other classical tests when the sample is of small or moderate size.Francisco M. C. MedeirosAntônio H. M. da Silva-JúniorDione M. ValençaSilvia L. P. Ferrari2014-04-212014-04-213