Abstract:
Given n observations, we study the consistency of a batch of k new observations, in terms of their distribution function. We propose a non-parametric, non-likelihood test based on Edgeworth expansion of the distribution function. The keypoint is to approximate the distribution of the n+k observations by the distribution of n-k among the n observations. Edgeworth expansion gives the correcting term and the rate of convergence. We also study the discrete distribution case, for which Cram\`er's condition of smoothness is not satisfied. The rate of convergence for the various cases are compared.

Abstract:
In the framework of estimating procedures without any distributional assumption under the alternative hypothesis, a new and efficient procedure for estimating the lFDR is described. The results of a simulation study indicated good performances for the proposed estimator in comparison to four published ones. The five different procedures were applied to real datasets.A novel and efficient procedure for estimating lFDR was developed and evaluated.The use of current high-density microarrays for genomic association studies leads to the simultaneous evaluation of a huge number of statistical hypotheses. Thus, one of the main problems faced by the investigator is the selection of genes (or gene products) worthy of further analysis taking multiple testing into account.Although the oldest extension of the classical type I error rate is the family-wise error rate (FWER), which is defined as the probability of falsely rejecting at least one null hypothesis (e.g., the lack of relationship between gene-expression changes and a phenotype), FWER-based procedures are often too conservative, particularly when numerous hypotheses are tested [1]. As an alternative and less stringent error criterion, Benjamini and Hochberg introduced, in their seminal paper [2], the False Discovery Rate (FDR), which is defined as the expected proportion of false discoveries among all discoveries. Here, a discovery refers to a rejected null hypothesis.Assuming that the test statistics are independent and identically distributed under the null hypothesis, Storey [3] demonstrated that, for a fixed rejection region Γ, which is considered to be the same for every test, the FDR is asymptotically equal to the following posterior probability:where H is the random variable such that H = 0 if the null hypothesis, noted H0, is true; H = 1 if the alternative hypothesis, noted H1, is true; and T is the test statistic considered for all tested hypotheses. However, one drawback is that the FDR criterion associated w

Abstract:
The problem of target localization with noise is addressed. The target is a sample from a continuous random variable with known distribution and the goal is to locate it with minimum mean squared error distortion. The localization scheme or policy proceeds by queries, or questions, weather or not the target belongs to some subset as it is addressed in the 20-question framework. These subsets are not constrained to be intervals and the answers to the queries are noisy. While this situation is well studied for adaptive querying, this paper is focused on the non adaptive querying policies based on dyadic questions. The asymptotic minimum achievable distortion under such policies is derived. Furthermore, a policy named the Aurelian1 is exhibited which achieves asymptotically this distortion.

Abstract:
Modeling relations between individuals is a classical question in social sciences and clustering individuals according to the observed patterns of interactions allows to uncover a latent structure in the data. Stochastic block model (SBM) is a popular approach for grouping the individuals with respect to their social comportment. When several relationships of various types can occur jointly between the individuals, the data are represented by multiplex networks where more than one edge can exist between the nodes. In this paper, we extend the SBM to multiplex networks in order to obtain a clustering based on more than one kind of relationship. We propose to estimate the parameters --such as the marginal probabilities of assignment to groups (blocks) and the matrix of probabilities of connections between groups-- through a variational Expectation-Maximization procedure. Consistency of the estimates as well as statistical properties of the model are obtained. The number of groups is chosen thanks to the Integrated Completed Likelihood criteria, a penalized likelihood criterion. Multiplex Stochastic Block Model arises in many situations but our applied example is motivated by a network of French cancer researchers. The two possible links (edges) between researchers are a direct connection or a connection through their labs. Our results show strong interactions between these two kinds of connections and the groups that are obtained are discussed to emphasize the common features of researchers grouped together.

Abstract:
The "Mister VCM" interface was designed by dividing the screen into two parts: a graphical interactive one including VCM icons and synthetizing drug properties, a textual one presenting on demand drug monograph excerpts. The interface was evaluated over 11 volunteer general practitioners, trained in the use of "Mister VCM". They were asked to answer clinical questions related to fictitious randomly generated drug monographs, using a textual interface or "Mister VCM". When answering the questions, correctness of the responses and response time were recorded."Mister VCM" is an interactive interface that displays VCM icons organized around an anatomical diagram of the human body with additional mental, etiological and physiological areas. Textual excerpts of the drug monograph can be displayed by clicking on the VCM icons. The interface can explicitly represent information implicit in the drug monograph, such as the absence of a given contraindication. Physicians made fewer errors with "Mister VCM" than with text (factor of 1.7; p = 0.034) and responded to questions 2.2 times faster (p < 0.001). The time gain with "Mister VCM" was greater for long monographs and questions with implicit replies."Mister VCM" seems to be a promising interface for accessing drug monographs. Similar interfaces could be developed for other medical domains, such as electronic patient records.When prescribing a drug, the physician needs to ensure that the prescription is safe. However, medication errors are frequent and constitute a public health problem [1]. Serious events reported to the FDA (Food and Drug Administration) increased 4 times faster than the total number of outpatient prescriptions. In the prescription process, the physician must first decide whether he is sufficiently familiar with the various contraindications, drug interactions and cautions for use. If not, he must consult the drug monograph. It may take too long to read this monograph in full if the text is long, and this r

Abstract:
We used a database of logbooks of type 1 diabetic patients who participated in a summer camp. Patients used a guideline to calculate the doses of insulin lispro and glargine four times a day, and registered their injected doses in the database. We implemented the guideline in a computer system to calculate recommended doses. We then compared injected and recommended doses by using five indicators that we designed for this purpose: absolute agreement (AA): the two doses are the same; relative agreement (RA): there is a slight difference between them; extreme disagreement (ED): the administered and recommended doses are merely opposite; Under-treatment (UT) and over-treatment (OT): the injected dose is not enough or too high, respectively. We used weighted linear regression model to study the evolution of these indicators over time.We analyzed 1656 insulin doses injected by 28 patients during a three weeks camp. Overall indicator rates were AA = 45%, RA = 30%, ED = 2%, UT = 26% and OT = 30%. The highest rate of absolute agreement is obtained for insulin glargine (AA = 70%). One patient with alarming behavior (AA = 29%, RA = 24% and ED = 8%) was detected. The monitoring of these indicators over time revealed a crescendo curve of adherence rate which fitted well in a weighted linear model (slope = 0.85, significance = 0.002). This shows an improvement in the quality of therapeutic decision-making of patients during the camp.Our method allowed the measurement of patients' adherence to their insulin adjustment guidelines. The indicators that we introduced were capable of providing quantitative data on the quality of patients' decision-making for the studied population as a whole, for each individual patient, for all injections, and for each time of injection separately. They can be implemented in monitoring systems to detect non-adherent patients.In some diseases such as diabetes, patients manage their own therapy. Insulin dependent diabetic patients are often advised to

Abstract:
The VCM graphical language was designed using a small number of graphical primitives and combinatory rules. VCM was evaluated over 11 volunteer general practitioners to assess if the language is easy to learn, to understand and to use. Evaluators were asked to register their VCM training time, to indicate the meaning of VCM icons and sentences, and to answer clinical questions related to randomly generated drug monograph-like documents, supplied in text or VCM format.VCM can represent the various signs, diseases, physiological states, life habits, drugs and tests described in drug monographs. Grammatical rules make it possible to generate many icons by combining a small number of primitives and reusing simple icons to build more complex ones. Icons can be organized into simple sentences to express drug recommendations. Evaluation showed that VCM was learnt in 2 to 7 hours, that physicians understood 89% of the tested VCM icons, and that they answered correctly to 94% of questions using VCM (versus 88% using text, p = 0.003) and 1.8 times faster (p < 0.001).VCM can be learnt in a few hours and appears to be easy to read. It can now be used in a second step: the design of graphical interfaces facilitating access to drug monographs. It could also be used for broader applications, including the design of interfaces for consulting other types of medical document or medical data, or, very simply, to enrich medical texts.The drug prescription process is complex and error-prone. Many medication errors result from incorrect prescription, leading to frequent injury, a public health problem [1-3].The installation of computerized physician order entry (CPOE) systems within hospitals may help to decrease the frequency of such errors [4-6], although an increase in errors was recently reported after the installation of such a system [7]. For maximal efficiency, CPOEs must be linked to decision-making systems using patient data encoded in the electronic patient record [8]. This con

Abstract:
The influence of climate on biodiversity is an important ecological question. Various theories try to link climate change to allelic richness and therefore to predict the impact of global warming on genetic diversity. We model the relationship between genetic diversity in the European beech forests and curves of temperature and precipitation reconstructed from pollen databases. Our model links the genetic measure to the climate curves through a linear functional regression. The interaction in climate variables is assumed to be bilinear. Since the data are georeferenced, our methodology accounts for the spatial dependence among the observations. The practical issues of these extensions are discussed.

Abstract:
We propose a two-step normalization procedure for triple-target experiments. First the dye bleeding is evaluated and corrected if necessary. Then the signal in each channel is normalized using a generalized lowess procedure to correct a global dye bias. The normalization procedure is validated using triple-self experiments and by comparing the results of triple-target and two-color experiments. Although the focus is on triple-target microarrays, the proposed method can be used to normalize p differently labelled targets co-hybridized on a same array, for any value of p greater than 2.The proposed normalization procedure is effective: the technical biases are reduced, the number of false positives is under control in the analysis of differentially expressed genes, and the triple-target experiments are more powerful than the corresponding two-color experiments. There is room for improving the microarray experiments by simultaneously hybridizing more than two samples.DNA microarray technology is a high throughput technique by which the expression of the whole genome is studied in a single experiment. In dual label experiments the fluorescent dyes Cy3 and Cy5 are used to label the two RNA samples co-hybridized on a same array. Recently two more dyes have been proposed (Alexa 488 and Alexa 594) allowing the simultaneous hybridization of three or four samples. Forster et al. [2] have evaluated triple-target microarray by comparing results of single-target, dual-target and triple-target microarrays. They have concluded that the use of triple-target microarray is valid from an experimental point of view. One year later, Staal et al. [7] have investigated the four-target microarray experiments. Their approach differs from that of [2], but their conclusions are in fair agreement. Their study has shown that Alexa 594 is best suited as a third dye and that Alexa 488 can be applied as a fourth dye on some microarray types. These extensions of the microarray technology are promis

Abstract:
Despite considerable success of genome wide association (GWA) studies in identifying causal variants for many human diseases, their success in unraveling the genetic basis to complex diseases has been more mitigated. Pathogen population structure may impact upon the infectious phenotype, especially with the intense short-term selective pressure that drug treatment exerts on pathogens. Rigorous analysis that accounts for repeated measures and disentangles the influence of genetic and environmental factors must be performed. Attempts should be made to consider whether pathogen diversity will impact upon host genetic responses to infection. We analyzed the heritability of two Plasmodium falciparum phenotypes, the number of clinical malaria episodes (PFA) and the proportion of these episodes positive for gametocytes (Pfgam), in a family-based cohort followed for 19 years, during which time there were four successive drug treatment regimes, with documented appearance of drug resistance. Repeated measures and variance components analyses were performed with fixed environmental, additive genetic, intra-individual and maternal effects for each drug period. Whilst there was a significant additive genetic effect underlying PFA during the first drug period of study, this was lost in subsequent periods. There was no additive genetic effect for Pfgam. The intra-individual effect increased significantly in the chloroquine period. The loss of an additive genetic effect following novel drug treatment may result in significant loss of power to detect genes in a GWA study. Prior genetic analysis must be a pre-requisite for more detailed GWA studies. The temporal changes in the individual genetic and the intra-individual estimates are consistent with those expected if there were specific host-parasite interactions. The complex basis to the human response to malaria parasite infection likely includes dominance/epistatic genetic effects encompassed within the intra-individual variance component. Evaluating their role in influencing the outcome of infection through host genotype by parasite genotype interactions warrants research effort.