oalib

Publish in OALib Journal

ISSN: 2333-9721

APC: Only $99

Submit

Any time

2018 ( 2 )

2017 ( 5 )

2016 ( 6 )

2015 ( 65 )

Custom range...

Search Results: 1 - 10 of 1571 matches for " Johanna Hardin "
All listed articles are free for downloading (OA Articles)
Page 1 /1571
Display every page Item
Network Analysis with the Enron Email Corpus
Johanna Hardin,Ghassan Sarkis,P. C. Urc
Computer Science , 2014,
Abstract: We use the Enron email corpus to study relationships in a network by applying six different measures of centrality. Our results came out of an in-semester undergraduate research seminar. The Enron corpus is well suited to statistical analyses at all levels of undergraduate education. Through this note's focus on centrality, students can explore the dependence of statistical models on initial assumptions and the interplay between centrality measures and hierarchical ranking, and they can use completed studies as springboards for future research. The Enron corpus also presents opportunities for research into many other areas of analysis, including social networks, clustering, and natural language processing.
A method for generating realistic correlation matrices
Johanna Hardin,Stephan Ramon Garcia,David Golan
Statistics , 2011, DOI: 10.1214/13-AOAS638
Abstract: Simulating sample correlation matrices is important in many areas of statistics. Approaches such as generating Gaussian data and finding their sample correlation matrix or generating random uniform $[-1,1]$ deviates as pairwise correlations both have drawbacks. We develop an algorithm for adding noise, in a highly controlled manner, to general correlation matrices. In many instances, our method yields results which are superior to those obtained by simply simulating Gaussian data. Moreover, we demonstrate how our general algorithm can be tailored to a number of different correlation models. Using our results with a few different applications, we show that simulating correlation matrices can help assess statistical methodology.
A robust measure of correlation between two genes on a microarray
Johanna Hardin, Aya Mitani, Leanne Hicks, Brian VanKoten
BMC Bioinformatics , 2007, DOI: 10.1186/1471-2105-8-220
Abstract: We propose a resistant similarity metric based on Tukey's biweight estimate of multivariate scale and location. The resistant metric is simply the correlation obtained from a resistant covariance matrix of scale. We give results which demonstrate that our correlation metric is much more resistant than the Pearson correlation while being more efficient than other nonparametric measures of correlation (e.g., Spearman correlation.) Additionally, our method gives a systematic gene flagging procedure which is useful when dealing with large amounts of noisy data.When dealing with microarray data, which are known to be quite noisy, robust methods should be used. Specifically, robust distances, including the biweight correlation, should be used in clustering and gene network analysis.One of the primary goals of experiments involving DNA microarrays is to find genes which are somehow similar across various experimental conditions. "Similar" is usually taken to mean co-expressed, but it can be measured in several different ways. The distance (usually one minus similarity) measure most commonly used is Pearson correlation, though Euclidean distance, cosine-angle metric, Spearman rank correlation, and jackknife correlation are also used frequently. (Note that correlation and cosine-angle metrics do not fulfill the triangle inequality, so they are not true distance metrics. However, they are used to measure distance in many applications.) For example, [1-4] use Pearson correlation in their gene network analysis; [5-13] use Pearson correlation (or a modification) to cluster gene expression data. Once the similarity or distance measure is chosen, the relationship between the genes is given by some sort of clustering algorithm (e.g., k-means, hierarchical clustering, k nearest neighbors) or gene network analysis.Clustering results can be highly dependent on the choice of similarity measure (particularly when comparing genes whose similarities are based on tens of samples instead of
Data Science in Statistics Curricula: Preparing Students to "Think with Data"
Johanna Hardin,Roger Hoerl,Nicholas J. Horton,Deborah Nolan
Statistics , 2014,
Abstract: A growing number of students are completing undergraduate degrees in statistics and entering the workforce as data analysts. In these positions, they are expected to understand how to utilize databases and other data warehouses, scrape data from Internet sources, program solutions to complex problems in multiple languages, and think algorithmically as well as statistically. These data science topics have not traditionally been a major component of undergraduate programs in statistics. Consequently, a curricular shift is needed to address additional learning outcomes. The goal of this paper is to motivate the importance of data science proficiency and to provide examples and resources for instructors to implement data science in their own statistics curricula. We provide case studies from seven institutions. These varied approaches to teaching data science demonstrate curricular innovations to address new needs. Also included here are examples of assignments designed for courses that foster engagement of undergraduates with data and data science.
Resistant Sparse Multiple Canonical Correlation
Jacob Coleman,Joseph Replogle,Gabriel Chandler,Johanna Hardin
Statistics , 2014,
Abstract: Canonical Correlation Analysis (CCA) is a multivariate technique that takes two datasets and forms the most highly correlated possible pairs of linear combinations between them. Each subsequent pair of linear combinations is orthogonal to the preceding pair, meaning that new information is gleaned from each pair. By looking at the magnitude of coefficient values, we can find out which variables can be grouped together, thus better understanding multiple interactions that are otherwise difficult to compute or grasp intuitively. CCA appears to have quite powerful applications to high throughput data, as we can use it to discover, for example, relationships between gene expression and gene copy number variation. One of the biggest problems of CCA is that the number of variables (often upwards of 10,000) makes biological interpretation of linear combinations nearly impossible. To limit variable output, we have employed a method known as Sparse Canonical Correlation Analysis (SCCA), while adding estimation which is resistant to extreme observations or other types of deviant data. In this paper, we have demonstrated the success of resistant estimation in variable selection using SCCA. Additionally, we have used SCCA to find {\em multiple} canonical pairs for extended knowledge about the datasets at hand. Again, using resistant estimators provided more accurate estimates than standard estimators in the multiple canonical correlation setting.
Differential expression analysis for multiple conditions
Ciaran Evans,Johanna Hardin,Mark Huber,Daniel Stoebel,Garrett Wong
Statistics , 2014,
Abstract: As high-throughput sequencing has become common practice, the cost of sequencing large amounts of genetic data has been drastically reduced, leading to much larger data sets for analysis. One important task is to identify biological conditions that lead to unusually high or low expression of a particular gene. Packages such as DESeq implement a simple method for testing differential signal when exactly two biological conditions are possible. For more than two conditions, pairwise testing is typically used. Here the DESeq method is extended so that three or more biological conditions can be assessed simultaneously. Because the computation time grows exponentially in the number of conditions, a Monte Carlo approach provides a fast way to approximate the $p$-values for the new test. The approach is studied on both simulated data and a data set of {\em C. jejuni}, the bacteria responsible for most food poisoning in the United States.
Back to basics
Susan Hardin
Genome Biology , 2002, DOI: 10.1186/gb-2002-3-8-reports4026
Abstract: This year's Association of Biomolecular Resource Facilities (ABRF) meeting, entitled "Biomolecular Technologies: Tools for Discovery in Proteomics and Genomics", emphasized the protein and DNA technologies that inspired the formation of the ABRF. Meeting abstracts and some presentation slides or posters are available through the ABRF website http://www.abrf.org webcite. Some presentations are also submitted for publication in the ABRF journal, Journal of Biomolecular Techniques.The plenary sessions emphasized the importance of technology development on scientific discovery, which is especially true for genomics and proteomics. Richard Wilson (Washington University School of Medicine, St. Louis, USA) summarized the development of techniques for physical mapping of the genome and discussed the importance of automating procedures for generating genome sequence information. He commented that the human genome sequence will be finished to coincide with the 50th anniversary of the discovery of the structure of DNA by Watson and Crick, in April 2003. He described his lab's collaboration with the lab of Eric Green (National Human Genome Research Institute, National Institutes of Health, Bethesda, USA) to analyze human chromosome 7, focusing on the Pendrin gene and the effect of its mutation on ear development. The gene is associated with 5-10% of cases of human hereditary deafness and also with enlargement of the thyroid (goiter) and encodes an anion transporter that, when mutated, is believed to damage (rupture) delicate ear structures. Pendrin knockout mice are deaf and a large portion of the progeny have an unusual phenotype of running in circles. Wilson also described his work on some large, highly repetitive (and therefore challenging) sequences on the human Y chromosome that may have biological significance for male fertility and sperm production.Raymond Deshaies (Howard Hughes Medical Institute and California Institute of Technology, Pasadena, USA) described the use o
Macromolecular technologies: applications and improvements
Susan Hardin
Genome Biology , 2001, DOI: 10.1186/gb-2001-2-5-reports4012
Abstract: Approximately 1,000 scientists arrived in San Diego for the annual ABRF meeting, which was entitled "The new biology: technologies for resolving macromolecular communications". An online version of the meeting abstracts will be available through the ABRF journal, the Journal of Biomolecular Techniques [http://www.abrf.org/JBT/JBTindex.html webcite], and more details can be found at the ABRF website [http://www.abrf.org webcite]. This meeting was, as in previous years, a great place to learn about both new technologies and recently developed modifications that improve existing research methods. A regular meeting highlight is the recognition of an outstanding contributor to technology development. This year, Csaba Horvath (Yale University, New Haven, USA) was recognized for his contributions to the evolution of modern chromatography.The plenary talks provided an appropriate backdrop to illustrate how basic science drives the discovery and development of the many research methods and technologies that were discussed in detail during the smaller concurrent sessions. Ronald Evans (Salk Institute, La Jolla, USA) presented the intricacies of nuclear hormone receptor action and illustrated potential effects that can result from drug-drug interactions. He described an interesting adaptation mechanism that enables the body to increase resistance to an introduced chemical, and outlined how this 'xenobiotic response' facilitates detoxification and clearance of the chemical from the body. This response can be triggered by substances present in non-prescription compounds (such as St John's Wort) and, once activated, removes a variety of substances from the body. Examples of drugs that can be eliminated from the body by the xenobiotic response include the active ingredient in birth control pills (thus providing a scientific explanation for many 'miracle' babies), and protease inhibitors, which are used to treat HIV.Roger Brent (Molecular Sciences Institute, Berkeley, USA) discusse
La tragedia de los comunes
Garrett Hardin
Polis : Revista de la Universidad Bolivariana , 2005,
Abstract:
Teach them to Fly: Strategies for Encouraging Active Online Learning
Karen HARDIN
The Turkish Online Journal of Distance Education , 2004,
Abstract: Teach them to Fly: Strategies for Encouraging Active Online Learning Karen HARDIN Cameron University Lawton, OK, USA PROBLEM One of the hot topics in education in the past 10 years has been the shift of the role of the educator. Whereas, he has traditionally been the owner and deliverer of the knowledge (Sage on the stage), now his role is shifting to a guide and facilitator (guide by the side). The purpose is to give the students ownership in their own learning process. As technology becomes more sophisticated, automation is replacing students problem solving skills, critical thinking and sometimes patience. On one of my evaluations in a 1999 online course, a student criticized that, she s not doing the teaching, I m doing the learning. Of course in my desire to encourage active learning, I took the response as a compliment, but the student meant it as a criticism. I began pondering the reluctance of students to take control of the learning process. I ve noticed this lack of problem solving, critical thinking and patience with young adults in the workplace. For example, I often visit Sam s, a warehouse store owned by Wal-Mart. When I check out, I pay with a check. The computerized register will print the check for me, so I allow the cashier to do that. I often ask him or her to add $15 to the total to give me cash back. It s amazing how long it takes these young adults to add $15 to the total because of their reliance on computers. In another situation, when I was in an outlet shoe store in Texas, I purchased a pair of sandals. After I checked out, I noticed a sign that promoted, buy one, get a second for one cent. Of course, I wanted to take advantage of this opportunity, so I told the cashier that I wanted to find another pair of shoes. She replied, It s too late, your transaction is complete. I wouldn t know what to do. I said, It s simple, I owe you one cent. She said, I don t know how to make the computer fix it. After attempting many explanations of how she could return the shoes I bought, and re-process, I walked out of the store with one pair of shoes. The student lacked problem solving skills because of her reliance on the computer. Online courses can tend to make students more dependent on the computer for problem solving. STUDENTS COMMENTS After several semesters of working with students in the online environment, I posted this discussion: All of you have taken courses and done well (for me it was History) and then 2-3 years later, you can't remember what you learned. That was probably because the learning process for that course w
Page 1 /1571
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.