Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments
Brian J Haas, Steven L Salzberg, Wei Zhu, Mihaela Pertea, Jonathan E Allen, Joshua Orvis, Owen White, C Robin Buell, Jennifer R Wortman
Genome Biology , 2008, DOI: 10.1186/gb-2008-9-1-r7
Abstract: Accurate and comprehensive gene discovery in eukaryotic genome sequences requires multiple independent and complementary analysis methods including, at the very least, the application of ab initio gene prediction software and sequence alignment tools. The problem is technically challenging, and despite many years of research no single method has yet been able to solve it, although numerous tools have been developed to target specialized and diverse variations on the gene finding problem (for review [1,2]). Conventional gene finding software employs probabilistic techniques such as hidden Markov models (HMMs). These models are employed to find the most likely partitioning of a nucleotide sequence into introns, exons, and intergenic states according to a prior set of probabilities for the states in the model. Such gene finding programs, including GENSCAN [3], GlimmerHMM [4], Fgenesh [5], and GeneMark.hmm [6], are effective at identifying individual exons and regions that correspond to protein-coding genes, but nevertheless they are far from perfect at correctly predicting complete gene structures, differing from correct gene structures in exon content or position [7-10].The correct gene structures, or individual components including introns and exons, are often apparent from spliced alignments of homologous transcript or protein sequences. Many software tools are available that perform these alignment tasks. Tools used to align expressed sequence tags (ESTs) and full-length cDNAs (FL-cDNAs) to genomic sequence include EST_GENOME [11], AAT [12], sim4 [13], geneseqer [14], BLAT [15], and GMAP [16], among numerous others. The list of programs that perform spliced alignments of protein sequences to DNA are much fewer, including the multifunctional AAT, exonerate [17], and PMAP (derived from GMAP). An extension of spliced protein alignment that includes a probabilistic model of eukaryotic gene structure is implemented in GeneWise [18], a popular homology-based gene predict
Optimizing Read Mapping to Reference Genomes to Determine Composition and Species Prevalence in Microbial Communities
John Martin, Sean Sykes, Sarah Young, Karthik Kota, Ravi Sanka, Nihar Sheth, Joshua Orvis, Erica Sodergren, Zhengyuan Wang, George M. Weinstock, Makedonka Mitreva
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0036427
Abstract: The Human Microbiome Project (HMP) aims to characterize the microbial communities of 18 body sites from healthy individuals. To accomplish this, the HMP generated two types of shotgun data: reference shotgun sequences isolated from different anatomical sites on the human body and shotgun metagenomic sequences from the microbial communities of each site. The alignment strategy for characterizing these metagenomic communities using available reference sequence is important to the success of HMP data analysis. Six next-generation aligners were used to align a community of known composition against a database comprising reference organisms known to be present in that community. All aligners report nearly complete genome coverage (>97%) for strains with over 6X depth of coverage, however they differ in speed, memory requirement and ease of use issues such as database size limitations and supported mapping strategies. The selected aligner was tested across a range of parameters to maximize sensitivity while maintaining a low false positive rate. We found that constraining alignment length had more impact on sensitivity than does constraining similarity in all cases tested. However, when reference species were replaced with phylogenetic neighbors, similarity begins to play a larger role in detection. We also show that choosing the top hit randomly when multiple, equally strong mappings are available increases overall sensitivity at the expense of taxonomic resolution. The results of this study identified a strategy that was used to map over 3 tera-bases of microbial sequence against a database of more than 5,000 reference genomes in just over a month.
Modeling the Geography of Migratory Pathways and Stopover Habitats for Neotropical Migratory Birds
Roger Tankersley, Jr.,Kenneth Orvis
Ecology and Society , 2003,
Abstract: Intact migratory routes are critical for the stability of forest-dwelling, neotropical, migratory bird populations, and mortality along migratory pathways may be significant. Yet we know almost nothing about the geography of available stopovers or the possible migratory pathways that connect optimal stopovers. We undertake a spatial analysis of stopover habitat availability and then model potential migratory pathways between optimal stopovers in the eastern United States. Using models of fixed orientation and fixed nightly flight distance between stopovers during spring migration, we explore whether a simple endogenous migratory program is sufficient to ensure successful migration across the modern landscape. Our model runs suggest that the modern distribution of optimum stopovers in the eastern United States can be adequately exploited by birds following migratory pathways defined by fixed-orientation and fixed-distance nightly flights. Longer flight distances may increase the chances of success by enabling migrants to bypass locales offering little habitat. Our results also suggest that most southwest–northeast migratory pathways through the Appalachian mountains are intact. Lack of optimal habitat at key locations in the Southeast causes many modeled pathways to fail. We present a speculative view of regional migration patterns implied by predominant ideas found in stopover ecology literature, and demonstrate the need for broad-scale migration research, in the hope that our approach will foster other continental- and regional-scale projects.
Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses
Shivakumara Siddaramappa, Jean F Challacombe, Alison J Duncan, Allison F Gillaspy, Matthew Carson, Jenny Gipson, Joshua Orvis, Jeremy Zaitshik, Gentry Barnes, David Bruce, Olga Chertkov, J Chris Detter, Cliff S Han, Roxanne Tapia, Linda S Thompson, David W Dyer, Thomas J Inzana
BMC Genomics , 2011, DOI: 10.1186/1471-2164-12-570
Abstract: The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the Pasteurellaceae, several H. somni genes that may encode proteins involved in virulence (e.g., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor.Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two H. somni strains.Histophilus somni is a commensal or opportunistic pathogen of the reproductive and respiratory
Stromal-to-Epithelial Transition during Postpartum Endometrial Regeneration
Cheng-Chiu Huang, Grant D. Orvis, Ying Wang, Richard R. Behringer
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0044285
Abstract: Endometrium is the inner lining of the uterus which is composed of epithelial and stromal tissue compartments enclosed by the two smooth muscle layers of the myometrium. In women, much of the endometrium is shed and regenerated each month during the menstrual cycle. Endometrial regeneration also occurs after parturition. The cellular mechanisms that regulate endometrial regeneration are still poorly understood. Using genetic fate-mapping in the mouse, we found that the epithelial compartment of the endometrium maintains its epithelial identity during the estrous cycle and postpartum regeneration. However, whereas the stromal compartment maintains its identity during homeostatic cycling, after parturition a subset of stromal cells differentiates into epithelium that is subsequently maintained. These findings identify potential progenitor cells within the endometrial stromal compartment that produce long-term epithelial tissue during postpartum endometrial regeneration.
Genomic Islands in the Pathogenic Filamentous Fungus Aspergillus fumigatus
Natalie D. Fedorova,Nora Khaldi,Vinita S. Joardar,Rama Maiti,Paolo Amedeo,Michael J. Anderson,Jonathan Crabtree,Joana C. Silva,Jonathan H. Badger,Ahmed Albarraq,Sam Angiuoli,Howard Bussey,Paul Bowyer,Peter J. Cotty,Paul S. Dyer,Amy Egan,Kevin Galens,Claire M. Fraser-Liggett,Brian J. Haas,Jason M. Inman,Richard Kent,Sebastien Lemieux,Iran Malavazi,Joshua Orvis,Terry Roemer,Catherine M. Ronning,Jaideep P. Sundaram,Granger Sutton,Geoff Turner,J. Craig Venter,Owen R. White,Brett R. Whitty,Phil Youngman,Kenneth H. Wolfe,Gustavo H. Goldman,Jennifer R. Wortman,Bo Jiang,David W. Denning,William C. Nierman
PLOS Genetics , 2008, DOI: 10.1371/journal.pgen.1000046
Abstract: We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome. While the core genes are 99.8% identical at the nucleotide level, identity for variable genes can be as low 40%. The most divergent loci appear to contain heterokaryon incompatibility (het) genes associated with fungal programmed cell death such as developmental regulator rosA. Cross-species comparison has revealed that 8.5%, 13.5% and 12.6%, respectively, of A. fumigatus, N. fischeri and A. clavatus genes are species-specific. These genes are significantly smaller in size than core genes, contain fewer exons and exhibit a subtelomeric bias. Most of them cluster together in 13 chromosomal islands, which are enriched for pseudogenes, transposons and other repetitive elements. At least 20% of A. fumigatus-specific genes appear to be functional and involved in carbohydrate and chitin catabolism, transport, detoxification, secondary metabolism and other functions that may facilitate the adaptation to heterogeneous environments such as soil or a mammalian host. Contrary to what was suggested previously, their origin cannot be attributed to horizontal gene transfer (HGT), but instead is likely to involve duplication, diversification and differential gene loss (DDL). The role of duplication in the origin of lineage-specific genes is further underlined by the discovery of genomic islands that seem to function as designated “gene dumps” and, perhaps, simultaneously, as “gene factories”.
Do Ghanaians Prefer Imported Textiles to Locally Manufactured Ones?  [PDF]
Peter Quartey, Joshua Abor
Modern Economy (ME) , 2011, DOI: 10.4236/me.2011.21009
Abstract: This paper ascertains whether consumers prefer locally made textile to imported ones or vice versa and what accounts for the choice. The study uses survey data of industry, traders and consumers to explain the issue. The results show that most consumers prefer locally-made textiles to imported ones. More than half of those who prefer locally-made textiles claimed local textile products are of a better quality. Others claimed they are more affordable and attractive while a few claimed local textiles are cheaper. This appears to contradict the country-of-origin effect and the results of previous studies in Africa and other developing countries. Im-plications for traders, governments and local manufacturers are also discussed. The study provides insights with respect to Ghanaians’ preference of locally-produced textiles to foreign-made ones.
Providing Sustainable Supports for Street Children in Nigeria:Stakeholders Challenges and the Policy Options Available  [PDF]
Joshua Oyeniyi Aransiola
Advances in Applied Sociology (AASoci) , 2013, DOI: 10.4236/aasoci.2013.33023

This article examines the limitations of all stakeholders in providing support for street children in Nigeria in the face of continuous increase in their number with a view to identify possible policy options in the light of inabilities of the stakeholders to adequately support the children. Qualitative research techniques were employed to collect the primary data from NGOs, community members and government agencies saddled with the responsibility of caring for the children. It was found that the stakeholders are incapable of addressing the problems of street children due to inadequate skills, lack of necessary facilities and stakeholders working in parallels among others. It emphasizes the need for collaboration among stakeholders to enjoy the benefit of synergy while there is also need to embark on capacity development for all the stakeholders in order to make meaningful progress and the situation of the street children improved in the country.

A Reconfigurable Network-on-Chip Datapath for Application Specific Computing  [PDF]
Joshua Weber, Erdal Oruklu
Circuits and Systems (CS) , 2013, DOI: 10.4236/cs.2013.42025

This paper introduces a new datapath architecture for reconfigurable processors. The proposed datapath is based on Network-on-Chip approach and facilitates tight coupling of all functional units. Reconfigurable functional elements can be dynamically allocated for application specific optimizations, enabling polymorphic computing. Using a modified network simulator, performance of several NoC topologies and parameters are investigated with standard benchmark programs, including fine grain and coarse grain computations. Simulation results highlight the flexibility and scalability of the proposed polymorphic NoC processor for a wide range of application domains.

Refining Use/Misuse/Mitigation Use Cases for Security Requirements  [PDF]
Joshua J. Pauli
Journal of Software Engineering and Applications (JSEA) , 2014, DOI: 10.4236/jsea.2014.78058

We investigate security at the same time as the functional requirements by refining and integrating use, misuse, and mitigation use cases. Security requirements rely on the interactions among normal system execution (use cases), attacks (misuse cases), and necessary security strategies (mitigation use cases), but previous approaches only use a high-level of abstraction. We use refinement to uncover details of each case and the relationships among them before integrating them. We identify and model “includes” and “extends” relationships within each refined case type, and use a condition-driven process that maintains these relationships as refinement continues. We then systematically identify and model “threatens” and “mitigates” relationships to integrate the cases at a detailed level.

