Predicting DNA methylation status using word composition  [PDF]
Lingyi Lu, Kao Lin, Ziliang Qian, Haipeng Li, Yudong Cai, Yixue Li
Journal of Biomedical Science and Engineering (JBiSE) , 2010, DOI: 10.4236/jbise.2010.37091
Abstract: Background: DNA methylation will influence the gene expression pattern and cause the changes of the genetic functions. Computational analysis of the methylation status for nucleotides can help to explore the underlying reasons for developing methylations. Results: We present a DNA sequence based method to analyze the methylation status of CpG dinucleotides using 5bp (5-mer) DNA fragments – named as the word composition encoding method. The prediction accuracy is 75.16% when all 5bp word compositions are used (totally 45 = 1024). Furthermore, 5-bp DNA fragments/words having the most impact on the methylation status are identified by mRMR (Maximum-Relevant-Minimum-Redundancy) feature selection method. As a result, 58 words are selected, and they are used to build a compact predictor, which achieves 77.45% prediction accuracy. When the word composition encoding method and the feature selection strategy are coupled together, the meaning of these words can be analyzed through their contribution towards the prediction. The biological evidence in the literature supports that the surrounding DNA sequence of the CpG dinucleotides will affect the methylation of the CpG dinucleotides. Conclusions: The main contribution of this paper is to find out and analyze the key DNA words taken from the neighbor-hood of the CpG dinucleotides that are inducing the DNA methylation.
An efficient method for statistical significance calculation of transcription factor binding sites
Ziliang Qian,Lingyi Lu,Liu Qi,Yixue Li
Bioinformation , 2007,
Abstract: Various statistical models have been developed to describe the DNA binding preference of transcription factors, by which putative transcription factor binding sites (TFBS) can be identified according to scores assigned. Statistical significance of these scores, usually known as the p-value, play a critical role in identification. We developed an efficient algorithm to provide precise calculation of the statistical significance, remarkably enhancing the calculation efficiency by reducing the time complexity from an exponent scale to a linear scale, and successfully extended the application of this algorithm to a wide range of models, from the commonly used position weight matrix models to the complicated Bayesian Network models. Further, we calculated p-values of all transcription factor DNA binding sites recorded in the database, JASPAR, and based on these, we investigated some unseen properties of p-values as a whole, such as the p-value distribution of different models and the p-value variance according to changed scoring schemes. We hope that our algorithm and the result of computational experiments would offer an improved solution to the statistical significance of transcription factor binding sites. The software to implement our method can be downloaded from http://pcal.biosino.org/pCal.html.
A Comparative Study to Understanding about Poetics Based on Natural Language Processing  [PDF]
Lingyi Zhang, Junhui Gao
Open Journal of Modern Linguistics (OJML) , 2017, DOI: 10.4236/ojml.2017.75017
Abstract: This paper tries to find out five poets’ (Thomas Hardy, Wilde, Browning, Yeats, and Tagore) differences and similarities through analyzing their works on nineteenth Century by using natural language understanding technology and word vector model. Firstly, we collect enough poems from these five poets, build five corpus respectively, and calculate their high-frequency words, by using Natural Language Processing method. Then, based on the word vector model, we calculate the word vectors of the five poets’ high-frequency words, and combine the word vectors of each poet into one vector. Finally, we analyze the similarity between the combined word vectors by using the hierarchical clustering method. The result shows that the poems of Hardy, Browning, and Wilde are similar; the poems of Tagore and Yeats are relatively close—but the gap between the two is relatively large. In addition, we evaluate the stability of our approach by altering the word vector dimension, and try to analyze the results of clustering in a literary (poetic) perspective. Yeats and Tagore possessed a kind of mysticism poetics thought, while Hardy, Browning, and Wilde have the elements of realism combined with tragedy and comedy. The results are similar comparing to those we get from the word vector model.
Research on the Interaction between Producer Services and Manufacturing Industry in Shaanxi Province  [PDF]
Lingyi Kong, Xiao Liang
American Journal of Industrial and Business Management (AJIBM) , 2018, DOI: 10.4236/ajibm.2018.85087
Abstract: With the deepening of social division of labor, producer services are gradually separated from manufacturing industry and play a more and more important role in the national economy. In particular, with the rapid development of scientific research, business, law, finance and other industries, the industrial association between the productive service industry and the manufacturing industry becomes more closely. The interactive state between the producer services and manufacturing industry has a direct impact on the industrial upgrading and structural adjustment of all sectors of the national economy, which has become an important way for the economic development of our country in the future. Taking Shaanxi Province as an example, this paper first analyzes the development of producer services and manufacturing in Shaanxi. On this basis, the VAR model is built to analyze the added value of two industries in Shaanxi Province. Finally, according to the empirical results, we put forward relevant countermeasures and suggestions.
Genomic characterization of ribitol teichoic acid synthesis in Staphylococcus aureus: genes, genomic organization and gene duplication
Ziliang Qian, Yanbin Yin, Yong Zhang, Lingyi Lu, Yixue Li, Ying Jiang
BMC Genomics , 2006, DOI: 10.1186/1471-2164-7-74
Abstract: We identified all S. aureus tar and tag gene orthologs in the selected S. aureus strains which would contribute to teichoic acids sythesis.Based on our identification of genes orthologous to tarI, tarJ, tarL, which are specific to tar pathway in B. subtilis W23, we also concluded that tar is the major teichoic acid biogenesis pathway in S. aureus. Further analyses indicated that the S. aureus tar genes, different from the divergon organization in B. subtilis, are organized into several clusters in cis. Most interesting, compared with genes in B. subtilis tar pathway, the S. aureus tar specific genes (tarI,J,L) are duplicated in all six S. aureus genomes.In the S. aureus strains we analyzed, tar (teichoic acid ribitol) is the main teichoic acid biogenesis pathway. The tar genes are organized into several genomic groups in cis and the genes specific to tar (relative to tag): tarI, tarJ, tarL are duplicated. The genomic organization of the S. aureus tar pathway suggests their regulations are different when compared to B. subtilis tar or tag pathway, which are grouped in two operons in a divergon structure.Staphylococcus. Aureus (S. aureus) is a Gram-positive bacterium, which causes a variety of suppurative infections and toxinoses in humans. The death rate associated with S. aureus infection is still high even with antimicrobial drug treatments due to the development of antibiotic resistance in Methicillin Resistant Staphylococcus Aureus (MRSA) strains. Current developments in antimicrobial therapeutics show little efficacy in treating S. aureus and this bacterium remains a major human health threat. S. aureus, and in particular its cell wall, remain a major target of glycopeptide antibiotics and focus of bacteriology research.Teichoic acids, polymers of alternating phosphate and alditol groups, in addition to peptidoglycan are an essential component of bacterial cell walls. Teichoic acid biosynthesis in S. aureus has not been well characterized. B. subtilis and S. aur
Current progress and prospects of induced pluripotent stem cells
LingYi Chen,Lin Liu
Science China Life Sciences , 2009, DOI: 10.1007/s11427-009-0092-6
Abstract: Induced pluripotent stem (iPS) cells are derived from somatic cells by ectopic expression of few transcription factors. Like embryonic stem (ES) cells, iPS cells are able to self-renew indefinitely and to differentiate into all types of cells in the body. iPS cells hold great promise for regenerative medicine, because iPS cells circumvent not only immunological rejection but also ethical issues. Since the first report on the derivation of iPS cells in 2006, many laboratories all over the world started research on iPS cells and have made significant progress. This paper reviews recent progress in iPS cell research, including the methods to generate iPS cells, the molecular mechanism of reprogramming in the formation of iPS cells, and the potential applications of iPS cells in cell replacement therapy. Current problems that need to be addressed and the prospects for iPS research are also discussed.
Economic Determinants of Happiness
Teng Guo,Lingyi Hu
Statistics , 2011,
Abstract: Many scholars have recently begun to dispute the assumed link between individual wellbeing and economic conditions and the extent to which the latter matters (Easterlin, 1995; Stevenson and Wolfers 2008; Tella and MacCulloch 2008). This dilemma is empirically demonstrated in the Latin America Public Opinion Project (LAPOP, 2011), which surveyed North and Latin America in terms of perceived life satisfaction. Higher measures found in the less developed countries of Brazil, Costa Rica, and Panama than in North America pose an intriguing quandary to traditional economic theory. In light of this predicament this paper aims to construct a sensible measure of the national happiness level for the United States on a year by year basis; and regress this against indicators of the national economy to provide insight into this puzzling enigma between national happiness and economic forces
Current progress and prospects of induced pluripotent stem cells

CHEN LingYi &,Liu Lin,

中国科学C辑(英文版) , 2009,
Chromosome 7p linkage and association study for diabetes related traits and type 2 diabetes in an African-American population enriched for nephropathy
Tennille S Leak, Carl D Langefeld, Keith L Keene, Carla J Gallagher, Lingyi Lu, Josyf C Mychaleckyj, Stephen S Rich, Barry I Freedman, Donald W Bowden, Michèle M Sale
BMC Medical Genetics , 2010, DOI: 10.1186/1471-2350-11-22
Abstract: We fine mapped this region by genotyping 11 additional polymorphic markers in the same ASP and investigated a total of 68 single nucleotide polymorphisms (SNPs) in functional candidate genes (GCK1, IL6, IGFBP1 and IGFBP3) for association with age of T2D diagnosis, age of ESRD diagnosis, duration of T2D to onset of ESRD, body mass index (BMI) in African American cases and T2D-ESRD in an African American case-control cohort. OSA of fine mapping markers supported linkage at 28 cM on 7p (near D7S3051) in early-onset T2D families (max. LOD = 3.61, P = 0.002). SNPs in candidate genes and 70 ancestry-informative markers (AIMs) were evaluated in 577 African American T2D-ESRD cases and 596 African American controls.The most significant association was observed between ESRD age of diagnosis and SNP rs730497, located in intron 1 of the GCK1 gene (recessive T2D age-adjusted P = 0.0006). Nominal associations were observed with GCK1 SNPs and T2D age of diagnosis (BMI-adjusted P = 0.014 to 0.032). Also, one IGFBP1 and four IGFBP3 SNPs showed nominal genotypic association with T2D-ESRD (P = 0.002-0.049). After correcting for multiple tests, only rs730497 remanined significant.Variant rs730947 in the GCK1 gene appears to play a role in early ESRD onset in African Americans.A genome wide linkage scan was performed on 638 African American affected sibling pairs (ASPs) with type 2 diabetes (T2D) from 247 families; 166 families contained at least one ASP concordant for diabetic end-stage renal disease (T2D-ESRD) [1]. Ordered subset analysis (OSA) revealed a linkage peak on chromosome 7p in the subset of T2D families with an early age of diagnosis (29% of pedigrees, max. LOD = 3.85, P = 0.003 for the change in LOD score) [1]. T2D-ESRD subsets with lower body mass index (BMI) (64% of pedigrees, max. LOD = 3.93, P = 0.010) and longer duration from T2D diagnosis to ESRD onset (37% of pedigrees, max. LOD = 3.59, P = 0.010) also showed evidence for linkage at this locus [2].Fine mapping of th
A 10 Gb/s receiver with half rate period calibration CDR and CTLE/DFE combiner

Gao Zhuo,Yang Zongren,Zhao Ying,Yang Yi,Zhang Lu,Huang Lingyi,Hu Weiwu,

半导体学报 , 2009,
Abstract: 本文提出了65纳米CMOS工艺下的一种10Gb/s 低功耗的有线电互连接收端。接收端的面积为300μm×500μm。通过集成新的周期性相位校准双沿触发的CDR电路,接收端的功耗为52mW。利用低功耗宽带可编程的CTLE和DFE组合均衡器,接收端可以在宽的范围区间内补偿信道损失。
