Identifying related cancer types based on their incidence among people with multiple cancers
Chris D Bajdik, Zenaida U Abanto, John J Spinelli, Angela Brooks-Wilson, Richard P Gallagher
Emerging Themes in Epidemiology , 2006, DOI: 10.1186/1742-7622-3-17
Abstract: In people with two or more cancer types, the probability that a specific type is diagnosed was determined as the number of diagnoses for that cancer type divided by the total number of cancer diagnoses. If two types of cancer occur independently of one another, then the probability that someone will develop both cancers by chance is the product of the individual probabilities for each type. The expected number of people with both cancers is the number of people at risk multiplied by the separate probabilities for each cancer. We performed the analysis on records of cancer diagnoses in British Columbia, Canada between 1970 and 2004.There were 28,159 people with records of multiple primary cancers between 1970 and 2004, including 1,492 people with between three and seven diagnoses. Among both men and women, the combinations of esophageal cancer with melanoma, and kidney cancer with oral cancer, are observed more than twice as often as expected.Our analysis suggests there are several pairs of primary cancers that might be related by a shared etiological factor. We think that our method is more appropriate than others when multiple diagnoses of primary cancer are unlikely to be the result of therapeutic or diagnostic procedures.There are several reasons that someone might be diagnosed with cancer at more than one anatomic site. First, a new cancer might be caused by the therapy for a previous cancer. The risk of breast cancer is significantly increased among women who were treated for Hodgkin Disease with radiation [1]. Second, cancer might occur at multiple sites because a factor is associated with cancer at each site. Germline mutations in mismatch repair genes can produce susceptibility to cancers of the colorectum, ovary, stomach, small bowel, upper uroepithelial tract, hepatobiliary tract and brain [2]. Likewise, cigarette smoking affects the risk of several cancer types. Third, a different cancer type might be diagnosed because of diagnostic or surveillance proced
CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM) to identify genetically-associated cancers and candidate genes
Chris D Bajdik, Byron Kuo, Shawn Rusaw, Steven Jones, Angela Brooks-Wilson
BMC Bioinformatics , 2005, DOI: 10.1186/1471-2105-6-78
Abstract: In the OMIM database on September 30, 2004, CGMIM identified 1943 genes related to cancer. BRCA2 (OMIM *164757), BRAF (OMIM *164757) and CDKN2A (OMIM *600160) were each related to 14 types of cancer. There were 45 genes related to cancer of the esophagus, 121 genes related to cancer of the stomach, and 21 genes related to both. Analysis of CGMIM results indicate that fewer than three gene entries in OMIM should mention both, and the more than seven-fold discrepancy suggests cancers of the esophagus and stomach are more genetically related than current literature suggests.CGMIM identifies genetically-related cancers and cancer-related genes. In several ways, cancers with shared genetic etiology are anticipated to lead to further etiologic hypotheses and advances regarding environmental agents. CGMIM results are posted monthly and the source code can be obtained free of charge from the BC Cancer Research Centre website http://www.bccrc.ca/ccr/CGMIM webcite.Cancers are complex diseases with multiple genetic and environmental factors contributing to their development. The most prominent success stories in cancer genetics to date have involved genes that produce a recognizable pattern of disease within certain rare families. Most cancers, however, are sporadic and appear in people who do not have a clear family history of the disease. These cancers are currently being studied in epidemiological investigations that examine genetics, environmental exposures or both. The studies often compare "cases" or affected individuals to "controls" or unaffected individuals, to determine which group has a higher frequency of a particular gene variant or a greater level of exposure to an environmental agent. The studies require logical hypotheses regarding the genes to be tested and clear criteria for case definition. Cases may be defined as people who have any of several types of cancer, if those types are related. For example, epidemiologic studies of BRCA1 mutation carriers might be
Estimates of array and pool-construction variance for planning efficient DNA-pooling genome wide association studies
Madalene A Earp, Maziar Rahmani, Kevin Chew, Angela Brooks-Wilson
BMC Medical Genomics , 2011, DOI: 10.1186/1755-8794-4-81
Abstract: By examining the variation in allele frequency estimation on SNP arrays between and within DNA pools we determine how array variance [var(earray)] and pool-construction variance [var(econstruction)] contribute to the total variance of allele frequency estimation. This information is useful in deciding whether replicate arrays or replicate pools are most useful in reducing variance. Our analysis is based on 27 DNA pools ranging in size from 74 to 446 individual samples, genotyped on a collective total of 128 Illumina beadarrays: 24 1M-Single, 32 1M-Duo, and 72 660-Quad.For all three Illumina SNP array types our estimates of var(earray) were similar, between 3-4 × 10-4 for normalized data. Var(econstruction) accounted for between 20-40% of pooling variance across 27 pools in normalized data.We conclude that relative to var(earray), var(econstruction) is of less importance in reducing the variance in allele frequency estimation from DNA pools; however, our data suggests that on average it may be more important than previously thought. We have prepared a simple online tool, PoolingPlanner (available at http://www.kchew.ca/PoolingPlanner/ webcite), which calculates the effective sample size (ESS) of a DNA pool given a range of replicate array values. ESS can be used in a power calculator to perform pool-adjusted calculations. This allows one to quickly calculate the loss of power associated with a pooling experiment to make an informed decision on whether a pool-based GWAS is worth pursuing.Genome-wide association studies (GWAS) have been used to examine over 200 diseases and traits, and identified over 4000 single nucleotide polymorphisms (SNPs) associated with these traits, as listed in the Catalog of Published Genome-Wide Association Studies [1]. In many cases, GWAS have revealed previously unsuspected molecular mechanisms of disease, highlighting the value of this hypothesis-free approach [reviewed in [2,3]]. Unfortunately, GWAS are very costly due to the price of ge
A Survey of Genomic Properties for the Detection of Regulatory Polymorphisms
Stephen B Montgomery ,Obi L Griffith,Johanna M Schuetz,Angela Brooks-Wilson,Steven J. M Jones
PLOS Computational Biology , 2007, DOI: 10.1371/journal.pcbi.0030106
Abstract: Advances in the computational identification of functional noncoding polymorphisms will aid in cataloging novel determinants of health and identifying genetic variants that explain human evolution. To date, however, the development and evaluation of such techniques has been limited by the availability of known regulatory polymorphisms. We have attempted to address this by assembling, from the literature, a computationally tractable set of regulatory polymorphisms within the ORegAnno database (http://www.oreganno.org). We have further used 104 regulatory single-nucleotide polymorphisms from this set and 951 polymorphisms of unknown function, from 2-kb and 152-bp noncoding upstream regions of genes, to investigate the discriminatory potential of 23 properties related to gene regulation and population genetics. Among the most important properties detected in this region are distance to transcription start site, local repetitive content, sequence conservation, minor and derived allele frequencies, and presence of a CpG island. We further used the entire set of properties to evaluate their collective performance in detecting regulatory polymorphisms. Using a 10-fold cross-validation approach, we were able to achieve a sensitivity and specificity of 0.82 and 0.71, respectively, and we show that this performance is strongly influenced by the distance to the transcription start site.
The prognostic effect of ethnicity for gastric and esophageal cancer: the population-based experience in British Columbia, Canada
Morteza Bashash, T Greg Hislop, Amil M Shah, Nhu Le, Angela Brooks-Wilson, Chris D Bajdik
BMC Cancer , 2011, DOI: 10.1186/1471-2407-11-164
Abstract: Data were obtained from the population-based BC Cancer Registry for patients diagnosed with invasive esophageal and gastric cancer between 1984 and 2006. The ethnicity of patients was estimated according to their names and categorized as Chinese, South Asian, Iranian or Other. Cox proportional hazards regression analysis was used to estimate the effect of ethnicity adjusted for patient sex and age, disease histology, tumor location, disease stage and treatment.The survival of gastric cancer patients was significantly different among ethnic groups. Chinese patients showed better survival compared to others in univariate and multivariate analysis. The survival of esophageal cancer patients was significantly different among ethnic groups when the data was analyzed by a univariate test (p = 0.029), but not in the Cox multivariate model adjusted for other patient and prognostic factors.Ethnicity may represent underlying genetic factors. Such factors could influence host-tumor interactions by altering the tumor's etiology and therefore its chance of spreading. Alternatively, genetic factors may determine response to treatments. Finally, ethnicity may represent non-genetic factors that affect survival. Differences in survival by ethnicity support the importance of ethnicity as a prognostic factor, and may provide clues for the future identification of genetic or lifestyle factors that underlie these observations.Gastric and esophageal cancers are among the most lethal human malignancies. Worldwide, gastric cancer is the fourth most common cancer, but the second most common cause of death from cancer [1]. Esophageal cancer is the eighth most common cancer, but the sixth most common cause of cancer death [1]. The epidemiology of these cancers is geographically diverse. Incidence rates for gastric cancer vary from 3.4 per 100,000 among women in North America to 26.9 per 100,000 among men in Asia. The 5-year survival is usually about 20% [2]; however, countries with higher inc
Cost–Effective Prediction of Gender-Labeling Errors and Estimation of Gender-Labeling Error Rates in Candidate-Gene Association Studies
Conghui Qu,Johanna M. Schuetz,Jeong Eun Min,Denise Daley,John J. Spinelli,Angela Brooks-Wilson,Jinko Graham
Frontiers in Genetics , 2011, DOI: 10.3389/fgene.2011.00031
Abstract: We describe a statistical approach to predict gender-labeling errors in candidate-gene association studies, when Y-chromosome markers have not been included in the genotyping set. The approach adds value to methods that consider only the heterozygosity of X-chromosome SNPs, by incorporating available information about the intensity of X-chromosome SNPs in candidate genes relative to autosomal SNPs from the same individual. To our knowledge, no published methods formalize a framework in which heterozygosity and relative intensity are simultaneously taken into account. Our method offers the advantage that, in the genotyping set, no additional space is required beyond that already assigned to X-chromosome SNPs in the candidate genes. We also show how the predictions can be used in a two-phase sampling design to estimate the gender-labeling error rates for an entire study, at a fraction of the cost of a conventional design.
Genetic Variation in Cell Death Genes and Risk of Non-Hodgkin Lymphoma
Johanna M. Schuetz, Denise Daley, Jinko Graham, Brian R. Berry, Richard P. Gallagher, Joseph M. Connors, Randy D. Gascoyne, John J. Spinelli, Angela R. Brooks-Wilson
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0031560
Abstract: Background Non-Hodgkin lymphomas are a heterogeneous group of solid tumours that constitute the 5th highest cause of cancer mortality in the United States and Canada. Poor control of cell death in lymphocytes can lead to autoimmune disease or cancer, making genes involved in programmed cell death of lymphocytes logical candidate genes for lymphoma susceptibility. Materials and Methods We tested for genetic association with NHL and NHL subtypes, of SNPs in lymphocyte cell death genes using an established population-based study. 17 candidate genes were chosen based on biological function, with 123 SNPs tested. These included tagSNPs from HapMap and novel SNPs discovered by re-sequencing 47 cases in genes for which SNP representation was judged to be low. The main analysis, which estimated odds ratios by fitting data to an additive logistic regression model, used European ancestry samples that passed quality control measures (569 cases and 547 controls). A two-tiered approach for multiple testing correction was used: correction for number of tests within each gene by permutation-based methodology, followed by correction for the number of genes tested using the false discovery rate. Results Variant rs928883, near miR-155, showed an association (OR per A-allele: 2.80 [95% CI: 1.63–4.82]; pF = 0.027) with marginal zone lymphoma that is significant after correction for multiple testing. Conclusions This is the first reported association between a germline polymorphism at a miRNA locus and lymphoma.
Genetic variation in the NBS1, MRE11, RAD50 and BLM genes and susceptibility to non-Hodgkin lymphoma
Johanna M Schuetz, Amy C MacArthur, Stephen Leach, Agnes S Lai, Richard P Gallagher, Joseph M Connors, Randy D Gascoyne, John J Spinelli, Angela R Brooks-Wilson
BMC Medical Genetics , 2009, DOI: 10.1186/1471-2350-10-117
Abstract: We surveyed the genetic variation in these genes in constitutional DNA of NHL patients by means of gene re-sequencing, then conducted genetic association tests for susceptibility to NHL in a population-based collection of 797 NHL cases and 793 controls.114 SNPs were discovered in our sequenced samples, 61% of which were novel and not previously reported in dbSNP. Although four variants, two in RAD50 and two in NBS1, showed association results suggestive of an effect on NHL, they were not significant after correction for multiple tests.These results suggest an influence of RAD50 and NBS1 on susceptibility to diffuse large B-cell lymphoma and marginal zone lymphoma. Larger association and functional studies could confirm such a role.Non-Hodgkin lymphoma (NHL) is a heterogeneous group of hematological malignancies that in aggregate constitutes the 5th highest cause of cancer mortality in the United States [1] and Canada [2]. NHL subtypes vary in presentation, survival expectation, morbidity and responses to treatment. Chromosomal translocations are so characteristic of NHL that many genes now known to be important in the development of cancer, such as BCL2 [3], were originally discovered due to their position at recurrent translocation breakpoints in NHL tumours.During development and differentiation, the DNA of B- and T-cells is subject to double stranded breaks necessary for the rearrangement of immunoglobulin genes. Genes functioning in double-stranded break repair are involved in successfully controlling and repairing these breaks, thus protecting the genome from molecular events that could lead to cancer. This study examined four genes with key roles in maintaining genome stability: the MRN complex, MRE11, RAD50 and NBS1, and the Bloom syndrome gene (BLM). We have previously shown association with NHL of a genetic variant in H2AX, which encodes a histone involved in signalling the presence of double stranded breaks [4]. The MRN complex forms foci at sites of doubl
Genetic Polymorphisms at TIMP3 Are Associated with Survival of Adenocarcinoma of the Gastroesophageal Junction
Morteza Bashash, Amil Shah, Greg Hislop, Martin Treml, Karla Bretherick, Rozmin Janoo-Gilani, Stephen Leach, Nhu Le, Chris Bajdik, Angela Brooks-Wilson
PLOS ONE , 2013, DOI: 10.1371/journal.pone.0059157
Abstract: The poor survival of adenocarcinomas of the gastroesophageal junction (GEJ) makes them clinically important. Discovery of host genetic factors that affect outcome may guide more individualized treatment. This study tests whether constitutional genetic variants in matrix metalloproteinases (MMP) and tissue inhibitors of metalloproteinases (TIMP) genes are associated with outcome of GEJ adenocarcinoma. Single nucleotide polymorphisms (SNPs) at four TIMP (TIMP1-4) and three MMP genes (MMP2, MMP7 and MMP9) were genotyped in DNA samples from a prospective cohort of patients with primary adenocarcinoma of the GEJ admitted to the British Columbia Cancer Agency. Cox proportional hazards regression, with adjustment for patient, disease and treatment variables, was used to estimate the association of SNPs with survival. Genotypes for 85 samples and 48 SNPs were analyzed. Four SNPs across TIMP3, (rs130274, rs715572, rs1962223 and rs5754312) were associated with survival. Interaction analyses revealed that the survival associations with rs715572 and rs5754312 are specific and significant for 5FU+cisplatin treated patients. Sanger sequencing of the TIMP3 coding and promoter regions revealed an additional SNP, rs9862, also associated with survival. TIMP3 genetic variants are associated with survival and may be potentially useful in optimizing treatment strategies for individual patients.
The Relationship between Telomere Length and Mortality in Chronic Obstructive Pulmonary Disease (COPD)
Jee Lee, Andrew J. Sandford, John E. Connett, Jin Yan, Tammy Mui, Yuexin Li, Denise Daley, Nicholas R. Anthonisen, Angela Brooks-Wilson, S. F. Paul Man, Don D. Sin
PLOS ONE , 2012, DOI: 10.1371/journal.pone.0035567
Abstract: Some have suggested that chronic obstructive pulmonary disease (COPD) is a disease of accelerated aging. Aging is characterized by shortening of telomeres. The relationship of telomere length to important clinical outcomes such as mortality, disease progression and cancer in COPD is unknown. Using quantitative polymerase chain reaction (qPCR), we measured telomere length of peripheral leukocytes in 4,271 subjects with mild to moderate COPD who participated in the Lung Health Study (LHS). The subjects were followed for approximately 7.5 years during which time their vital status, FEV1 and smoking status were ascertained. Using multiple regression methods, we determined the relationship of telomere length to cancer and total mortality in these subjects. We also measured telomere length in healthy “mid-life” volunteers and patients with more severe COPD. The LHS subjects had significantly shorter telomeres than those of healthy “mid-life” volunteers (p<.001). Compared to individuals in the 4th quartile of relative telomere length (i.e. longest telomere group), the remaining participants had significantly higher risk of cancer mortality (Hazard ratio, HR, 1.48; p = 0.0324) and total mortality (HR, 1.29; p = 0.0425). Smoking status did not make a significant difference in peripheral blood cells telomere length. In conclusion, COPD patients have short leukocyte telomeres, which are in turn associated increased risk of total and cancer mortality. Accelerated aging is of particular relevance to cancer mortality in COPD.
