|
Genome Biology 2011
Closure of the NCBI SRA and implications for the long-term future of genomics data storageAbstract: DL: NCBI was facing budgetary constraints and presented a range of options to the National Institutes of Health (NIH) leadership, who chose to phase out the SRA along with other resources. One factor in making the determination was the understanding that because the raw sequence data within the SRA are processed into derived forms in order to answer the underlying biological questions, as methods mature, the SRA was seen as a transitional resource. The SRA primarily has been used by a relatively small community of project analysts and researchers working on methods development in genome scale research projects.PF: The SRA isn't closing. It started as a joint venture between the NCBI and the EBI, so the NCBI ceasing to accept submissions doesn't meant that the SRA is closing, merely changing and the European Nucleotide Archive (ENA) at EMBL-EBI will remain. The NCBI's decision was based on budgetary constraints. It should be noted that most people don't realize that storage space is only a minor fraction of the budget of the database; the bulk of the cost is associated with the staff who maintain the database, process the submissions, develop the software and so on.SS: From the outside, it appears that the SRA is closing because of NIH budgetary considerations. One problem is that the amount of sequence being generated is growing at an extraordinary rate, probably faster than increases to the budget. My group uses the SRA a lot. Due to the nature of our work, we rely on it maybe more than others. We download data reasonably frequently, but because of the size of the datasets we try not to do it too often.RK: The SRA was widely disliked by a lot of users, in particular because it was hard to get data. Partly that was because of poor standards for metadata associated with the data entries. This makes it hard to find the samples you were looking for. It wasn't set up for projects that were generating many samples at a time, and multiplexing with barcoded samples was als
|