Genomics and Biological Big Data: Facing Current and Future Challenges around Data and Software Sharing and Reproducibility

Novel technologies in genomics allow creating data in exascale dimension with relatively minor effort of human and laboratory and thus monetary resources compared to capabilities only a decade ago. While the availability of this data salvage to find answers for research questions, which would not have been feasible before, maybe even not feasible to ask before, the amount of data creates new challenges, which obviously need new software and data management systems. Such new solutions have to consider integrative approaches, which are not only considering the effectiveness and efficiency of data processing but improve reusability, reproducibility and usability especially tailored to the target user communities of genomic big data. In our opinion, current solutions tackle part of the challenges and have each their strengths but lack to provide a complete solution. We present in this paper the key challenges and the characteristics cutting-edge developments should possess for fulfilling the needs of the user communities to allow for seamless sharing and data analysis on a large scale.


