Experimental Design and Data Management
Full course description
When publishing in a scientific journal, you often have the choice to make your findings available freely, openly accessible for the entire world. Additionally, a standard requirement is that all of the source data underlying your findings are openly shared using the FAIR ('Findable, Accessible, Interoperable, and Reusable') data principles. In Systems Biology research, these datasets are generally huge. Hence, next to proper experimental design, data management has become of utmost importance in Systems Biology research. This course covers all aspects of study design and data management, including statistical data analysis and the FAIR data principles. Additionally, you will learn how to perform basic as well as advanced data analyses in the scientific programming language R.
Course objectives
1. Learn to explain important aspects of sample collecting and sample storage in biobanks;
2. Learn to distinguish between the relative merits and use cases for the diversity of study designs used in the field of Systems Biology research;
3. Learn to perform analyses on biomedical research data in the statistical programming language R;
4. Learn to explain important aspects of scientific data management, including statistical data analysis, data visualization and data sharing;
5. Learn to explain the principles of FAIR data sharing and open-access publishing.
Recommended reading
Mandatory Literature:
- Rundle et al. 2012: Better cancer biomarker discovery through better study design. Eur J Clin Invest; 42(12): 1350–1359.
- Vineis et al. 2005: Environmental tobacco smoke and risk of respiratory cancer and chronic obstructive pulmonary disease in former smokers and never smokers in the EPIC prospective study. BMJ; 330(7486):277
- Rundle et al. 2005: Design Options for Molecular Epidemiology Research within Cohort Studies. Cancer Epidemiol Biomarkers Prev; 14(8):1899-907.
- Gallo et al. 2012: STrengthening the Reporting of OBservational studies in Epidemiology - Molecular Epidemiology (STROBE-ME): an extension of the STROBE statement. Eur J Clin Invest.; 42(1):1-16.
- Lall et al. 2017: Personalized risk prediction for type 2 diabetes: the potential of genetic risk scores. Genetics in Medicine; 19(3):322-329.
- Lemesle et al. 2015: Multimarker proteomic profiling for the prediction of cardiovascular mortality in patients with chronic heart failure. PLoS One; 10(4):e0119265.
- Wang et al. 2011: Assessing the role of circulating, genetic, and imaging biomarkers in cardiovascular risk prediction. Circulation; 123(5):551-65.
- Vaux et al. 2012: Research methods: Know when your numbers are significant. Nature; 492(7428):180-1.
- Button et al. 2013: Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience; 14:365–376.
- Sham et al. 2014: Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet; 15(5):335-46.
- Uitterlinden et al. 2016: An Introduction to Genome-Wide Association Studies: GWAS for Dummies. Semin Reprod Med; 34:196–204.
- Langfelder et al. 2008: WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics; 9:559.
- Yuan et al. 2017: Co-expression network analysis identified FCER1G in association with progression and prognosis in human clear cell renal cell carcinoma. Int J Biol Sci; 13(11):1361-1372.
- Zerbino et al. 2017: Ensembl 2018. Nucleic Acids Res; 46(D1):D754-D761.
- Rodriguez et al. 2016: Publishing FAIR Data: An Exemplar Methodology Utilizing PHI-Base. Front Plant Sci; 7:641.
- Wilkinson et al. 2016: The FAIR Guiding Principles for scientific data management and stewardship. Sci Data; 3:160018.
Additional Literature:
- Boothby et al. 2015: Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. PNAS; 112(52): 15976–15981.
- Koutsovoulos et. Al 2016: No evidence for extensive horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini, PNAS; 113(18): 5053–5058.
Additional literature may be provided by the lecturers at their lecture and tutorial sessions. All literature will be provided via Canvas.