Scientific Programming
Full course description
Scientific programming can be seen like any other type of the laboratory experiment. During the experiment it is required to create a protocol, where each aspect of the experiment is properly described, so it can be reproduced with small experimental error every time it is done and independently of the user. The same applies to a scientific writing. The main objectives is to produce the results, which can be obtained independently of the user.
In this course, you will be introduced into several aspects relevant for programming and for working in the data science/machine learning filed. The course will introduce the relevance of automated statistical analysis, parallel computing, data visualization, interactive notebooks.
Format
The course follows the problem-based learning (PBL) approach. Characteristic of this approach is that learning is the result of an engaged interaction between academic staff and students, fuelled by their experience and knowledge, with the objective of developing understanding and insights.
The course consists of lectures, journal clubs and project meetings. Lectures and journal clubs are meant to introduce the specific topic and give broad understanding of its application and relevance. The project meetings are meant to give the opportunity to go more into depth of each presented subject. During those project meetings, the students will have an opportunity to apply the knowledge and extend it in the project under guidance of the supervisor. Here, the students will be solving problems to new situations by applying acquired knowledge, facts, techniques and rules in a different way.
Course objectives
The main aim of the course is to give a clear overview of the important parts of scientific programming. This will include but won’t be limited to reproducibility aspects of the created computational tools, interactive notebooks, parallel computing and visualization of the outcomes.
The intendent learning outcomes (ILOs) of the course are:
- The student compiles information to develop software to answer biological/biomedical question
- The student is able to create and evaluate someone else an interactive lab-journal (Git)
- The student is able to design multivariate statistic in reproducible way
- The student is able to explain, apply and implement software in appropriate way.
- The student learn about parallel computing and apply it in the software where it is possible.
- The student is able to evaluate different visualization options for the examined data and decide the most informative option.
Recommended reading
Each literature indicated in this section aims at giving information relevant to the course. Those are only examples of the literature and additional literature will be provided.
Mandatory Literature:
- List M, Ebert P, Albrecht F. Ten Simple Rules for Developing Usable Software in Computational Biology. Markel S, editor. PLOS Computational Biology. 2017 Jan 5;13(1):e1005265.
- Perez-Riverol Y, Gatto L, Wang R, Sachsenberg T, Uszkoreit J, Leprevost F da V, et al. Ten Simple Rules for Taking Advantage of Git and GitHub. Markel S, editor. PLOS Computational Biology. 2016 Jul 14;12(7):e1004947.
- Website: https://towardsdatascience.com/getting-started-with-git-and-github-6fcd0f2d4ac6
- Website :https://jwiegley.github.io/git-from-the-bottom-up/
- Website: https://www.youtube.com/watch?v=iv8rSLsi1xo
- Website: https://towardsdatascience.com/the-ultimate-guide-to-data-cleaning-3969843991d4
- Website: Reproducibility and Replicability in Science: https://www.ncbi.nlm.nih.gov/books/NBK547546/
- Website:https://www.kdnuggets.com/2019/11/reproducibility-replicability-data-science.html
- Website:https://medium.com/science-uncovered/reproducibility-crisis-84ae0a5af2f
Additional Literature:
It will be provided during the course.