Statistical computing with R

Statistical computing with R for applied biology. 1. Basic and intermediate methods.

Course objective:

to provide an introduction to use of the R environment for graphical and statistical analysis in biology, biotechnology, medicine and food science and nutrition

Download the brochure for AY 2018-2019 here (.pdf, 573 kB).

Learning goals:

  • knowledge and understanding: an introductory knowledge of principles of statistical computing for applied biology; working knowledge of basic methods for data wrangling, exploratoty data analysis, statistical and graphical analysis
  • applying knowledge and understanding: ability to develop code in R and use it for graphical and statistical analysis
  • making judgements: ability to choose the graphical and statistical methods which are more appropriate in a given situation
  • communication skills: ability to produce reports for the statistical and graphical analysis of experimental data in a variety of formats
  • learning skills: ability to access and peruse literature and technical information on statistical computing

Prerequisites:

A BSc in Agriculture, Food Science, Chemistry, Biology, Biotechnology. At least 5 ECTS credits in Mathematics (some statistics, 3 ETCS credits in Statistics, may help). Ability to use spreadsheet software packages under Windows, MacOS or Unix/Linux operating systems. A knowledge of technical English language (for speakers of English as a second language a B1 or B2 level is suggested)

Attendance. Only 10 highly motivated students can attend the course (and get their exercises graded). Further students can be accepted but their exercises and reports will not be graded.

Grading. To obtain full credits (4 or 5 ECTS, depending on the number of lectures taken) the students must turn in a report (in Word or pdf format) within 1 month from the end of the course (note: as of 2018 the report MUST be in the form of a R notebook or of a .html document generated with R markdown). The report shall describe in full (including code) the descriptive and inferential statistical analysis and the graphical analysis of one of their own experiments. A suitable dataset form a R package can be used.

Course content:

Lectures (32 h). 1. An introduction to statistical analysis (2 h). 2. The R environment (1 h). 3. Importing data, data structures in R (3 h). 4. Data wrangling, tidying and reshaping (2 h). 5. Data visualisation with base functions and ggplot2 (3 h). 6. Numerical and graphical summaries of data. Generating reports with R markdown and knitr (3 h). 7. Group comparisons with t-tests and non parametric tests; one way ANOVA and multiple mean comparisons; tests of independence and association for contingency tables; power analysis (3 h). 8. Experimental design; ANOVA and ANCOVA (4 h). 9. Covariance, correlation and linear regression. (3 h) Bonus lectures 10. Factorial designs and empirical model building (4 h). 11. Non-linear regression (4 h)

Practicals: 16 h. Writing and running code, generating reports using datasets from R or case studies

Venue and timetable: Room B6, library building Campus di Macchia Romana, Università degli Studi della Basilicata. The course will begin on March 27, 2019 with two lecture a week: Tuesday 15.00-17.00, Wednesday 15.00-17.00.

Course material: handouts and code will be shared via Dropbox. Students need to bring their own laptop and are advised to download R and RStudio before they start attending the lectures.

Suggested readings.

  • Gacula, M., Singh, J., Bi, J., Altan, S. 2008. Statistical methods in food and consumer research. Academic Press.
  • Kabacoff, R.I. 2015. R in action. 2nd edition. Manning.
  • http://www.biostathandbook.com
  • Grolemund G., Wickham H.. 2017 R for Data Science.