**Statistical computing with R for applied biology. 1. Basic and intermediate methods.**

**Course objective**:

to provide an introduction to use of the R environment for graphical and statistical analysis in biology, biotechnology, medicine and food science and nutrition

Download the brochure for AY 2018-2019 here (.pdf, 573 kB).

**Learning goals**:

**knowledge and understanding**: an introductory knowledge of principles of statistical computing for applied biology; working knowledge of basic methods for data wrangling, exploratoty data analysis, statistical and graphical analysis**applying knowledge and understanding**: ability to develop code in R and use it for graphical and statistical analysis**making judgements**: ability to choose the graphical and statistical methods which are more appropriate in a given situation**communication skills**: ability to produce reports for the statistical and graphical analysis of experimental data in a variety of formats**learning skills**: ability to access and peruse literature and technical information on statistical computing

**Prerequisites**:

A BSc in Agriculture, Food Science, Chemistry, Biology, Biotechnology. At least 5 ECTS credits in Mathematics (some statistics, 3 ETCS credits in Statistics, may help). Ability to use spreadsheet software packages under Windows, MacOS or Unix/Linux operating systems. A knowledge of technical English language (for speakers of English as a second language a B1 or B2 level is suggested)

**Attendance**. Only 10 highly motivated students can attend the course (and get their exercises graded). Further students can be accepted but their exercises and reports will not be graded.

**Grading**. To obtain full credits (4 or 5 ECTS, depending on the number of lectures taken) the students must turn in a report (in Word or pdf format) within 1 month from the end of the course (note: as of 2018 the report MUST be in the form of a R notebook or of a .html document generated with R markdown). The report shall describe in full (including code) the descriptive and inferential statistical analysis and the graphical analysis of one of their own experiments. A suitable dataset form a R package can be used.

**Course content**:

**Lectures (32 h). **1. An introduction to statistical analysis (**2 h**). 2. The R environment (**1 h**). 3. Importing data, data structures in R (**3 h**). 4. Data wrangling, tidying and reshaping (**2 h**). 5. Data visualisation with base functions and ggplot2 (**3 h**). 6. Numerical and graphical summaries of data. Generating reports with R markdown and knitr (**3 h**). 7. Group comparisons with t-tests and non parametric tests; one way ANOVA and multiple mean comparisons; tests of independence and association for contingency tables; power analysis (**3 h**). 8. Experimental design; ANOVA and ANCOVA (**4 h**). 9. Covariance, correlation and linear regression. (**3 h**) **Bonus lectures** 10. Factorial designs and empirical model building (**4 h**). 11. Non-linear regression (**4 h**)

**Practicals: 16 h**. Writing and running code, generating reports using datasets from R or case studies

**Venue and timetable**: Room B6, library building Campus di Macchia Romana, Università degli Studi della Basilicata. The course will begin on March 27, 2019 with two lecture a week: Tuesday 15.00-17.00, Wednesday 15.00-17.00.

**Course material**: handouts and code will be shared via Dropbox. Students need to bring their own laptop and are advised to download R and RStudio before they start attending the lectures.

**Suggested readings**.

- Gacula, M., Singh, J., Bi, J., Altan, S. 2008. Statistical methods in food and consumer research. Academic Press.
- Kabacoff, R.I. 2015. R in action. 2nd edition. Manning.
- http://www.biostathandbook.com
- Grolemund G., Wickham H.. 2017 R for Data Science.