In this course we will introduce the basic ideas of Data Science and we will implement them using the R programming language. We will use the Tidyverse, which is a collection of R packages that facilitate data import, manipulation, encoding, exploration and visualisation.
Day 1: Introduction to the Tidyverse
Introduction to R and RStudio. Workflow. Tidy data. The Tidyverse ecosystem. Data import.
Tibbles. Dplyr basics. Pipes.
Day 2: Data Manipulation
Dplyr verbs. Numerical summaries. SQL and Dplyr.
Day 3: Categorical Variables
Factors. The package forcats. Modifying factor order. Modifying factors levels.
Day 4: Relational Data
Mutating joins. Filtering joins. Set operations.
Day 5: Data Visualisation I
Introduction to ggplot2. Creating a ggplot. Aesthetic mappings. Geometric objects.
Day 6: Data Visualisation II
More geometric objects. Themes.
Day 7: Exploratory Data Analysis I
Visualising distributions. Typical vs unusual values. Missing values.
Day 8: Exploratory Data Analysis II
Covariation. A categorical and continuous variable. Two categorical variables. Two continuous
variables.
Your training will be led by:
Pre-requisites
A basic knowledge of R can be helpful but not necessary.
Audience
This course is free and available to all those working in Midlands health and care organisations
Duration
Eight half days (3.5 hours each day)
Location
Online – delivered via Zoom with a combination of delivery styles.
For more information about this course, please contact:
Training & Development Operational Lead, Rachel Caswell