Chapter 1 ECOSCOPE

This is an open source textbook aimed at introducing data science and R programming to undergraduate and graduate students. It was originally written as a learning tool for the ECOSCOPE workshops hosted at the University of British Columbia.

The book is structured so that each chapter is a different workshop.

Note: This book is in its preliminary stages. Content will be continuously updated and added.

1.1 Schedule

2021 Term 1 (Sept - Dec)

Date Time Workshop Register
Sept 21 (Tues) 2-4 PM Intro to R Click here to register
Sept 28 & 30 (Tues & Thurs) 1-4 PM Intro to the tidyverse Click here to register
Oct 19 (Tues) 2-4 PM Intro to R Click here to register
Oct 26 & 28 (Tues & Thurs) 1-4 PM Statistical models in R Click here to register
Nov 16 (Tues) 2-4 PM Intro to R Click here to register
Nov 23 & 25 (Tues & Thurs) 1-4 PM Intermediate R programming Click here to register


2021 Term 2 (Jan - Apr)

Date Time Workshop Register
Sept 21 (Tues) 2-4 PM Intro to R Click here to register
Sept 28 & 30 (Tues & Thurs) 1-4 PM Intro to the tidyverse Click here to register
Oct 19 (Tues) 2-4 PM Intro to R Click here to register
Oct 26 & 28 (Tues & Thurs) 1-4 PM Beyond ggplot2 Click here to register
Nov 16 (Tues) 2-4 PM Intro to R Click here to register
Nov 23 & 25 (Tues & Thurs) 1-4 PM Intermediate R programming Click here to register

1.2 Workshops

1.2.1 Introduction to R and R Studio

Author(s): Gil B. Henriques, Florent Mazel, Yue Liu & Kim Dill-McFarland

This is a truly introductory workshop for beginners with no experience in R. In this workshop, we introduce you to R and RStudio at the beginner level. This condensed 2-hour workshop is meant to get you started in R and acts as a pre-requisite for our more advanced workshops.

In it, we cover:

  • R and RStudio
  • RStudio projects
  • R scripts
  • Installing packages
  • Reading in data as a data frame
  • Vectors, single values, and data types
  • Basic data visualization
  • The help function

Click here for set up instructions.

1.2.2 Introduction to the tidyverse

Author(s): Kim Dill-McFarland, Andrew Li & Kris Hong

In this workshop, we provide a brief introduction to RStudio, then delve into data manipulation and graphics in the tidyverse including the packages dplyr, tidyr, and ggplot2. We teach different ways to manipulate data in tabular and text forms as well as the critical concepts underlying the grammar of graphics and how they are implemented in ggplot. We will use RStudio, a powerful but user-friendly R environment, and teach you how to use it effectively.

You will learn how to:

  • create an R project and import data from a file into R,
  • create subsets of rows or columns from data frames using dplyr,
  • select pieces of an object by indexing using element names or position,
  • change your data frames between wide and narrow formats,
  • create various types of graphics,
  • modify the various features of a graphic, and
  • save your graphic in various formats

Click here for setup instructions.

1.2.3 Reproducible research

Author(s): Kim Dill-McFarland & Kris Hong

In this workshop, we introduce computational reproducibility and its importance to modern research. We will teach the general principles for reproducible computer-based analyses, along with specific methods and tools for reproducibility and version control with RStudio and GitHub.

You will learn how to:

  • Construct reproducible, automatable workflows in R with scripts and Make
  • Create reproducible documents using Rmarkdown to include underlying code / computations with relevant graphical and statistical results in several different formats (reports, presentation slides, handouts, notes)
  • Use Git version control
  • Integrate version control with GitHub for both personal and group projects

Click here for setup instructions.

1.2.4 Statistical models in R

Author(s): Yue Liu, Kim Dill-McFarland & Andrew Li

In this workshop, we introduce various types of regression models and how they are implemented in R. We cover linear regression, ANOVA, ANCOVA and mixed effects models for continuous response data, logistic regression binary response data, and Poisson and Negative Binomial regression for count response data.

You will learn:

  • the assumptions behind the different models
  • how to interpret the main effects and interaction terms in a model
  • various experimental design concepts that help maximize the power

In R, you will learn how to;

  • build a statistical model
  • define and manipulate model terms
  • use the lsmeans package to answer specific research questions

Click here for the setup instructions.

1.2.5 Intermediate R programming

Author(s): Kim Dill-McFarland & Andrew Li

In this workshop, we teach you to use R as a programming environment, allowing you to write more complex, yet clearer data analysis code. We will teach you three fundamental concepts of R programming: functions, classes, and packages.

You will learn how to:

  • define objects, classes, and attributes in data and built-in functions
  • write functions for loops
  • output large result tables to your hard drive
  • write and publish an R package
  • write formal automated tests (aka unit testing)

Click here for the setup instructions.

1.2.6 Beyond ggplot2

Author(s): Andrew Li

Coming soon!


License: GPL v3