Introduction to R
Preface & Course Overview
This notebook was created to host learning materials for Math 130: Introduction to R at California State University, Chico.
Math 130 is designed as a short-course “primer” to be taken before, or concurrently with, an upper division or graduate level Statistics course such as Math 315 or Math 615.
The goal of Math 130 is to get the complete novice up and running with the basic knowledge of how to use the statistical programming language R in an environment that emphasizes reproducible research and literate programming for data analysis. We recognize there are many topics and many approaches to teach an introduction to the R programming language. We have picked specific topics and methods that we believe learners will need to succeed in subsequent classes.
The target audience is anyone who wants to do their own data analysis. The course will culminate with an exploratory data analysis on either a pre-specified data set or your data set of choice.
Logistics
This course may be offered as in person or (a)synchronous online. Regardless of the mode of instruction, all learning materials are found on this website. Class time (when applicable) is spent expanding on ideas and concepts, working through assignments as a class or in pairs.
The logistics of the course varies slightly by instructor and mode of instruction. See below for your specific section.
Syllabus for current term
move these to sub folder
where to put discord & common help?
Callout boxes and icons
mean this is a note
orange boxes
means danger. so this is red?
means caution
is a tip or an example
not sure what color
- :cinema: associated video
- :pencil2: You try it
- :x: Don’t do this thing
Tool choices
R + R Studio = Success
The term R refers to both the programming language and the software that interprets the scripts written using it.
RStudio is currently a very popular way to not only write your R scripts but also to interact with the R software.
We will be programming in the R language, using the R Studio platform. You will have to install both onto your computer. Setup instructions are discussed later in section
FIXME
Why use R?
- Open source, cross-platform, and free
- Great for reproducibility
- Flexible and extensible. Doesn’t do something you want? Create a custom function for yourself.
- Tons of learning resources
- Currently R is used in all of Chico’s upper division Statistics & Data Science courses, as well as in some Science, Public Health, Economics, Finance, and some graduate courses.
- Does not involve lots of pointing and clicking (that’s a good thing!)
- Works on data of all shapes and sizes
- Produces high-quality graphics
- Large and welcoming community
Why use R Studio?
- Customizable workspace that docks all windows together.
- Notebook formats that allow for easy sharing of code and output, and integration with other languages (Python, C++, SQL, Stan)
- Syntax highlighting, warning errors when missing a closing parentheses.
- Cross-platform interface. Also works on Windows/iOS/Linux.
- Tab completion for functions. Forget the syntax or a variable name? Popup helpers are available.
- Free training videos available from the developers directly.
- One button publishing of reproducible documents such as reports, interactive visualizations, presentations (like this one!), websites.
make this side by side and add appropriate alt text
Left photo credit; Right photo credit. Entire figure credit: Data Carpentry R for Social Scientists.
check this link
Programming is scary!
check links
Learning to program has other benefits
- Improves your logical skills and critical problem solving
- Increases your attention to detail
- Increases your self reliance and empowers you to control your own research.
- Your PI will love your awesome graphics and reports.
- Some people think what you do is magic.
- Thinking graduate school? [expect to learn this on your own]
- [A few] [other lists] [of reasons]
Why no point and click?
Because it’s not reproducible.
- Which boxes did you click last time?
- New data? Gotta do it all over.
- Need to expand your model? Gotta do it all over.
- Made a mistake in the data coding? Gotta do it all over…
References
Some of this material presented is a derivation from work that is Copyright © Software Carpentry (http://software-carpentry.org/) which is under a CC BY 4.0 license which allows for adaptations and reuse of the work.