R Independent Study



Syllabus


This is a 2-credit online independent study. You are expected to learn the material on your own. There are no class meeting or face-to-face discussion groups. Anticipate spending 6 – 8 hours a week. All the required course work will be posted on the lesson page every week. Your grade will be determined by the weekly Lon Capa assignments due every Friday at 11:59 pm.

Credit Hours: This is a 2-hour credit course. Make sure that's what you chose on UI Enterprise, otherwise you'll get short-changed. If there is some reason (like going over hours) you prefer to only get 1-hour credit, email us firemanstat390@gmail.com.

Pre-requisite: You should have taken a basic one-semester statistics course like Stat 100, and are concurrently taking/have taken Stat 200. All students can access the material of this course from the course website except the Lon Capa homework assignments. Students currently taking Stat 200 but not this course will be able to access the Lon Capa homework assignments after the assignment due dates, but those assignments will not be graded.


Course Staff

Instructor: Ellen Fireman (email: firemanstat390@gmail.com)
Course Assistant: Yuk Tung Liu

Website: http://courses.atlas.illinois.edu/fall2016/STAT/STAT200/RProgramming/
(short URL: go.illinois.edu/stat390)

Office Hours: Monday–Friday at 4–6pm in 23 Illini Hall (the computer lab in the basement)


Course Goal and Philosophy

Upon completion of this course, you will be expected to become skilled enough in using R as a tool for real statistical projects.

R is more than a set of separate little calculator-like commands. It's a full programming language with an internal logic. As with any language, acquiring fluency requires real practice. Our exercises are mostly not cut-and-paste phrases. Instead we build toward real-world use.


Materials

Fortunately there are already many high-quality free resources available for R learners and users. We structure the course around those materials. We will use mainly three resources in this course.

Textbook: R Programming for Data Science by Roger D. Peng.
This is an ebook. The suggested price is $20, but you can get it free if you want. More information on the book and how to purchase it can be found in Week 1's note.

swirl: A software package written in R. It provides interactive lessons for beginners to learn R. Instructions for installing swirl are given in Week 1's note.

Weekly Note: There is an html note every week. The links to the notes will be posed on our lesson page every week. In these additional notes, we demonstrate how R can be used to tackle problems encountered in Stat 100 and 200.

Lon Capa Assignments: We will integrate these materials into a set of lesson plans with weekly Lon Capa homework assignments, mostly using individually randomized data sets and graded automatically. Most of these problems are not very hard, but since each of you has a slightly different data set, copying answers will not work! Starting from Week 6, you will also be given higher-level problems that require more complex programming and data analysis skills. These problems will be hand-graded by the TAs.

Late homework is NOT accepted on Lon-Capa. However, Lon-Capa grades each problem in the assignment separately, so if you do 70% of the homework correctly before the due date, you'll get credit for that 70%. Your lowest homework score will be dropped at the end of the semester.


Grading

The grades will be based on the weekly Lon Capa homework due every Friday at 11:59pm starting on Sep 2; there are no exams or projects. That means everyone can do well just by hard work. Your grades will appear on Compass.

Overall grade is translated into a letter grade as follows:

A+ 97-100 A 93-96.99 A- 90-92.99
B+ 87-89.99 B 83-86.99 B- 80-82.99
C+ 77-79.99 C 73-76.99 C- 70-72.99
D+ 67-69.99 D 63-66.99 D- 60-62.99
F < 60

Bonus Points: There will be two bonus assignments, each is worth 10 points. The first is the syllabus quiz, given on the first week and due by the end of the second week (Sep 3). The second is a survey on this course, which will be given at the end of the semester. Bonus points can only help you. You can still get 100% without doing any bonus work. Bonus points are figured into your grade as follows:

(Percentage on Required Work) + 0.2×(Percentage on Bonus Points)
100 + 0.2×(Percentage on Bonus Points)

Suppose at the end of the semester you have an 80% average and you get 90% of the bonus work. Your course total will be (80 + 0.2×90)/(100 + 0.2×90) = 98/118 = 83.05%. So your grade would be raised from a B- to a B.



Course Schedule

Lessons will be posted here every week.

Week Dates Topics
1 8/22 – 8/26 Introduction, installation of R
2 8/29 – 9/2 Data types, missing values, vectorized operations
3 9/5 – 9/9 Loading data files, subsetting, statistics functions
4 9/12 – 9/16 Control functions, logical operations, simple data manipulations
5 9/19 – 9/23 Writing functions, plotting
6 9/26 – 9/30 R markdown, simple linear regressions
7 10/3 – 10/7 Loop functions, regression with factor variables
8 10/10 – 10/14 Introduction to Monte Carlo simulations
9 10/17 – 10/21 Date and time in R, multivariable regressions
10 10/24 – 10/28 Statistical Tests, optional: regular expressions
11 10/31 – 11/4 Transformation of variables
12 11/7 – 11/11 Maximum likelihood and logistic regression
13 11/14 – 11/18 Logistic regression
14 11/28 – 12/2 Nonparametric statistics