# COS 435 / ECE 433: Introduction to Reinforcement Learning
syllabus / schedule / Ed / Gradescope / Spring '24 Slides
* **Q**: How do I join the class/waitlist? * If you are unable to enroll on the registrar website, then we have likely hit the course cap. Please add yourself to the [waitlist](https://docs.google.com/forms/d/e/1FAIpQLSdkSgQRW2X5hp-aqphmzIP09d2VDiM80XJHtGP8LJo3dN2rAw/viewform) * **Q**: Can I audit the course? No, but you are welcome to attend the lectures if you are not enrolled in the course. * **Lecture**: Tuesday and Thursday, 1:30pm – 2:50pm. Location: Computer Science 104 * **Precepts**: Fridays 11:00 – 11:50 and 12:30 – 1:20. Cathy Ji and Jiayi Geng will lead concurrent precepts during both times, in Friend 004. * **Office Hours**: * Amanda Wang: Monday at 4:30pm -- 6:30pm in CS Building 105 * Amanda won't hold office hour on March 24. * Cathy Ji: Monday at 6:30pm -- 7:30pm CS Building 105. * Ben Eysenbach: Tuesday at 3:00pm -- 5:00pm in CS Building 416 * Ben's Feb 25 office hours will be held on Feb 28 (same place and time). * Yulai Zhao: Wednesday at 7:00pm -- 9:00pm Friend Center 010; Monday at 8:00pm -- 9:00pm via [Zoom](https://princeton.zoom.us/j/6061324339) * Kaixuan Huang: Thursday at 4:00pm -- 6:00pm Friend Center 010 * Zihan Ding: Friday at 3:00pm -- 5:00pm Friend Center 308; Sunday at 10:00am -- 12:00 pm via [Zoom](https://princeton.zoom.us/j/7311017948?omn=92927588850) * Prerequisites: * Intro to ML: COS 324, ECE 435 or equivalent * Probability: ORF 309, or equivalent * Linear Algebra * **Textbook**: None are required, but see [Syllabus](https://docs.google.com/document/d/1zI8TsqTqfQRoEDaaPzeVCcdVwV0UHn6D/edit?usp=sharing&ouid=114713110862674988879&rtpof=true&sd=true) for some books that might be useful if you're ever confused about any of the material in the course. * **Questions?** Ask on [Ed](https://edstem.org/us/courses/54890/discussion/). We will not be using Canvas. Assignments will be submitted on Gradescope. ![...](ideogram.jpeg width=200px) _**Reinforcement learning (RL)** is a machine learning technique that teaches agents how to make decisions that lead to good outcomes. This course will introduce fundamental concepts, important RL algorithms, and key challenges (e.g., exploration and generalization). The course will also highlight applications of RL to real-world problems, including health care and molecular science. Assignments will entail implementation of RL algorithms and mathematical analysis of these algorithms. Students will complete an open-ended final group project._ ### Assignments We have provided both the assignment and the TeX file for use as a template. Unless explicitly specified, please type up your solutions using TeX and submit both your compiled PDF and finished .ipynb on Gradescope. * **Homework 0**: [latex](./hw/s25/hw0.tex); [ipynb](./hw/s25/hw0.ipynb); [sample pdf](./hw/s25/hw0.pdf) Due Date: **2/3/2025** * **Homework 1**: [latex](./hw/s25/hw1.tex); [ipynb](./hw/s25/hw1.ipynb); [sample pdf](./hw/s25/hw1.pdf); (Please Read the Notice [here)](https://edstem.org/us/courses/69451/discussion/6133716) Due Date: **2/10/2025** * **Homework 2**: [latex](./hw/s25/hw2.tex); [sample pdf](./hw/s25/hw2.pdf); [dataset](https://drive.google.com/file/d/1wkmRlYrBszwzgRNNG0t-YZul1RDsFuGS/view?usp=drive_link) Due Date: **2/19/2025** (two-day extention) * **Homework 3**: [latex](./hw/s25/hw3.tex); [sample pdf](./hw/s25/hw3.pdf); [ipynb](./hw/s25/hw3.ipynb) Due Date: **2/24/2025** * **Homework 4**: * **Homework 5**: * **Homework 6**: * **Homework 7**: * **Homework 8**: ### Solutions Solutions will be posted after each assignment is due. * **Homework 0**: [solution](./hw/s25/hw0-solution.pdf) * **Homework 1**: [solution](./hw/s25/hw1_solution.pdf); [coding_solution](./hw/s25/hw1_solution.ipynb) ### Lecture Notes * **Lecture 1**: [lecture note](./s25_material/lecture-1-what-is-rl.pdf) * **Lecture 2**: [lecture note](./s25_material/lecture-2-mdp.pdf); [code](./s25_material/Lecture-2-CartPole.ipynb) * **Lecture 3**: [lecture note](./s25_material/lecture-3-bandits.pdf) * **Lecture 4**: [lecture note](./s25_material/lecture-4-cem-mpc.pdf) * **Lecture 5**: [lecture note](./s25_material/lecture-5-imitation.pdf) * **Lecture 6**: [lecture note](./s25_material/lecture-6-policy-gradient.pdf) * **Lecture 7**: [lecture note](./s25_material/lecture-7-value-functions.pdf) * **Lecture 8**: [lecture note](./s25_material/lecture-8-policy-value-iteration.pdf) * **Lecture 9**: * **Lecture 10**: * **Lecture 11**: * **Lecture 13**: * **Lecture 14**: * **Lecture 15**: * **Lecture 16**: * **Lecture 17**: * **Lecture 18**: * **Lecture 19**: * **Lecture 20**: * **Lecture 21**: * **Lecture 22**: ### Precept Notes * **Week 1**: [precept note](./s25_material/precept-1-notes.pdf) * **Week 2**: [precept note](./s25_material/precept-2-notes.pdf) * **Week 3**: [precept note](./s25_material/precept-3-notes.pdf) * **Week 4**: * **Week 5**: * **Week 6**: * **Week 7**: * **Week 8**: * **Week 9**: ### Course Staff ![[Ben Eysenbach](https://ben-eysenbach.github.io/)](assets/s25/ben.jpg height=150) ![[Yulai Zhao](https://yulaizhao.com/)](assets/s25/yulai.png height=150) ![[Zihan Ding](https://quantumiracle.github.io/webpage/)](assets/s25/zihan.jpg height=150) ![[Jiayi Geng](https://jiayigeng.github.io/)](assets/s25/jiayi.jpeg height=150) ![[Cathy Ji](mailto:cj7280@princeton.edu)](assets/s25/cathy.png height=150) ![[Kaixuan Huang](https://hackyhuang.github.io/)](assets/s25/kaixuan.jpg height=150) ![[Alice Hou](mailto:ah5087@princeton.edu)](assets/s25/alice.jpg height=150) ![[Amanda Wang](mailto:aw4309@princeton.edu )](assets/s25/amanda.jpg height=150) ![[Leo Yu](mailto:ly4431@princeton.edu)](assets/s25/leo.jpg height=150) ------------