I am co-organizing an RL reading group at the Vector Institute, which is conducted fully online. All participants meet at this Zoom link: https://carleton-ca.zoom.us/j/94458342852?pwd=BALQfADPJGSfdMSNA32wp5cNnN8YBs.1 (passcode: 552958) and take turns in presenting a recent RL research paper or a recent RL library/environment that is of interest to others. The schedule is maintained here: https://docs.google.com/spreadsheets/d/1SX-l9vGe9jy35ibGnAohQgvIsg_nmvGjx9LfqrMzSek/edit?gid=0#gid=0. Anyone interested in RL is welcome to join future meetings or sign up to present on that Google Sheet. The meetings take place on Mondays from 3 pm – 4 pm Eastern Time.
Here is the presentation schedule for Winter 2026.
| Date | Presenter | Paper Topic | Link | |
| Jan 19 2026 | Sriram Ganapathi Subramanian | The Big World Hypothesis and its Ramifications for Artificial Intelligence | https://openreview.net/pdf?id=Sv7DazuCn8 | [email protected] |
| No Meeting (for ICML deadline) | ||||
| Feb 2 2026 | Michal Lisicki | KL-Regularized Reinforcement Learning is Designed to Mode Collapse | https://openreview.net/forum?id=flBRtdIihA | [email protected] |
| Feb 9 2026 | Wenhao Li | A Comedy of Estimators: On KL Regularization in RL Training of LLMs | https://openreview.net/forum?id=MkLHbwSMP3 | [email protected] |
| Feb 16 2026 | No Meeting (Family Day) | |||
| Feb 23 2026 | Fae Moradi | Understanding R1-Zero-Like Training: A Critical Perspective | https://arxiv.org/abs/2503.20783 | [email protected] |
| Mar 2 2026 | Emiliano Penaloza | Privileged Information Distillation for Language Models | https://arxiv.org/abs/2602.04942 | [email protected] |
