CSCI 6962/4140 Project Details, Fall 2025
↩ Trustworthy Machine Learning (TML)
The second major component of the course (in addition to paper presentations) is the course project. This is your chance to step beyond reading about trustworthy machine learning and contribute to the field. The project is designed to give you first-hand experience in hands-on research: asking questions, testing ideas, facing challenges, and creating new insights into how we can make machine learning systems more trustworthy. You learn best by doing, and this project is your opportunity to do real work at the frontier of TML.
Along the way, you will also practice a core skill of researchers: communicating your findings clearly and thoroughly enough to facilitate reproduction and confidence in your conclusions. Being able to share your ideas persuasively is as important as having the ideas themselves, and this project will give you practice in both.
Your project must connect directly to trustworthy machine learning, including but not limited to the following desiderata: robustness, fairness, interpretability, privacy, and alignment. Projects will be completed in groups of two. Groups will be initially assigned at random, but you may rearrange membership by mutual agreement before the project selection deadline. Working together will not only make the project more manageable, it will also give you experience in collaborative research, another essential part of how progress is made in our field.
Research Projects
You will conduct original research related to trustworthy ML, theoretical or applied. Projects that only apply existing methods to a dataset in a straightforward way are not acceptable.
Each project culminates in a prerecorded 20-minute presentation and a written report. The report does not have to be a certain number of pages; it must simply be the length required to be thorough.
Examples of potential research directions
- Extend recent (or old!) papers on ML or algorithms for ML to incorporate trustworthiness desiderata (e.g., fairness constraints, robustness guarantees, privacy budgets, interpretability requirements).
- Novel methods for certifiable adversarial robustness (training or verification).
- Defenses against data poisoning or backdoor attacks; or improved attack models and evaluations.
- Fairness under distribution shift, selection bias, or adversarial/correlated missingness.
- Interpretability methods with validated faithfulness or usability criteria.
- Privacy–utility tradeoffs in federated, distributed, or on-device learning.
- Designing better benchmarks for trustworthiness desiderata and using them to evaluate existing models.
Grading Rubric and Deadlines
| Task | Due date (11:59 pm ET) | Percentage | Details |
|---|---|---|---|
| Project selection | October 10 | 20 | Submit via Piazza with title (GX project selection). Provide: the problem you will tackle and why it is novel/interesting, the techniques you plan to use, and how it relates to trustworthy ML. These project ideas must be pre-approved. |
| Project progress report | November 21 | 25 | Email me a pdf with title (GX project progress report). Approximately two-thirds of the final report in the format of an ICLR workshop paper: include introduction, background, and related work; experimental design; and initial results and preliminary conclusions (aim for at least ~50% of experiments completed so we can give actionable feedback). |
| Presentations and report | December 8 | 35 | Presentations are prerecorded to avoid timing issues. Submit your final report and a prerecorded 20-minute presentation, both uploaded to RPI Box. The Box link should be in the deliverable repo that you submit. We will have 5 minutes of discussion after each talk; the group must attend to field questions. |
| Deliverables | December 8 | 20 | Email me a link to a public GitHub repository with well-documented, reproducible code, with title (GX project deliverables). Include datasets (if small) or scripts to download and preprocess to your expected format. Instructions should be given for using the repo to reproduce your results. The repo should also contain the final report, your slide deck, and the Box link to your presentation. |
Project Selection
Submit one (pre-approved!) proposal per team. Include a clear statement of the research problem and why it is novel/interesting, planned techniques, and its connection to trustworthy ML. This can be a short paragraph.
Project Progress Report
The progress report is intended to be a near-complete draft of your final paper. It will be graded with the understanding that it occurs about two-thirds of the way through the project timeline.
Your progress report should include:
- A clear introduction and statement of the research problem.
- Background and related work sections that demonstrate your understanding of prior research and its relation to your work.
- A description of your methodology and experimental design (datasets, models, evaluation metrics, baselines).
- Initial results, with at least half of your experiments completed by this stage.
- Discussion of any challenges or open questions you are facing.
The goal of the progress report is to demonstrate that you are on track and to provide enough substance for me to give useful feedback. This feedback should guide the completion of your remaining experiments and the writing of your final report. Remember that the report does not have a required page length — it must simply be long enough to explain your work thoroughly.
Presentations and Report
The two components serve complementary purposes. The presentation demonstrates the ability to communicate ideas clearly and persuasively, while the report emphasizes rigor, completeness, and reproducibility in an archival form. During class, we will play the recorded 20 minute talk and then hold a 5 minute discussion period in which the group is expected to participate and answer questions.
To receive full credit on the presentation, address:
- The problem you investigated and why it matters for trustworthy ML.
- Your main contributions and how they compare to prior work.
- An intuitive and technically accurate explanation of your method or theoretical results.
- Evaluation against relevant baselines (or formal guarantees if theoretical).
- Limitations and avenues for future work.
Deliver the talk clearly, with professional slides, appropriate pacing, and participation from all group members, and engage actively in the live question and answer session following the talk.
To receive full credit, the report must:
- Motivate the problem and explain its importance in the context of TML.
- Situate the work within the existing literature, identifying related approaches and clarifying the contribution.
- Describe methods and experimental design, including datasets, models, evaluation metrics, and baselines, with enough detail for reproduction.
- Present results comprehensively and analyze their meaning and limitations.
Follow professional style and organization, using ICLR workshop formatting, with clear figures, legible equations, and polished writing.
Deliverables
The deliverables ensure that your research is reproducible and transparent. By sharing code and data, you make it possible for others to replicate your experiments. By preparing slides and a written report, you practice explaining your work clearly and professionally. These components together mirror real research practice, where reproducibility, communication, and clarity are as important as technical results.
You will submit the following:
- GitHub/GitLab repository: Well-documented, cross-platform code with clear instructions to reproduce your experimental results. Python, Julia, R, and C/C++ are acceptable. Include datasets if they are small enough, or provide scripts that download and preprocess them into the format your code expects. The repo should also contain the following two deliverables.
- Slide deck: A 20-minute presentation with legible figures and properly typeset math. This deck supports your prerecorded talk and should explain your contributions in a clear, concise manner.
- Final report: A workshop-style paper (ICLR format). There is no page limit — it should simply be as long as needed to thoroughly present your problem, methods, results, and conclusions.
Guidance
See the CS Grad Skills Seminar slide decks for expectations and best practices:
- Giving Presentations: 2025 seminar page
- Writing Papers: 2024 seminar page