-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathindex.html
More file actions
157 lines (142 loc) · 8.99 KB
/
index.html
File metadata and controls
157 lines (142 loc) · 8.99 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link rel="icon" href="https://www.ri.cmu.edu/wp-content/uploads/2017/04/ri-favicon.ico">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>16824 Fall 2025</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.2.0/css/bootstrap.min.css">
<link href="../css/style.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div class="container">
<table border="0" align="center">
<tr>
<td width="900" align="center" valign="middle"><h3>Carnegie Mellon University</h3>
<span class="title"> <h1>16-824: Visual Learning and Recognition</h1></span></td>
</tr>
<td colspan="3" align="center"><h3>VIS LRN & RECOG</h3> </td>
</tr>
</tr>
<td colspan="3" align="center"><h3>Fall 2025</h3> </td>
</tr>
<tr>
<td colspan="3" align="center"><span class="menubar">[ <a href="index.html">Home</a> | <a href="schedule.html">Schedule</a> | <a href="resources.html">Assignments and Resources</a> | <a href="https://piazza.com/cmu/fall2025/16824/">Piazza</a> | <a href="previous.html">Previous Offerings</a>]</span></td>
</tr>
<tr>
<td colspan="3" align="center"><h5>Mondays and Wednesdays, 2:00-3:20pm, TEP 1403</h5> </td>
</tr>
</table>
<br>
<p align="center"><img class="teaser" src="images/teaser.jpg" height="400" align="middle" /></p>
<!--One Ring to rule them all, One Ring to find them,
One Ring to bring them all, and in the darkness bind them, test-->
<h2>Course Overview</h2>
<p style="text-align:justify"> <b>Key Topics:</b> Visual Recognition, Deep Learning, Image Classification, Object Detection, Video Understanding, 3D Scene Understanding. Generative Models for Images and Videos. </p>
<p style="text-align:justify"> <b>Description:</b> This graduate-level computer vision course focuses on representation and reasoning for large amounts of data (images, videos, and associated tags, text, GPS locations, etc.) toward the ultimate goal of understanding the visual world surrounding us. We will be reading an eclectic mix of classic and recent papers on topics including Theories of Perception, Mid-level Vision (Grouping, Segmentation, Poses), Object and Scene Recognition, 3D Scene Understanding, Action Recognition, Contextual Reasoning, Joint Language and Vision Models, Deep Generative Models, etc. We will be covering a wide range of supervised, semi-supervised and unsupervised approaches for each of the topics above.</p>
<p style="text-align:justify"> <b>Course Relevance:</b> The course is relevant to students who want to understand and implement state-of-the-art deep learning and computer vision algorithms.</p>
<p style="text-align:justify"> <b>Course Goals:</b> There are three primary course goals. First, the course aims to familiarize students with the fundamental concepts of deep learning models. Second, the course helps students understand state-of-the-art methods in visual recognition and understanding. Third, through the programming assignments and final project, students have an opportunity to learn how to build practical computer vision systems.</p>
<!--<h2>Announcements</h2>-->
<h2>Course Staff</h2>
<p>Please use the course <a href="https://piazza.com/cmu/fall2025/16824">Piazza page</a> for all communication with course staff. </p>
<div class="col-md-12">
<h3>Instructor</h3>
<div class="instructor">
<a href="https://www.cs.cmu.edu/~junyanz/">
<div class="instructorphoto"><img src="images/jun-yan.jpg"></div>
<div>Jun-Yan Zhu</div>
</a>
</div>
</div>
<div class="col-md-12">
<h3>TAs</h3>
<div class="instructor">
<a href="https://ananyabal.github.io/">
<div class="instructorphoto"><img src="images/ananya.jpg"></div>
<div>Ananya Bal</div>
</a>
</div>
<div class="instructor">
<a href="https://eungyeupkim.github.io/">
<div class="instructorphoto"><img src="images/eungyeup.jpeg"></div>
<div>Eungyeup Kim</div>
</a>
</div>
<div class="instructor">
<a href="https://jaykarhade.github.io/">
<div class="instructorphoto"><img src="images/jkarhade.jpg"></div>
<div>Jay Karhade</div>
</a>
</div>
<div class="instructor">
<a href="https://ariannaliu.github.io/">
<div class="instructorphoto"><img src="images/Zhixuan.jpg"></div>
<div>Zhixuan Liu</div>
</a>
</div>
</div>
<div class="col-md-12">
<h3>Assignment Structure</h3>
<ul>
<li>Class Participation (10%): Participate in class and online piazza discussion.
Students must read one paper per class and post a few lines (question, answer, thoughts, insight, etc.)
within one week of the corresponding class day. </li>
<li>Homework Assignments (45%): Submit all homework assignments on time. Each assignment is worth 15% of the overall grade.</li>
<li>Final (group) Project (45%): </li>
<ul>
<li> Students will complete an independent research project in groups of 3-4. </li>
<li>Submit a project proposal. </li>
<li>Present project in class.</li>
<li>Write up your findings in a short paper (4 - 8 pages, standard CVPR template). </li>
</li>
</ul>
</ul>
</div>
<div class="col-md-12">
<h3> Learning Resources </h3>
<p> Resources will include lecture slides, textbooks, webpages, Colab, videos, and paper reading list. Additional coding tutorials on PyTorch will be provided. </p>
<P>There is no official textbook for this course. But you will find the following textbooks useful.</P>
<ul>
<li><a href="https://szeliski.org/Book/">“Computer Vision: Algorithms and Applications”</a>, Richard Szeliski, 2010</li>
<li><a href="https://www.deeplearningbook.org/">“Deep Learning”</a>, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, 2016</li>
<li><a href="https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/">“Foundations of Computer Vision”</a>, Antonio Torralba, Phillip Isola and William T. Freeman, 2024</li>
</ul>
</div>
<div class="col-md-12">
<h3> Extra Time Commitments </h3>
<p> This course requires ~9 hrs/weekly for assignments, projects and study. </p>
</div>
<div class="col-md-12">
<h3> Collaboration Policy </h3>
<p> Collaboration is encouraged, but the work you submit for assignments is expected to be entirely your own.
That is, the writing and code must be yours, and you must fully understand everything that you submit.
Discussing a paper or the details of how to solve a problem is fine, but you must write your submission yourself.
Please list collaborators whom you discussed with in the assignment write-up.
If we find highly identical work without proper accreditation of collaborators, we will take action according to university policies.
For more, see the <a href="https://www.cmu.edu/policies/student-and-student-life/academic-integrity.html">CMU academic integrity guidelines</a>.</p>
</div>
<div class="col-md-12">
<h3> Use of Large Language Models</h3>
<p> Using a large language model (e.g., ChatGPT, CoPilot, Cursor, etc.) to generate any part of your programming assignments or Piazza posts is strictly prohibited and a violation of academic integrity.
For the final project, you are permitted to use LLMs. If you do, you must acknowledge this in your final report and submit a log (or chat history) of all prompts used to generate project content.
Failure to properly document your use of AI on the project is also considered an academic integrity violation.
</p>
</div>
<div class="col-md-12">
<h3> Late Policy </h3>
<p> For the programming assignments, students will be allowed a total of <b>five</b> late days per semester; each additional late day will incur a 10% continuously prorated penalty. The code should be easy to run by TAs.
Make sure to start early and complete your assignments on time!
Please note that the late days do not apply to any part of the final project: that includes the project proposal and final project report.
This policy will be enforced strictly. </p>
</div>
<div class="col-md-12">
<h3> Prerequisites </h3>
<p> While there are no formal prerequisites, this course assumes familiarity with computer vision (16-720 or similar) and machine learning (10-601 or similar). If you have not taken courses covering this material, consult with the instructor.
Additionally, you must be familiar with how to use PyTorch and have some prior experience using the framework.
</p>
</div>
<p> </p>
<hr />
<p> <tiny>Website template modified from <a href="https://phillipi.github.io/6.882/2020/">here</a></small> </p>
</div>
</body>
</html>