Some Thoughts on Teaching Machine Learning post-ChatGPT
“I knew nothing, persisting in unshaken faith that the time of cruel miracles was not passed.” -- Stanisław Lem, Solaris
In late 2022 OpenAI released ChatGPT, a chat interface to their (at the time) most advanced generative artificial intelligence model, GPT 3.5. This is the first time most of us have encountered general AI, capable of tasks which not so long ago seemed far out of reach, perhaps even for hundreds of years. We are now at a point where the most advanced AI models can usefully assist with mathematical and scientific research and can match top human performers on international Olympiad-level math, physics and programming competitions. The transition from the impossible to routine (still in progress) has taken three years.
What does the success of AI mean for teaching machine learning? There is an on-going discussion of how AI affects teaching in general. Nearly all undergraduate curriculum, and a large portion of graduate classes across the disciplines are within the domain of competence of modern AI systems. This makes these systems invaluable companions to learning for motivated students -- like experienced private tutors, patient, always available and nearly free. But they are also tools for avoiding learning altogether for students who are less interested. Cheating on tests and assignments is certainly nothing new, going at least as far back as ancient Chinese Imperial exams. Nevertheless, the power, immediacy and low friction of using AI for that purpose are qualitatively different. It is now reasonable to assume that nearly any undergraduate and most graduate assignments can be solved effortlessly within minutes. Furthermore, while AI models are not fully reliable for certain tasks, their capabilities and scope are growing rapidly. Any “AI-resistant” assignments (popular a year or so ago, perhaps less so now) are at best a temporary solution, until the next model is released a few months down the line. There has been an on-going discussion on whether there should be more in-class tests. It is useful to remember that the goal of a test is two-fold – (1) to give a grade, which serves as a certificate of competence for employers, graduate schools and other external parties; (2) to provide a development checkpoint for the student to evaluate their progress internally. An in-class exam is still useful for validating competence. However, the value of such certificates is going down, at least for the industry, as AI tools are starting to replace junior programmers and data scientists. There seems to be a shift from the skills measured by grades to other indicators of ability, such as projects and artifacts. This process will likely continue and possibly accelerate. As far as (2) is concerned, interested students are likely to study whether exams are given in class or not.
However, while the promise and the threat of AI permeate our educational system, the challenges modern AI brings to teaching Machine Learning are deeper, going to the core of the subject. Here are some thoughts and personal experiences as someone actively involved in ML research and teaching ML classes at graduate and undergraduate levels for a number of years. The technologies underlying modern AI are based on classical machine learning and optimization tools such as neural networks and Stochastic Gradient Descent (SGD). These methods are certainly not new and have been taught in classes for many years. Until relatively recently, a little more than ten years ago, we thought that we had a good fundamental understanding of the underlying principles. This certainty was significantly diminished in the early 2010’s with the advent of deep learning, a set of neural network-based techniques for problems such as image analysis and natural language processing. At that time, the models were still specialized to narrowly defined tasks, predating the generalist nature of modern post-ChatGPT AI. Nevertheless, deep learning had become highly successful, surpassing preceding techniques by a wide margin on a range of benchmarks. While in many ways the practice of deep learning circa 2015 was not substantially different from that of 10 or even 20 years prior, there were certain “modern” aspects of best practices which went contrary to traditional intuitions. These were things like a persistent use of very large (tiny by the standards of 2025!) “over-parameterized’’ models, with the number of parameters often exceeding the number of training data, the fact that fitting the data exactly, classically considered overfitting, was mostly fine and “the unreasonable effectiveness” of SGD for non-convex optimization. It seemed that much of the received statistical ML and optimization wisdom, such as over-fitting and bias-variance tradeoffs no longer applied. It became challenging to teach those concepts in machine learning classes knowing how theory deviated from practice. An instructor should be honest with the students. It is far better to admit ignorance than to mislead. Yet ignorance (or, to put it politely, uncertainty) is not what students taking machine learning classes are looking for. That tension inevitably led to muddled and confusing messaging. However, by early 2020’s things were looking up. There was significant progress in theoretical understanding of modern ML and statistics, based on a broad effort by researchers, including some of my work on topics such as “double descent” (an extension of the classical U-shaped generalization curve, which was introduced by us), over-parameterization and non-convex optimization. The theoretical understanding was still largely limited to far simplified models and settings, and the results were subtle, not easily conveyable in introductory ML classes, and possibly contradictory to what the students had learned prior to that. Still, I felt that interested students could gain a basic understanding of the modern state of the subject. Most importantly, I was not misleading them with claims which were incorrect or incomplete.
The situation has changed qualitatively yet again in the last three years. While the insights gained from deep learning of 2010’s still apply, we have little understanding of how autoregressive models, such as GPT, which are trained to simply predict the next word (token), learn to exhibit broad intelligence on a variety of seemingly unrelated tasks. While AI is changing the world, and is evolving itself, our understanding lags far behind the progress of technology. How do our classes stay relevant in this fluid world? Unquestionably, many students are fine with the uncertainty – they would like to understand the fundamentals and to help make progress toward understanding modern AI. Still, for the majority of students, the material in ML classes is remote from the reality of the world as it stands now, even allowing for the enduring tension between theory and practice. On the other hand, for those of us teaching these courses it makes little sense to teach current technologies as the core material – they change daily and do not form a lasting foundation of knowledge. Is there a fundamental basis for prompt engineering or agentic workflows? It seems unlikely. Furthermore, many excellent resources on building and using modern systems are easily available online.
Where does this leave teaching ML? We should emphasize the durable and to continuously evaluate the transient. Clearly, the mathematical backbone -- linear algebra, optimization, modern statistical learning theory and generalization – is a key ingredient in an ML class. At the same time, we need to discuss some elements of modern systems, such as transformer architectures, autoregressive predictions, “thinking models”, and other modern successes (perhaps diffusion-based generative models and something about GPU design). The aim is to help students to ask precise questions and detect failure modes. They should be fluent with today’s tools but not captive to them and able to act under uncertainty while retaining clarity of thought. Is that too much to ask or too little? We will find out soon. The world will change in the next few years. Adapting to these wonderful and terrible transformations is the only path forward, for individuals and institutions alike.


Excellent analysis of the educational implications post-ChatGPT! I'm wondering how you think the assessment methods need to evolve - beyond detecting AI use, what fundamental shifts in evaluation might better capture genuine understanding in this new era? Should we focus more on metacognitive skills?
Well said!