How good is generative AI at passing exams? What does this tell us about how we could design better assessments? Join us on Monday 4th December at 2pm GMT (UTC) to discuss a paper on this by Joyce Mahon, Brian Mac Namee and Brett Becker at University College Dublin published at UKICER earlier this year.  From the abstract:
We investigate the capabilities of ChatGPT (GPT-4) on second level (high-school) computer science examinations: the UK A-Level and Irish Leaving Certificate. Both are national, government-set / approved, and centrally assessed examinations. We also evaluate performance differences in exams made publicly available before and after the ChatGPT knowledge cutoff date, and investigate what types of question ChatGPT struggles with.
We find that ChatGPT is capable of achieving very high marks on both exams and that the performance difference before and after the knowledge cutoff date are minimal. We also observe that ChatGPT struggles with questions involving symbols or images, which can be mitigated when in-text information ‘fills in the gaps’. Additionally, GPT-4 performance can be negatively impacted when an initial inaccurate answer leads to further inaccuracies in subsequent parts of the same question. Finally, the element of choice on the Leaving Certificate is a significant advantage in achieving a high grade. Notably, there are minimal occurrences of hallucinations in answers and few errors in solutions not involving images.
These results reveal several strengths and weaknesses of these exams in terms of how generative AI performs on them and have implications for exam design, the construction of marking schemes, and could also shift the focus of what is examined and how.
We’ll be joined by the papers lead author Joyce, who will give us a lightning talk summary of her paper to start our discussion. All welcome, as usual we’ll be meeting on zoom details at sigcse.cs.manchester.ac.uk/join-us
Joyce Mahon, Brian MacNamee and Brett A. Becker (2023) No More Pencils No More Books: Capabilities of Generative AI on Irish and UK Computer Science School Leaving Examinations. In The United Kingdom and Ireland Computing Education Research conference (UKICER 2023), September 07–08, 2023, Swansea, Wales UK. ACM, New York, NY, USA, 7 pages. DOI: 10.1145/3610969.3610982
Programming is hard, or at least it used to be. AI code generators like Amazon’s CodeWhisperer, DeepMind’s AlphaCode, GitHub’s CoPilot, Replit’s Ghostwriter and many others now make programming easier, at least for some people, some of the time. What opportunities and challenges do these new tools present for educators? Join us on Zoom to discuss an award winning paper by Brett Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather and Eddie Antonio Santos at University College Dublin, the University of Auckland and Abilene Christian University on this very topic.  We’ll be joined by two of the co-authors who will present a lightning talk to kick-off our discussion, for our monthly ACM journal club meetup. Here’s the abstract of his paper:
The introductory programming sequence has been the focus of much research in computing education. The recent advent of several viable and freely-available AI-driven code generation tools present several immediate opportunities and challenges in this domain. In this position paper we argue that the community needs to act quickly in deciding what possible opportunities can and should be leveraged and how, while also working on overcoming otherwise mitigating the possible challenges. Assuming that the effectiveness and proliferation of these tools will continue to progress rapidly, without quick, deliberate, and concerted efforts, educators will lose advantage in helping shape what opportunities come to be, and what challenges will endure. With this paper we aim to seed this discussion within the computing education community.
Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, Eddie Antonio Santos (2023) Programming Is Hard – Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation in Proceedings of the 54th ACM Technical Symposium on Computer Science Education: SIGCSE 2023, pages 500–506, DOI: 10.1145/3545945.3569759
Maybe you wrote that code and maybe you didn’t. If AI helped you, such as the OpenAI Codex in GitHub Copilot, how did it solve your problem? How much did Artificial Intelligence help or hinder your solution? Join us to discuss a paper by Michel Wermelinger from the Open University published in the SIGCSE technical symposium earlier this month on this very topic.  We’ll be joined by Michel who will present a lightning talk to kick-off our discussion. Here’s the abstract of his paper:
The teaching and assessment of introductory programming involves writing code that solves a problem described by text. Previous research found that OpenAI’s Codex, a natural language machine learning model trained on billions of lines of code, performs well on many programming problems, often generating correct and readable Python code. GitHub’s version of Codex, Copilot, is freely available to students. This raises pedagogic and academic integrity concerns. Educators need to know what Copilot is capable of, in order to adapt their teaching to AI-powered programming assistants. Previous research evaluated the most performant Codex model quantitatively, e.g. how many problems have at least one correct suggestion that passes all tests. Here I evaluate Copilot instead, to see if and how it differs from Codex, and look qualitatively at the generated suggestions, to understand the limitations of Copilot. I also report on the experience of using Copilot for other activities asked of students in programming courses: explaining code, generating tests and fixing bugs. The paper concludes with a discussion of the implications of the observed capabilities for the teaching of programming.
Michel Wermelinger (2023) Using GitHub Copilot to Solve Simple Programming Problems in Proceedings of the 54th ACM Technical Symposium on Computer Science Education Pages SIGCSE 2023 page 172–178 DOI: 10.1145/3545945.3569830
It’s all very well getting an AI to write your code for you but neither writing code or reading code are the same as understanding code. So what is going on in novices brains when they learn to actually understand the code they are reading and writing? Join us on Monday 6th March at 2pm GMT to discuss a paper by Quintin Cutts and Maria Kallia from the University of Glasgow on this very topic , from the abstract:
An approach to code comprehension in an introductory programming class is presented, drawing on the Text Surface, Functional and Machine aspects of Schulte’s Block Model, and emphasising programming as a modelling activity involving problem and machine domains. To visually connect the domains and a program, a key diagram conceptualising the three aspects lies at the approach’s heart, alongside instructional exposition and exercises, which are all presented. Students find the approach challenging initially, but most recognise its value later, and identify, unexpectedly, the value of the approach for problem decomposition, planning and coding.
We’ll be joined by one of the co-authors (Quintin Cutts), who’ll give us a lightning talk summary of the paper to kick-off our journal club discussion.  Quintin has added: “You can’t write if you can’t read. In just four pages the paper outlines a classroom approach to developing in novices good code comprehension right from the start of an introductory course. There’s also some feedback on what students thought, a year later – spoiler – they seemed to get a lot from it. Anyone teaching introductory programming might find such a short paper thought provoking, even if they don’t pick up the technique in their teaching. Worth a quick read, and coming along to listen/add to the discussion…”
Quintin Cutts and Maria Kallia (2023) Introducing Modelling and Code Comprehension from the First Days of an Introductory Programming Class in CEP ’23: Proceedings of 7th Conference on Computing Education Practice Pages 21–24 DOI:10.1145/3573260.3573266
Automatic code generators have been with us a while, but how do modern AI powered bots perform on introductory programming assignments? Join us to discuss the implications of the OpenAI Codex on introductory programming courses on Monday 4th July at 2pm BST. We’ll be discussing a paper by James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly and James Prather  for our monthly SIGCSE journal club meetup on zoom. Here is the abstract:
Recent advances in artificial intelligence have been driven by an exponential growth in digitised data. Natural language processing, in particular, has been transformed by machine learning models such as OpenAI’s GPT-3 which generates human-like text so realistic that its developers have warned of the dangers of its misuse. In recent months OpenAI released Codex, a new deep learning model trained on Python code from more than 50 million GitHub repositories. Provided with a natural language description of a programming problem as input, Codex generates solution code as output. It can also explain (in English) input code, translate code between programming languages, and more. In this work, we explore how Codex performs on typical introductory programming problems. We report its performance on real questions taken from introductory programming exams and compare it to results from students who took these same exams under normal conditions, demonstrating that Codex outscores most students. We then explore how Codex handles subtle variations in problem wording using several published variants of the well-known “Rainfall Problem” along with one unpublished variant we have used in our teaching. We find the model passes many test cases for all variants. We also explore how much variation there is in the Codex generated solutions, observing that an identical input prompt frequently leads to very different solutions in terms of algorithmic approach and code length. Finally, we discuss the implications that such technology will have for computing education as it continues to evolve, including both challenges and opportunities. (see accompanying slides and sigarch.org/coping-with-copilot/)
James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly, James Prather (2022) The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming ACE ’22: Australasian Computing Education Conference Pages 10–19 DOI:10.1145/3511861.3511863