Join us on to discuss why LLM-enhanced Programming Error Messages are Ineffective in Practice on Monday 2nd December at 2pm GMT (UTC)

Icon by LAFS on flaticon.com

Large Language Models (LLMs) can help explain programming error messages and these explanations tend to improve as the models they are based on include more source code. However, it is unknown to what extent novice programmers are able to effectively utilise these automatically generated explanations to debug their programs, with tools like GitHub CoPilot and ChatGPT. Join us to discuss a paper on this by Eddie Antonio Santos and Brett Becker. This paper won a best paper award at UKICER.com earlier this year. We’ll be joined by the papers lead author, Eddie Antonio Santos, who’ll give a lightning talk to kick off our discussion. From the abstract:

The sudden emergence of large language models (LLMs) such as ChatGPT has had a disruptive impact throughout the computing education community. LLMs have been shown to excel at producing correct code to CS1 and CS2 problems, and can even act as friendly assistants to students learning how to code. Recent work shows that LLMs demonstrate unequivocally superior results in being able to explain and resolve compiler error messages—for decades, one of the most frustrating parts of learning how to code. However, LLM-generated error message explanations have only been assessed by expert programmers in artificial conditions. This work sought to understand how novice programmers resolve programming error messages (PEMs) in a more realistic scenario. We ran a within-subjects study with 𝑛 = 106 participants in which students were tasked to fix six buggy C programs. For each program, participants were randomly assigned to fix the problem using either a stock compiler error message, an expert-handwritten error message, or an error message explanation generated by GPT-4. Despite promising evidence on synthetic benchmarks, we found that GPT-4 generated error messages outperformed conventional compiler error messages in only 1 of the 6 tasks, measured by students’ time-to-fix each problem. Handwritten explanations still outperform LLM and conventional error messages, both on objective and subjective measures.

As usual, we’ll be meeting on zoom, all welcome, details at sigcse.cs.manchester.ac.uk/join-us.

References

  1. Eddie Antonio Santos and Brett A. Becker (2024) Not the Silver Bullet: LLM-enhanced Programming Error Messages are Ineffective in Practice, UKICER ’24: Proceedings of the 2024 Conference on United Kingdom & Ireland Computing Education Research DOI:10.1145/3689535.3689554

In Memory of Brett Becker

We are deeply saddened to hear of Brett Becker‘s tragic passing. Brett has been a regular speaker, supporter and attendee at SIGCSE journal club since we started in 2020 and we’ve have often discussed Brett’s papers at SIGCSE journal club. We’d planned to discuss another one of Brett’s papers at our October meetup but we’ve postponed that to a later date, in light of his passing.

Brett was an accomplished researcher and active member of the international computing education research community.  Among his many professional activities and accomplishments, Brett was influential in the sigcse.org community in Europe, America and beyond where he served as vice chair.  He also served as program co-chair at the inaugural ukicer.com conference in 2019, and again in 2022, and served on the UKICER steering committee.  He helped to ensure Ireland was a cornerstone of UKICER, and also co-founded sigcseire.acm.org, the SIGCSE Ireland chapter.  He was an energetic and astute proponent of computing education in Ireland and globally, always a pleasure to work with, and he will be greatly missed.

Plans to honor and remember Brett will be distributed to the SIGCSE-MEMBERS@LISTSERV.ACM.ORG mailing list in due course, this is an open list that anyone can subscribe to.





Join us at Durham University on 5th January 2024 to discuss Computing Education Practice (CEP)

Rather than meeting online in January, we’ll be meeting in person. So join us at Durham University for the annual Computing Education Practice (CEP) conference which takes place on Friday 5th January, with a pre-conference dinner in the evening of Thursday 4th January.

Thanks to our program chair Jane Waite, general chair Ryan Crosby and program committee for organising this event.

The full conference program and registration details are available at cepconference.webspace.durham.ac.uk/programme

Join us to discuss the ability of generative AI to pass exams on 4th December at 2pm GMT

CC-licensed exam image from flaticon.com

How good is generative AI at passing exams? What does this tell us about how we could design better assessments? Join us on Monday 4th December at 2pm GMT (UTC) to discuss a paper on this by Joyce Mahon, Brian Mac Namee and Brett Becker at University College Dublin published at UKICER earlier this year. [1] From the abstract:

We investigate the capabilities of ChatGPT (GPT-4) on second level (high-school) computer science examinations: the UK A-Level and Irish Leaving Certificate. Both are national, government-set / approved, and centrally assessed examinations. We also evaluate performance differences in exams made publicly available before and after the ChatGPT knowledge cutoff date, and investigate what types of question ChatGPT struggles with.

We find that ChatGPT is capable of achieving very high marks on both exams and that the performance difference before and after the knowledge cutoff date are minimal. We also observe that ChatGPT struggles with questions involving symbols or images, which can be mitigated when in-text information ‘fills in the gaps’. Additionally, GPT-4 performance can be negatively impacted when an initial inaccurate answer leads to further inaccuracies in subsequent parts of the same question. Finally, the element of choice on the Leaving Certificate is a significant advantage in achieving a high grade. Notably, there are minimal occurrences of hallucinations in answers and few errors in solutions not involving images.

These results reveal several strengths and weaknesses of these exams in terms of how generative AI performs on them and have implications for exam design, the construction of marking schemes, and could also shift the focus of what is examined and how.

We’ll be joined by the papers lead author Joyce, who will give us a lightning talk summary of her paper to start our discussion. All welcome, as usual we’ll be meeting on zoom details at sigcse.cs.manchester.ac.uk/join-us

References

  1. Joyce Mahon, Brian MacNamee and Brett A. Becker (2023) No More Pencils No More Books: Capabilities of Generative AI on Irish and UK Computer Science School Leaving Examinations. In The United Kingdom and Ireland Computing Education Research conference (UKICER 2023), September 07–08, 2023, Swansea, Wales UK. ACM, New York, NY, USA, 7 pages. DOI: 10.1145/3610969.3610982

Join us on zoom to discuss the implications of programming getting easier, Monday 15th May at 2pm BST

Programming is hard, or at least it used to be. AI code generators like Amazon’s CodeWhisperer, DeepMind’s AlphaCode, GitHub’s CoPilot, Replit’s Ghostwriter and many others now make programming easier, at least for some people, some of the time. What opportunities and challenges do these new tools present for educators? Join us on Zoom to discuss an award winning paper by Brett Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather and Eddie Antonio Santos at University College Dublin, the University of Auckland and Abilene Christian University on this very topic. [1] We’ll be joined by two of the co-authors who will present a lightning talk to kick-off our discussion, for our monthly ACM journal club meetup. Here’s the abstract of his paper:

The introductory programming sequence has been the focus of much research in computing education. The recent advent of several viable and freely-available AI-driven code generation tools present several immediate opportunities and challenges in this domain. In this position paper we argue that the community needs to act quickly in deciding what possible opportunities can and should be leveraged and how, while also working on overcoming otherwise mitigating the possible challenges. Assuming that the effectiveness and proliferation of these tools will continue to progress rapidly, without quick, deliberate, and concerted efforts, educators will lose advantage in helping shape what opportunities come to be, and what challenges will endure. With this paper we aim to seed this discussion within the computing education community.

All welcome, as usual we’ll be meeting on zoom at 2pm BST (UTC+1), details at sigcse.cs.manchester.ac.uk/join-us. Thanks to Sue Sentance at the University of Cambridge for nominating this paper for discussion.

See also linkedin.com/posts/duncanhull_ai-codewhisperer-alphacode-activity-7051921278923915264-7i_5

References

  1. Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, Eddie Antonio Santos (2023) Programming Is Hard – Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation in Proceedings of the 54th ACM Technical Symposium on Computer Science Education: SIGCSE 2023, pages 500–506, DOI: 10.1145/3545945.3569759

Join us to discuss the implications of the Open AI codex on introductory programming Monday 4th July at 2pm BST


Automatic code generators have been with us a while, but how do modern AI powered bots perform on introductory programming assignments? Join us to discuss the implications of the OpenAI Codex on introductory programming courses on Monday 4th July at 2pm BST. We’ll be discussing a paper by James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly and James Prather [1] for our monthly SIGCSE journal club meetup on zoom. Here is the abstract:

Recent advances in artificial intelligence have been driven by an exponential growth in digitised data. Natural language processing, in particular, has been transformed by machine learning models such as OpenAI’s GPT-3 which generates human-like text so realistic that its developers have warned of the dangers of its misuse. In recent months OpenAI released Codex, a new deep learning model trained on Python code from more than 50 million GitHub repositories. Provided with a natural language description of a programming problem as input, Codex generates solution code as output. It can also explain (in English) input code, translate code between programming languages, and more. In this work, we explore how Codex performs on typical introductory programming problems. We report its performance on real questions taken from introductory programming exams and compare it to results from students who took these same exams under normal conditions, demonstrating that Codex outscores most students. We then explore how Codex handles subtle variations in problem wording using several published variants of the well-known “Rainfall Problem” along with one unpublished variant we have used in our teaching. We find the model passes many test cases for all variants. We also explore how much variation there is in the Codex generated solutions, observing that an identical input prompt frequently leads to very different solutions in terms of algorithmic approach and code length. Finally, we discuss the implications that such technology will have for computing education as it continues to evolve, including both challenges and opportunities. (see accompanying slides and sigarch.org/coping-with-copilot/)

All welcome, details at sigcse.cs.manchester.ac.uk/join-us. Thanks to Jim Paterson at Glasgow Caledonian University for nominating this months paper.

References

  1. James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly, James Prather (2022) The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming ACE ’22: Australasian Computing Education Conference Pages 10–19 DOI:10.1145/3511861.3511863

Join us to discuss sense of belonging in Computer Science on Mon 6th Dec at 2pm GMT

Image by surang on flaticon.com


Students sense of belonging has been shown to be associated with many attributes such as motivation and persistence. But what makes people feel like they belong in Computer Science? Join us on Monday 6th December at 2pm GMT to discuss belonging via a paper by Catherine Mooney and Brett Becker. [1] This won a best paper award at SIGCSE 2021.

[There will be no SIGCSE journal club in November, however, we’ll be back in December as usual.]

Sense of belonging, or belongingness, describes how accepted one feels in their academic community and is an important factor in creating inclusive learning environments. Belongingness is influenced by many factors including: students’ backgrounds and experiences; other people; environments (physical and virtual); academic discipline; external factors such as local, regional, and global issues; and time. 2020 has been dominated by several major events including the COVID-19 pandemic which dramatically impacted education. The Black Lives Matter movement has further raised global awareness of equality, diversity and inclusion not just in society, but in educational contexts. Climate change concerns, and politically charged news are also increasingly affecting our students.

We have been monitoring our undergraduate computing students’ sense of belonging for over three years, providing us with a unique opportunity to gauge recent changes during the pandemic. Our results surprised us. We found statistically significant reductions in the belongingness of students identifying as men as well as those not identifying as being part of a minority. However, investigating intersectionality of self-identified gender and minority status revealed more complicated and nuanced trends, illustrating important shifts in the belongingness of our students that we are only beginning to understand.

There’s a video summary of the paper here:

Video summary of the paper by Catherine Mooney and Brett Becker

As usual, we’ll be meeting on zoom details at sigcse.cs.manchester.ac.uk/join-us

References

  1. Catherine Mooney and Brett Becker (2021) Investigating the Impact of the COVID-19 Pandemic on Computing Students’ Sense of Belonging. SIGCSE ’21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, March 2021 Pages 612–618 DOI:10.1145/3408877.3432407

Join us to discuss failure rates in introductory programming courses on Monday 1st February at 2pm GMT

Icons made by freepik from flaticon.com

Following on from our discussion of ungrading, this month we’ll be discussing pass/fail rates in introductory programming courses. [1] Here is the abstract:

Vast numbers of publications in computing education begin with the premise that programming is hard to learn and hard to teach. Many papers note that failure rates in computing courses, and particularly in introductory programming courses, are higher than their institutions would like. Two distinct research projects in 2007 and 2014 concluded that average success rates in introductory programming courses world-wide were in the region of 67%, and a recent replication of the first project found an average pass rate of about 72%. The authors of those studies concluded that there was little evidence that failure rates in introductory programming were concerningly high.

However, there is no absolute scale by which pass or failure rates are measured, so whether a failure rate is concerningly high will depend on what that rate is compared against. As computing is typically considered to be a STEM subject, this paper considers how pass rates for introductory programming courses compare with those for other introductory STEM courses. A comparison of this sort could prove useful in demonstrating whether the pass rates are comparatively low, and if so, how widespread such findings are.

This paper is the report of an ITiCSE working group that gathered information on pass rates from several institutions to determine whether prior results can be confirmed, and conducted a detailed comparison of pass rates in introductory programming courses with pass rates in introductory courses in other STEM disciplines.

The group found that pass rates in introductory programming courses appear to average about 75%; that there is some evidence that they sit at the low end of the range of pass rates in introductory STEM courses; and that pass rates both in introductory programming and in other introductory STEM courses appear to have remained fairly stable over the past five years. All of these findings must be regarded with some caution, for reasons that are explained in the paper. Despite the lack of evidence that pass rates are substantially lower than in other STEM courses, there is still scope to improve the pass rates of introductory programming courses, and future research should continue to investigate ways of improving student learning in introductory programming courses.

Anyone is welcome to join us. As usual, we’ll be meeting on zoom, see sigcse.cs.manchester.ac.uk/join-us for details.

Thanks to Brett Becker and Joseph Allen for this months #paper-suggestions via our slack channel at uk-acm-sigsce.slack.com.

References

  1. Simon, Andrew Luxton-Reilly, Vangel V. Ajanovski, Eric Fouh, Christabel Gonsalvez, Juho Leinonen, Jack Parkinson, Matthew Poole, Neena Thota (2019) Pass Rates in Introductory Programming and in other STEM Disciplines in ITiCSE-WGR ’19: Proceedings of the Working Group Reports on Innovation and Technology in Computer Science Education, Pages 53–71 DOI: 10.1145/3344429.3372502