Join us on to discuss why LLM-enhanced Programming Error Messages are Ineffective in Practice on Monday 2nd December at 2pm GMT (UTC)

Icon by LAFS on flaticon.com

Large Language Models (LLMs) can help explain programming error messages and these explanations tend to improve as the models they are based on include more source code. However, it is unknown to what extent novice programmers are able to effectively utilise these automatically generated explanations to debug their programs, with tools like GitHub CoPilot and ChatGPT. Join us to discuss a paper on this by Eddie Antonio Santos and Brett Becker. This paper won a best paper award at UKICER.com earlier this year. We’ll be joined by the papers lead author, Eddie Antonio Santos, who’ll give a lightning talk to kick off our discussion. From the abstract:

The sudden emergence of large language models (LLMs) such as ChatGPT has had a disruptive impact throughout the computing education community. LLMs have been shown to excel at producing correct code to CS1 and CS2 problems, and can even act as friendly assistants to students learning how to code. Recent work shows that LLMs demonstrate unequivocally superior results in being able to explain and resolve compiler error messages—for decades, one of the most frustrating parts of learning how to code. However, LLM-generated error message explanations have only been assessed by expert programmers in artificial conditions. This work sought to understand how novice programmers resolve programming error messages (PEMs) in a more realistic scenario. We ran a within-subjects study with 𝑛 = 106 participants in which students were tasked to fix six buggy C programs. For each program, participants were randomly assigned to fix the problem using either a stock compiler error message, an expert-handwritten error message, or an error message explanation generated by GPT-4. Despite promising evidence on synthetic benchmarks, we found that GPT-4 generated error messages outperformed conventional compiler error messages in only 1 of the 6 tasks, measured by students’ time-to-fix each problem. Handwritten explanations still outperform LLM and conventional error messages, both on objective and subjective measures.

As usual, we’ll be meeting on zoom, all welcome, details at sigcse.cs.manchester.ac.uk/join-us.

References

  1. Eddie Antonio Santos and Brett A. Becker (2024) Not the Silver Bullet: LLM-enhanced Programming Error Messages are Ineffective in Practice, UKICER ’24: Proceedings of the 2024 Conference on United Kingdom & Ireland Computing Education Research DOI:10.1145/3689535.3689554

Join us to discuss the use of AI in undergraduate programming courses on Monday 4th Nov at 2pm GMT (UTC)

Co-pilots still need pilots, but what’s the relationship between them? CC BY icon from flaticon.com

Students of programming are often encouraged to use AI assistants with little consideration for their perceptions and preferences. How do students perceptions influence their usage of AI and Large Language Models (LLMs) in undergraduate programming courses? How does the use of tools like ChatGPT and GitHub CoPilot relate to students self-belief in their own programming abilities? Join us to discuss a paper by Aadarsh Padiyath et al about this published at ICER 2024. [1] From the abstract:

The capability of large language models (LLMs) to generate, debug, and explain code has sparked the interest of researchers and educators in undergraduate programming, with many anticipating their transformative potential in programming education. However, decisions about why and how to use LLMs in programming education may involve more than just the assessment of an LLM’s technical capabilities. Using the social shaping of technology theory as a guiding framework, our study explores how students’ social perceptions influence their own LLM usage. We then examine the correlation of self-reported LLM usage with students’ self-efficacy and midterm performances in an undergraduate programming course. Triangulating data from an anonymous end-of-course student survey (n = 158), a mid-course self-efficacy survey (n=158), student interviews (n = 10), self-reported LLM usage on homework, and midterm performances, we discovered that students’ use of LLMs was associated with their expectations for their future careers and their perceptions of peer usage. Additionally, early self-reported LLM usage in our context correlated with lower self-efficacy and lower midterm scores, while students’ perceived over-reliance on LLMs, rather than their usage itself, correlated with decreased self-efficacy later in the course.

There’s also an accompanying article and blog post to go with this paper. [2.3]

All welcome, as usual, we’ll be meeting online joining details at sigcse.cs.manchester.ac.uk/join-us

References

  1. Aadarsh Padiyath, Xinying Hou, Amy Pang, Diego Viramontes Vargas, Xingjian Gu, Tamara Nelson-Fromm, Zihan Wu, Mark Guzdial, Barbara Ericson (2024) Insights from Social Shaping Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course ICER ’24: Proceedings of the 2024 ACM Conference on International Computing Education Research – Volume 1, Pages 114 – 130 DOI:10.1145/3632620.3671098 (non-paywalled version at arxiv.org/abs/2406.06451)
  2. Aadarsh Padiyath (2024) Do I have a say in this or has ChatGPT already decided for me? blog post at computinged.wordpress.com
  3. Aadarsh Padiyath (2024) Do I Have a Say in This, or Has ChatGPT Already Decided for Me? XRDS: Crossroads, The ACM Magazine for students, Volume 31, Issue 1, Pages 52 – 55, DOI:10.1145/3688090 (paywalled version only)

Join us to discuss the most dangerous course to teach in Computing on Monday 7th August at 2pm BST

Skeleton image from flaticon.com

What is the most dangerous course to teach in Computing? Join us on Monday 7th August at 2pm BST (UTC+1) to discuss an opinion piece by Tony Clear from Auckland University of Technology on this very subject. Tony argues that introductory programming (aka CS1) is the most dangerous course for educators to teach. Do you agree with him? From the intro to his paper:

This column reflects on some of my own experiences, observations, and research insights into CS1 teaching over more than 25 years in my own institution and others. The challenges facing first year programming educators and the inability of universities and their managers to learn from the copious literature relating to the teaching of introductory programming seem to be perennial. This places first year programming educators in some peril!

All welcome, as usual, we’ll be meeting on zoom, details at sigcse.cs.manchester.ac.uk/join-us. Thanks to James Davenport at the University of Bath for nominating this months paper. 🙏

References

  1. Clear, Tony (2022) CS1: The Most Dangerous Course for CS Educators to Teach? ACM Inroads, Volume 13, issue 4, DOI:10.1145/3571089

Join us on zoom to discuss the implications of programming getting easier, Monday 15th May at 2pm BST

Programming is hard, or at least it used to be. AI code generators like Amazon’s CodeWhisperer, DeepMind’s AlphaCode, GitHub’s CoPilot, Replit’s Ghostwriter and many others now make programming easier, at least for some people, some of the time. What opportunities and challenges do these new tools present for educators? Join us on Zoom to discuss an award winning paper by Brett Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather and Eddie Antonio Santos at University College Dublin, the University of Auckland and Abilene Christian University on this very topic. [1] We’ll be joined by two of the co-authors who will present a lightning talk to kick-off our discussion, for our monthly ACM journal club meetup. Here’s the abstract of his paper:

The introductory programming sequence has been the focus of much research in computing education. The recent advent of several viable and freely-available AI-driven code generation tools present several immediate opportunities and challenges in this domain. In this position paper we argue that the community needs to act quickly in deciding what possible opportunities can and should be leveraged and how, while also working on overcoming otherwise mitigating the possible challenges. Assuming that the effectiveness and proliferation of these tools will continue to progress rapidly, without quick, deliberate, and concerted efforts, educators will lose advantage in helping shape what opportunities come to be, and what challenges will endure. With this paper we aim to seed this discussion within the computing education community.

All welcome, as usual we’ll be meeting on zoom at 2pm BST (UTC+1), details at sigcse.cs.manchester.ac.uk/join-us. Thanks to Sue Sentance at the University of Cambridge for nominating this paper for discussion.

See also linkedin.com/posts/duncanhull_ai-codewhisperer-alphacode-activity-7051921278923915264-7i_5

References

  1. Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, Eddie Antonio Santos (2023) Programming Is Hard – Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation in Proceedings of the 54th ACM Technical Symposium on Computer Science Education: SIGCSE 2023, pages 500–506, DOI: 10.1145/3545945.3569759

Join us to discuss using AI to solve simple programming problems on Monday 3rd April at 2pm BST

CC licensed pilot icon from flaticon.com

Maybe you wrote that code and maybe you didn’t. If AI helped you, such as the OpenAI Codex in GitHub Copilot, how did it solve your problem? How much did Artificial Intelligence help or hinder your solution? Join us to discuss a paper by Michel Wermelinger from the Open University published in the SIGCSE technical symposium earlier this month on this very topic. [1] We’ll be joined by Michel who will present a lightning talk to kick-off our discussion. Here’s the abstract of his paper:

The teaching and assessment of introductory programming involves writing code that solves a problem described by text. Previous research found that OpenAI’s Codex, a natural language machine learning model trained on billions of lines of code, performs well on many programming problems, often generating correct and readable Python code. GitHub’s version of Codex, Copilot, is freely available to students. This raises pedagogic and academic integrity concerns. Educators need to know what Copilot is capable of, in order to adapt their teaching to AI-powered programming assistants. Previous research evaluated the most performant Codex model quantitatively, e.g. how many problems have at least one correct suggestion that passes all tests. Here I evaluate Copilot instead, to see if and how it differs from Codex, and look qualitatively at the generated suggestions, to understand the limitations of Copilot. I also report on the experience of using Copilot for other activities asked of students in programming courses: explaining code, generating tests and fixing bugs. The paper concludes with a discussion of the implications of the observed capabilities for the teaching of programming.

All welcome, as usual we’ll be meeting on zoom, details at sigcse.cs.manchester.ac.uk/join-us. 

References

  1. Michel Wermelinger (2023) Using GitHub Copilot to Solve Simple Programming Problems in Proceedings of the 54th ACM Technical Symposium on Computer Science Education Pages SIGCSE 2023 page 172–178 DOI: 10.1145/3545945.3569830

Join us to discuss code comprehension on Monday 6th March at 2pm GMT

CC licensed puzzle icon by flaticon.com


It’s all very well getting an AI to write your code for you but neither writing code or reading code are the same as understanding code. So what is going on in novices brains when they learn to actually understand the code they are reading and writing? Join us on Monday 6th March at 2pm GMT to discuss a paper by Quintin Cutts and Maria Kallia from the University of Glasgow on this very topic [1], from the abstract:

An approach to code comprehension in an introductory programming class is presented, drawing on the Text Surface, Functional and Machine aspects of Schulte’s Block Model, and emphasising programming as a modelling activity involving problem and machine domains. To visually connect the domains and a program, a key diagram conceptualising the three aspects lies at the approach’s heart, alongside instructional exposition and exercises, which are all presented. Students find the approach challenging initially, but most recognise its value later, and identify, unexpectedly, the value of the approach for problem decomposition, planning and coding.

We’ll be joined by one of the co-authors (Quintin Cutts), who’ll give us a lightning talk summary of the paper to kick-off our journal club discussion. [1] Quintin has added: â€śYou can’t write if you can’t read.  In just four pages the paper outlines a classroom approach to developing in novices good code comprehension right from the start of an introductory course.  There’s also some feedback on what students thought, a year later – spoiler – they seemed to get a lot from it.  Anyone teaching introductory programming might find such a short paper thought provoking, even if they don’t pick up the technique in their teaching. Worth a quick read, and coming along to listen/add to the discussion…”

All welcome, as usual we’ll be meeting on zoom, details at sigcse.cs.manchester.ac.uk/join-us. 

References

  1. Quintin Cutts and Maria Kallia (2023) Introducing Modelling and Code Comprehension from the First Days of an Introductory Programming Class in CEP ’23: Proceedings of 7th Conference on Computing Education Practice Pages 21–24 DOI:10.1145/3573260.3573266

Join us to discuss Collaborative Coding in the Cloud on Monday 6th February at 2pm GMT

Creative Commons cloud image by flaticon.com

More and more software development tools are available in the cloud, with tools like Replit, CodingRooms, GitHub Codespaces, Amazon Web Services Cloud9, JetBrains and Eclipse all offering online tools for developers to code collaboratively in the cloud. Integrated Development Environments (IDEs) which have traditionally been available as “fatter” clients are increasingly available as “thinner” web-based clients running in a browser. These tools can lower some of the barriers to installation and maintenance for their users. What are the strengths and weaknesses of these new tools for teaching introductory programming courses? Join us on Monday 6th February at 2pm GMT to discuss a paper by Phil Hackett and his colleagues at the Open University on this very topic [1], from the abstract:

This paper discusses a pilot research project, which investigated the use of online collaborative IDEs (Integrated development environments) during a first-year computing degree course. The IDEs used can be described as virtual computing labs because they replicate some of the actions possible in physical computing labs. Students were supported by a tutor with real-time help and feedback provided, whilst they were programming, without being collocated. The use of two different platforms is considered with the benefits and drawbacks discussed. Students and tutors indicated that they would like to use a virtual computing lab approach in the future.

We’ll be joined by the lead author of the paper Phil Hackett, who’ll give us a lightning talk summary of the paper to kick-off our journal club discussion. The paper was presented at Computing Education Practice (CEP) in Durham earlier this month. [1]

All welcome, as usual we’ll be meeting on zoom, details at sigcse.cs.manchester.ac.uk/join-us. 

References

  1. Phil Hackett, Michel Wermelinger, Karen Kear and Chris Douce (2023) Using a Virtual Computing Lab to Teach Programming at a Distance in CEP ’23: Proceedings of 7th Conference on Computing Education Practice Pages 5–8 DOI:10.1145/3573260.3573262

Join us to discuss novice use of Java on Monday 7th November at 2pm GMT

Java is widely used as a teaching language in Universities around the world, but what wider problems does it present for novice programmers? Join us to discuss via a paper published in TOCE by Neil Brown, Pierre Weill-Tessier, Maksymilian Sekula, Alexandra-Lucia Costache and Michael Kölling. [1] From the abstract:

Objectives: Java is a popular programming language for use in computing education, but it is difficult to get a wide picture of the issues that it presents for novices, and most studies look only at the types or frequency of errors. In this observational study we aim to learn how novices use different features of the Java language. Participants: Users of the BlueJ development environment have been invited to opt-in to anonymously record their activity data for the past eight years. This dataset is called Blackbox, which was used as the basis for this study. BlueJ users are mostly novice programmers, predominantly male, with a median age of 16. Our data subset featured approximately 225,000 participants from around the world. Study Methods: We performed a secondary data analysis that used data from the Blackbox dataset. We examined over 320,000 Java projects collected over the course of eight years, and used source code analysis to investigate the prevalence of various specifically-selected Java programming usage patterns. As this was an observational study without specific hypotheses, we did not use significance tests; instead we present the results themselves with commentary, having applied seasonal trend decomposition to the data. Findings: We found many long-term trends in the data over the course of the eight years, most of which were monotonic. There was a notable reduction in the use of the main method (common in Java but unnecessary in BlueJ), and a general reduction in the complexity of the projects. We find that there are only a small number of frequently used types: int, String, double and boolean, but also a wide range of other infrequently used types. Conclusions: We find that programming usage patterns gradually change over a long period of time (a period where the Java language was not seeing major changes), once seasonal patterns are accounted for. Any changes are likely driven by instructors and the changing demographics of programming novices. The novices use a relatively restricted subset of Java, which implies that designers of languages specifically targeted at novices can satisfy their needs with a smaller set of language constructs and features. We provide detailed recommendations for the designers of educational programming languages and supporting development tools.

All welcome, as usual we’ll be meeting on zoom, details at sigcse.cs.manchester.ac.uk/join-us

References

  1. Neil C. C. Brown, Pierre Weill-Tessier, Maksymilian Sekula, Alexandra-Lucia Costache and Michael Kölling (2022) Novice use of the Java programming language ACM Transactions on Computing Education DOI:10.1145/3551393

Join us to discuss the implications of the Open AI codex on introductory programming Monday 4th July at 2pm BST


Automatic code generators have been with us a while, but how do modern AI powered bots perform on introductory programming assignments? Join us to discuss the implications of the OpenAI Codex on introductory programming courses on Monday 4th July at 2pm BST. We’ll be discussing a paper by James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly and James Prather [1] for our monthly SIGCSE journal club meetup on zoom. Here is the abstract:

Recent advances in artificial intelligence have been driven by an exponential growth in digitised data. Natural language processing, in particular, has been transformed by machine learning models such as OpenAI’s GPT-3 which generates human-like text so realistic that its developers have warned of the dangers of its misuse. In recent months OpenAI released Codex, a new deep learning model trained on Python code from more than 50 million GitHub repositories. Provided with a natural language description of a programming problem as input, Codex generates solution code as output. It can also explain (in English) input code, translate code between programming languages, and more. In this work, we explore how Codex performs on typical introductory programming problems. We report its performance on real questions taken from introductory programming exams and compare it to results from students who took these same exams under normal conditions, demonstrating that Codex outscores most students. We then explore how Codex handles subtle variations in problem wording using several published variants of the well-known “Rainfall Problem” along with one unpublished variant we have used in our teaching. We find the model passes many test cases for all variants. We also explore how much variation there is in the Codex generated solutions, observing that an identical input prompt frequently leads to very different solutions in terms of algorithmic approach and code length. Finally, we discuss the implications that such technology will have for computing education as it continues to evolve, including both challenges and opportunities. (see accompanying slides and sigarch.org/coping-with-copilot/)

All welcome, details at sigcse.cs.manchester.ac.uk/join-us. Thanks to Jim Paterson at Glasgow Caledonian University for nominating this months paper.

References

  1. James Finnie-Ansley, Paul Denny, Brett A. Becker, Andrew Luxton-Reilly, James Prather (2022) The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming ACE ’22: Australasian Computing Education Conference Pages 10–19 DOI:10.1145/3511861.3511863

Join us to discuss teaching programming to Physics students on Monday 13th June at 2pm BST

CC BY-SA image of Bohr model of the atom by Jabberwock on Wikimedia Commons w.wiki/59id 

print(’Hello World!’) is all very well but it doesn’t help physics students solve the Schrödinger equation. Join us for our next journal club meeting on Monday 13th June at 2pm BST where we’ll be discussing a paper by Lloyd Cawthorne from the Department of Physics and Astronomy on teaching programming to undergraduate Physics students. From the abstract:

Computer programming is a key component of any physical science or engineering degree and is a skill sought by employers. Coding can be very appealing to these students as it is logical and another setting where they can solve problems. However, many students can often be reluctant to engage with the material as it might not interest them or they might not see how it applies to their wider study. Here, I present lessons I have learned and recommendations to increase participation in programming courses for students majoring in the physical sciences or engineering. The discussion and examples are taken from my second-year core undergraduate physics module, Introduction to Programming for Physicists, taught at The University of Manchester, UK. Teaching this course, I have developed successful solutions that can be applied to undergraduate STEM courses.

Lloyds slides from the presentation are here

All welcome, physicists, non-physicists, programmers and non-programmers alike. As usual we’ll be meeting on zoom, details are in the slack channel sigcse.cs.manchester.ac.uk/join-us.

References

  1. Lloyd Cawthorne (2021) Invited viewpoint: teaching programming to students in physical sciences and engineering, Journal of Materials Science 56, pages 16183–16194 DOI:10.1007/s10853-021-06368-1