Innovations in Medical Education

A Journey Towards Programmatic Assessment

Estimated read time: 1:20

Summary

The transcript opens with a welcome to a prestigious lecture honoring Dr. Paul Cudmore, highlighting his contributions to medical education and the special guests present, including family members. The lecture, led by Dr. Case Bander Bloon, emphasizes the transformative changes in medical assessment and education over the years. He outlines the journey from traditional methods to a more holistic, programmatic assessment model that focuses on both feedback and learning, rather than just testing. The talk concludes with a call to adopt these modern educational approaches to improve student outcomes.

Highlights

A heartfelt tribute to Dr. Cudmore's contributions in advancing medical education. 🌠
Dr. Case Bander Bloon enlightening the audience on modern assessment trends. 🤓
The shift from traditional testing to programmatic assessments gets everyone buzzing. 🔄
The importance of feedback-driven learning over standard exams is emphasized. 💬
The event ends with hopes of further advancing medical education through workshops and collaborative efforts. 🛠️

Key Takeaways

Programmatic assessment is shaping the new educational landscape! 🌟
Move over old-school tests, it's all about comprehensive feedback now! 🚀
Learning is a journey, not just an exam—with the right tools, the journey is rewarding. 📚
Dr. Cudmore's legacy lives on, pushing educational boundaries forward! 🎓
Collaborative learning and continuous assessment are becoming the norm. 🤝

Overview

The conversation kicks off with the acknowledgment of Dr. Paul Cudmore and his innovative spirit in advancing medical education. His influence is commemorated with a special gathering, where his family and the attendees reflect on the impact he had on the field. The event is marked as a reunion and celebration of his visionary work.

Dr. Case Bander Bloon continues the discourse by examining the evolution in medical education, particularly in assessment strategies. He emphasizes the importance of programmatic assessment—a transition from mere testing to a comprehensive learning approach. This model integrates low-stakes and high-stakes assessments, personalized feedback, and encourages lifelong learning and professional development.

As the lecture wraps up, attendees are encouraged to continue discussions about these changes in medical education's landscape, highlighting the significance of sustaining innovative practice. The session is followed by reception and series of workshops to delve deeper into these educational methodologies with Dr. Bloon.

Chapters

00:00 - 01:30: Introduction The chapter titled 'Introduction' describes the commencement of the annual Cudmore lecture. The speaker acknowledges the presence of the audience and the significance of the event. This particular lecture is highlighted as special for several reasons. The involvement of the Cudmore family, including David and Donald, the two sons, and someone named Brent, adds to the event's uniqueness.
01:30 - 03:30: Welcome and Opening Remarks The chapter, titled 'Welcome and Opening Remarks', involves greeting various family members: Annie, Mark, and Paul (Dr. Paul Cudmore) along with his wife Jean. The narrator expresses happiness about a special event they attended where they enjoyed tea together. After the tea, they were given a guided tour of the zebrafish facility by Dean Tom.
03:30 - 06:00: Legacy of Dr. Paul Cudmore The chapter titled 'Legacy of Dr. Paul Cudmore' opens with a description of a special event held in honor of Dr. Paul Cudmore, highlighting his significant contributions to medical education. Dr. Cudmore was portrayed as an innovator who was ahead of his time, with a strong belief in the advancement of medical education even when it was not a common focus. The attendees are encouraged to read the brochure provided, which details Dr. Cudmore's achievements and his impact on the field. This commemoration illustrates the lasting influence Dr. Cudmore had on medical education and the importance of remembering his legacy.
06:00 - 09:00: Introduction of Dr. Cees van der Vleuten The chapter introduces Dr. Cees van der Vleuten, highlighting his previous visit 19 years ago as a lecturer on assessment. It suggests that much has evolved in the field since his last talk, and there is anticipation for the insights he will share. Additionally, it is mentioned that a working group has been focusing on assessment since the previous fall.
09:00 - 15:00: Dr. van der Vleuten's Personal Journey in Assessment Dr. van der Vleuten's journey in the field of assessment has been both personal and professional. The chapter highlights the collaborative effort of a working group dedicated to improving assessment strategies. Regular meetings reflect the commitment of the group, with a focus on potential changes in assessment practices. Dr. van der Vleuten's role as a PhD supervisor is also acknowledged, showcasing the impact of his mentorship and guidance in the academic journey of his students.
15:00 - 18:00: Assessment Toolbox and Pyramid This chapter discusses key qualities of a good assessor, highlighting the importance of providing specific, timely, and thoughtful feedback. It emphasizes the effectiveness of asking questions that stimulate critical thinking. Despite a short introduction to the speaker or author ('he'), the focus remains on illustrating the traits desirable in assessment practices.
18:00 - 21:00: Work-Based and Postgraduate Assessment The chapter discusses the role and contributions of a director involved in research and development within the schools of Medicine, Health Sciences, and Life Sciences. The director leads postgraduate programs including master's and PhD courses. The chapter highlights the director's international recognition and leadership in health education.
21:00 - 25:00: Validity and Measuring What Matters The speaker begins by expressing gratitude and reminiscing about a visit to Canada 19 years ago, when his youngest son was just born. Despite the timing being difficult, as it wasn't a popular decision at the time, the speaker notes with pride that his son is now 2 meters tall and finishing university.
25:00 - 32:00: Curriculum Development and Outcome-Based Education The chapter begins with the speaker reflecting on personal changes and opportunities in their life, beginning with earning a degree. The speaker is focusing on the topic of assessment, emphasizing that their approach and understanding have evolved over time.
32:00 - 35:00: Competency Frameworks and Generic Skills The chapter discusses a professional journey in medical education, emphasizing over 30 years of experience. It focuses on the development and research of assessments within educational practice. The speaker intends to summarize existing research and results related to assessment, drawing from extensive experience and knowledge in the field.
35:00 - 40:30: Behavioral Skills and Longitudinal Development The chapter titled "Behavioral Skills and Longitudinal Development" covers several key topics. It begins by discussing insights derived from research, as viewed through the author's perspective. Following this, the chapter moves on to explore relevant theories, particularly focusing on a model of assessment that holds potential benefits for the future of the field. The theoretical discussion is illustrated with a practical example, bridging the gap between theory and practice. The chapter concludes with a few key takeaways or conclusions drawn from the topics discussed.
40:30 - 43:30: Implications for Assessment Practices The chapter discusses the extensive development of tools for assessing professional competence over the last 30 years, referencing a model developed by George Miller as part of his work presented in the Cutmore lecture. The 'pyramid' mentioned refers to a framework related to assessment and evaluation within educational contexts, hinting at a structured progression in assessing competencies.
43:30 - 48:00: Reliability and Consistency in Assessment This chapter discusses a model of competence represented by a pyramid structure, as published by an author. The model consists of various levels: factual knowledge forms the base, followed by the ability to apply that knowledge. This then leads to reasoning with the knowledge, and performing in simulated settings, labeled as the 'showing how' level. Finally, the top level, the 'does' level, is where the knowledge and skills are applied in actual practice, emphasizing the importance of each layer in the assessment process.
48:00 - 53:30: Subjectivity and Reliability of Different Methods This chapter discusses the production of various assessment instruments, particularly in the context of work-based assessment in postgraduate education. It suggests rethinking traditional assessment methods by emphasizing practical assessment in real-world settings, indicating a need for reliability and relevance in evaluating competencies within actual practice environments.
53:30 - 61:00: Learning Impact and Feedback in Assessment The chapter titled 'Learning Impact and Feedback in Assessment' explores the quality characteristics of assessments. Many characteristics have been proposed, but the focus will be on summarizing research on three key elements. The chapter aims to infer consistencies from existing research, starting with the element of validity.
61:00 - 65:00: Programmatic Assessment In the chapter titled 'Programmatic Assessment', the focus is on understanding the purpose and effectiveness of educational assessment measures. It questions the objectives of what is being measured and whether these align with contemporary educational goals. It highlights a shift in education from traditional input-based curricula, which allocated specific hours to subjects like anatomy, physiology, and internal medicine, to potentially new methods.
65:00 - 73:00: Examples and Implementation of Programmatic Assessment In this chapter, the shift in educational approach within postgraduate training is discussed. Historically, the focus was on the duration of training ('time definitions'), whereas contemporary methods emphasize defining outcomes and competencies that learners should achieve, making time a variable factor depending on individual learning progress. This signifies a critical change from process to outcomes-oriented training.
73:00 - 77:00: Concluding Remarks The chapter 'Concluding Remarks' discusses the evolution from individual teachers defining curriculum content based on personal judgment to a more structured and governed approach where curriculum planning is systematic. The content taught is strategically aligned so that it revisits and builds on previous teachings, allowing for evaluation and adaptability. The focus is on a comprehensive, organized approach to education, ensuring consistency and coherence in the curricular objectives and teaching methods.
77:00 - 82:00: Q&A Session In this chapter, the focus is on the evolution of professional competencies beyond mere knowledge acquisition. Emphasizing the shift from being solely knowledge-oriented, especially in fields like medicine, the discussion highlights the importance of developing generic professional skills. These include effective communication and professional behavior, which are now considered integral to any profession.
82:00 - 86:30: Closing Remarks and Reception Invitation Discusses the shift in educational programs from being teacher-oriented to learner-centered. Emphasizes the importance and benefits of focusing on the learner's learning needs instead of being directed solely by the teacher.
86:30 - 91:00: Acknowledgements and Final Thoughts This chapter brings attention to the importance of self-directed learning, emphasizing it as the foundation for lifelong learning. It highlights how such learning approaches are increasingly becoming the norm globally, with competencies being defined and assessed across various regions. Examples provided include commonly adopted frameworks from countries like Saudi Arabia and Australia, among others, signifying a global trend in educational evolution. The focus on competencies suggests a shift towards equipping learners with the skills necessary to adapt and thrive in an ever-changing world.

A Journey Towards Programmatic Assessment Transcription

00:00 - 00:30 hello everyone I think we'll get started I think it's 4:30 so thank you for coming to the annual Cudmore lecture as you know it's always a very special event and it's special for a number of reasons and I can think of three reasons this time as always we have the Cudmore family here with us and let me see we have David and Donald the two sons and we have uh Brent
00:30 - 01:00 and Annie and Mark no Paul excuse me I bet it Paul um the uh three grandchildren and Jean Mrs Jean cmore uh Paul Dr Paul Cudmore wife so it's just it's lovely it's just a special event we just came from having tea and and because we drank our tea so quickly we just Tom took us our Dean took us on a tour of the zebra fish facility so in
01:00 - 01:30 the last little while so it's been really very nice and as you know this this is such a special event in honor of Dr Paul cadmore and you can read about him in the little brochure that you have and just read that he was really a an innovator and ahead of his time that he really believed in advancing medical education when people weren't were barely thinking about it and uh so it's just such an honor to remember him so do take a little bit of time to read about him okay the second reason for it being
01:30 - 02:00 so special is to welcome Dr case bander bluen who when you read the list of names he was here 19 years ago as our cmore lecturer and at that time talking about assessment and I think we'll hear that that much has changed since then and so we're just you know really looking forward to what he has to share and we're looking forward to it as well as we've had a group some of you may know that we've had a working group uh since the fall looking at assess
02:00 - 02:30 and there's a number of people in the audience who are part of that working group working really hard 7:30 meetings every second Monday morning looking at assessment and looking at how we can improve so we're really excited that case is here and that will be the stimulus to help change a number of things that we think we're getting poised to change I think my final comment is a personal one because case was my PhD supervisor at the uh University of MRI and
02:30 - 03:00 uh and so I think that he um would say that he embodies from a from a personal perspective every everything that we would like to have embodied in a good assessor he gives specific feedback timely feedback and in a very thoughtful way one never feels dumb which is really a nice thing he asks really good questions that stimulate critical thinking and so welcome case I didn't say very much about you did I he he's I talked about everybody else not so it's written up in in the brochure he's yeah
03:00 - 03:30 the the director for the program in uh in research and development in the medicine school of medicine and Health Sciences and Life Sciences at MRI and the uh the lead for the she the school and health uh health education leads a master's and PhD program and as you can see from the little CV just recognized internationally and is really a leader internationally so we're thrilled to have you with with us and welcome thank
03:30 - 04:00 you very [Applause] much it is a great honor to be back again I mean indeed 19 years ago and I remember so well because my youngest son was just born really and I wasn't very popular that back then for going to Canada um you know now he's um he's 2 met tall uh finishing his University
04:00 - 04:30 degree this year so things have changed to me too as well um great opportunity to talk about assessment again and that's what I'm going to do with you and uh basically um I think things have changed and my talk way back when was also an assessment but I think things have changed and what I'm going to do is to take you on my personal Journey
04:30 - 05:00 um my journey in my professional life which is now more than 30 years in medical education and in developing and researching assessment and many other areas as well but today's the journey is on assessment and what I'll do um there's a lot has been written and researched and developed in uh assessment and I'm going to summarize that research from educational practice I'm going to summarize the results
05:00 - 05:30 uh the messages that come from the research and at least as I see that Through My Lens and then from those messages from the research we'll go to a bit of theory uh talking about a model of assessment which I think might be beneficial for the future and then from there we might give you an illustration an example so we move from Theory to practice and then we'll finish with a couple of conclusions
05:30 - 06:00 okay our toolbox is extremely well filled in in assessment of professional competence we have a lot and this has all been developed in the last uh 30 years or so um and we've been climbing this pyramid and anyone not familiar with this particular pyramid actually it stems from the first cutmore lecture George Miller
06:00 - 06:30 and he published this pyramid as a very simple model of competence uh so you have factual knowledge at the bottom you were able to apply that knowledge you were able to uh know how to use your knowledge to reason with your knowledge if you know then know how to perform in a simulated session uh in a simulated setting it's showing how and then finally in actual practice it is the does level of assessment that for each of these different layers is uh
06:30 - 07:00 different instruments have been produced I'm not going into that but mind you we have a lot we have a lot if we talk talk about work-based assessment so assessment in the workplace or in postgraduate education you might Tilt The Pyramid upside down and we need to have as much assessment in the actual practice as possible um for each of the measures that have been the developed you could
07:00 - 07:30 look at that quality you could look at that quality characteristics and many of these characteristics have been proposed and these are but five many of them many more have been proposed um for the sake of ease I'm going to summarize the research on only three of these elements um and um infer some of the the the consistencies as I see them from the research and I'll start with validity
07:30 - 08:00 what are we measuring what are our measures measuring what do we intend to measure and if you then look at what we do in education nowadays that has changed quite dramatically over the last number of years and we used to have curricula which was basically defined by input so many hours of anatomy so many hours of physiology so many hours of internal medicine and that was a
08:00 - 08:30 curriculum um in post-graduate training so much time in this particular workplace um these were all time definitions nowadays it's the opposite we Define the outcomes we Define what we want our Learners to be able to do at the end of their training which is a very different approach and then all depending on how well they learn time is a variable not a
08:30 - 09:00 thing and we went from Individual teachers defining what they thought was good for the curriculum to a more govern curriculum in which we plan things right we have we teach something here because we want to go back to it over there curricular our planned are governed are evaluated are changed and that's a whole that's a big
09:00 - 09:30 difference with with the past and we also emphasize different entities in the past we very knowledge oriented now knowledge is very important particularly medicine but there's much more and these are more generic competencies that hold for any sort of professional for example you have to be a good communicator or you have to be professionally behave yourself and these competencies have become part
09:30 - 10:00 of our training programs because they're pretty important and I'll come back to that later on and we changed from teacher oriented programs to learner Center programs in which we put the learning of the learner Central to education um sounds a bit paradoxical uh but it it opposes to teacher centory learning where the teacher defines what you're learning um now days we put a lot more
10:00 - 10:30 emphasis on self-directed learning as a basis for lifelong learning because that's what we have to do ultimately okay um what are we assessing if you look at um around in the world you see these kind of outcomes being defined competencies being defined across the world and here are three very popular ones but there are many more there's a Saudi one there's an Australian one so there are many more of these compensate
10:30 - 11:00 Frameworks and what strikes me is that they're very similar particularly if you start looking underneath and these Frameworks have been developed with a lot of stakeholder input so apparently and independently with a lot of stakeholder input we have I think quite some consensus of what we need to train for
11:00 - 11:30 secondly what I find really interesting is that we move beyond the knowledge domain so we emphasize a lot of skills that move beyond the knowledge domain there's good reason for doing that because um later on in actual practice these kind of skills are essential if things go wrong in clinical practice these kind of skills are often
11:30 - 12:00 evolved we done a study in which we looked at the hospital complaints in our Hospital 80% of the complaints are related to communication exactly communication right um actually there are studies indicating that if if you do really well on the labor market it's these kind of skills that are involved right we we have publications
12:00 - 12:30 of cases that come to um a professional courts which are published in a medical journal Dutch medical journal every week all kinds of all these skills are involved right so these are pretty important there are studies showing that problems later on in clinical practice in relation to these kind of skills were already seen in the training
12:30 - 13:00 program so actually we have a responsibility to train and learn these kind of skills what is also I think interesting about these competencies these more generic competencies they're not generic because they're always contextually bound but you know a lawyer has to communicate as well and behave professionally in that sense they're generically
13:00 - 13:30 um but these skills are very behavioral they only can be demonstrated through Behavior naturally you can think of knowledge and communication and knowledge and professionalism but showing that you can communicate and be a professional that's what it's about now these are very complex skills they're not easy to Define
13:30 - 14:00 yet we all have a relationship with them's a painter if I were to hang up one of my paintings and Van one of Van Go's most of you would be able to pick out quickly which one is nicest without you having a definition of what is good painting um these kind of skills you don't learn overnight you can't have a course on
14:00 - 14:30 communication for a couple of weeks have sort of an exam at the end of the course and then you're a good communicator for the rest of your life it won't work that way these kind of skills need to be developed longitudinally over time being nurtured being given feedback on being evaluated regularly and be developed in a longitudinal fashion
14:30 - 15:00 right um so that has a lot of implications for assessing and we'll we'll look at that so the messages really if we look if this is behavioral if this is behavioral then we have to rely on the top end of the pyramid right that has massive implications because the lower three layers of the pyramid
15:00 - 15:30 represent more or less standardized forms of assessment like multiple choice exams like um osis the top end of the pyramid represents non-standardized assessment in the dirty world out there right and that's a challenge um the standard assessment is fairly established um you know an
15:30 - 16:00 enormous amount there's a whole oology for example in terms of how to uh assess clinical skills with osis um but the unstandardized assessment is really emerging is interesting is is coming up and we're starting to understand it I think and the messages that come from that validity research is that there's no single bullet that can do it all right there's not a single assessment method that can cover the entire competency
16:00 - 16:30 perimeter in order to do a good job you need a multitude of different methods to cover the entire competency pyramid and I remember that every time maybe Dale you you also know every time that we thought of something new the key feature approach or the OSI or um you know we thought of now we have the Holy Grail and assessment we never had we never had um you know there's no magical
16:30 - 17:00 bullet what we also found is that um in order to do a good job you need both you need both the standardized and the unstandardized assessment U um and one complements the other then what we found uh very systematically really if if you if we talk about standardized assessment that you can do a good job in terms of its quality by De developing materials
17:00 - 17:30 carefully if you take an objective structured clinical exam at OSI you can train your assessors you can train simulated patients you can select a standard setting method you can write your scenarios review your scenarios you can do a lot you can you can sharpen the instrument before you start assessing right and if you do a good job on that in terms of quality control your instrument will be okay
17:30 - 18:00 okay in a lot of practices we don't do that and we're not okay and our assessments are pretty poor nevertheless you can make them really sharp before test Administration in non-standardized assessment the instrument is not so important less important but that people are important using the instrument the way they they take the interaction the feedback
18:00 - 18:30 seriously is really determining the utility of non-standardized assessment simple tick boxes hardly have any information value won't really help so it's really in the people where your concern is and that is also a headache because you now not have to sharpen the instrument but you have to sharpen the people which is a
18:30 - 19:00 challenge a big challenge right so these are the messages from from validity research some of the major messages let's look at reliability reliability is the reproducibility of the findings that I have with my instrument um if I were to measure your length and I would do it twice I would expect a high Rel relationship between the first and the second measure that is
19:00 - 19:30 consistency that is reliability that's called reproducibility we can estimate that for our assessment instruments and we've done so many Studies have been published in the literature and let me give you an overview um basically from the left um we're low in Miller pyramid and we move up to the right multiple choice short essays patient management problem which
19:30 - 20:00 was meant for assessing clinical reasoning you gave simulated encounters on paper uh simulated problems and um the responses were being scored oral exam very subjective the long case in the British tradition you have a patient and then afterwards you have an oral on that particular patient uh the osy the objective structured clinical examination the minic CX where you look
20:00 - 20:30 in Where You observe an actual practice and you sit down with the learner and give that learner feedback uh the practice video assessment where we have a video uh recording Encounters in actual practice and then we select encounters and score them and finally Incognito simulated patients standardized patients these are fake patients that we sent out to doctor's practices um incognito uh patients the doctors were not aware they were visited
20:30 - 21:00 by a fake patient really we had ethical consent and we sent letters to the Physicians saying they're going to be visited by a simulated patient in the next coming half year and we gave them forms so that when they thought they had detected a simulated patient they could fax us and we got the forms but not with the simulated
21:00 - 21:30 patients um but actually this is very unobtrusive assessment there's no stage effect the person being assessed is not aware that he or she is being assessed right then you can calculate reliability and I won't go into the technique of that um and they say that you need a reliability for at least 80 in order to do a good job particularly
21:30 - 22:00 in a summi of context so reliability of 80 is minimum then the reliabilities have been standardized for time so that you can compare across different methods of assessment so what do you see what's your conclusion you need pretty long testing
22:00 - 22:30 time you need quite some testing time in order to come up with a reliable score what's behind this is that any performance measured by any method doesn't really matter is highly variable all depending on the context so if you change contexts you get different performance and this has been called the content speci specificity of clinical
22:30 - 23:00 expertise uh and it's been not only been found in medicine it's been found anywhere so the variability of the performance across different items essays osy stations um really determines a lot of noise in the measurement therefore you have to sample widely across items orals stations in order to do a good job so this is bad news most of our tests and actual practice is are simply unreliable or yet in other words we take
23:00 - 23:30 a lot of false positive and false negative decisions simply because of measurement error um so basically one measure is no measure um so that is really bad news but there's also good news in this SL the good news is there's not much difference between
23:30 - 24:00 methods you know and that's interesting because more objectified methods do just as well as more subjective methods right so objectivity is not the same as reliability you can have very unreliable assessment with objective testing and you can have very reliable testing with subjective
24:00 - 24:30 assessments provided that you sample widely so many subjective judgments make up a pretty reliable inference and that is good news remember these hardto Define competencies in order to assess them we need some form of professional judgment we need some form of holistic
24:30 - 25:00 judgment yes that's more subjective and the point is so what you can have many subjective judgments and you'll still come up with a pretty reliable inference okay so that's really good news because all these competencies these generic competencies can only be assessed with professional judgment and over the many years I think that uh um we've been removing
25:00 - 25:30 professional judgment from the assessment scene right let me give you an example orals were considered to be very subjective long cases very subjective and these were the standard assessment tools for assessing um clinical performance in the old days and because of their subjectivity The OSI was prop proposed and got very popular I mean there's not a single medical school around the world not using ois
25:30 - 26:00 nowadays and Harden who proposed the Yi look at the term objective structured clinical examination now many years later we know it's not in the objectification nor in the standardization that brings you measurement information but in the sampling and luckily I got my hands I'm very subjective as exams like the long case and theal exam and they show
26:00 - 26:30 reasonable or just as poor all the way how you look at it glasses are half full or half empty um well you know there's not a single instrument that is inherently better because it's more objectifying which I think is great news we uh developed an electronic portfolio we're sitting on a lot of data um from postgraduate uh
26:30 - 27:00 training and um here's um reliability expressed as a number of samples um in this case from the mini CX and this is assessment of technical skills this is multisource feedback and basically what it says there's a magical number of eight sort of if you have a sample of eight you'll do pretty well okay and that's a pretty feasible sample right and if he were to judge me and he's very
27:00 - 27:30 hawkish and and and she judges me she lenient and he's hawkish hawkish Len Len and that will nicely average out across the sample bringing a good inference right and actually if you combine those methods you even need smaller samples so if you combine information you're sampling even becomes
27:30 - 28:00 less right so some of the messages yes acceptable reliability is only achieved if you sample widely across contexts at least but also across cases and assessors or any other uh um factor that influence your measurement no method is inherently better anyone may go whether all old whether new doesn't really
28:00 - 28:30 matter objectivity is not the same as reliability which is a phenomenal Insight I think and many subjective judgments form a pretty robust picture clear okay then let's look at learning impact um they always say that assessment drives learning Learners will do anything to do well on the
28:30 - 29:00 assessment for Learners the assessment the exams really represent the curriculum and you know the Learners will do whatever you want them to do if you want them to memorize things they'll memorize things if you want them to learn checklist by heart because you're assessing them on checklist in osy stations they will learn checklist my heart they will do
29:00 - 29:30 anything and like anyone now we are a homoeconomicus we wish to have maximum effect with the least of effort but the relationship is complex there's a lot of negative effect of the assessment on learning you know Learners hunting for grades um creating a great culture a competitive culture instead of a Cooperative
29:30 - 30:00 culture I see a lot of reductionism in assessment let me give you an example of my own training program I was trained in a semester system at the end of the semester we had a bunch of exams um in the beginning of the semester I did nothing I pared I met my wife wife which was which was beneficial
30:00 - 30:30 because the exams required little more than reproduction regurgitation of factual knowledge I'm describing your training program as well um I naturally am a boy so I procrastinated too long and then I work my butt off to prepare for the exam many hours per day I went to the exam I couldn't do all of
30:30 - 31:00 them properly so I scouted the number of them I mostly passed never I only failed one exam I had a good memory in that time um you know I only F once when I when I finished the exam I wiped out my hard disk entirely to move on to the next exam and I see all you nodding and and and and and recognizing this this learning style
31:00 - 31:30 well I'm not sure if you're familiar with the forget Curves in Psychology but after 2 weeks 50% is gone and it will go down even to 25 30% so there's a lot of reductionism uh I think in and the the standard model of assessment is that you have a course and at the end of the course you have an exam and you know kind of the art of throwing away information because you
31:30 - 32:00 you have a standard um you pass a you fail you don't know what the performance is behind it you pass a fail um if you if if you pass your sort of immune for life because you get your credit points that's water under the bridge move on to the next one if you fail we take a silly measure we don't look at what the problem is we simply say retake the test and if you then retake the test and
32:00 - 32:30 you would fail again we don't look at the problem we simply say repeat the course uh so we we kind of blindly use our um information system or you get a grade well let me tell you the poorest form of feedback is a grade if you were to grade me on a 10-point scale you would give me a seven fine what should I do next yeah get an
32:30 - 33:00 eight but how so you know um particularly if we talk about complex skills grades don't have any information value um so I see a lot of reductionism with this whole oology all over the world I see a lot of monkeys doing
33:00 - 33:30 tricks not really knowing what they're doing simply passing the test um and I see few longitudinal elements and we were talking about the longitudinal development of skills I see very few longitudinal evaluation systems it's all block and the block finish up block so on so forth and I think this doesn't belong that kind of a view on
33:30 - 34:00 assessment aligns with an outdated model of learning which is Mastery learning right which is very behavioristic once you've learned it you sort of staple everything on top of each other and it's um the responsibility of the learner to integrate that and to be able to use that information when confronted with professional tasks it doesn't work that way it doesn't work
34:00 - 34:30 that way in order to learn properly you know we construct our own world um and we we do that much more with with with self-directed learning um and that's a very different model that's a more constructivistic model on learning you're using pbl pbl is a constructivistic approach to to learning yet probably you still have a
34:30 - 35:00 behavioristic system of assessment you know I never understood that you can tell students go and self-direct yourself and Define your learning objectives but here at the end of the course you have to jump to through this Loop that's conceptually not clear to me right so we need to have a different model of assessment that better aligns with our modern view on learning so I would say that the
35:00 - 35:30 messages that come from the learning um research on assessment that we should have feedback in the system information in the system so no assessment without feedback in my view but we've learned a lot particularly for complex skills quantitative information is not so useful and we have a fantastic tool namely
35:30 - 36:00 language let's use it narrative information carries a lot of weight and has a lot of impact and then we've understood partly also through the research of Joan that uh feedback giving feedback alone is not sufficient a lot of the feedback is simply you know ignored by Learners particularly in summi of
36:00 - 36:30 settings so if in a summi of setting a learner passes a test doesn't care about feedback you know I I passed so what's the point uh so feedback really should be a dialogue a two-way process a process which is supported which is scaffolded which is um which is followed up upon so the dialogue is really important and then we need longitudinal
36:30 - 37:00 assessment much more than we than we use than we currently do so these to me are the messages from the three um characteristics that we could look at instruments and with that in mind we may shift back and think more theoretically with these messages behind us I mean I think what I've just demonstrated to you that from any
37:00 - 37:30 angle any individual assessment has severe limitations and that every individual assessment severely compromises on any of these quality criteria you will never have a perfectly reliable test you will never have a perfectly valid test you know um it's only through different tests and different assessments that you can cover the whole and to me the implications from
37:30 - 38:00 the reliability research is you need to have many different assessments the validity research says you need a multitude of different methods the learning impact research says we need to have information meaningful information for the learner and to me that's the basis for programmatic assessment um and the curriculum I think is a good metaphor I was I was in the beginning I told you modern curricular are
38:00 - 38:30 governed you know are planned are evaluated um and I would suggest the same for assessment where you do things deliberately you want them Learners to present here to analyze there to orally defend there to right over there and so on so forth um and you use any sort of method I don't care I'm agnostic in relation to method
38:30 - 39:00 of assessment it all depends how you use it and what your educational justification is for that method at that particular moment in time but the question is how do you do that how do you then do that the literature is only about individual methods of assessment and it's particularly reliability and validity aspects so we started uh with um a uh a guy y stra formulating a set of um
39:00 - 39:30 guidelines there're out there a bit too much I think 73 is a lot um you may also know Amy they have um an in initiative called Aspire in which schools can have um an extra qualification in certain areas assessment is one and they've published nice criteria on the assessment program as a whole but I take a more theoretical
39:30 - 40:00 perspective um to me every assessment is but one data point right the little triangle middest triangle but one data point any individual assessment is looking at an individual through a keyhole through the door right and in my view every indiv individual assessment every data point should be optimized for learning should
40:00 - 40:30 be meaningful for learning should therefore have feedback to the learner right and it should be variable it should be it should vary in in format all depending on the educational justification for using that method at that particular moment in time and I would replace the summative formative debate by A continuing of stakes what is at stake here low stake decision or no
40:30 - 41:00 decision or a high stake decision and I think the number of data points should be proportional related to the stake of the decision right uh if there's a low STI decision doesn't really matter huh uh you can use a single data point if you have a very heavy high stake decision you need a lot of data points to make it get more concrete um low stake one single data
41:00 - 41:30 point is geared optimized for learning meaningful to learning giving feedback right it's not decision oriented you do not pass or fail you simply give information tough because if you tell your teachers um you cannot fail your students anymore Your Role is not to pass or to
41:30 - 42:00 fail your role is to provide information the decision will be taken elsewhere when we have a lot more information if we have more information we may have intermediate decisions intermediate decisions where you diagnose the learner what's going on what are your strengths what are areas for remediation and Remediation is very different from the classical retake
42:00 - 42:30 exam remediation is personalized uh remediation is is done with the learner um himself and very individualized um if there's a a high stake decision for example at the end of the year uh for promotion to the next year or maybe even certification you need a lot of data
42:30 - 43:00 points but if you have a lot of data points with Rich information you can have a good picture of an individual so you can also look at this each individual assessment is but a pixel right and if you have a single Pixel you won't see a lot right if you have more pixels you see more any
43:00 - 43:30 idea ah the Mona Lisa he got it he got it yeah yeah good very good very good so you building a picture of a learner you use information you diagnose you remediate you use the information to
43:30 - 44:00 learn um and at the same time to take decisions one final point I'm going to make uh that is in the classical approach we tend to aggregate information within a method within methods right and then we have multiple methods and we then aggregate that information again um let's take an OSI where you have a station on History taking and communication and then the next station
44:00 - 44:30 is on resuscitation and then you have a bunch of other stations and you aggregate the information across all stations into a final outcome now I don't know what resuscitation and history taking have in common probably nothing but we simply add it up because it's part of the osy and I think we can much more meaningfully aggregate across skills so I would be in favor
44:30 - 45:00 to aggregate the information in a station on communication with information on a multisource feedback on communication which is a much more meaningful way of aggregating information so these competency Frameworks are pretty handy for that because you can structure your assessments in relation to these competencies and more meaningfully aggregate information now I could go on for a long
45:00 - 45:30 time um put this into the literature with a model which explains things a lot more deeper um and more detailed but I won't go into much more details and recently I published the 12 tips paper in medical teacher um but you know this is still this is still Theory and I think I better give you an example of how this may work in actual
45:30 - 46:00 practice and there are a bunch of examples now out there using programmatic assessment and actually I must add one to the list as of today because I heard from Tom that their program in Family Practice in general practice is completely utterly extremely um programmatic and I think with great
46:00 - 46:30 success um they don't do anything quantitatively anymore they do everything qualitatively and that's interesting because assessment is often associated with quantification with scores they only deal with information with narrative information so we move from a world in assessment from scores to a world of Words which is a very interesting very
46:30 - 47:00 paradigmatic change in assessment I also came across um a paper just recently from McMasters um from the emergency medicine doctors there and they also use programmatic assessment and I was impressed um forgive me I'm going to stay home as much as possible because I know that program best when I talk about our graduate entry program in medicine in MCT in M we have two medical programs
47:00 - 47:30 one is a six-year undergraduate medical program three years bachelor's three years of master um fully problem based 340 students per year um and we have a graduate entry program in medicine where people have a relevant Bachelor or master degree before they enter medicine and they do they have a shortened version of it four years and on top of that they uh do a
47:30 - 48:00 lot of research and they get a master of science in research that's a very full program um it's focused on Excellence you know we want only Excellence we have very high expectations of the students also a pbl curriculum uh the pbl of the first year is based on paper p patients simulated patients in the skills lab in the second year the pbl
48:00 - 48:30 starts with a patient a real patient which they see in an outpatient clinic so um the actual stimulus for learning is a real patient in the second year then clerkship rotations and then half a year of research and half a year of uh um a clerkship in one of their um the whatever they like and that may be connected so the research and the clerkship rotation may be connected to each
48:30 - 49:00 other uh the assessment program is that we do a a module assessment like anyone would do um but there's a lot of variation we might also have very many assessments across one problem giving a feedback on this particular problem um we might have presentations we have um um assignments which are being evalu valuated so they're very variable we also have a lot of
49:00 - 49:30 longitudinal assessment um for example we do a lot of peer assessment on on uh the competencies um peer assessment and tutor assessments um which is very qualitative uh I mean very narrative we use a lot of words instead of scores um we also have progress tests which is a a multiple choice test but that's a multiple choice test that is
49:30 - 50:00 sort of a final examination through all the disciplines of the curriculum and that is being held every three months and given to all the students in the curriculum so when a student comes in first year first final examination is given in the first week of his training program that learner won't be able to answer many questions so second year a little bit more and so on so forth and you can't prepare for Progress tests
50:00 - 50:30 because what would you prepare right anything could be asked so you simply do your work in the tutorial groups and you'll find yourself growing automatically kind of you don't need to crme you know don't need to be anxious you'll grow automatically towards a higher level and also it's focused on functional knowledge I mean if you learn your anatomy in the first two years you still have to answer Anatomy questions at the end of your
50:30 - 51:00 training we do a lot of peer assessment again on professional behavior and other competencies and all the assessment is informative and low sick you cannot fail a single assessment okay the portfolio is the central instrument um and the portfolio to me is the learner chart very comparable to a patient
51:00 - 51:30 chart where all the information is being gathered and in which um the learner reflects on his development and discusses that with someone else this is what it looks like in a graph um also very important to say is that each of the Learners will have a mentor that will follow that learner throughout all four years we call them counselors I
51:30 - 52:00 hate that term um but they're coaches they're mentors um and they have a very intimate relationship with the Learners they are the dialogue that is created around the assessment uh Learners reflect remediate discuss that everything with their mentor the mentor follows uh up them at the end of the year there's a
52:00 - 52:30 meeting where final decisions are being made and a summit of decision is being made a high stake decision is being made I should formulate it in a different way we put in a lot of effort in terms of giving feedback here's a um an example of um um progress test information you can go online and look at your total score is longitudinally across time this student is not too good in the
52:30 - 53:00 beginning and has seen the light probably in the third year and started to perform better and then actually the last part the open part you'll see the computer projecting future results based on your past results and you can do this for your total scores you can do this for your Anatomy scores you can do that for your total basic Sciences scores um for any sort of query that you might have and you compare that to the
53:00 - 53:30 performance to the whole cohort and in the Netherlands we collaboratively do this kind of testing with five out of the eight medical schools collaboratively so you can also compare you every three mon months to a national average we also provide um good information on um the competencies we have adopted in the Netherlands the canm Med's competencies your competencies and they are actually part
53:30 - 54:00 of law so every curriculum has to map its curriculum to the ket's competencies we assemble information from different sources and different instruments and we aggregate that into the competencies and here's an overall chart the spider chart in which you can see individual performance related to cohort performance so they can compare they see where they are if you look at work-based assessments they're into the system and
54:00 - 54:30 you can um see this these Assessments in time you can click on each of the dots and you will get all the quantitative but also all the qualitative information in relation to that particular moment of assessment so you have a very rich database on the learner quantitative qualitative then I cannot stress
54:30 - 55:00 sufficiently how important the mentors are extremely important um and these mentors are just regular teachers it's a regular teaching role and it's actually very much valued uh because you have a very direct relationship with Learners so this is a gratifying teaching role and then and the decision making is done by a committee so ultimately a committee will take a
55:00 - 55:30 decision in that committee the the the mentor will not have a role because we wish to protect the relationship between Mentor MNT right and you can't be a judge and a helper at the same time the work of the committee it sounds like an awful amount of work but it isn't because you know 95 98% of the cases are straightforward you know that
55:30 - 56:00 do well and it's only on a few perc of the cases that they really have to deliberate right and think long and maybe even gather additional information before they come to a decision you think that's subjective yes it is another professional judgment which is required because you have quantitative information qualitative information you can't average
56:00 - 56:30 qualitative information it requires another professional judgment then how do you know it's not biased well we take all kinds of measures so that that judgment isn't biased procedural measures and let me let me be more clear for example the size of the committee will matter a largest size will uh will matter in terms of The credibility of
56:30 - 57:00 the decision that will come from that the amount of deliberation will matter will build to the trustworthiness of the decision the fact that the outcome and prior feedback Cycles the outcome of the committee's decision is no surprise to the learner builds to The credibility of the decision the fact that you can justify your judgment builds to The credibility of
57:00 - 57:30 the decision the fact that you can Appeal on the committee justifies and builds to The credibility of the decision and this is basically based on inspired on methodologies that we use to make qualitative research more robust so we use qualitative strategies to remove as much bias from the decision as possible but we don't replace the
57:30 - 58:00 professional judgment imagine you're a GP imagine that we would take away his professional judgment from his practice that will be that would be impossible you know he couldn't wouldn't be able to do his work work is his judgment biased yes it is that's why we have guidelines and support systems and
58:00 - 58:30 second opinions but we don't take away your professional judgment and I think we should do the same in assessment and I think we have been removing professional judgment all for the wrong reasons in education often the whole assessment business is um has a very strong assess psychometric discourse which is all about standardizing and removing
58:30 - 59:00 bias and I think that we have thrown away the baby with the bath water and we need that professional judgment back into the system right just to brag if we compare the performance of our graduate entry programs with our undergraduate students and then look at the last four years of the six-year program and compare uh across the two training programs then The Graduate ENT program students start lower than the sixe
59:00 - 59:30 undergraduate students but they end up very high at the end and actually this doesn't look like a big difference but it's an effect size on the scoring scale which is substantial and I've never seen that before naturally causality is a problem I can't say this is because of programmatic assessment many variables probably
59:30 - 60:00 behind this but at least as you see programmatic assessment may work may work well right naturally the next stage Tom is to go back to research right and see how things function in programs that have used programmatic assessment and understanding why things
60:00 - 60:30 function or not function and then readjust the model again it's sort of a iteration between practice and research which is called design-based research uh which I think is important and first studies are coming out um time doesn't per permit me to go into that let me come to some conclusions I think we have to stop thinking in terms of individual assessment methods right we need a systematic approach a more
60:30 - 61:00 um a more educationally oriented approach that is longitudinally oriented um in our education programs every method is functional Let me Give an example our students like oral exams because you know they have Agency on the exam they can show what they can do do they like oral exams professional judgment is
61:00 - 61:30 absolutely essential otherwise we can never make it really relevant and subjectivity is dealt with through sampling as I showed you earlier or through all kinds of professional or procedural bias reduction measures that deals with that bias but we don't remove the professional judgment and I think that programmatic assessment therefore optimizes both the
61:30 - 62:00 learning function of assessment as well as the decision-making function of assessment and I think that is an integrative all that fits that is fit for purpose and that is the end of my journey thank you very much [Applause]
62:00 - 62:30 thank you very much I'm not sure that we have time for for formal questions but does anyone have a a burning a burning question I invite you to a reception afterwards where certainly you can uh talk with case individually any question that is burning on the tip of anyone's tongue no they don't dare yes Diane can you hit your hit your
62:30 - 63:00 button it's it's kind of a I mean it's it's more talking about back to the people that you're talking about it depends on who's working and who's doing these things so um do you have a sense of it's maybe it's a generational kind of question so the people who are now putting the programmatic assessment in may have been trained in a certain methodology and now you're enacting a completely different one and so in any of this how do you account for well we learned it this way but I we think it's a diff it's a better way and and has
63:00 - 63:30 there been application of well they learned it differently but now the outcomes are are the same or different I it was just kind of curious in terms of well you know this the moving towards programmatic assessment reminds me of moving towards pbl and I I been in that business for many years and um the same arguments hold there um you know when we we don't know what pbl is and what it may do and whether it's beneficial and and our Learners are not used to that and when
63:30 - 64:00 they come in you know they don't understand that and it has been a struggle to convince people because this is about convincing people and not about the methodologies is there but you know it's it's it's about convincing it's a shift in thinking it's a mindset change and that's difficult to achieve um but that doesn't discharges of of trying to do it and I actually never worry about Learners you know uh
64:00 - 64:30 in medicine we get the cream that cream anyway they'll do fine despite our education and um so I think and I never worry about Learners they're very flexible they're adaptive I never ever worry about Learners I worry about teachers and uh it's it's it's it's getting a new Behavior your repertoire and that's not easy and I think teachers learn in
64:30 - 65:00 exactly the same way as students do by experience and um so simply telling them won't do it they have to experience it we did a course on programmatic assessment a couple of weeks ago and uh in the first day we had our um people coming to the course interview the undergraduate students and interview The Graduate entry students they were convinced you know they were completely convinced The Graduate entry students
65:00 - 65:30 are feedback Seekers they they want to know and they look for it and they're on top of it they're on top of their learning and that convinced them so I I should have we should have given the cut to a group of students who I should have brought that would have convinced you a lot more than my fancy thank you thank you thank you very much so going to invite Dale DNE just to
65:30 - 66:00 say h a couple of words you can see that Dale uh was one of the first lectures in the Cudmore series would you like to yes do you want to just yeah do it there if you'd like thank you Joan it was many years ago uh I have different color hair and I had hair case I've known you for many many years and had the pleasure pleasure of working with you some interesting collaboration in many places we all come to respect that MRI
66:00 - 66:30 is a special place it's a relative it's an old school but a relatively new University and they are amongst the top 40 in the world now amongst the new universities for many reasons uh you've shown us the way in so many things you've had so many wonderful colleagues some of your own tutors who've helped us redirected us and you personally have met a lot to many people in this room but again I've never ever heard you come or you didn't make me think very carefully about
66:30 - 67:00 everything I do and think about and in this era of social accountability and responsibilities this has been a terrific way to stimulate all of us as teachers to move things ahead thank you so much again thank thank you so please join us if you can for the reception outside and case is offering a workshop tomorrow and as part of the Education Institute on Thursday so hope
67:00 - 67:30 to see some of you there