Australia University
CoP, Community of Practice; HCP, Healthcare professionals; HPE, Health professions education; SCT, Social cognitive theory; SLT, Social learning theory; SRL, Self-regulated learning; NCSES, Nursing Competence Self-Efficacy Scale; NE, Nutrition Educator; NPT, Near peer teaching; PPR, Pharmacy practice research; RN, Registered nurses .
Five studies in this scoping review focused on utilizing Bandura's SLTs in the teaching, learning, and assessment of health professions students ( 34 , 36 – 38 , 40 ). The use of Bandura's SLTs in the included studies suggested its advantages in improving students' self-efficacy and confidence, collaborative learning, learning experiences and future teaching experience and career research intentions.
In 1977, Bandura proposed an SLT based on a series of human behavioral studies ( 24 ). According to Bandura, learning takes place in social settings and occurs not only through an individual's own experiences, but by observing the actions of others and their consequences ( 24 , 43 ). Social learning is also referred to as observational learning because learning takes place as a result of observing others (i.e., models), which Bandura's previous studies demonstrated as a valuable strategy for acquiring new behaviors ( 44 ). Bandura and his colleagues continued to demonstrate modeling/observational learning as a very efficient method of learning ( 44 ). Bandura's theorizing of the social development process later incorporated motivational and cognitive processes into SLT ( 44 ). In 1986, Bandura renamed his original SLT to SCT to emphasize the critical role that cognition plays in encoding and the performance of activities ( 44 , 45 ). SCT suggests that learning occurs in a social context with a dynamic and reciprocal interaction of the person, environment, and behavior ( 25 ). The core constructs of SCT include modeling/observational learning, outcome expectancies, self-efficacy and self-regulation ( 25 , 44 ). Bandura's observational learning consists of four stages: (1) attention: learners see the behavior they want to reproduce, (2) retention: learners retain the behavior they have seen entailing a cognitive process in which learners mentally rehearse the behavior they wish to replicate, (3) reproduction: learners put the processes obtained in attention and retention into action, and (4) motivation: learners imitate the observed behavior through reinforcement (direct, vicarious or self-reinforcement).
Based on Bandura's argument that human behavior is learnt via interactions with, and modeling of others in social contexts, Carroll et al. ( 36 ) applied the four stages of observational learning to investigate the effectiveness of GoSoapBox, a student response system (SRS). The study proved the effectiveness of this online tool in stimulating discussions on controversial topics, improving learning experiences and in-class engagement among paramedic, psychology, nutrition and dietetics, nursing, and public health students.
Carter et al. ( 37 ) focused on the self-efficacy, outcome expectancy and social influence components of SCT to develop and test a model that evaluates undergraduate pharmacy students' intentions to pursue a higher pharmacy practice research (PPR) degree. The authors suggest that educators must provide links between practice and research and increase student self-confidence to undertake PPR, thereby increasing interest in this as a future career path. This is because exposure alone has minimal influence on a student's interest in PPR as a career.
Irvine et al. ( 38 ) explored self-regulated learning (SRL), a learning model situated in SCT, strategies utilized by final year nursing students in both their approaches to learning and practical teaching sessions (peer-teaching). The study findings support the use of SRL in nursing education, as highlighted by the high level of motivational behaviors and learning strategies reported among undergraduate nursing students in their approach to learning and their roles as peer-teachers.
Kennedy et al. ( 40 ) used the construct of self-efficacy to develop and psychometrically assess a scale that examines undergraduate nursing students' self-efficacy practice competence, assist educators in determining the level of education that students receive, as well as their level of confidence and advocacy for positive changes.
Furthermore, Koo et al. ( 34 ) indicated that implementing a self-efficacy construct to develop a formative standardized patient experience allowed nursing students to develop the concepts of inter-professional collaborative communication, and enhanced their problem-solving and communication skills, as well as their clinical competency.
The CoP theory consists of three key components: the domain (the common interest among all members), the practice (the implicit and explicit knowledge shared), and the community (made up of mutually beneficial interactions between experts and learners leading to learning, engagement, and identity development) ( 10 , 46 – 48 ). All the articles retrieved in this review described a CoP as a group of people who share similar characteristics and collaborate toward a common goal, therefore enhancing mutual learning through sharing relevant knowledge and fostering the development of a shared identity. Three of the studies implemented CoP theory with a focus on teaching and learning among health professions students, and one with a focus on HPEP curricula design. All studies indicated that implementing the CoP learning theory enhanced student learning, collaboration, and identity.
Alsio et al. ( 39 ) found that when CoP theory was used to create teams of practicing nurses, physicians, and undergraduate medical students with the mandate of developing learning activities during their clinical placements, learning was stimulated through self-reflection and consideration of their perspectives during patient interactions. Further, inter-professional reflection was vital for successful introduction of new students into a CoP and was effective for structural and cultural development. Moreover, staff and students' awareness of their roles and responsibilities facilitated their motivation to participate in the CoPs implementation.
Similarly, Molesworth et al. ( 41 ) and Protoghese et al. ( 42 ) explored the experiences of undergraduate nursing students regarding their application of the CoP theory during clinical placements. Both studies argued that CoP helped students to integrate their theoretical learning of bioscience into practice ( 41 ), and to advance their existing clinical knowledge ( 42 ). Moreover, application of bioscience knowledge within a CoP facilitated effective inter-professional relationships ( 41 ). Additionally, students perceived that they received more respect, support, and feedback while learning within a CoP ( 42 ). This further emphasizes the significance of mutual engagement and the collaborative relationship component of the CoP theory in enhancing student learning ( 42 ).
Furthermore, Chen et al. ( 35 ) used CoP theory in a curricular design for the HPEP aimed at helping undergraduate medical students, residents, fellows, and learners from other HPE schools to develop their identities as future health professions educators. The program has demonstrated its effectiveness in providing learners with the knowledge and skills to realize their career aspirations. It also enhanced learners' enthusiasm for teaching and increased their interest in educational leadership, innovation, and research.
This scoping review attempted to provide an overview of how SToLs have been used in the teaching and learning of HPEPs over the last decade. This review highlighted some interesting findings that, collectively, may provide insights into how educational practices in HPEPs are shaped and influenced by learning theories.
Bandura's SLTs were applied predominantly in teaching and instruction strategies within the HPEPs. This review demonstrated the application of Bandura's observational learning model in the form of in-class integrated collaborative learning activities through an online tool for improving learning experiences and engagement ( 36 ). It is argued that observational learning provides a faster and safer approach to learning complicated patterns of behavior than trial and error, making it consistent with and suitable for HPE ( 7 , 49 ). Self-efficacy, defined by an individuals' assessment of their capacity to perform given tasks or activities and achieve specified goals ( 50 ), was the most highlighted construct in the included articles. This can be explained by Bandura's argument that self-efficacy is central to social learning because it significantly impacts a wide range of human endeavors, including developmental and health psychology, education, and in the workplace ( 19 ). The findings suggest that the self-efficacy construct is beneficial to the learning outcome, particularly in simulation contexts, as demonstrated in the review conducted by Lavoie et al. ( 51 ). This aligns with previous literature about the self-efficacy construct indicating that individuals with stronger self-efficacy for certain tasks are more motivated to execute them ( 50 , 52 ). Furthermore, the self-efficacy construct was used to develop an assessment tool that evaluates students competence and confidence level and advocacy for positive changes as they become professional nursing practitioners ( 40 ). In this context, it is worth mentioning that assessment tools based on self-efficacy found in previous health-related literature are task-specific ( 53 , 54 ). Previous literature has also argued that feelings of confidence among medical students are associated with competence and proficiency ( 55 , 56 ), and lack of confidence leads to nurses leaving the profession ( 57 ). Moreover, clinical educators' self-efficacy and confidence are critical to their ability to carry out their teaching and training responsibilities as they affect student achievement and patient outcomes ( 58 ).
In this review, CoP theory was mainly employed in the teaching and learning of health professions students, educators, and providers to improve learning, collaboration, and identity. However, as highlighted by Hörberg et al. ( 59 ), it would be better used to identify team challenges and provide more meaningful interventions. It is noteworthy that none of the included studies highlighted any long-term benefits of CoP, aligning with Allen et al.'s ( 60 ) argument that there is a paucity of health professions studies exploring the long-term effect of CoP on individuals and the relevance to educational outcomes. Additionally, several studies in healthcare education and practice indicated the scarcity of studies that focus on the development and assessment of CoPs ( 10 , 61 , 62 ).
This review highlights a scarcity of research focusing on the application of SToLs in the development, validation, and conduction of assessment activities within HPEPs. Only one study used the self-efficacy construct to develop a tool for assessing student competence ( 40 ). This is consistent with a recent literature review suggesting that SToLs are not applied in performing assessment activities compared to other learning theories, such as humanistic theories or motivational models ( 13 ). This is despite evidence of the utility of CoP learning theory in planning and implementing effective assessment measures in the PharmD program ( 20 ).
The current review suggests that the application of SToLs in designing HPEPs' curricular content, learning objectives, syllabus or influencing educational competencies is also not common. In this regard, Mukhalalati and Taylor proposed a novel CoP theory-informed framework that can be used in designing a new HPEP to reduce the disconnect between the educational practice and learning theories ( 10 ). The authors suggest key components to consider when developing a CoP-based curriculum, including but not limited to, complementing formal with informal learning, transferring tacit knowledge to explicit knowledge through socialization and externalization, re-contextualizing knowledge, and aligning students' learning needs to learning activities ( 10 ). These components are compatible with several SToLs and claimed to be applicable in various HPEPs ( 10 ).
An important observation in this review was the exclusion of a large number of retrieved articles because they failed to inform how SToLs are implemented in the educational practices and in delivering educational goals ( 63 ), or because they aimed to use SToLs as a lens to explore HPEPs teaching and learning practices, or as a theoretical framework to conceptualize or analyze HPE research data ( 64 – 66 ). This aligns with previous research that highlighted the significance of using theories to enhance research rigor and its relevant outcomes ( 67 ). However, it is suggested to use learning theories to critique HPE and guide its advancement initiatives ( 68 , 69 ). Furthermore, several excluded studies utilized SToLs for healthcare professionals continuing professional development ( 70 – 75 ), which seems to be a common application of SToLs. Although examining SToLs utilization in continuing professional development activities was not the aim of conducting this review, this aspect is extremely important as it indirectly influences students who will ultimately become health care professionals. Collectively, the small number of included eligible studies in this review that applied SToLs in HPEPs suggests disconnect between SToLs and HPEPs educational practices. It is argued that it is challenging for HPEPs educators to apply the educational theories because they received minimal or no educational training about their significance and implementation ( 5 ). Therefore, as recommended by previous research, a collaborative reform initiative should be enacted to enhance the optimal use of SToLs in educational practice and examine the applicability and usefulness of other theories of learning in HPEP ( 20 ). Moreover, this review did not include studies from Africa, Eastern Mediterranean, and South-East Asia, suggesting that exploratory and experimental educational research utilizing various learning theories are highly warranted in these regions.
This review explored SToLs use in HPEPs and provided a valuable overview for educators in a broad range of health education fields. Studies included were conducted in various countries which further enhanced the results' applicability to other contexts. However, a number of limitations should be acknowledged when interpreting the findings of this review. For example, this review was limited to only four databases and to the last decade, potentially missing relevant articles in other major databases such as Scopus and Web of Science and those published before 2011. Moreover, as is inherent to scoping reviews, a quality assessment for the included articles was not conducted necessitating caution in interpreting conclusions. Additionally, since SToLs can be categorized and named differently, this might inadvertently result in the omission of relevant articles.
This review provides an overview of the application of SToLs in HPEPs from 2011 to 2020. Only two SToLs were identified in this review: Bandura's SLT and SCT; and Lave and Wenger's CoP theory. Bandura's four-stage model of observational learning, as well as self-efficacy construct, were applied in the included studies. CoP theory was mainly employed to improve learning, collaboration, and identity, whilst SToLs use was predominantly focused on teaching and learning with less focus on assessment and curriculum design. This review demonstrated a limited number of HPEPs applying and reporting an application of SToLs despite the significance of the social aspect of learning concepts in those theories and within HPEP. This suggests a potential disconnect between SToLs and HPEP educational practices. Nonetheless, this review illustrated the successful and effective implementation of StoLs in various HPEPs, which is applicable to other HPEPs. Finally, this review supports the call for collaborative reform initiatives to optimize the use of StoLs in HPEPs educational practices. Future research should focus on the applicability and usefulness of other theories of learning in HPEP and investigate the long-term outcomes of theory implementation.
Author contributions.
All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.
Open Access funding is provided by the Qatar National Library.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2022.912751/full#supplementary-material
10 Must Read Machine Learning Research Papers
Machine learning is a rapidly evolving field with research papers often serving as the foundation for discoveries and advancements. For anyone keen to delve into the theoretical and practical aspects of machine learning, the following ten research papers are essential reads. They cover foundational concepts, groundbreaking techniques, and key advancements in the field.
This article highlights 10 must-read machine learning research papers that have significantly contributed to the development and understanding of machine learning. Whether you’re a beginner or an experienced practitioner, these papers provide invaluable insights that will help you grasp the complexities of machine learning and its potential to transform industries.
Table of Content
2. “imagenet classification with deep convolutional neural networks” by alex krizhevsky, ilya sutskever, and geoffrey e. hinton, 3. “playing atari with deep reinforcement learning” by volodymyr mnih et al., 4. “sequence to sequence learning with neural networks” by ilya sutskever, oriol vinyals, and quoc v. le, 5. “attention is all you need” by ashish vaswani et al., 6. “generative adversarial nets” by ian goodfellow et al., 7. “bert: pre-training of deep bidirectional transformers for language understanding” by jacob devlin et al., 8. “deep residual learning for image recognition” by kaiming he et al., 9. “a survey on deep learning in medical image analysis” by geert litjens et al., 10. “alphago: mastering the game of go with deep neural networks and tree search” by silver et al..
Summary : Pedro Domingos provides a comprehensive overview of essential machine learning concepts and common pitfalls. This paper is a great starting point for understanding the broader landscape of machine learning.
Access: Read the Paper
Summary : Often referred to as the “AlexNet” paper, this work introduced a deep convolutional neural network that significantly improved image classification benchmarks, marking a turning point in computer vision.
Summary : This paper from DeepMind presents the use of deep Q-networks (DQN) to play Atari games . It was a seminal work in applying deep learning to reinforcement learning.
Summary : This paper introduced the sequence-to-sequence (seq2seq) learning framework , which has become fundamental for tasks such as machine translation and text summarization.
Summary : This paper introduces the Transformer model, which relies solely on attention mechanisms, discarding recurrent layers used in previous models. It has become the backbone of many modern NLP systems.
Summary : Ian Goodfellow and his colleagues introduced Generative Adversarial Networks (GANs) , a revolutionary framework for generating realistic data through adversarial training.
Summary : BERT (Bidirectional Encoder Representations from Transformers) introduced a new way of pre-training language models, significantly improving performance on various NLP benchmarks.
Summary : This paper introduces Residual Networks (ResNets), which utilize residual learning to train very deep neural networks effectively.
Summary : This survey provides a comprehensive review of deep learning techniques applied to medical image analysis, summarizing the state of the art in this specialized field.
Summary : This paper describes AlphaGo, the first AI to defeat a world champion in the game of Go, using a combination of deep neural networks and Monte Carlo tree search.
These ten research papers cover a broad spectrum of machine learning advancements, from foundational concepts to cutting-edge techniques. They provide valuable insights into the development and application of machine learning technologies, making them essential reads for anyone looking to deepen their understanding of the field. By exploring these papers, you can gain a comprehensive view of how machine learning has evolved and where it might be heading in the future.
What are large language models (llms) and why are they important.
Large Language Models (LLMs) are advanced AI systems designed to understand and generate human language. They are built using deep learning techniques, particularly transformer architectures. LLMs are important because they enable applications such as text generation, translation, and sentiment analysis, significantly advancing the field of natural language processing (NLP).
Pedro Domingos’ paper provides a broad overview of key machine learning concepts, common challenges, and practical advice. It’s an excellent resource for both beginners and experienced practitioners to understand the underlying principles of machine learning and avoid common pitfalls.
The “AlexNet” paper revolutionized image classification by demonstrating the effectiveness of deep convolutional neural networks. It significantly improved benchmark results on ImageNet and introduced techniques like dropout and ReLU activations, which are now standard in deep learning.
Similar reads.
Problem-based learning (PBL) is an approach where group discussions and collaboration are apparent during problem-solving activities. Accordingly, learners’ personality types that affect the way they think, feel, behave, and interact may potentially have a role in PBL classrooms. This study tries to reveal the possible roles personality types play in PBL by investigating the effects of PBL on the argumentative essay writing of both extroverted and introverted students. This study employed a quasi-experimental design by randomly selecting students in academic writing courses for both the experimental and comparison groups and involving them in the intact classes. The findings revealed that the students in the PBL group scored higher than those in the guided writing group. Moreover, the extroverted students in the PBL group achieved higher mean scores than the extroverted students in the comparison group; however, the difference was insignificant. On the contrary, the statistical analysis showed that the introverted students in the experimental group outperformed those in the comparison group. This is to say that the introverted students taught using PBL had better skills in writing argumentative essays compared to those taught using guided writing techniques. This finding implies the need to use suitable teaching strategies that facilitate both extroverts and introverts in developing their writing skills while also sharpening their communicative and social skills.
Jumariati jumariati, universitas lambung mangkurat.
English Language Education Study Program
Alavinia, P. & Hassanlou, A. (2014). On the Viable Linkages between Extroversion/Introversion and Academic Iranian EFL Learners’ Writing Proficiency. English Language Teaching, 7(3), 167-185. DOI: http://dx.doi.org/10.5539/elt.v7n3p167
Boroujeni, A., Roohani, A. & Hasanimanesh, A. (2015). The Impact of Extroversion and Introversion Personality Types on EFL Learners’ Writing Ability. Theory and Practice in Language Studies, 5(1), 212-218. DOI: http://dx.doi.org/10.17507/tpls.0501.29
Burch, K. (2000). A Primer on Problem-Based Learning for International Relations Courses. International Studies Perspectives, 1(1), 31-44. Retrieved January 13th 2024 from http://www.jstor.org/stable/44218105 .
Cain, S. (2012). Quiet: The Power of Introverts in a World That Can't Stop Talking. New York, NY: Crown Publishers.
Cao, C. & Meng, Q. (2020). Exploring personality traits as predictors of English achievement and global competence among Chinese university students: English learning motivation as the moderator. Learning and Individual Differences, 77, Article 101814. DOI: https://psycnet.apa.org/doi/10.1016/j.lindif.2019.101814
Dewaele, J. (2013). Personality in Second Language Acquisition. In The Encyclopedia of Applied Linguistic. Ed. Carol A. Chapelle. Blackwell Publishing Limited. DOI: 10.1002/9781405198431.wbeal0904
Dewaele, J. & Furnham, A. (1999). Extraversion: The Unloved Variable in Applied Linguistic Research. Language Learning, 49(3), 509–544. DOI: http://dx.doi.org/10.1111/0023-8333.00098
Dörnyei, Z. & Skehan, P. (2003). Individual Differences in Second Language. In Catherine J. Doughty, Michael H. Long (eds.). The Handbook of Second Language Acquisition, 589-630. DOI: https://doi.org/10.1002/9780470756492.ch18
Dörnyei, Z. (2006). Individual differences in second language acquisition. AILA Review, 19, 42–68. DOI: http://dx.doi.org/10.1075/aila.19.05dor
Dow, S.E. (2013). The Invisible Students in the Classroom: How to Include the Introverts Without Excluding the Extroverts. Education and Human Development Master's Theses, 268. Retrieved January 13th 2024 from http://digitalcommons.brockport.edu/ehd_theses/268
Ellis, R. (1994). The Study of Second Language Acquisition. Oxford, UK: Oxford University Press.
Graham, S. & Sandmel, K. (2011). The Process Writing Approach: A Meta-analysis. The Journal of Educational Research, 104(6), 396-407. DOI: http://doi.org./10.1080/00220671.2010.488703
Hajimohammadi, R. & Mukundan, J. (2011). Impact of Self-Correction on Extrovert and Introvert Students in EFL Writing Progress. English Language Teaching, 4(2), 161-168. DOI: http://dx.doi.org/10.5539/elt.v4n2p161
Hallinger, P. & Lu, J. (2011). Implementing problem-based learning in higher education in Asia: challenges, strategies and effect. Journal of Higher Education Policy and Management, 33(3), 267-285, DOI: http://dx.doi.org/10.1080/1360080X.2011.565000
He, T. (2019). Personality Facets, Writing Strategy Use, and Writing Performance of College Students Learning English as A Foreign Language. SAGE Open, 1-15. DOI: https://doi.org/10.1177/2158244019861483
Hemmatnezhad, S., Jahandar, S. & Khodabandehlou, M. (2014). The Impact of Extraversion vs. Introversion on Iranian EFL Learners' Writing Ability. Indian Journal of Fundamental and Applied Life Sciences, 4(1), 119-128. Retrieved February 5, 2023 from http://www.cibtech.org/jls.htm
Ho, D. W. L., Whitehill, T. L. & Ciocca, V. (2014). Performance of speech-language pathology students in problem-based learning tutorials and in clinical practice. Clinical Linguistics & Phonetics, 28(1–2), 83–97. DOI: http://dx.doi.org/10.3109/02699206.2013.812146
Hung, W. (2013). Problem-Based Learning: A Learning Environment for Enhancing Learning Transfer. New Directions for Adult and Continuing Education, 137, 27-38. (Online), ( http://dx.doi.org/10.1002/ace.20042)
Johnson, S. M. & Finucane, P. M. (2000). The emergence of problem-based learning in medical education. Journal of Evaluation in Clinical Practice, 6(3), 281–291. DOI: http://dx.doi.org/10.1046/j.1365-2753.2000.00267.x
Johnston, I. (2000). Essays and Arguments: A Handbook on Writing Argumentative and Interpretive Essays. Retrieved January 28, 2023 from http://www.mala.bc.ca/-johnston/arguments .
Jumariati, J. & Sulistyo, G.H. (2017). Problem-Based Writing Instruction: Its Effect on Students’ Skills in Argumentative Writing. Arab World English Journal (AWEJ), 8(2), 87-100. DOI: https://dx.doi.org/10.24093/awej/vol8no2.6
Khodabandeh, F. (2022). Exploring the applicability of virtual reality‐enhanced education on extrovert and introvert EFL learners’ paragraph writing. Educational Journal of Technology in Higher Education, 19-27. DOI: https://doi.org/10.1186/s41239-022-00334-w
Kok, F. Z. & Duman, B. (2023). The effect of problem-based learning on problem-solving skills in English language teaching. Journal of Pedagogical Research, 7(1), 154-173. DOI: http://dx.doi.org/10.33902/JPR.202318642
Kumari, V., Ffytche, D. H. Williams, S. C. R. & Gray, J. A. (2004). Personality Predicts Brain Responses to Cognitive Demands. The Journal of Neuroscience, 24(47), 10636–10641. DOI: https://doi.org/10.1523/JNEUROSCI.3206-04.2004
Layeghi, F. (2011). Form and Content in the Argumentative Writing of Extroverted and Introverted Iranian EFL Learners. Iranian EFL Journal, 7(3), 166-183. Retrieved January 28, 2023 from http://www.iranian-efl-journal.com/2011
Li, X., & Liu, J. (2021). Mapping the taxonomy of critical thinking ability in EFL. Thinking Skills and Creativity, 41, 100880. DOI: https://doi.org/10.1016/j.tsc.2021.100880
Li, Y. (2013). Effects of Problem-Based English Writing Instruction on Thai Upper Secondary School Students’ Critical Thinking Abilities and Argumentative Writing Skills. Online Journal of Education, 8(1), 242-255. Retrieved January 28, 2023 from http://www.edu.chula.ac.th./ojed
Liang, H. Y. & Kelsen, B. (2018). Influence of Personality and Motivation on Oral Presentation Performance. Journal of Psycholinguist Research, 47(2). DOI: https://doi.org/10.1007/s10936-017-9551-6
Lieberman, M. D. & Rosenthal, R. (2001). Why Introverts Can't Always Tell Who Likes Them: Multitasking and Nonverbal Decoding. Journal of Personality and Social Psychology, 80(2), 294-310. DOI: https://doi.org/10.I037//O022-3514.80.2.294
Lin, Y. (2018). Core Issues in Developing Critical Thinking Skills. In: Developing Critical Thinking in EFL Classes. Springer, Singapore. https://doi.org/10.1007/978-981-10-7784-5_1
Liyanage, I. & Bartlett, B. (2013). Personality types and languages learning strategies: Chameleons changing colours. System, 41(3), 598-608. DOI: http://dx.doi.org/10.1016/j.system.2013.07.011
Marefat, F. (2006). Student Writing, Personality Type of the Student and the Rater: Any Interrelationship? The Reading Matrix, 6(2), 116-124. Retrieved February 2, 2023 from http://www.readingmatrix.com/articles/marefat/article.pdf
Nejad, A. M., Bijami, M. & Ahmadi, M. R. (2012). Do Personality Traits Predict Academic Writing Ability? An EFL Case Study. English Linguistics Research, 1(2). DOI: http://dx.doi.org/10.5430/elr.v1n2p145
Othman, N & Shah, M. I. A. (2013). Problem-Based Learning in the English Language Classroom. English Language Teaching, 6(3), 125-134. DOI: http://dx.doi.org/10.5539/elt.v6n3p125
Oxford, R. (2003). Language learning styles and strategies: An overview. GALA. 1-25. Retrieved December 13th 2023 from https://www.researchgate.net/publication/254446824_Language_learning_styles_and_strategies_An_overview
Qanwal, S. & Ghani, M. (2019). Relationship Between Introversion/Extroversion Personality Trait and Proficiency in ESL Writing Skills. International Journal of English Linguistics, 9(4), 107-118. DOI: https://doi.org/10.5539/ijel.v9n4p107
Sanjaya, D., Mokhtar, A. A., & Sumarsih. (2015). The Impact of Personality (Extroversion/Introversion) on Indonesian EFL Learners’ Essay Writing Achievement. The Asian EFL Journal Professional Teaching Article, 87, 4-19. Retrieved November 12th 2023 from http://www.asian-efl-journal.com
Savery, J. R. (2006). Overview of Problem-Based Learning: Definitions and Distinctions. Interdisciplinary Journal of Problem-Based Learning, 1, 9-20. DOI: http://dx.doi.org/10.7771/1541-5015.1002
Sharp, A. (2008). Personality and Second Language Learning. Asian Social Science, 11(4), 17-25. DOI: http://dx.doi.org/10.5539/ass.v4n11p17
Shorkpour, N. & Moslehi, S. (2015). The relationship between personality types and the type of correction in EFL writing skill. Pertanika Journal of Social Science and Humanities, 23(1), 35-46.
Smalley, R. L., Ruetten, M. K. & Kozyrev, J. R. (2001). Refining Composition Skills: Academic Writing and Grammar (Developing & Refining Composition Skills). Heinle Cengage Learning.
Soland, J., Hamilton, L. S. & Stecher, B. M. (2013). Measuring 21st-century competencies: Guidance for educators. Asia Society: RAND Corporation.
Swanberg, A. B., & Martinsen, O. L. (2010). Personality, approaches to learning and achievement. Educational Psychology, 30(1), 75-88. DOI: http://doi.org/10.1080/01443410903410474
Yuan, R., Yang, M., & Lee, I. (2021). Preparing pre-service language teachers to teach critical thinking: Can overseas field school experience make a difference? Thinking Skills and Creativity, 40, 100832. DOI: http://doi.org/10.1016/j.tsc.2021.100832
Zaswita, H. & Ihsan. R. (2020). The Impact of Personality Types on Students’ Writing Ability. Jurnal Pendidikan Indonesia, 9(1), 75-84. DOI: http://dx.doi.org/10.23887/jpi-undiksha.v9i1.21101
Copyright © 2015-2024 ACADEMY PUBLICATION — All Rights Reserved
BMC Medical Education volume 24 , Article number: 927 ( 2024 ) Cite this article
The disruption of health and medical education by the COVID-19 pandemic made educators question the effect of online setting on students’ learning, motivation, self-efficacy and preference. In light of the health care staff shortage online scalable education seemed relevant. Reviews on the effect of online medical education called for high quality RCTs, which are increasingly relevant with rapid technological development and widespread adaption of online learning in universities. The objective of this trial is to compare standardized and feasible outcomes of an online and an onsite setting of a research course regarding the efficacy for PhD students within health and medical sciences: Primarily on learning of research methodology and secondly on preference, motivation, self-efficacy on short term and academic achievements on long term. Based on the authors experience with conducting courses during the pandemic, the hypothesis is that student preferred onsite setting is different to online setting.
Cluster randomized trial with two parallel groups. Two PhD research training courses at the University of Copenhagen are randomized to online (Zoom) or onsite (The Parker Institute, Denmark) setting. Enrolled students are invited to participate in the study. Primary outcome is short term learning. Secondary outcomes are short term preference, motivation, self-efficacy, and long-term academic achievements. Standardized, reproducible and feasible outcomes will be measured by tailor made multiple choice questionnaires, evaluation survey, frequently used Intrinsic Motivation Inventory, Single Item Self-Efficacy Question, and Google Scholar publication data. Sample size is calculated to 20 clusters and courses are randomized by a computer random number generator. Statistical analyses will be performed blinded by an external statistical expert.
Primary outcome and secondary significant outcomes will be compared and contrasted with relevant literature. Limitations include geographical setting; bias include lack of blinding and strengths are robust assessment methods in a well-established conceptual framework. Generalizability to PhD education in other disciplines is high. Results of this study will both have implications for students and educators involved in research training courses in health and medical education and for the patients who ultimately benefits from this training.
Retrospectively registered at ClinicalTrials.gov: NCT05736627. SPIRIT guidelines are followed.
Peer Review reports
Medical education was utterly disrupted for two years by the COVID-19 pandemic. In the midst of rearranging courses and adapting to online platforms we, with lecturers and course managers around the globe, wondered what the conversion to online setting did to students’ learning, motivation and self-efficacy [ 1 , 2 , 3 ]. What the long-term consequences would be [ 4 ] and if scalable online medical education should play a greater role in the future [ 5 ] seemed relevant and appealing questions in a time when health care professionals are in demand. Our experience of performing research training during the pandemic was that although PhD students were grateful for courses being available, they found it difficult to concentrate related to the long screen hours. We sensed that most students preferred an onsite setting and perceived online courses a temporary and inferior necessity. The question is if this impacted their learning?
Since the common use of the internet in medical education, systematic reviews have sought to answer if there is a difference in learning effect when taught online compared to onsite. Although authors conclude that online learning may be equivalent to onsite in effect, they agree that studies are heterogeneous and small [ 6 , 7 ], with low quality of the evidence [ 8 , 9 ]. They therefore call for more robust and adequately powered high-quality RCTs to confirm their findings and suggest that students’ preferences in online learning should be investigated [ 7 , 8 , 9 ].
This uncovers two knowledge gaps: I) High-quality RCTs on online versus onsite learning in health and medical education and II) Studies on students’ preferences in online learning.
Recently solid RCTs have been performed on the topic of web-based theoretical learning of research methods among health professionals [ 10 , 11 ]. However, these studies are on asynchronous courses among medical or master students with short term outcomes.
This uncovers three additional knowledge gaps: III) Studies on synchronous online learning IV) among PhD students of health and medical education V) with long term measurement of outcomes.
The rapid technological development including artificial intelligence (AI) and widespread adaption as well as application of online learning forced by the pandemic, has made online learning well-established. It represents high resolution live synchronic settings which is available on a variety of platforms with integrated AI and options for interaction with and among students, chat and break out rooms, and exterior digital tools for teachers [ 12 , 13 , 14 ]. Thus, investigating online learning today may be quite different than before the pandemic. On one hand, it could seem plausible that this technological development would make a difference in favour of online learning which could not be found in previous reviews of the evidence. On the other hand, the personal face-to-face interaction during onsite learning may still be more beneficial for the learning process and combined with our experience of students finding it difficult to concentrate when online during the pandemic we hypothesize that outcomes of the onsite setting are different from the online setting.
To support a robust study, we design it as a cluster randomized trial. Moreover, we use the well-established and widely used Kirkpatrick’s conceptual framework for evaluating learning as a lens to assess our outcomes [ 15 ]. Thus, to fill the above-mentioned knowledge gaps, the objective of this trial is to compare a synchronous online and an in-person onsite setting of a research course regarding the efficacy for PhD students within the health and medical sciences:
Primarily on theoretical learning of research methodology and
Secondly on
◦ Preference, motivation, self-efficacy on short term
◦ Academic achievements on long term
This study protocol covers synchronous online and in-person onsite setting of research courses testing the efficacy for PhD students. It is a two parallel arms cluster randomized trial (Fig. 1 ).
Consort flow diagram
The study measures baseline and post intervention. Baseline variables and knowledge scores are obtained at the first day of the course, post intervention measurement is obtained the last day of the course (short term) and monthly for 24 months (long term).
Randomization is stratified giving 1:1 allocation ratio of the courses. As the number of participants within each course might differ, the allocation ratio of participants in the study will not fully be equal and 1:1 balanced.
The study site is The Parker Institute at Bispebjerg and Frederiksberg Hospital, University of Copenhagen, Denmark. From here the courses are organized and run online and onsite. The course programs and time schedules, the learning objective, the course management, the lecturers, and the delivery are identical in the two settings. The teachers use the same introductory presentations followed by training in break out groups, feed-back and discussions. For the online group, the setting is organized as meetings in the online collaboration tool Zoom® [ 16 ] using the basic available technicalities such as screen sharing, chat function for comments, and breakout rooms and other basics digital tools if preferred. The online version of the course is synchronous with live education and interaction. For the onsite group, the setting is the physical classroom at the learning facilities at the Parker Institute. Coffee and tea as well as simple sandwiches and bottles of water, which facilitate sociality, are available at the onsite setting. The participants in the online setting must get their food and drink by themselves, but online sociality is made possible by not closing down the online room during the breaks. The research methodology courses included in the study are “Practical Course in Systematic Review Technique in Clinical Research”, (see course programme in appendix 1) and “Getting started: Writing your first manuscript for publication” [ 17 ] (see course programme in appendix 2). The two courses both have 12 seats and last either three or three and a half days resulting in 2.2 and 2.6 ECTS credits, respectively. They are offered by the PhD School of the Faculty of Health and Medical Sciences, University of Copenhagen. Both courses are available and covered by the annual tuition fee for all PhD students enrolled at a Danish university.
Inclusion criteria for participants: All PhD students enrolled on the PhD courses participate after informed consent: “Practical Course in Systematic Review Technique in Clinical Research” and “Getting started: Writing your first manuscript for publication” at the PhD School of the Faculty of Health and Medical Sciences, University of Copenhagen, Denmark.
Exclusion criteria for participants: Declining to participate and withdrawal of informed consent.
The PhD students at the PhD School at the Faculty of Health Sciences, University of Copenhagen participate after informed consent, taken by the daily project leader, allowing evaluation data from the course to be used after pseudo-anonymization in the project. They are informed in a welcome letter approximately three weeks prior to the course and again in the introduction the first course day. They register their consent on the first course day (Appendix 3). Declining to participate in the project does not influence their participation in the course.
Online course settings will be compared to onsite course settings. We test if the onsite setting is different to online. Online learning is increasing but onsite learning is still the preferred educational setting in a medical context. In this case onsite learning represents “usual care”. The online course setting is meetings in Zoom using the technicalities available such as chat and breakout rooms. The onsite setting is the learning facilities, at the Parker Institute, Bispebjerg and Frederiksberg Hospital, The Capital Region, University of Copenhagen, Denmark.
The course settings are not expected to harm the participants, but should a request be made to discontinue the course or change setting this will be met, and the participant taken out of the study. Course participants are allowed to take part in relevant concomitant courses or other interventions during the trial.
Course participants are motivated to complete the course irrespectively of the setting because it bears ECTS-points for their PhD education and adds to the mandatory number of ECTS-points. Thus, we expect adherence to be the same in both groups. However, we monitor their presence in the course and allocate time during class for testing the short-term outcomes ( motivation, self-efficacy, preference and learning). We encourage and, if necessary, repeatedly remind them to register with Google Scholar for our testing of the long-term outcome (academic achievement).
Outcomes are related to the Kirkpatrick model for evaluating learning (Fig. 2 ) which divides outcomes into four different levels; Reaction which includes for example motivation, self-efficacy and preferences, Learning which includes knowledge acquisition, Behaviour for practical application of skills when back at the job (not included in our outcomes), and Results for impact for end-users which includes for example academic achievements in the form of scientific articles [ 18 , 19 , 20 ].
The Kirkpatrick model
The primary outcome is short term learning (Kirkpatrick level 2).
Learning is assessed by a Multiple-Choice Questionnaire (MCQ) developed prior to the RCT specifically for this setting (Appendix 4). First the lecturers of the two courses were contacted and asked to provide five multiple choice questions presented as a stem with three answer options; one correct answer and two distractors. The questions should be related to core elements of their teaching under the heading of research training. The questions were set up to test the cognition of the students at the levels of "Knows" or "Knows how" according to Miller's Pyramid of Competence and not their behaviour [ 21 ]. Six of the course lecturers responded and out of this material all the questions which covered curriculum of both courses were selected. It was tested on 10 PhD students and within the lecturer group, revised after an item analysis and English language revised. The MCQ ended up containing 25 questions. The MCQ is filled in at baseline and repeated at the end of the course. The primary outcomes based on the MCQ is estimated as the score of learning calculated as number of correct answers out of 25 after the course. A decrease of points of the MCQ in the intervention groups denotes a deterioration of learning. In the MCQ the minimum score is 0 and 25 is maximum, where 19 indicates passing the course.
Furthermore, as secondary outcome, this outcome measurement will be categorized as binary outcome to determine passed/failed of the course defined by 75% (19/25) correct answers.
The learning score will be computed on group and individual level and compared regarding continued outcomes by the Mann–Whitney test comparing the learning score of the online and onsite groups. Regarding the binomial outcome of learning (passed/failed) data will be analysed by the Fisher’s exact test on an intention-to-treat basis between the online and onsite. The results will be presented as median and range and as mean and standard deviations, for possible future use in meta-analyses.
Motivation assessment post course: Motivation level is measured by the Intrinsic Motivation Inventory (IMI) Scale [ 22 ] (Appendix 5). The IMI items were randomized by random.org on the 4th of August 2022. It contains 12 items to be assessed by the students on a 7-point Likert scale where 1 is “Not at all true”, 4 is “Somewhat true” and 7 is “Very true”. The motivation score will be computed on group and individual level and will then be tested by the Mann–Whitney of the online and onsite group.
Self-efficacy assessment post course: Self-efficacy level is measured by a single-item measure developed and validated by Williams and Smith [ 23 ] (Appendix 6). It is assessed by the students on a scale from 1–10 where 1 is “Strongly disagree” and 10 is “Strongly agree”. The self-efficacy score will be computed on group and individual level and tested by a Mann–Whitney test to compare the self-efficacy score of the online and onsite group.
Preference assessment post course: Preference is measured as part of the general course satisfaction evaluation with the question “If you had the option to choose, which form would you prefer this course to have?” with the options “onsite form” and “online form”.
Academic achievement assessment is based on 24 monthly measurements post course of number of publications, number of citations, h-index, i10-index. This data is collected through the Google Scholar Profiles [ 24 ] of the students as this database covers most scientific journals. Associations between onsite/online and long-term academic will be examined with Kaplan Meyer and log rank test with a significance level of 0.05.
Enrolment for the course at the Faculty of Health Sciences, University of Copenhagen, Denmark, becomes available when it is published in the course catalogue. In the course description the course location is “To be announced”. Approximately 3–4 weeks before the course begins, the participant list is finalized, and students receive a welcome letter containing course details, including their allocation to either the online or onsite setting. On the first day of the course, oral information is provided, and participants provide informed consent, baseline variables, and base line knowledge scores.
The last day of scheduled activities the following scores are collected, knowledge, motivation, self-efficacy, setting preference, and academic achievement. To track students' long term academic achievements, follow-ups are conducted monthly for a period of 24 months, with assessments occurring within one week of the last course day (Table 1 ).
The power calculation is based on the main outcome, theoretical learning on short term. For the sample size determination, we considered 12 available seats for participants in each course. To achieve statistical power, we aimed for 8 clusters in both online and onsite arms (in total 16 clusters) to detect an increase in learning outcome of 20% (learning outcome increase of 5 points). We considered an intraclass correlation coefficient of 0.02, a standard deviation of 10, a power of 80%, and a two-sided alpha level of 5%. The Allocation Ratio was set at 1, implying an equal number of subjects in both online and onsite group.
Considering a dropout up to 2 students per course, equivalent to 17%, we determined that a total of 112 participants would be needed. This calculation factored in 10 clusters of 12 participants per study arm, which we deemed sufficient to assess any changes in learning outcome.
The sample size was estimated using the function n4means from the R package CRTSize [ 25 ].
Participants are PhD students enrolled in 10 courses of “Practical Course in Systematic Review Technique in Clinical Research” and 10 courses of “Getting started: Writing your first manuscript for publication” at the PhD School of the Faculty of Health Sciences, University of Copenhagen, Denmark.
Randomization will be performed on course-level. The courses are randomized by a computer random number generator [ 26 ]. To get a balanced randomization per year, 2 sets with 2 unique random integers in each, taken from the 1–4 range is requested.
The setting is not included in the course catalogue of the PhD School and thus allocation to online or onsite is concealed until 3–4 weeks before course commencement when a welcome letter with course information including allocation to online or onsite setting is distributed to the students. The lecturers are also informed of the course setting at this time point. If students withdraw from the course after being informed of the setting, a letter is sent to them enquiring of the reason for withdrawal and reason is recorded (Appendix 7).
The allocation sequence is generated by a computer random number generator (random.org). The participants and the lecturers sign up for the course without knowing the course setting (online or onsite) until 3–4 weeks before the course.
Due to the nature of the study, it is not possible to blind trial participants or lecturers. The outcomes are reported by the participants directly in an online form, thus being blinded for the outcome assessor, but not for the individual participant. The data collection for the long-term follow-up regarding academic achievements is conducted without blinding. However, the external researcher analysing the data will be blinded.
Data will be collected by the project leader (Table 1 ). Baseline variables and post course knowledge, motivation, and self-efficacy are self-reported through questionnaires in SurveyXact® [ 27 ]. Academic achievements are collected through Google Scholar profiles of the participants.
Given that we are using participant assessments and evaluations for research purposes, all data collection – except for monthly follow-up of academic achievements after the course – takes place either in the immediate beginning or ending of the course and therefore we expect participant retention to be high.
Data will be downloaded from SurveyXact and stored in a locked and logged drive on a computer belonging to the Capital Region of Denmark. Only the project leader has access to the data.
This project conduct is following the Danish Data Protection Agency guidelines of the European GDPR throughout the trial. Following the end of the trial, data will be stored at the Danish National Data Archive which fulfil Danish and European guidelines for data protection and management.
Data is anonymized and blinded before the analyses. Analyses are performed by a researcher not otherwise involved in the inclusion or randomization, data collection or handling. All statistical tests will be testing the null hypotheses assuming the two arms of the trial being equal based on corresponding estimates. Analysis of primary outcome on short-term learning will be started once all data has been collected for all individuals in the last included course. Analyses of long-term academic achievement will be started at end of follow-up.
Baseline characteristics including both course- and individual level information will be presented. Table 2 presents the available data on baseline.
We will use multivariate analysis for identification of the most important predictors (motivation, self-efficacy, sex, educational background, and knowledge) for best effect on short and long term. The results will be presented as risk ratio (RR) with 95% confidence interval (CI). The results will be considered significant if CI does not include the value one.
All data processing and analyses were conducted using R statistical software version 4.1.0, 2021–05-18 (R Foundation for Statistical Computing, Vienna, Austria).
If possible, all analysis will be performed for “Practical Course in Systematic Review Technique in Clinical Research” and for “Getting started: Writing your first manuscript for publication” separately.
Primary analyses will be handled with the intention-to-treat approach. The analyses will include all individuals with valid data regardless of they did attend the complete course. Missing data will be handled with multiple imputation [ 28 ] .
Upon reasonable request, public assess will be granted to protocol, datasets analysed during the current study, and statistical code Table 3 .
This project is coordinated in collaboration between the WHO CC (DEN-62) at the Parker Institute, CAMES, and the PhD School at the Faculty of Health and Medical Sciences, University of Copenhagen. The project leader runs the day-to-day support of the trial. The steering committee of the trial includes principal investigators from WHO CC (DEN-62) and CAMES and the project leader and meets approximately three times a year.
Data monitoring is done on a daily basis by the project leader and controlled by an external independent researcher.
An adverse event is “a harmful and negative outcome that happens when a patient has been provided with medical care” [ 29 ]. Since this trial does not involve patients in medical care, we do not expect adverse events. If participants decline taking part in the course after receiving the information of the course setting, information on reason for declining is sought obtained. If the reason is the setting this can be considered an unintended effect. Information of unintended effects of the online setting (the intervention) will be recorded. Participants are encouraged to contact the project leader with any response to the course in general both during and after the course.
The trial description has been sent to the Scientific Ethical Committee of the Capital Region of Denmark (VEK) (21041907), which assessed it as not necessary to notify and that it could proceed without permission from VEK according to the Danish law and regulation of scientific research. The trial is registered with the Danish Data Protection Agency (Privacy) (P-2022–158). Important protocol modification will be communicated to relevant parties as well as VEK, the Joint Regional Information Security and Clinicaltrials.gov within an as short timeframe as possible.
The results (positive, negative, or inconclusive) will be disseminated in educational, scientific, and clinical fora, in international scientific peer-reviewed journals, and clinicaltrials.gov will be updated upon completion of the trial. After scientific publication, the results will be disseminated to the public by the press, social media including the website of the hospital and other organizations – as well as internationally via WHO CC (DEN-62) at the Parker Institute and WHO Europe.
All authors will fulfil the ICMJE recommendations for authorship, and RR will be first author of the articles as a part of her PhD dissertation. Contributors who do not fulfil these recommendations will be offered acknowledgement in the article.
This cluster randomized trial investigates if an onsite setting of a research course for PhD students within the health and medical sciences is different from an online setting. The outcomes measured are learning of research methodology (primary), preference, motivation, and self-efficacy (secondary) on short term and academic achievements (secondary) on long term.
The results of this study will be discussed as follows:
Discussion of primary outcome
Primary outcome will be compared and contrasted with similar studies including recent RCTs and mixed-method studies on online and onsite research methodology courses within health and medical education [ 10 , 11 , 30 ] and for inspiration outside the field [ 31 , 32 ]: Tokalic finds similar outcomes for online and onsite, Martinic finds that the web-based educational intervention improves knowledge, Cheung concludes that the evidence is insufficient to say that the two modes have different learning outcomes, Kofoed finds online setting to have negative impact on learning and Rahimi-Ardabili presents positive self-reported student knowledge. These conflicting results will be discussed in the context of the result on the learning outcome of this study. The literature may change if more relevant studies are published.
Discussion of secondary outcomes
Secondary significant outcomes are compared and contrasted with similar studies.
It is a limitation to this study, that an onsite curriculum for a full day is delivered identically online, as this may favour the onsite course due to screen fatigue [ 33 ]. At the same time, it is also a strength that the time schedules are similar in both settings. The offer of coffee, tea, water, and a plain sandwich in the onsite course may better facilitate the possibility for socializing. Another limitation is that the study is performed in Denmark within a specific educational culture, with institutional policies and resources which might affect the outcome and limit generalization to other geographical settings. However, international students are welcome in the class.
In educational interventions it is generally difficult to blind participants and this inherent limitation also applies to this trial [ 11 ]. Thus, the participants are not blinded to their assigned intervention, and neither are the lecturers in the courses. However, the external statistical expert will be blinded when doing the analyses.
We chose to compare in-person onsite setting with a synchronous online setting. Therefore, the online setting cannot be expected to generalize to asynchronous online setting. Asynchronous delivery has in some cases showed positive results and it might be because students could go back and forth through the modules in the interface without time limit [ 11 ].
We will report on all the outcomes defined prior to conducting the study to avoid selective reporting bias.
It is a strength of the study that it seeks to report outcomes within the 1, 2 and 4 levels of the Kirkpatrick conceptual framework, and not solely on level 1. It is also a strength that the study is cluster randomized which will reduce “infections” between the two settings and has an adequate power calculated sample size and looks for a relevant educational difference of 20% between the online and onsite setting.
The results of this study may have implications for the students for which educational setting they choose. Learning and preference results has implications for lecturers, course managers and curriculum developers which setting they should plan for the health and medical education. It may also be of inspiration for teaching and training in other disciplines. From a societal perspective it also has implications because we will know the effect and preferences of online learning in case of a future lock down.
Future research could investigate academic achievements in online and onsite research training on the long run (Kirkpatrick 4); the effect of blended learning versus online or onsite (Kirkpatrick 2); lecturers’ preferences for online and onsite setting within health and medical education (Kirkpatrick 1) and resource use in synchronous and asynchronous online learning (Kirkpatrick 5).
This trial collected pilot data from August to September 2021 and opened for inclusion in January 2022. Completion of recruitment is expected in April 2024 and long-term follow-up in April 2026. Protocol version number 1 03.06.2022 with amendments 30.11.2023.
The project leader will have access to the final trial dataset which will be available upon reasonable request. Exception to this is the qualitative raw data that might contain information leading to personal identification.
Artificial Intelligence
Copenhagen academy for medical education and simulation
Confidence interval
Coronavirus disease
European credit transfer and accumulation system
International committee of medical journal editors
Intrinsic motivation inventory
Multiple choice questionnaire
Doctor of medicine
Masters of sciences
Randomized controlled trial
Scientific ethical committee of the Capital Region of Denmark
WHO Collaborating centre for evidence-based clinical health promotion
Samara M, Algdah A, Nassar Y, Zahra SA, Halim M, Barsom RMM. How did online learning impact the academic. J Technol Sci Educ. 2023;13(3):869–85.
Article Google Scholar
Nejadghaderi SA, Khoshgoftar Z, Fazlollahi A, Nasiri MJ. Medical education during the coronavirus disease 2019 pandemic: an umbrella review. Front Med (Lausanne). 2024;11:1358084. https://doi.org/10.3389/fmed.2024.1358084 .
Madi M, Hamzeh H, Abujaber S, Nawasreh ZH. Have we failed them? Online learning self-efficacy of physiotherapy students during COVID-19 pandemic. Physiother Res Int. 2023;5:e1992. https://doi.org/10.1002/pri.1992 .
Torda A. How COVID-19 has pushed us into a medical education revolution. Intern Med J. 2020;50(9):1150–3.
Alhat S. Virtual Classroom: A Future of Education Post-COVID-19. Shanlax Int J Educ. 2020;8(4):101–4.
Cook DA, Levinson AJ, Garside S, Dupras DM, Erwin PJ, Montori VM. Internet-based learning in the health professions: A meta-analysis. JAMA. 2008;300(10):1181–96. https://doi.org/10.1001/jama.300.10.1181 .
Pei L, Wu H. Does online learning work better than offline learning in undergraduate medical education? A systematic review and meta-analysis. Med Educ Online. 2019;24(1):1666538. https://doi.org/10.1080/10872981.2019.1666538 .
Richmond H, Copsey B, Hall AM, Davies D, Lamb SE. A systematic review and meta-analysis of online versus alternative methods for training licensed health care professionals to deliver clinical interventions. BMC Med Educ. 2017;17(1):227. https://doi.org/10.1186/s12909-017-1047-4 .
George PP, Zhabenko O, Kyaw BM, Antoniou P, Posadzki P, Saxena N, Semwal M, Tudor Car L, Zary N, Lockwood C, Car J. Online Digital Education for Postregistration Training of Medical Doctors: Systematic Review by the Digital Health Education Collaboration. J Med Internet Res. 2019;21(2):e13269. https://doi.org/10.2196/13269 .
Tokalić R, Poklepović Peričić T, Marušić A. Similar Outcomes of Web-Based and Face-to-Face Training of the GRADE Approach for the Certainty of Evidence: Randomized Controlled Trial. J Med Internet Res. 2023;25:e43928. https://doi.org/10.2196/43928 .
Krnic Martinic M, Čivljak M, Marušić A, Sapunar D, Poklepović Peričić T, Buljan I, et al. Web-Based Educational Intervention to Improve Knowledge of Systematic Reviews Among Health Science Professionals: Randomized Controlled Trial. J Med Internet Res. 2022;24(8): e37000.
https://www.mentimeter.com/ . Accessed 4 Dec 2023.
https://www.sendsteps.com/en/ . Accessed 4 Dec 2023.
https://da.padlet.com/ . Accessed 4 Dec 2023.
Zackoff MW, Real FJ, Abramson EL, Li STT, Klein MD, Gusic ME. Enhancing Educational Scholarship Through Conceptual Frameworks: A Challenge and Roadmap for Medical Educators. Acad Pediatr. 2019;19(2):135–41. https://doi.org/10.1016/j.acap.2018.08.003 .
https://zoom.us/ . Accessed 20 Aug 2024.
Raffing R, Larsen S, Konge L, Tønnesen H. From Targeted Needs Assessment to Course Ready for Implementation-A Model for Curriculum Development and the Course Results. Int J Environ Res Public Health. 2023;20(3):2529. https://doi.org/10.3390/ijerph20032529 .
https://www.kirkpatrickpartners.com/the-kirkpatrick-model/ . Accessed 12 Dec 2023.
Smidt A, Balandin S, Sigafoos J, Reed VA. The Kirkpatrick model: A useful tool for evaluating training outcomes. J Intellect Dev Disabil. 2009;34(3):266–74.
Campbell K, Taylor V, Douglas S. Effectiveness of online cancer education for nurses and allied health professionals; a systematic review using kirkpatrick evaluation framework. J Cancer Educ. 2019;34(2):339–56.
Miller GE. The assessment of clinical skills/competence/performance. Acad Med. 1990;65(9 Suppl):S63–7.
Ryan RM, Deci EL. Self-Determination Theory and the Facilitation of Intrinsic Motivation, Social Development, and Well-Being. Am Psychol. 2000;55(1):68–78. https://doi.org/10.1037//0003-066X.55.1.68 .
Williams GM, Smith AP. Using single-item measures to examine the relationships between work, personality, and well-being in the workplace. Psychology. 2016;07(06):753–67.
https://scholar.google.com/intl/en/scholar/citations.html . Accessed 4 Dec 2023.
Rotondi MA. CRTSize: sample size estimation functions for cluster randomized trials. R package version 1.0. 2015. Available from: https://cran.r-project.org/package=CRTSize .
Random.org. Available from: https://www.random.org/
https://rambollxact.dk/surveyxact . Accessed 4 Dec 2023.
Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ (Online). 2009;339:157–60.
Google Scholar
Skelly C, Cassagnol M, Munakomi S. Adverse Events. StatPearls Treasure Island: StatPearls Publishing. 2023. Available from: https://www.ncbi.nlm.nih.gov/books/NBK558963/ .
Rahimi-Ardabili H, Spooner C, Harris MF, Magin P, Tam CWM, Liaw ST, et al. Online training in evidence-based medicine and research methods for GP registrars: a mixed-methods evaluation of engagement and impact. BMC Med Educ. 2021;21(1):1–14. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8439372/pdf/12909_2021_Article_2916.pdf .
Cheung YYH, Lam KF, Zhang H, Kwan CW, Wat KP, Zhang Z, et al. A randomized controlled experiment for comparing face-to-face and online teaching during COVID-19 pandemic. Front Educ. 2023;8. https://doi.org/10.3389/feduc.2023.1160430 .
Kofoed M, Gebhart L, Gilmore D, Moschitto R. Zooming to Class?: Experimental Evidence on College Students' Online Learning During Covid-19. SSRN Electron J. 2021;IZA Discussion Paper No. 14356.
Mutlu Aİ, Yüksel M. Listening effort, fatigue, and streamed voice quality during online university courses. Logop Phoniatr Vocol :1–8. Available from: https://doi.org/10.1080/14015439.2024.2317789
Download references
We thank the students who make their evaluations available for this trial and MSc (Public Health) Mie Sylow Liljendahl for statistical support.
Open access funding provided by Copenhagen University The Parker Institute, which hosts the WHO CC (DEN-62), receives a core grant from the Oak Foundation (OCAY-18–774-OFIL). The Oak Foundation had no role in the design of the study or in the collection, analysis, and interpretation of the data or in writing the manuscript.
Authors and affiliations.
WHO Collaborating Centre (DEN-62), Clinical Health Promotion Centre, The Parker Institute, Bispebjerg & Frederiksberg Hospital, University of Copenhagen, Copenhagen, 2400, Denmark
Rie Raffing & Hanne Tønnesen
Copenhagen Academy for Medical Education and Simulation (CAMES), Centre for HR and Education, The Capital Region of Denmark, Copenhagen, 2100, Denmark
You can also search for this author in PubMed Google Scholar
RR, LK and HT have made substantial contributions to the conception and design of the work; RR to the acquisition of data, and RR, LK and HT to the interpretation of data; RR has drafted the work and RR, LK, and HT have substantively revised it AND approved the submitted version AND agreed to be personally accountable for their own contributions as well as ensuring that any questions which relates to the accuracy or integrity of the work are adequately investigated, resolved and documented.
Correspondence to Rie Raffing .
Ethics approval and consent to participate.
The Danish National Committee on Health Research Ethics has assessed the study Journal-nr.:21041907 (Date: 21–09-2021) without objections or comments. The study has been approved by The Danish Data Protection Agency Journal-nr.: P-2022–158 (Date: 04.05.2022).
All PhD students participate after informed consent. They can withdraw from the study at any time without explanations or consequences for their education. They will be offered information of the results at study completion. There are no risks for the course participants as the measurements in the course follow routine procedure and they are not affected by the follow up in Google Scholar. However, the 15 min of filling in the forms may be considered inconvenient.
The project will follow the GDPR and the Joint Regional Information Security Policy. Names and ID numbers are stored on a secure and logged server at the Capital Region Denmark to avoid risk of data leak. All outcomes are part of the routine evaluation at the courses, except the follow up for academic achievement by publications and related indexes. However, the publications are publicly available per se.
The authors declare no competing interests
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material 1., supplementary material 2., supplementary material 3., supplementary material 4., supplementary material 5., supplementary material 6., supplementary material 7., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Raffing, R., Konge, L. & Tønnesen, H. Learning effect of online versus onsite education in health and medical scholarship – protocol for a cluster randomized trial. BMC Med Educ 24 , 927 (2024). https://doi.org/10.1186/s12909-024-05915-z
Download citation
Received : 25 March 2024
Accepted : 14 August 2024
Published : 26 August 2024
DOI : https://doi.org/10.1186/s12909-024-05915-z
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1472-6920
Alzheimer's disease (AD), is the most common form of dementia that affects the nervous system. In the past few years, non-invasive early AD diagnosis has become more popular as a way to improve patient care and treatment results. Imaging methods, electroencephalogram (EEG) tests, and sound evaluations are some of the new ways that researchers have looked into. This review covers 60 papers published from 2020. They are compared in terms of how they use basic deep learning models such as CNN, LSTM, Alex Net, Inception Net, VGG19, and ResNet to identify AD. But not many studies use more than one method together, like image and EEG, EEG and sounds, or images and sounds. The information from the Scopus database makes it easy to look at the newest information and work. This means that using more than one method to find AD isn't getting as much attention. Our review says that combining the best parts of each method in a mixed way could make Alzheimer's research much more useful and lead to better ways to diagnose. The paper talks about problems and opportunities in the field right now as well as possible study topics and issues for the future.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
Source www.scopus.com , accessed on 20th December 2023
All the databases are freely available and cited in the manuscript.
Prince M, Wimo A, Guerchet M, Ali G-C, Wu Y-T, Prina M. World Alzheimer report 2015. The global impact of dementia. An analysis of prevalence, incidence, cost, and trends. 2015.
Kumar Y. Recent advancement of machine learning and deep learning in the field of healthcare system. 2021; p. 77–98.
Ghazal TM, Hasan MK, Alshurideh MT, Alzoubi HM, Ahmad M, Akbar SS, Al Kurdi B, Akour IA. IoT for smart cities: Machine learning approaches in smart healthcare—a review. Future Internet. 2021;13(8):218.
Article Google Scholar
Bind S, Tiwari A, Kumar A. A survey of machine learning-based approaches for Parkinson disease prediction. 2022.
Fleming R, Zeisel J, Bennett K. World Alzheimer Report 2020: Design Dignity Dementia: dementia-related design and the built environment, vol. 1. London: Alzheimer’s Disease International; 2020.
Google Scholar
Twarowski B, Herbet M. Inflammatory processes in Alzheimer’s disease—pathomechanism, diagnosis, and treatment: a review. Int J Mol Sci. 2023;24:6518.
Muhammed Niyas KP, Thiyagarajan P. A systematic review on early prediction of mild cognitive impairment to Alzheimer using machine learning algorithms. Int J Intell Netw. 2023;4:74–88.
Khojaste-Sarakhsi M, Haghighi SS, Fatemi Ghomi SMT, Marchiori E. Deep learning for Alzheimer’s disease diagnosis: A survey. Artif Intell Med. 2022;130:102332.
Gao S, Lima D. A review of the application of deep learning in the detection of Alzheimer’s disease. Int J Cognit Comput Eng. 2022;3:1–8.
Jehosheba Margaret M, Masoodhu Banu NM. Performance analysis of EEG based emotion recognition using deep learning models. Brain-Comput Interfaces. 2023;10(2–4):79–98.
Kulkarni N. Color thresholding method for image segmentation of natural images. Int J Image Graph Signal Process. 2012;4:02.
Meng Lu, Zhang Q. Research on early diagnosis of Alzheimer’s disease based on dual fusion cluster graph convolutional network. Biomed Signal Process Control. 2023;86: 105212.
Chabib CM, Hadjileontiadis LJ, Al Shehhi A. Deepcurvmri: Deep convolutional curvelet transform-based mri approach for early detection of Alzheimer’s disease. IEEE Access. 2023;11:44650–9.
Minaee S, Kafieh R, Sonka M, Yazdani S, Soufi GJ. Deep-covid: Predicting COVID-19 from chest x-ray images using deep transfer learning. Med Image Anal. 2020;65:101794.
Kinge A, Oswal Y, Khangal T, Kulkarni N, Jha P. Comparative study on different classification models for customer churnproblem. 2022; p. 153–164.
Mofrad SA, Lundervold A, Lundervold AS. A predictive framework based on brain volume trajectories enabling early detection of Alzheimer’s disease. Comput Med Imaging Graph. 2021;90:101910.
Khan A, Zubair S. Development of a three-tiered cognitive hybrid machine learning algorithm for effective diagnosis of Alzheimer’s disease. J King Saud Univ Comput Inform Sci. 2022;34(10, Part A):8000–18.
Sharma S, Guleria K, Tiwari S, Kumar S. A deep learning-based convolutional neural network model with vgg16 feature extractor for the detection of Alzheimer disease using mri scans. Meas Sens. 2022;24:100506.
Mccombe N, Ding X, Prasad G, Gillespie P, Finn DP, Todd S, Mcclean PL, Wong-Lin K. Alzheimer’s disease assessments optimized for diagnostic accuracy and administration time. IEEE J Transl Eng Health Med. 2022;10:1–9.
Akbar S, Ali H, Ahmad A, Sarker MR, Saeed A, Salwana E, Gul S, Khan A, Ali F. Prediction of amyloid proteins using embedded evolutionary ensemble feature selection based descriptors with extreme gradient boosting model. IEEE Access. 2023;11:39024–36.
Shukla A, Tiwari R, Tiwari S. Review on Alzheimer disease detection methods: Automatic pipelines and machine learning techniques. Science. 2023;5(1):13.
Chauhan N, Choi B-J. Classification of Alzheimer’s disease using maximal information coefficient-based functional connectivity with an extreme learning machine. Brain Sci. 2023;13(7):1046.
Dang M, Chen Q, Zhao X, Chen K, Li X, Zhang J, Jie Lu, Ai L, Chen Y, Zhang Z. Tau as a biomarker of cognitive impairment and neuropsychiatric symptom in Alzheimer’s disease. Hum Brain Mapp. 2022;44:08.
Khan A, Kulkarni N, Kumar A, Kamat A. D-cnn and image processing based approach for diabetic retinopathy classification. 2022; p. 283–291.
Kale A, Jawade I, Kakade P, Jadhav R, Kulkarni N. Pairnet: a deep learning-based object detection and segmentationsystem. 2022; p.ages 423–436.
Marwa EL-Geneedy M, El-Din Moustafa H, Khalifa F, Khater H, AbdElhalim E. An MRI-based deep learning approach for accurate detection of Alzheimer’s disease. Alex Eng J. 2023;63:211–21.
Balaji P, Chaurasia M, Bilfaqih S, Muniasamy A, Elzubir L. Hybridized deep learning approach for detecting Alzheimer’s disease. Biomedicines. 2023;11:1–16.
Hu Z, Wang Z, Jin Y, Hou W. Vgg-tswinformer: Transformer-based deep learning model for early Alzheimer’s disease prediction. Comput Methods Programs Biomed. 2023;229:107291.
Sethuraman SK, Malaiyappan N, Ramalingam R, Basheer S, Rashid M, Ahmad N. Predicting Alzheimer disease using deep neuro-functional networks with resting-state fMRI. Electronics. 2023;12(4):1031.
Ismail WN, Fathimathul Rajeena PP, Ali MAS. A meta-heuristic multi-objective optimization method for Alzheimer’s disease detection based on multi-modal data. Mathematics. 2023;11(4):957.
Menagadevi M, Mangai S, Madian N, Thiyagarajan D. Automated prediction system for Alzheimer detection based on deep residual auto encoder and support vector machine. Optik. 2023;272:170212.
Leela M, Helenprabha K, Sharmila L. Prediction and classification of Alzheimer disease categories using integrated deep transfer learning approach. Meas Sensors. 2023;27:100749.
Vakli P, Weiss B, Szalma J, Barsi P, Gyuricza I, Kemenczky P, Somogyi E, Nárai A, Gál V, Petra H, Vidnyánszky Z. Automatic brain MRI motion artifact detection based on end-to-end deep learning is similarly effective as traditional machine learning trained on image quality metrics. Med Image Anal. 2023;88:102850.
Javed Mehedi Shamrat FM, Akter S, Azam S, Karim A, Ghosh P, Tasnim Z, Hasib KM, De Boer F, Ahmed K. Alzheimernet: An effective deep learning based proposition for Alzheimer’s disease stages classification from functional brain changes in magnetic resonance images. IEEE Access. 2023;11:16376–95.
Chelladurai A, Narayan DL, Divakarachari PB, Loganathan U. fMRI-based Alzheimer disease detection using the sas method with a multi-layer perceptron network. Brain Sci. 2023;13(6):893.
Imani M. Alzheimer’s disease diagnosis using fusion of high informative bilstm and cnn features of EEG signal. Biomed Signal Process Control. 2023;86: 105298.
Miltiadous A, Gionanidis E, Tzimourta KD, Giannakeas N, Tzallas AT. Dice-net: a novel convolution-transformer architecture for Alzheimer detection in EEG signals. IEEE Access. 2023;11:71840–58.
Fouad IA, Labib FEM. Identification of Alzheimer’s disease from central lobe eeg signals utilizing machine learning and residual neural network. Biomed Signal Process Control. 2023;86:105266.
Rajaguru S. A greedy optimized intelligent framework for early detection of Alzheimer’s disease using eeg signal. Comput Intell Neurosci. 2023;2023:1–10.
Jiao B, Li R, Zhou H, Qing K, Liu H, Pan H, Lei Y, Wenjin Fu, Wang X, Xiao X, Liu X, Yang Q, Liao X, Zhou Y, Fang L, Dong Y, Yang Y, Jiang H, Huang S, Shen Lu. Neural biomarker diagnosis and prediction to mild cognitive impairment and Alzheimer’s disease using eeg technology. Alzheimer’s Res Ther. 2023;15:02.
Khare SK, Rajendra Acharya U. Adazd-net: Automated adaptive and explainable Alzheimer’s disease detection system using EEG signals. Knowl-Based Syst. 2023;278:110858.
jae Kim M, Youn YC, Paik J. Deep learning-based eeg analysis to classify normal, mild cognitive impairment, and dementia: algorithms and dataset. Neuroimage. 2023;272:120054.
Alessandrini M, Biagetti G, Crippa P, Falaschetti L, Luzzi S, Turchetti C. EEG-based Alzheimer disease recognition using robust-pca and LSTM recurrent neural network. Sensors. 2022;22(10):3696.
Pirrone D, Weitschek E, Paolo P, De Salvo S, De Cola M. Eeg signal processing and supervised machine learning to early diagnose Alzheimer’s disease. Appl Sci. 2022;12:5413.
Al-Nuaimi AH, Blūma M, Al-Juboori SS, Eke CS, Jammeh E, Sun L, Ifeachor E. Robust eeg based biomarkers to detect Alzheimer’s disease. Brain Sci. 2021;11(8):1026.
Tripathi T, Kumar R. Speech-based detection of multi-class Alzheimer disease classification using machine learning. 2023.
Karande S, Kulkarni V. Automated prognosis of Alzheimer’s disease using machine learning classifiers on spontaneous speech features. Int J Intell Syst Appl Eng. 2023;11(2):245–51.
Mahajan P, Baths V. Acoustic and language-based deep learning approaches for Alzheimer’s dementia detection from spontaneous speech. Front Aging Neurosci. 2021;13:02.
Kurtz E, Zhu Y, Driesse T, Tran B, Batsis JA, Roth RM, Liang X. Early detection of cognitive decline using voice assistant commands. 2023; p. 1–5.
Yamada Y, Shinkawa K, Nemoto M, Nemoto K, Arai T. A mobile application using automatic speech analysis for classifying Alzheimer’s disease and mild cognitive impairment. Comput Speech Lang. 2023;81: 101514.
Liu J, Fu F, Li L, Yu J, Zhong D, Zhu S, Zhou Y, Liu B, Li J. Efficient pause extraction and encode strategy for Alzheimer’s disease detection using only acoustic features from spontaneous speech. Brain Sci. 2023;13(3):477.
Weiner MW (2004) Alzheimer’s Disease Neuroimaging Initiative. http://adni.loni.usc.edu . Accessed 28 Apr 2024.
Deenadayalan T, Shantharajah SP. An early-stage Alzheimer’s disease detection using various imaging modalities and techniques—a mini-review. J Integr Sci Technol. 2024;12(5):803.
OASIS: OASIS Brains Dataset https://www.oasis-brains.org/ . Accessed 28 Apr 2024.
AIBL: AIBL. https://aibl.csiro.au/ . Accessed 28 Apr 2024.
Download references
The authors acknowledged the MIT Art, Design and Technology University, Pune, India for supporting the research work by providing the facilities.
No funding received for this research.
Authors and affiliations.
Computer Science & Engineering Department, MIT SoC, MIT Art, Design and Technology University, Pune, India
Sonali Deshpande & Nilima Kulkarni
You can also search for this author in PubMed Google Scholar
All authors participated equally in this research.
Correspondence to Sonali Deshpande .
Conflict of interest.
No conflict of interest.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
Deshpande, S., Kulkarni, N. Exploring Integration of Multimodal Deep Learning Approaches for Enhanced Alzheimer's Disease Diagnosis: A Review of Recent Literature. SN COMPUT. SCI. 5 , 852 (2024). https://doi.org/10.1007/s42979-024-03084-w
Download citation
Received : 09 June 2024
Accepted : 21 June 2024
Published : 02 September 2024
DOI : https://doi.org/10.1007/s42979-024-03084-w
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Advertisement
Ai-powered academic resource locator.
Ai sentence rephraser, ai article outliner, ai writing style adapter.
20,000+ Professional Language Experts Ready to Help. Expertise in a variety of Niches.
API Solutions
Unmatched expertise at affordable rates tailored for your needs. Our services empower you to boost your productivity.
GoTranscript is the chosen service for top media organizations, universities, and Fortune 50 companies.
One of the Largest Online Transcription and Translation Agencies in the World. Founded in 2005.
Speaker 1: If you're in a STEM field, chances are you'll need to read primary literature, also known as research articles. And unlike books, effectively and efficiently reading a research paper requires a nuanced and systematic approach. When I first started reading research papers as a neuroscience major in college, it took considerable effort and time to make sense of it all. But since then, I've read through thousands of papers, published dozens of my own in peer-reviewed journals, and can now crank through them with ease. Here's the system I use. Dr. Jubbal, MedSchoolInsiders.com. First, determine the purpose of reading. Depending on the purpose and your goal in reading a research paper, your approach may differ considerably. Keep that in mind as we cover the following sections. If you're reading a paper as a requirement for class, like I initially had to for my neuroscience courses, you will be focusing on comprehension rather than on determining utility. You'll need to know the study hypothesis, the methods they used, the findings, and the limitations of their conclusions. As you proceed with your medical training, you will likely write many of your own research articles. After all, doing so is one of the most powerful ways to stand out and strengthen your medical school or residency application. In these instances, you are mostly referencing other research papers, and it becomes more important to quickly determine relevance and value prior to committing a more in-depth reading and analysis. You will also use papers as tools to figure out other papers to read by taking advantage of their own reference list. Now that you've identified your primary goal, it's time to begin reading. Not every section in a research article is created equal. Unlike reading a traditional book, I don't advise you read a research paper in order. First, you need to read the title and the abstract to get an overview of the paper. If at any time during your reading, you come across a word or acronym that you don't understand, stop and look it up. This is not like a novel where you can infer the meaning and likely not see the word again. The language in research articles is generally pretty straightforward, and terms you don't understand are often the scientific terms that are critical to your understanding of the paper and its findings. Next, dive into the conclusion. Again, this is a research paper, not a novel, so you're not running into any spoilers. The conclusion effectively summarizes the most pertinent findings. Now that you have a better idea of what the paper is about, spend as much time as you desire going over the figures, methods, results, and discussion sections. The discussion will likely be the highest-dealed portion that requires the most amount of time, but to truly understand the paper, you must also go over the methods, results, and figures. A big element to reading papers is understanding the limitations of the study, which then allows you to more accurately determine the paper's significance. The biggest and most widespread mistake is jumping to the conclusion and not understanding the limitations and generalizability of the study. Look at any media article summarizing new, groundbreaking research, and you'll see what I mean. Towards the end of the discussion section in most any paper, you'll find the author's own interpretation of the limitations of their study. But there are always many more limitations beyond what they mention. There have been entire books dedicated to the nuances of statistics and extrapolating conclusions from research, and this is something I may consider making a dedicated future video on. Let me know in the comments below so I can gauge interest. Most people know about randomization, placebo-controlled, and single or double-blinded studies. That being said, there is so much more nuance to it. Here are a few examples. First, study design. Is the study retrospective, meaning looking back historically, or prospective, starting with individuals that are then followed over time? Is it a case control, cohort, or cross-sectional study? Number two, what are the endpoints used? If the study draws conclusions about heart disease, but only uses HDL as a surrogate marker, understand that the surrogate is just that, an imperfect proxy. Number three, biases. There are too many to cover, but selection, recall, sampling, confounding, procedure, lead time, and the Hawthorne effect are all biases that you should familiarize yourself with. And number four, basic statistical analyses. Sensitivity vs. specificity, normal and skewed distributions, positive and negative predictive values, etc. Over time, you'll be reading dozens or even hundreds of research papers, and it becomes a challenge to keep everything straight. Again, depending on the purpose, there are a few options to consider. If you're reading as a class assignment, I recommend you print out the paper, highlight, and annotate in the margins as needed. More recently, I've done this on an iPad with the Apple Pencil, such that printing out was no longer necessary. On the other hand, if you're reading in order to write your own paper, start using a citation manager immediately from the beginning. EndNote is often referenced as the rather expensive gold standard, but Mendeley is a free and quite sufficient alternative. As soon as you begin reading papers, import them into your citation manager. In a separate Word document, begin jotting down the key points of your paper that are relevant to your own project. This notes document will now become the main resource from which you will begin writing your own paper. Trust me, it's much better this way, otherwise you'll spend considerable time and effort hunting for facts from the dozens of PDFs that you've read. Lastly, understand that a big part of reading speed in both regular books as well as research papers is your familiarity with the subject. I started off reading neuroscience papers quite slowly, but as my expertise in the area grew, I was able to breeze through them. I knew the anatomy and terminology like the back of my hand, and coming across terms like CA1 vs CA2 neurons of the hippocampus no longer required additional processing. Similarly, when I first started diving into plastic surgery research, I didn't know all the nuances of hand anatomy or the principles of aesthetic surgery. But as I grew to understand more, reading and understanding the literature became second nature, and once again I was able to breeze through them. It's important to keep this in mind to make sure you don't get discouraged. If you consistently apply yourself to reading research articles and follow these steps that I've outlined, you'll be tackling papers with ease in no time. Like it or not, being proficient in research is an essential skill if you want to go to a top medical school or residency program. In a certain way, there's a science but also an art to bolstering a solid research CV and securing impressive letters of recommendation from your PI. You can check out my own personal list of research articles, abstracts, and presentations on my personal website at kevinjubbal.com. It's currently over 60 and counting. Being proficient in research was a huge part of my own success getting into a top medical school and highly competitive residency. It's a challenging ordeal, and very few people know how to address this for maximal effectiveness. Through experimentation and very uncommon techniques, I was able to pump out over 30 items in less than 12 months and secure stellar letters of recommendation. Visit MedSchoolInsiders.com to learn how our team of top advisors can help you master your own research and present your best self in your application and interview. I created this video because you guys requested it. If you have any other requests for other research related videos, let me know down in the comments below. If you made it this far, then there is a lot more content on my Instagram that you will definitely enjoy. Check out at kevinjubbalmd and at MedSchoolInsiders. Thank you guys so much for watching, and I will see you all in that next one.
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Nature volume 632 , pages 1060–1066 ( 2024 ) Cite this article
56k Accesses
3 Citations
678 Altmetric
Metrics details
General circulation models (GCMs) are the foundation of weather and climate prediction 1 , 2 . GCMs are physics-based simulators that combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine-learning models trained on reanalysis data have achieved comparable or better skill than GCMs for deterministic weather forecasting 3 , 4 . However, these models have not demonstrated improved ensemble forecasts, or shown sufficient stability for long-term weather and climate simulations. Here we present a GCM that combines a differentiable solver for atmospheric dynamics with machine-learning components and show that it can generate forecasts of deterministic weather, ensemble weather and climate on par with the best machine-learning and physics-based methods. NeuralGCM is competitive with machine-learning models for one- to ten-day forecasts, and with the European Centre for Medium-Range Weather Forecasts ensemble prediction for one- to fifteen-day forecasts. With prescribed sea surface temperature, NeuralGCM can accurately track climate metrics for multiple decades, and climate forecasts with 140-kilometre resolution show emergent phenomena such as realistic frequency and trajectories of tropical cyclones. For both weather and climate, our approach offers orders of magnitude computational savings over conventional GCMs, although our model does not extrapolate to substantially different future climates. Our results show that end-to-end deep learning is compatible with tasks performed by conventional GCMs and can enhance the large-scale physical simulations that are essential for understanding and predicting the Earth system.
Solving the equations for Earth’s atmosphere with general circulation models (GCMs) is the basis of weather and climate prediction 1 , 2 . Over the past 70 years, GCMs have been steadily improved with better numerical methods and more detailed physical models, while exploiting faster computers to run at higher resolution. Inside GCMs, the unresolved physical processes such as clouds, radiation and precipitation are represented by semi-empirical parameterizations. Tuning GCMs to match historical data remains a manual process 5 , and GCMs retain many persistent errors and biases 6 , 7 , 8 . The difficulty of reducing uncertainty in long-term climate projections 9 and estimating distributions of extreme weather events 10 presents major challenges for climate mitigation and adaptation 11 .
Recent advances in machine learning have presented an alternative for weather forecasting 3 , 4 , 12 , 13 . These models rely solely on machine-learning techniques, using roughly 40 years of historical data from the European Center for Medium-Range Weather Forecasts (ECMWF) reanalysis v5 (ERA5) 14 for model training and forecast initialization. Machine-learning methods have been remarkably successful, demonstrating state-of-the-art deterministic forecasts for 1- to 10-day weather prediction at a fraction of the computational cost of traditional models 3 , 4 . Machine-learning atmospheric models also require considerably less code, for example GraphCast 3 has 5,417 lines versus 376,578 lines for the National Oceanic and Atmospheric Administration’s FV3 atmospheric model 15 (see Supplementary Information section A for details).
Nevertheless, machine-learning approaches have noteworthy limitations compared with GCMs. Existing machine-learning models have focused on deterministic prediction, and surpass deterministic numerical weather prediction in terms of the aggregate metrics for which they are trained 3 , 4 . However, they do not produce calibrated uncertainty estimates 4 , which is essential for useful weather forecasts 1 . Deterministic machine-learning models using a mean-squared-error loss are rewarded for averaging over uncertainty, producing unrealistically blurry predictions when optimized for multi-day forecasts 3 , 13 . Unlike physical models, machine-learning models misrepresent derived (diagnostic) variables such as geostrophic wind 16 . Furthermore, although there has been some success in using machine-learning approaches on longer timescales 17 , 18 , these models have not demonstrated the ability to outperform existing GCMs.
Hybrid models that combine GCMs with machine learning are appealing because they build on the interpretability, extensibility and successful track record of traditional atmospheric models 19 , 20 . In the hybrid model approach, a machine-learning component replaces or corrects the traditional physical parameterizations of a GCM. Until now, the machine-learning component in such models has been trained ‘offline’, by learning parameterizations independently of their interaction with dynamics. These components are then inserted into an existing GCM. The lack of coupling between machine-learning components and the governing equations during training potentially causes serious problems, such as instability and climate drift 21 . So far, hybrid models have mostly been limited to idealized scenarios such as aquaplanets 22 , 23 . Under realistic conditions, machine-learning corrections have reduced some biases of very coarse GCMs 24 , 25 , 26 , but performance remains considerably worse than state-of-the-art models.
Here we present NeuralGCM, a fully differentiable hybrid GCM of Earth’s atmosphere. NeuralGCM is trained on forecasting up to 5-day weather trajectories sampled from ERA5. Differentiability enables end-to-end ‘online training’ 27 , with machine-learning components optimized in the context of interactions with the governing equations for large-scale dynamics, which we find enables accurate and stable forecasts. NeuralGCM produces physically consistent forecasts with accuracy comparable to best-in-class models across a range of timescales, from 1- to 15-day weather to decadal climate prediction.
A schematic of NeuralGCM is shown in Fig. 1 . The two key components of NeuralGCM are a differentiable dynamical core for solving the discretized governing dynamical equations and a learned physics module that parameterizes physical processes with a neural network, described in full detail in Methods , Supplementary Information sections B and C , and Supplementary Table 1 . The dynamical core simulates large-scale fluid motion and thermodynamics under the influence of gravity and the Coriolis force. The learned physics module (Supplementary Fig. 1 ) predicts the effect of unresolved processes, such as cloud formation, radiative transport, precipitation and subgrid-scale dynamics, on the simulated fields using a neural network.
a , Overall model structure, showing how forcings F t , noise z t (for stochastic models) and inputs y t are encoded into the model state x t . The model state is fed into the dynamical core, and alongside forcings and noise into the learned physics module. This produces tendencies (rates of change) used by an implicit–explicit ordinary differential equation (ODE) solver to advance the state in time. The new model state x t +1 can then be fed back into another time step, or decoded into model predictions. b , The learned physics module, which feeds data for individual columns of the atmosphere into a neural network used to produce physics tendencies in that vertical column.
The differentiable dynamical core in NeuralGCM allows an end-to-end training approach, whereby we advance the model multiple time steps before employing stochastic gradient descent to minimize discrepancies between model predictions and reanalysis (Supplementary Information section G.2 ). We gradually increase the rollout length from 6 hours to 5 days (Supplementary Information section G and Supplementary Table 5 ), which we found to be critical because our models are not accurate for multi-day prediction or stable for long rollouts early in training (Supplementary Information section H.6.2 and Supplementary Fig. 23 ). The extended back-propagation through hundreds of simulation steps enables our neural networks to take into account interactions between the learned physics and the dynamical core. We train deterministic and stochastic NeuralGCM models, each of which uses a distinct training protocol, described in full detail in Methods and Supplementary Table 4 .
We train a range of NeuralGCM models at horizontal resolutions with grid spacing of 2.8°, 1.4° and 0.7° (Supplementary Fig. 7 ). We evaluate the performance of NeuralGCM at a range of timescales appropriate for weather forecasting and climate simulation. For weather, we compare against the best-in-class conventional physics-based weather models, ECMWF’s high-resolution model (ECMWF-HRES) and ensemble prediction system (ECMWF-ENS), and two of the recent machine-learning-based approaches, GraphCast 3 and Pangu 4 . For climate, we compare against a global cloud-resolving model and Atmospheric Model Intercomparison Project (AMIP) runs.
Our evaluation set-up focuses on quantifying accuracy and physical consistency, following WeatherBench2 12 . We regrid all forecasts to a 1.5° grid using conservative regridding, and average over all 732 forecasts made at noon and midnight UTC in the year 2020, which was held-out from training data for all machine-learning models. NeuralGCM, GraphCast and Pangu compare with ERA5 as the ground truth, whereas ECMWF-ENS and ECMWF-HRES compare with the ECMWF operational analysis (that is, HRES at 0-hour lead time), to avoid penalizing the operational forecasts for different biases than ERA5.
We use ECMWF’s ensemble (ENS) model as a reference baseline as it achieves the best performance across the majority of lead times 12 . We assess accuracy using (1) root-mean-squared error (RMSE), (2) root-mean-squared bias (RMSB), (3) continuous ranked probability score (CRPS) and (4) spread-skill ratio, with the results shown in Fig. 2 . We provide more in-depth evaluations including scorecards, metrics for additional variables and levels and maps in Extended Data Figs. 1 and 2 , Supplementary Information section H and Supplementary Figs. 9 – 22 .
a , c , RMSE ( a ) and RMSB ( c ) for ECMWF-ENS, ECMWF-HRES, NeuralGCM-0.7°, NeuralGCM-ENS, GraphCast 3 and Pangu 4 on headline WeatherBench2 variables, as a percentage of the error of ECMWF-ENS. Deterministic and stochastic models are shown in solid and dashed lines respectively. e , g , CRPS relative to ECMWF-ENS ( e ) and spread-skill ratio for the ENS and NeuralGCM-ENS models ( g ). b , d , f , h , Spatial distributions of RMSE ( b ), bias ( d ), CRPS ( f ) and spread-skill ratio ( h ) for NeuralGCM-ENS and ECMWF-ENS models for 10-day forecasts of specific humidity at 700 hPa. Spatial plots of RMSE and CRPS show skill relative to a probabilistic climatology 12 with an ensemble member for each of the years 1990–2019. The grey areas indicate regions where climatological surface pressure on average is below 700 hPa.
Deterministic models that produce a single weather forecast for given initial conditions can be compared effectively using RMSE skill at short lead times. For the first 1–3 days, depending on the atmospheric variable, RMSE is minimized by forecasts that accurately track the evolution of weather patterns. At this timescale we find that NeuralGCM-0.7° and GraphCast achieve best results, with slight variations across different variables (Fig. 2a ). At longer lead times, RMSE rapidly increases owing to chaotic divergence of nearby weather trajectories, making RMSE less informative for deterministic models. RMSB calculates persistent errors over time, which provides an indication of how models would perform at much longer lead times. Here NeuralGCM models also compare favourably against previous approaches (Fig. 2c ), with notably much less bias for specific humidity in the tropics (Fig. 2d ).
Ensembles are essential for capturing intrinsic uncertainty of weather forecasts, especially at longer lead times. Beyond about 7 days, the ensemble means of ECMWF-ENS and NeuralGCM-ENS forecasts have considerably lower RMSE than the deterministic models, indicating that these models better capture the average of possible weather. A better metric for ensemble models is CRPS, which is a proper scoring rule that is sensitive to full marginal probability distributions 28 . Our stochastic model (NeuralGCM-ENS) running at 1.4° resolution has lower error compared with ECMWF-ENS across almost all variables, lead times and vertical levels for ensemble-mean RMSE, RSMB and CRPS (Fig. 2a,c,e and Supplementary Information section H ), with similar spatial patterns of skill (Fig. 2b,f ). Like ECMWF-ENS, NeuralGCM-ENS has a spread-skill ratio of approximately one (Fig. 2d ), which is a necessary condition for calibrated forecasts 29 .
An important characteristic of forecasts is their resemblance to realistic weather patterns. Figure 3 shows a case study that illustrates the performance of NeuralGCM on three types of important weather phenomenon: tropical cyclones, atmospheric rivers and the Intertropical Convergence Zone. Figure 3a shows that all the machine-learning models make significantly blurrier forecasts than the source data ERA5 and physics-based ECMWF-HRES forecast, but NeuralCGM-0.7° outperforms the pure machine-learning models, despite its coarser resolution (0.7° versus 0.25° for GraphCast and Pangu). Blurry forecasts correspond to physically inconsistent atmospheric conditions and misrepresent extreme weather. Similar trends hold for other derived variables of meteorological interest (Supplementary Information section H.2 ). Ensemble-mean predictions, from both NeuralGCM and ECMWF, are closer to ERA5 in an average sense, and thus are inherently smooth at long lead times. In contrast, as shown in Fig. 3 and in Supplementary Information section H.3 , individual realizations from the ECMWF and NeuralGCM ensembles remain sharp, even at long lead times. Like ECMWF-ENS, NeuralGCM-ENS produces a statistically representative range of future weather scenarios for each weather phenomenon, despite its eight-times-coarser resolution.
All forecasts are initialized at 2020-08-22T12z, chosen to highlight Hurricane Laura, the most damaging Atlantic hurricane of 2020. a , Specific humidity at 700 hPa for 1-day, 5-day and 10-day forecasts over North America and the Northeast Pacific Ocean from ERA5 14 , ECMWF-HRES, NeuralGCM-0.7°, ECMWF-ENS (mean), NeuralGCM-ENS (mean), GraphCast 3 and Pangu 4 . b , Forecasts from individual ensemble members from ECMWF-ENS and NeuralGCM-ENS over regions of interest, including predicted tracks of Hurricane Laura from each of the 50 ensemble members (Supplementary Information section I.2 ). The track from ERA5 is plotted in black.
We can quantify the blurriness of different forecast models via their power spectra. Supplementary Figs. 17 and 18 show that the power spectra of NeuralCGM-0.7° is consistently closer to ERA5 than the other machine-learning forecast methods, but is still blurrier than ECMWF’s physical forecasts. The spectra of NeuralGCM forecasts is also roughly constant over the forecast period, in stark contrast to GraphCast, which worsens with lead time. The spectrum of NeuralGCM becomes more accurate with increased resolution (Supplementary Fig. 22 ), which suggests the potential for further improvements of NeuralGCM models trained at higher resolutions.
In NeuralGCM, advection is handled by the dynamical core, while the machine-learning parameterization models local processes within vertical columns of the atmosphere. Thus, unlike pure machine-learning methods, local sources and sinks can be isolated from tendencies owing to horizontal transport and other resolved dynamics (Supplementary Fig. 3 ). This makes our results more interpretable and facilitates the diagnosis of the water budget. Specifically, we diagnose precipitation minus evaporation (Supplementary Information section H.5 ) rather than directly predicting these as in machine-learning-based approaches 3 . For short weather forecasts, the mean of precipitation minus evaporation has a realistic spatial distribution that is very close to ERA5 data (Extended Data Fig. 4c–e ). The precipitation-minus-evaporation rate distribution of NeuralGCM-0.7° closely matches the ERA5 distribution in the extratropics (Extended Data Fig. 4b ), although it underestimates extreme events in the tropics (Extended Data Fig. 4a ). It is noted that the current version of NeuralGCM directly predicts tendencies for an atmospheric column, and thus cannot distinguish between precipitation and evaporation.
We examined the extent to which NeuralGCM, GraphCast and ECMWF-HRES capture the geostrophic wind balance, the near-equilibrium between the dominant forces that drive large-scale dynamics in the mid-latitudes 30 . A recent study 16 highlighted that Pangu misrepresents the vertical structure of the geostrophic and ageostrophic winds and noted a deterioration at longer lead times. Similarly, we observe that GraphCast shows an error that worsens with lead time. In contrast, NeuralGCM more accurately depicts the vertical structure of the geostrophic and ageostrophic winds, as well as their ratio, compared with GraphCast across various rollouts, when compared against ERA5 data (Extended Data Fig. 3 ). However, ECMWF-HRES still shows a slightly closer alignment to ERA5 data than NeuralGCM does. Within NeuralGCM, the representation of the geostrophic wind’s vertical structure only slightly degrades in the initial few days, showing no noticeable changes thereafter, particularly beyond day 5.
Physically consistent weather models should still perform well for weather conditions for which they were not trained. We expect that NeuralGCM may generalize better than machine-learning-only atmospheric models, because NeuralGCM employs neural networks that act locally in space, on individual vertical columns of the atmosphere. To explore this hypothesis, we compare versions of NeuralCGM-0.7° and GraphCast trained to 2017 on 5 years of weather forecasts beyond the training period (2018–2022) in Supplementary Fig. 36 . Unlike GraphCast, NeuralGCM does not show a clear trend of increasing error when initialized further into the future from the training data. To extend this test beyond 5 years, we trained a NeuralGCM-2.8° model using only data before 2000, and tested its skill for over 21 unseen years (Supplementary Fig. 35 ).
Although our deterministic NeuralGCM models are trained to predict weather up to 3 days ahead, they are generally capable of simulating the atmosphere far beyond medium-range weather timescales. For extended climate simulations, we prescribe historical sea surface temperature (SST) and sea-ice concentration. These simulations feature many emergent phenomena of the atmosphere on timescales from months to decades.
For climate simulations with NeuralGCM, we use 2.8° and 1.4° deterministic models, which are relatively inexpensive to train (Supplementary Information section G.7 ) and allow us to explore a larger parameter space to find stable models. Previous studies found that running extended simulations with hybrid models is challenging due to numerical instabilities and climate drift 21 . To quantify stability in our selected models, we run multiple initial conditions and report how many of them finish without instability.
To assess the capability of NeuralGCM to simulate various aspects of the seasonal cycle, we run 2-year simulations with NeuralGCM-1.4°. for 37 different initial conditions spaced every 10 days for the year 2019. Out of these 37 initial conditions, 35 successfully complete the full 2 years without instability; for case studies of instability, see Supplementary Information section H.7 , and Supplementary Figs. 26 and 27 . We compare results from NeuralGCM-1.4° for 2020 with ERA5 data and with outputs from the X-SHiELD global cloud-resolving model, which is coupled to an ocean model nudged towards reanalysis 31 . This X-SHiELD run has been used as a target for training machine-learning climate models 24 . For comparison, we evaluate models after regridding predictions to 1.4° resolution. This comparison slightly favours NeuralGCM because NeuralGCM was tuned to match ERA5, but the discrepancy between ERA5 and the actual atmosphere is small relative to model error.
Figure 4a shows the temporal variation of the global mean temperature to 2020, as captured by 35 simulations from NeuralGCM, in comparison with the ERA5 reanalysis and standard climatology benchmarks. The seasonality and variability of the global mean temperature from NeuralGCM are quantitatively similar to those observed in ERA5. The ensemble-mean temperature RMSE for NeuralGCM stands at 0.16 K when benchmarked against ERA5, which is a significant improvement over the climatology’s RMSE of 0.45 K. We find that NeuralGCM accurately simulates the seasonal cycle, as evidenced by metrics such as the annual cycle of the global precipitable water (Supplementary Fig. 30a ) and global total kinetic energy (Supplementary Fig. 30b ). Furthermore, the model captures essential atmospheric dynamics, including the Hadley circulation and the zonal-mean zonal wind (Supplementary Fig. 28 ), as well as the spatial patterns of eddy kinetic energy in different seasons (Supplementary Fig. 31 ), and the distinctive seasonal behaviours of monsoon circulation (Supplementary Fig. 29 ; additional details are provided in Supplementary Information section I.1 ).
a , Global mean temperature for ERA5 14 (orange), 1990–2019 climatology (black) and NeuralGCM-1.4° (blue) for 2020 using 35 simulations initialized every 10 days during 2019 (thick line, ensemble mean; thin lines, different initial conditions). b , Yearly global mean temperature for ERA5 (orange), mean over 22 CMIP6 AMIP experiments 34 (violet; model details are in Supplementary Information section I.3 ) and NeuralGCM-2.8° for 22 AMIP-like simulations with prescribed SST initialized every 10 days during 1980 (thick line, ensemble mean; thin lines, different initial conditions). c , The RMSB of the 850-hPa temperature averaged between 1981 and 2014 for 22 NeuralGCM-2.8° AMIP runs (labelled NGCM), 22 CMIP6 AMIP experiments (labelled AMIP) and debiased 22 CMIP6 AMIP experiments (labelled AMIP*; bias was removed by removing the 850-hPa global temperature bias). In the box plots, the red line represents the median. The box delineates the first to third quartiles; the whiskers extend to 1.5 times the interquartile range (Q1 − 1.5IQR and Q3 + 1.5IQR), and outliers are shown as individual dots. d , Vertical profiles of tropical (20° S–20° N) temperature trends for 1981–2014. Orange, ERA5; black dots, Radiosonde Observation Correction using Reanalyses (RAOBCORE) 41 ; blue dots, mean trends for NeuralGCM; purple dots, mean trends from CMIP6 AMIP runs (grey and black whiskers, 25th and 75th percentiles for NeuralGCM and CMIP6 AMIP runs, respectively). e – g , Tropical cyclone tracks for ERA5 ( e ), NeuralGCM-1.4° ( f ) and X-SHiELD 31 ( g ). h – k , Mean precipitable water for ERA5 ( h ) and the precipitable water bias in NeuralGCM-1.4° ( i ), initialized 90 days before mid-January 2020 similarly to X-SHiELD, X-SHiELD ( j ) and climatology ( k ; averaged between 1990 and 2019). In d – i , quantities are calculated between mid-January 2020 and mid-January 2021 and all models were regridded to a 256 × 128 Gaussian grid before computation and tracking.
Next, we compare the annual biases of a single NeuralGCM realization with a single realization of X-SHiELD (the only one available), both initiated in mid-October 2019. We consider 19 January 2020 to 17 January 2021, the time frame for which X-SHiELD data are available. Global cloud-resolving models, such as X-SHiELD, are considered state of the art, especially for simulating the hydrological cycle, owing to their resolution being capable of resolving deep convection 32 . The annual bias in precipitable water for NeuralGCM (RMSE of 1.09 mm) is substantially smaller than the biases of both X-SHiELD (RMSE of 1.74 mm) and climatology (RMSE of 1.36 mm; Fig. 4i–k ). Moreover, NeuralGCM shows a lower temperature bias in the upper and lower troposphere than X-SHiELD (Extended Data Fig. 6 ). We also indirectly compare precipitation bias in X-SHiELD with precipitation-minus-evaporation bias in NeuralGCM-1.4°, which shows slightly larger bias and grid-scale artefacts for NeuralGCM (Extended Data Fig. 5 ).
Finally, to assess the capability of NeuralGCM to generate tropical cyclones in an annual model integration, we use the tropical cyclone tracker TempestExtremes 33 , as described in Supplementary Information section I.2 , Supplementary Fig. 34 and Supplementary Table 6 . Figure 4e–g shows that NeuralGCM, even at a coarse resolution of 1.4°, produces realistic trajectories and counts of tropical cyclone (83 versus 86 in ERA5 for the corresponding period), whereas X-SHiELD, when regridded to 1.4° resolution, substantially underestimates the tropical cyclone count (40). Additional statistical analyses of tropical cyclones can be found in Extended Data Figs. 7 and 8 .
To assess the capability of NeuralGCM to simulate historical temperature trends, we conduct AMIP-like simulations over a duration of 40 years with NeuralGCM-2.8°. Out of 37 different runs with initial conditions spaced every 10 days during the year 1980, 22 simulations were stable for the entire 40-year period, and our analysis focuses on these results. We compare with 22 simulations run with prescribed SST from the Coupled Model Intercomparison Project Phase 6 (CMIP6) 34 , listed in Supplementary Information section I.3 .
We find that all 40-year simulations of NeuralGCM, as well as the mean of the 22 AMIP runs, accurately capture the global warming trends observed in ERA5 data (Fig. 4b ). There is a strong correlation in the year-to-year temperature trends with ERA5 data, suggesting that NeuralGCM effectively captures the impact of SST forcing on climate. When comparing spatial biases averaged over 1981–2014, we find that all 22 NeuralGCM-2.8° runs have smaller bias than the CMIP6 AMIP runs, and this result remains even when removing the global temperature bias in CMIP6 AMIP runs (Fig. 4c and Supplementary Figs. 32 and 33 ).
Next, we investigated the vertical structure of tropical warming trends, which climate models tend to overestimate in the upper troposphere 35 . As shown in Fig. 4d , the trends, calculated by linear regression, of NeuralGCM are closer to ERA5 than those of AMIP runs. In particular, the bias in the upper troposphere is reduced. However, NeuralGCM does show a wider spread in its predictions than the AMIP runs, even at levels near the surface where temperatures are typically more constrained by prescribed SST.
Lastly, we evaluated NeuralGCM’s capability to generalize to unseen warmer climates by conducting AMIP simulations with increased SST (Supplementary Information section I.4.2 ). We find that NeuralGCM shows some of the robust features of climate warming response to modest SST increases (+1 K and +2 K); however, for more substantial SST increases (+4 K), NeuralGCM’s response diverges from expectations (Supplementary Fig. 37 ). In addition, AMIP simulations with increased SST show climate drift, underscoring NeuralGCM’s limitations in this context (Supplementary Fig. 38 ).
NeuralGCM is a differentiable hybrid atmospheric model that combines the strengths of traditional GCMs with machine learning for weather forecasting and climate simulation. To our knowledge, NeuralGCM is the first machine-learning-based model to make accurate ensemble weather forecasts, with better CRPS than state-of-the-art physics-based models. It is also, to our knowledge, the first hybrid model that achieves comparable spatial bias to global cloud-resolving models, can simulate realistic tropical cyclone tracks and can run AMIP-like simulations with realistic historical temperature trends. Overall, NeuralGCM demonstrates that incorporating machine learning is a viable alternative to building increasingly detailed physical models 32 for improving GCMs.
Compared with traditional GCMs with similar skill, NeuralGCM is computationally efficient and low complexity. NeuralGCM runs at 8- to 40-times-coarser horizontal resolution than ECMWF’s Integrated Forecasting System and global cloud-resolving models, which enables 3 to 5 orders of magnitude savings in computational resources. For example, NeuralGCM-1.4° simulates 70,000 simulation days in 24 hours using a single tensor-processing-unit versus 19 simulated days on 13,824 central-processing-unit cores with X-SHiELD (Extended Data Table 1 ). This can be leveraged for previously impractical tasks such as large ensemble forecasting. NeuralGCM’s dynamical core uses global spectral methods 36 , and learned physics is parameterized with fully connected neural networks acting on single vertical columns. Substantial headroom exists to pursue higher accuracy using advanced numerical methods and machine-learning architectures.
Our results provide strong evidence for the disputed hypothesis 37 , 38 , 39 that learning to predict short-term weather is an effective way to tune parameterizations for climate. NeuralGCM models trained on 72-hour forecasts are capable of realistic multi-year simulation. When provided with historical SSTs, they capture essential atmospheric dynamics such as seasonal circulation, monsoons and tropical cyclones. However, we will probably need alternative training strategies 38 , 39 to learn important processes for climate with subtle impacts on weather timescales, such as a cloud feedback.
The NeuralGCM approach is compatible with incorporating either more physics or more machine learning, as required for operational weather forecasts and climate simulations. For weather forecasting, we expect that end-to-end learning 40 with observational data will allow for better and more relevant predictions, including key variables such as precipitation. Such models could include neural networks acting as corrections to traditional data assimilation and model diagnostics. For climate projection, NeuralGCM will need to be reformulated to enable coupling with other Earth-system components (for example, ocean and land), and integrating data on the atmospheric chemical composition (for example, greenhouse gases and aerosols). There are also research challenges common to current machine-learning-based climate models 19 , including the capability to simulate unprecedented climates (that is, generalization), adhering to physical constraints, and resolving numerical instabilities and climate drift. NeuralGCM’s flexibility to incorporate physics-based models (for example, radiation) offers a promising avenue to address these challenges.
Models based on physical laws and empirical relationships are ubiquitous in science. We believe the differentiable hybrid modelling approach of NeuralGCM has the potential to transform simulation for a wide range of applications, such as materials discovery, protein folding and multiphysics engineering design.
NeuralGCM combines components of the numerical solver and flexible neural network parameterizations. Simulation in time is carried out in a coordinate system suitable for solving the dynamical equations of the atmosphere, describing large-scale fluid motion and thermodynamics under the influence of gravity and the Coriolis force.
Our differentiable dynamical core is implemented in JAX, a library for high-performance code in Python that supports automatic differentiation 42 . The dynamical core solves the hydrostatic primitive equations with moisture, using a horizontal pseudo-spectral discretization and vertical sigma coordinates 36 , 43 . We evolve seven prognostic variables: vorticity and divergence of horizontal wind, temperature, surface pressure, and three water species (specific humidity, and specific ice and liquid cloud water content).
Our learned physics module uses the single-column approach of GCMs 2 , whereby information from only a single atmospheric column is used to predict the impact of unresolved processes occurring within that column. These effects are predicted using a fully connected neural network with residual connections, with weights shared across all atmospheric columns (Supplementary Information section C.4 ).
The inputs to the neural network include the prognostic variables in the atmospheric column, total incident solar radiation, sea-ice concentration and SST (Supplementary Information section C.1 ). We also provide horizontal gradients of the prognostic variables, which we found improves performance 44 . All inputs are standardized to have zero mean and unit variance using statistics precomputed during model initialization. The outputs are the prognostic variable tendencies scaled by the fixed unconditional standard deviation of the target field (Supplementary Information section C.5 ).
To interface between ERA5 14 data stored in pressure coordinates and the sigma coordinate system of our dynamical core, we introduce encoder and decoder components (Supplementary Information section D ). These components perform linear interpolation between pressure levels and sigma coordinate levels. We additionally introduce learned corrections to both encoder and decoder steps (Supplementary Figs. 4–6 ), using the same column-based neural network architecture as the learned physics module. Importantly, the encoder enables us to eliminate the gravity waves from initialization shock 45 , which otherwise contaminate forecasts.
Figure 1a shows the sequence of steps that NeuralGCM takes to make a forecast. First, it encodes ERA5 data at t = t 0 on pressure levels to initial conditions on sigma coordinates. To perform a time step, the dynamical core and learned physics (Fig. 1b ) then compute tendencies, which are integrated in time using an implicit–explicit ordinary differential equation solver 46 (Supplementary Information section E and Supplementary Table 2 ). This is repeated to advance the model from t = t 0 to t = t final . Finally, the decoder converts predictions back to pressure levels.
The time-step size of the ODE solver (Supplementary Table 3 ) is limited by the Courant–Friedrichs–Lewy condition on dynamics, and can be small relative to the timescale of atmospheric change. Evaluating learned physics is approximately 1.5 times as expensive as a time step of the dynamical core. Accordingly, following the typical practice for GCMs, we hold learned physics tendencies constant for multiple ODE time steps to reduce computational expense, typically corresponding to 30 minutes of simulation time.
We train deterministic NeuralGCM models using a combination of three loss functions (Supplementary Information section G.4 ) to encourage accuracy and sharpness while penalizing bias. During the main training phase, all losses are defined in a spherical harmonics basis. We use a standard mean squared error loss for prompting accuracy, modified to progressively filter out contributions from higher total wavenumbers at longer lead times (Supplementary Fig. 8 ). This filtering approach tackles the ‘double penalty problem’ 47 as it prevents the model from being penalized for predicting high-wavenumber features in incorrect locations at later times, especially beyond the predictability horizon. A second loss term encourages the spectrum to match the training data using squared loss on the total wavenumber spectrum of prognostic variables. These first two losses are evaluated on both sigma and pressure levels. Finally, a third loss term discourages bias by adding mean squared error on the batch-averaged mean amplitude of each spherical harmonic coefficient. For analysis of the impact that various loss functions have, refer to Supplementary Information section H.6.1 , and Supplementary Figs. 23 and 24 . The combined action of the three training losses allow the resulting models trained on 3-day rollouts to remain stable during years-to-decades-long climate simulations. Before final evaluations, we perform additional fine-tuning of just the decoder component on short rollouts of 24 hours (Supplementary Information section G.5 ).
Stochastic NeuralGCM models incorporate inherent randomness in the form of additional random fields passed as inputs to neural network components. Our stochastic loss is based on the CRPS 28 , 48 , 49 . CRPS consists of mean absolute error that encourages accuracy, balanced by a similar term that encourages ensemble spread. For each variable we use a sum of CRPS in grid space and CRPS in the spherical harmonic basis below a maximum cut-off wavenumber (Supplementary Information section G.6 ). We compute CRPS on rollout lengths from 6 hours to 5 days. As illustrated in Fig. 1 , we inject noise to the learned encoder and the learned physics module by sampling from Gaussian random fields with learned spatial and temporal correlation (Supplementary Information section C.2 and Supplementary Fig. 2 ). For training, we generate two ensemble members per forecast, which suffices for an unbiased estimate of CRPS.
For training and evaluating the NeuralGCM models, we used the publicly available ERA5 dataset 14 , originally downloaded from https://cds.climate.copernicus.eu/ and available via Google Cloud Storage in Zarr format at gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3. To compare NeuralGCM with operational and data-driven weather models, we used forecast datasets distributed as part of WeatherBench2 12 at https://weatherbench2.readthedocs.io/en/latest/data-guide.html , to which we have added NeuralGCM forecasts for 2020. To compare NeuralGCM with atmospheric models in climate settings, we used CMIP6 data available at https://catalog.pangeo.io/browse/master/climate/ , as well as X-SHiELD 24 outputs available on Google Cloud storage in a ‘requester pays’ bucket at gs://ai2cm-public-requester-pays/C3072-to-C384-res-diagnostics. The Radiosonde Observation Correction using Reanalyses (RAOBCORE) V1.9 that was used as reference tropical temperature trends was downloaded from https://webdata.wolke.img.univie.ac.at/haimberger/v1.9/ . Base maps use freely available data from https://www.naturalearthdata.com/downloads/ .
The NeuralGCM code base is separated into two open source projects: Dinosaur and NeuralGCM, both publicly available on GitHub at https://github.com/google-research/dinosaur (ref. 50 ) and https://github.com/google-research/neuralgcm (ref. 51 ). The Dinosaur package implements a differentiable dynamical core used by NeuralGCM, whereas the NeuralGCM package provides machine-learning models and checkpoints of trained models. Evaluation code for NeuralGCM weather forecasts is included in WeatherBench2 12 , available at https://github.com/google-research/weatherbench2 (ref. 52 ).
Bauer, P., Thorpe, A. & Brunet, G. The quiet revolution of numerical weather prediction. Nature 525 , 47–55 (2015).
Article ADS CAS PubMed Google Scholar
Balaji, V. et al. Are general circulation models obsolete? Proc. Natl Acad. Sci. USA 119 , e2202075119 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382 , 1416–1421 (2023).
Article ADS MathSciNet CAS PubMed Google Scholar
Bi, K. et al. Accurate medium-range global weather forecasting with 3D neural networks. Nature 619 , 533–538 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Hourdin, F. et al. The art and science of climate model tuning. Bull. Am. Meteorol. Soc. 98 , 589–602 (2017).
Article ADS Google Scholar
Bony, S. & Dufresne, J.-L. Marine boundary layer clouds at the heart of tropical cloud feedback uncertainties in climate models. Geophys. Res. Lett. 32 , L20806 (2005).
Webb, M. J., Lambert, F. H. & Gregory, J. M. Origins of differences in climate sensitivity, forcing and feedback in climate models. Clim. Dyn. 40 , 677–707 (2013).
Article Google Scholar
Sherwood, S. C., Bony, S. & Dufresne, J.-L. Spread in model climate sensitivity traced to atmospheric convective mixing. Nature 505 , 37–42 (2014).
Article ADS PubMed Google Scholar
Palmer, T. & Stevens, B. The scientific challenge of understanding and estimating climate change. Proc. Natl Acad. Sci. USA 116 , 24390–24395 (2019).
Fischer, E. M., Beyerle, U. & Knutti, R. Robust spatially aggregated projections of climate extremes. Nat. Clim. Change 3 , 1033–1038 (2013).
Field, C. B. Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation: Special Report of the Intergovernmental Panel on Climate Change (Cambridge Univ. Press, 2012).
Rasp, S. et al. WeatherBench 2: A benchmark for the next generation of data-driven global weather models. J. Adv. Model. Earth Syst. 16 , e2023MS004019 (2024).
Keisler, R. Forecasting global weather with graph neural networks. Preprint at https://arxiv.org/abs/2202.07575 (2022).
Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 146 , 1999–2049 (2020).
Zhou, L. et al. Toward convective-scale prediction within the next generation global prediction system. Bull. Am. Meteorol. Soc. 100 , 1225–1243 (2019).
Bonavita, M. On some limitations of current machine learning weather prediction models. Geophys. Res. Lett. 51 , e2023GL107377 (2024).
Weyn, J. A., Durran, D. R. & Caruana, R. Improving data-driven global weather prediction using deep convolutional neural networks on a cubed sphere. J. Adv. Model. Earth Syst. 12 , e2020MS002109 (2020).
Watt-Meyer, O. et al. ACE: a fast, skillful learned global atmospheric model for climate prediction. Preprint at https://arxiv.org/abs/2310.02074 (2023).
Bretherton, C. S. Old dog, new trick: reservoir computing advances machine learning for climate modeling. Geophys. Res. Lett. 50 , e2023GL104174 (2023).
Reichstein, M. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566 , 195–204 (2019).
Brenowitz, N. D. & Bretherton, C. S. Spatially extended tests of a neural network parametrization trained by coarse-graining. J. Adv. Model. Earth Syst. 11 , 2728–2744 (2019).
Rasp, S., Pritchard, M. S. & Gentine, P. Deep learning to represent subgrid processes in climate models. Proc. Natl Acad. Sci. USA 115 , 9684–9689 (2018).
Yuval, J. & O’Gorman, P. A. Stable machine-learning parameterization of subgrid processes for climate modeling at a range of resolutions. Nat. Commun. 11 , 3295 (2020).
Kwa, A. et al. Machine-learned climate model corrections from a global storm-resolving model: performance across the annual cycle. J. Adv. Model. Earth Syst. 15 , e2022MS003400 (2023).
Arcomano, T., Szunyogh, I., Wikner, A., Hunt, B. R. & Ott, E. A hybrid atmospheric model incorporating machine learning can capture dynamical processes not captured by its physics-based component. Geophys. Res. Lett. 50 , e2022GL102649 (2023).
Han, Y., Zhang, G. J. & Wang, Y. An ensemble of neural networks for moist physics processes, its generalizability and stable integration. J. Adv. Model. Earth Syst. 15 , e2022MS003508 (2023).
Gelbrecht, M., White, A., Bathiany, S. & Boers, N. Differentiable programming for Earth system modeling. Geosci. Model Dev. 16 , 3123–3135 (2023).
Gneiting, T. & Raftery, A. E. Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102 , 359–378 (2007).
Article MathSciNet CAS Google Scholar
Fortin, V., Abaza, M., Anctil, F. & Turcotte, R. Why should ensemble spread match the RMSE of the ensemble mean? J. Hydrometeorol. 15 , 1708–1713 (2014).
Holton, J. R. An introduction to Dynamic Meteorology 5th edn (Elsevier, 2004).
Cheng, K.-Y. et al. Impact of warmer sea surface temperature on the global pattern of intense convection: insights from a global storm resolving model. Geophys. Res. Lett. 49 , e2022GL099796 (2022).
Stevens, B. et al. DYAMOND: the dynamics of the atmospheric general circulation modeled on non-hydrostatic domains. Prog. Earth Planet. Sci. 6 , 61 (2019).
Ullrich, P. A. et al. TempestExtremes v2.1: a community framework for feature detection, tracking, and analysis in large datasets. Geosc. Model Dev. 14 , 5023–5048 (2021).
Eyring, V. et al. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev. 9 , 1937–1958 (2016).
Mitchell, D. M., Lo, Y. E., Seviour, W. J., Haimberger, L. & Polvani, L. M. The vertical profile of recent tropical temperature trends: persistent model biases in the context of internal variability. Environ. Res. Lett. 15 , 1040b4 (2020).
Bourke, W. A multi-level spectral model. I. Formulation and hemispheric integrations. Mon. Weather Rev. 102 , 687–701 (1974).
Ruiz, J. J., Pulido, M. & Miyoshi, T. Estimating model parameters with ensemble-based data assimilation: a review. J. Meteorol. Soc. Jpn Ser. II 91 , 79–99 (2013).
Schneider, T., Lan, S., Stuart, A. & Teixeira, J. Earth system modeling 2.0: a blueprint for models that learn from observations and targeted high-resolution simulations. Geophys. Res. Lett. 44 , 12–396 (2017).
Schneider, T., Leung, L. R. & Wills, R. C. J. Opinion: Optimizing climate models with process knowledge, resolution, and artificial intelligence. Atmos. Chem. Phys. 24 , 7041–7062 (2024).
Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27 , 3104–3112 (2014).
Haimberger, L., Tavolato, C. & Sperka, S. Toward elimination of the warm bias in historic radiosonde temperature records—some new results from a comprehensive intercomparison of upper-air data. J. Clim. 21 , 4587–4606 (2008).
Bradbury, J. et al. JAX: composable transformations of Python+NumPy programs. GitHub http://github.com/google/jax (2018).
Durran, D. R. Numerical Methods for Fluid Dynamics: With Applications to Geophysics Vol. 32, 2nd edn (Springer, 2010).
Wang, P., Yuval, J. & O’Gorman, P. A. Non-local parameterization of atmospheric subgrid processes with neural networks. J. Adv. Model. Earth Syst. 14 , e2022MS002984 (2022).
Daley, R. Normal mode initialization. Rev. Geophys. 19 , 450–468 (1981).
Whitaker, J. S. & Kar, S. K. Implicit–explicit Runge–Kutta methods for fast–slow wave problems. Mon. Weather Rev. 141 , 3426–3434 (2013).
Gilleland, E., Ahijevych, D., Brown, B. G., Casati, B. & Ebert, E. E. Intercomparison of spatial forecast verification methods. Weather Forecast. 24 , 1416–1430 (2009).
Rasp, S. & Lerch, S. Neural networks for postprocessing ensemble weather forecasts. Month. Weather Rev. 146 , 3885–3900 (2018).
Pacchiardi, L., Adewoyin, R., Dueben, P. & Dutta, R. Probabilistic forecasting with generative networks via scoring rule minimization. J. Mach. Learn. Res. 25 , 1–64 (2024).
Smith, J. A., Kochkov, D., Norgaard, P., Yuval, J. & Hoyer, S. google-research/dinosaur: 1.0.0. Zenodo https://doi.org/10.5281/zenodo.11376145 (2024).
Kochkov, D. et al. google-research/neuralgcm: 1.0.0. Zenodo https://doi.org/10.5281/zenodo.11376143 (2024).
Rasp, S. et al. google-research/weatherbench2: v0.2.0. Zenodo https://doi.org/10.5281/zenodo.11376271 (2023).
Download references
We thank A. Kwa, A. Merose and K. Shah for assistance with data acquisition and handling; L. Zepeda-Núñez for feedback on the paper; and J. Anderson, C. Van Arsdale, R. Chemke, G. Dresdner, J. Gilmer, J. Hickey, N. Lutsko, G. Nearing, A. Paszke, J. Platt, S. Ponda, M. Pritchard, D. Rothenberg, F. Sha, T. Schneider and O. Voicu for discussions.
These authors contributed equally: Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Stephan Hoyer
Google Research, Mountain View, CA, USA
Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, James Lottes, Stephan Rasp, Michael P. Brenner & Stephan Hoyer
Earth, Atmospheric and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
Milan Klöwer
European Centre for Medium-Range Weather Forecasts, Reading, UK
Peter Düben & Sam Hatfield
Google DeepMind, London, UK
Peter Battaglia, Alvaro Sanchez-Gonzalez & Matthew Willson
School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Michael P. Brenner
You can also search for this author in PubMed Google Scholar
D.K., J.Y., I.L., P.N., J.S. and S. Hoyer contributed equally to this work. D.K., J.Y., I.L., P.N., J.S., G.M., J.L. and S. Hoyer wrote the code. D.K., J.Y., I.L., P.N., G.M. and S. Hoyer trained models and analysed the data. M.P.B. and S. Hoyer managed and oversaw the research project. M.K., S.R., P.D., S. Hatfield, P.B. and M.P.B. contributed technical advice and ideas. M.W. ran experiments with GraphCast for comparison with NeuralGCM. A.S.-G. assisted with data preparation. D.K., J.Y., I.L., P.N. and S. Hoyer wrote the paper. All authors gave feedback and contributed to editing the paper.
Correspondence to Dmitrii Kochkov , Janni Yuval or Stephan Hoyer .
Competing interests.
D.K., J.Y., I.L., P.N., J.S., J.L., S.R., P.B., A.S.-G., M.W., M.P.B. and S. Hoyer are employees of Google. S. Hoyer, D.K., I.L., J.Y., G.M., P.N., J.S. and M.B. have filed international patent application PCT/US2023/035420 in the name of Google LLC, currently pending, relating to neural general circulation models.
Peer review information.
Nature thanks Karthik Kashinath and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data fig. 1 maps of bias for neuralgcm-ens and ecmwf-ens forecasts..
Bias is averaged over all forecasts initialized in 2020.
Spread-skill ratio is averaged over all forecasts initialized in 2020.
Vertical profiles of the extratropical intensity (averaged between latitude 30°–70° in both hemispheres) and over all forecasts initialized in 2020 of (a,d,g) geostrophic wind, (b,e,h) ageostrophic wind and (c,f,i) the ratio of the intensity of ageostrophic wind over geostrophic wind for ERA5 (black continuous line in all panels), (a,b,c) NeuralGCM-0.7°, (d,e,f) GraphCast and (g,h,i) ECMWF-HRES at lead times of 1 day, 5 days and 10 days.
(a) Tropical (latitudes −20° to 20°) precipitation minus evaporation (P minus E) rate distribution, (b) Extratropical (latitudes 30° to 70° in both hemispheres) P minus E, (c) mean P minus E for 2020 ERA5 14 and (d) NeuralGCM-0.7° (calculated from the third day of forecasts and averaged over all forecasts initialized in 2020), (e) the bias between NeuralGCM-0.7° and ERA5, (f-g) Snapshot of daily precipitation minus evaporation for 2020-01-04 for (f) NeuralGCM-0.7° (forecast initialized on 2020-01-02) and (g) ERA5.
Mean precipitation calculated between 2020-01-19 and 2021-01-17 for (a) ERA5 14 (c) X-SHiELD 31 and the biases in (e) X-SHiELD and (g) climatology (ERA5 data averaged over 1990-2019). Mean precipitation minus evaporation calculated between 2020-01-19 and 2021-01-17 for (b) ERA5 (d) NeuralGCM-1.4° (initialized in October 18th 2019) and the biases in (f) NeuralGCM-1.4° and (h) climatology (data averaged over 1990–2019).
Mean temperature between 2020-01-19 to 2020-01-17 for (a) ERA5 at 200hPa and (b) 850hPa. (c,d) the bias in the temperature for NeuralGCM-1.4°, (e,f) the bias in X-SHiELD and (g,h) the bias in climatology (calculated from 1990–2019). NeuralGCM-1.4° was initialized in 18th of October (similar to X-SHiELD).
(a) Tropical Cyclone (TC) density from ERA5 14 data spanning 1987–2020. (b) TC density from NeuralGCM-1.4° for 2020, generated using 34 different initial conditions all initialized in 2019. (c) Box plot depicting the annual number of TCs across different regions, based on ERA5 data (1987–2020), NeuralGCM-1.4° for 2020 (34 initial conditions), and orange markers show ERA5 for 2020. In the box plots, the red line represents the median; the box delineates the first to third quartiles; the whiskers extend to 1.5 times the interquartile range (Q1 − 1.5IQR and Q3 + 1.5IQR), and outliers are shown as individual dots. Each year is defined from January 19th to January 17th of the following year, aligning with data availability from X-SHiELD. For NeuralGCM simulations, the 3 initial conditions starting in January 2019 exclude data for January 17th, 2021, as these runs spanned only two years.
Number of Tropical Cyclones (TCs) as a function of maximum wind speed at 850hPa across different regions, based on ERA5 data (1987–2020; in orange), and NeuralGCM-1.4° for 2020 (34 initial conditions; in blue). Each year is defined from January 19th to January 17th of the following year, aligning with data availability from X-SHiELD. For NeuralGCM simulations, the 3 initial conditions starting in January 2019 exclude data for January 17th, 2021, as these runs spanned only two years.
Supplementary information.
Supplementary Information (38 figures, 6 tables): (A) Lines of code in atmospheric models; (B) Dynamical core of NeuralGCM; (C) Learned physics of NeuralGCM; (D) Encoder and decoder of NeuralGCM; (E) Time integration; (F) Evaluation metrics; (G) Training; (H) Additional weather evaluations; (I) Additional climate evaluations.
Rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and permissions
Cite this article.
Kochkov, D., Yuval, J., Langmore, I. et al. Neural general circulation models for weather and climate. Nature 632 , 1060–1066 (2024). https://doi.org/10.1038/s41586-024-07744-y
Download citation
Received : 13 November 2023
Accepted : 15 June 2024
Published : 22 July 2024
Issue Date : 29 August 2024
DOI : https://doi.org/10.1038/s41586-024-07744-y
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.
Discover the world's research
An official website of the United States government
Here’s how you know
Official websites use .gov A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
https://www.nist.gov/publications/towards-better-understanding-deep-learning-model-time-series-forecasting-fire-research
Download paper, additional citation formats.
If you have any questions about this publication or are having problems accessing it, please contact [email protected] .
Submission Last date | 30-Oct-2024 |
Acceptance Status | In One Day |
Paper Publish | In Two Days |
Dear Authors, Article publish in our journal for Volume-10,Issue-5. For article submission on below link: Submit Manuscript
Dear Reviewer, You can join our Reviewer team without given any charges in our journal. Submit Details on below link: Join As Board
Similar-paper.
Author Name | Author Institute |
---|---|
N Manjunatha | Shri Jagdishprasad Jhabarmal Tibrewala University, Jhunjhunu, Rajasthan |
Dr.Prasadu Peddi | Shri Jagdishprasad Jhabarmal Tibrewala University, Jhunjhunu, Rajasthan |
CSE | |
Image Processing (IP), CNN architecture, kidney stone, sensitivity, C.T. Scan | |
Kidney stones, a common urological condition, pose a significant health risk and can cause severe pain and complications if not detected and managed promptly. Traditional methods for kidney stone detection often involve medical imaging techniques such as X-rays and ultrasounds. In recent years, the application of artificial intelligence and neural networks has emerged as a promising approach to enhance the accuracy and efficiency of kidney stone detection. This abstract explores the use of neural networks in the detection of kidney stones, highlighting their potential to revolutionize the diagnostic process. Neural networks, particularly convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have shown remarkable capabilities in analyzing medical images and clinical data. Leveraging their ability to extract complex patterns and features from data, neural networks have the potential to improve the sensitivity and specificity of kidney stone detection, reducing misdiagnoses and unnecessary procedures. In the present research, the collections of a diverse dataset of medical images containing kidney stones were preprocessed to enhance image quality and remove noise. Subsequently, the CNN architecture was designed and trained using the dataset which involves extracting relevant features from the images and optimizing the network parameters to achieve high accuracy. The evaluation metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC-ROC) were studied. The results demonstrated the effectiveness of the proposed system in accurately detecting kidney stones. Further research and clinical validation are necessary to fully realize the potential of neural networks in kidney stone detection and to ensure their safe and effective integration into clinical practice. |
Number Of DownloadsSave in google drive.
|
IMAGES
VIDEO
COMMENTS
Learn how cognitive theories explain different aspects of learning and inform classroom practices in this PDF article on ResearchGate.
The PBL concept implies collaboration of two or more teachers at a specific level when planning, implementing, and/or evaluating a course (Carpenter et al., 2007), which mainly involves the exchange of training expertise and reflective conversation (Chang & Lee, 2010).It has been shown that the PBL approach provides inexperienced teachers with varied and valuable learning experiences and ...
theoretical underpinnings, teachers' perceptions towards the game-based learning approach are further addressed in the paper. The advantages and disadvantages of game-based learning are also discussed. Keywords Game-based learning Digital games Technology Collaborative learning . Introduction. Game-based learning is a method of obtaining new ...
McKeachie's Teaching Tips: Strategies, Research, and Theory for College and University Teachers. 14th Edn. Belmont, CA: Wadsworth Cengage Learning. [Google Scholar] Voukelatou G. (2019). The contribution of experiential learning to the development of cognitive and social skills in secondary education: A case study. Educ.
The effectiveness of research-based learning. Conducting one's own research project involves various cognitive, behavioural, and affective experiences (Lopatto, Citation 2009, 29), which in turn lead to a wide range of benefits associated with RBL. RBL is associated with long-term societal benefits because it can foster scientific careers: Students participating in RBL reported a greater ...
The purpose of this descriptive study is to identify learning. approaches (deep, surface, or strategic) among successful. undergraduate students and the factors that affect and shape their ...
The research also found that effective active learning methods use not only hands-on and minds-on approaches, but also hearts-on, providing increased emotional and social support. Interest in active learning grew as the COVID-19 pandemic challenged educators to find new ways to engage students.
Thompson (2021) thoroughly examine the research on CBE between 2000 and 2019 to integrate the literature. They draw attention to the absence of a common definition and language in the studies examined and state that this causes CBE to be interwoven with concepts such as personalized learning, student-centered learning, proficiency-based education.
or increases information (cognitive skills). Education empowers our brain and beliefs, so. it encourages our intellectual power to improve knowledge. Most important theories related to language ...
In this paper, we report on teacher and student perceived features of collaborative activities that teachers have implemented to foster student collaboration. Over the last decades, research has demonstrated that CL can promote academic and social educational outcomes (Johnson, Johnson, & Smith, Citation 2007; Slavin, Citation 1996). However ...
theory and research to support a movement towards an instruction attentive to students' variance manifested in at least three areas: the student's readiness, interest, and learning profile. Nowadays, one of the challenges in teachinglearning process is - knowing the most effective teaching approach and strategies that are also in line with
With the emergence of ensemble learning approaches, lots of research has been conducted to evaluate the methods of ensemble (Hashino et al., 2007 ... The paper also illustrated the recent trends in ensemble learning using quantitative analysis of several research papers. Moreover, the paper offered various factors that influence ensemble ...
Abstract. Observational learning is an important area in the field of psychology and behavior science more generally. Given this, it is essential that behavior analysts articulate a sound theory of how behavior change occurs through observation. This paper begins with an overview of seminal research in the area of observational learning ...
Results. Nine studies met the inclusion criteria and were included in the analysis. Only two SToLs were identified in this review: Bandura's social learning theory (n = 5) and Lave and Wenger's communities of practice (CoP) theory (n = 4).A total of five studies used SToLs in nursing programs, one in medicine, one in pharmacy, and two used SToLs in multi-disciplinary programs.
Machine learning techniques have been experimented on a range of datasets and deep learning techniques are still to be fully evaluated on the fake news detection and related tasks. 3. Proposed approach. The research on fake news detection requires a lot of experimentation using machine learning techniques on a wide range of datasets.
Paved the way for further research in reinforcement learning. Access: Read the Paper . 4. "Sequence to Sequence Learning with Neural Networks" by Ilya Sutskever, Oriol Vinyals, and Quoc V. Le . Summary: This paper introduced the sequence-to-sequence (seq2seq) learning framework, which has become fundamental for tasks such as machine ...
Problem-based learning (PBL) is an approach where group discussions and collaboration are apparent during problem-solving activities. Accordingly, learners' personality types that affect the way they think, feel, behave, and interact may potentially have a role in PBL classrooms. This study tries to reveal the possible roles personality types play in PBL by investigating the effects of PBL ...
The disruption of health and medical education by the COVID-19 pandemic made educators question the effect of online setting on students' learning, motivation, self-efficacy and preference. In light of the health care staff shortage online scalable education seemed relevant. Reviews on the effect of online medical education called for high quality RCTs, which are increasingly relevant with ...
Alzheimer's disease (AD), is the most common form of dementia that affects the nervous system. In the past few years, non-invasive early AD diagnosis has become more popular as a way to improve patient care and treatment results. Imaging methods, electroencephalogram (EEG) tests, and sound evaluations are some of the new ways that researchers have looked into. This review covers 60 papers ...
Speaker 1: Hello world, it's Siraj, and I'm going to show you how I read research papers and give you some additional tips on how you can consume them more efficiently. Reading research papers is an art. Whether the topic is machine learning or cryptography, distributed consensus or networking, in order to truly have an educated opinion on a particular topic in computer science, you've got to ...
Discovers relevant academic papers and research articles to support a given topic or thesis. HyperWrite's Research Paper Finder is an AI-powered tool designed to locate relevant academic papers and research articles that support a given topic or thesis. Leveraging the power of AI and machine learning, this tool is capable of scouring the internet and academic databases to find the most ...
Speaker 1: If you're in a STEM field, chances are you'll need to read primary literature, also known as research articles. And unlike books, effectively and efficiently reading a research paper requires a nuanced and systematic approach. When I first started reading research papers as a neuroscience major in college, it took considerable effort and time to make sense of it all.
This paper includes the analysis of the feedback of the pre-test and post-test scores of the students before and after the Hands-on-approach was given as ... rate of learning by lecture is 5% while that of practice by doing (Activity-oriented) is about 75%. ... and activity-oriented teaching methods (NERDC, 2008). Past research work had stated ...
By synthesizing traditional models and machine learning methods, this paper aims to provide a comprehensive and in-depth perspective for electric vehicle energy consumption prediction. ... In Section 3, we provide an overview of research related to machine learning, analyzing how machine learning interprets model predictions.
a, Overall model structure, showing how forcings F t, noise z t (for stochastic models) and inputs y t are encoded into the model state x t.The model state is fed into the dynamical core, and ...
This research paper is dedicated to our parents, who became our support system in . ... th e activity-based learning approach and this will be noted by the researchers. 19.
Paper Format. Guidelines for setting up your paper, including the title page, font, and sample papers. Reference Examples. More than 100 reference examples of various types, including articles, books, reports, films, social media, and webpages. APA Style and MLA Style Reference Comparison Guide (PDF, 87KB)
Deep learning model has been a viable approach to forecast critical events in fire development. ... It is believed this work would contribute to bringing trustworthy deep learning models for fire research. Citation. ... Journals. Download Paper. Local Download. Keywords. black-box model, flashover, model interpretability, trustworthy AI, time ...
N Manjunatha, and Dr.Prasadu Peddi , "The Novel Approach to Detect Kidney Stones Using Deep Learning and Convolutional Neural Networks," International Journal Of Advance Research And Innovative Ideas In Education, vol. 10, no. 4, pp. 3385-3393, Jul-Aug 2024. [Online].
Read the latest articles of Computer Methods in Applied Mechanics and Engineering at ScienceDirect.com, Elsevier's leading platform of peer-reviewed scholarly literature ... Research Papers. ... Unsupervised machine learning classification for accelerating FE 2 multiscale fracture simulations. Souhail Chaouch, Julien Yvonnet ...