Rater Errors in Rating Processandthe Needsto Be Identified Among Student Raters
Keywords:
rater errors, performance-based assessment, MFRMAbstract
In assessing performance-based language assessment, rater behaviour is one of the contributing factors in measurement error which is derived from rater error that become a threat in a rating process. The emphasis on students’ self-directed learning after the implementation of CEFR in the Malaysian English language syllabus has required students to be able to assess their progress where the validity and reliability of the scores should be unquestionable. However, since it is a new practice, there is a lack of awareness of the need to identify rater behaviour among students. Therefore, this paper aims to discuss the different types of rater errors that occur in a rating process which will highlight the importance of the errors to be identified among students in secondary schools in Malaysia. Other than that, it is also aimed to propose a conceptual framework of rater-mediated assessment using the Many Facet Rasch Model (MFRM) that can be used in understanding the rater errors. The implication of this conceptual paper is teachers will gain insights into the factors that will become a threat for students to be good rater. Apart from that, the conceptual framework of rater mediated-assessment using MFRM will assist teachers to understand the relationship between three facets which are task, examinee, and rater with the outputs produced by MFRM. Future research should delve into factors that contribute to student’s rater errors which undoubtedly affecting their judging in a rating process based on the conceptual framework of rater mediated assessment using MFRM.
Downloads
References
Abu Kassim, N.L. (2011). Judging behaviour and rater errors: An application of the many-facet rasch model. GEMA Online Journal of Language Studies,11(3): 179–197.
Ahmadi Shirazi, M. (2019). For a greater good: bias analysis in writing assessment. SAGE,9(1): 1–14. http://journals.sagepub.com/doi/10.1177/2158244018822377.
Alla Baksh, M.A., Mohd Sallehhudin, A.A., Tayeb, Y.A. & Norhaslinda, H. (2016). Washback effect of school-based english language assessment: A case-study on students’ perceptions. Pertanika Journal of Social Sciences and Humanities,24(3): 1087–1104.
Azman, H. (2016). Implementation and challenges of English language education reform in Malaysian primary schools. 3L: Language, Linguistics, Literature,22(3): 65–78.
Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24(1), 1–21. https://doi.org/10.1080/08957347.2011.532417
Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press, 1–264. https://rm.coe.int/1680459f97.
Davis, L. (2018). Analytic, holistic, and primary trait marking scales. The TESOL Encyclopedia of English Language Teaching.1–6. John Wiley & Sons, Inc.
Dickinson, P. & Adams, J. (2017). Values in evaluation – The use of rubrics. Evaluation and Program Planning,65: 113–116. https://dx.doi.org/10.1016/j.evalprogplan.2017.07.005.
Díez-Bedmar, M.B. & Byram, M. (2019). The current influence of the CEFR in secondary education: teachers’ perceptions. Language, Culture and Curriculum,32(1): 1–15. https://doi.org/10.1080/07908318.2018.1493492.
Eckes, T. (2009). Many-facet Rasch measurement. In S. Takala (Ed.), Reference supplement to the manual for relating language examinations to the Common European Framework of Reference for Languages: Learning, teaching, assessment (Section H). Strasbourg, France: Council of Europe/Language Policy Division.
Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly,9(3): 270–292.
Engelhard, G. (1994). Examining rater errors in the assessment of written composition with a
Many-Faceted Rasch Model. Journal of Educational Measurement, 31(2), 93–112. https:
//doi:10.1111/j.1745-3984.1994.tb00436.x
Engelhard, G. & Wind, S.A. (2018). Invariant measurement with raters and rating scales: Rasch models for rater-mediated assessments. Routledge.
Figueras, N., Kaftandjieva, F. & Takala, S. (2013). Relating a reading comprehension test to the CEFR levels: A case of standard setting in practice with focus on judges and items. Canadian Modern Language Review,69(4): 359–385. https://doi/10.3138/cmlr.1723.359.
Foley, J. (2019). Issues on the initial impact of CEFR in Thailand and the region. Indonesian Journal of Applied Linguistics,9(2): 359–370.
Goodwin, S. (2016). A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes. Assessing Writing,30: 21–31. http://dx.doi.org/10.1016/j.asw.2016.07.004.
Green, A. (2018). Linking Tests of English for Academic Purposes to the CEFR: The Score User’s Perspective. Language Assessment Quarterly,15(1): 59–74. https://doi.org/10.1080/15434303.2017.1350685.
Holzknecht, F., Huhta, A. & Lamprianou, I. (2018). Comparing the outcomes of two different approaches to CEFR-based rating of students’ writing performances across two European countries. Assessing Writing,37(April 2017): 57–67. https://doi.org/10.1016/j.asw.2018.03.009.
Humphry, S. & Heldsinger, S. (2019). Raters ’ perceptions of assessment criteria relevance.
Assessing Writing,41: 1–13. https://doi.org/10.1016/j.asw.2019.04.002.
Idris, M. & Abdul Raof, A.H. (2016, November 25-27). Modest ESL learners rating behavior during self and peer assessment practice. Language Testing Forum, Reading, United Kingdom.
Isbell, D.R. (2017). Assessing C2 writing ability on the certificate of english language proficiency: Rater and examinee age effects. Assessing Writing,34(April): 37–49.
Ishak, W. I. W., & Mohamad, M. (2018). The implementation of Common European Framework of References (CEFR): What are the effects towards LINUS students’ achievements? Creative Education, 9, 2714-2731. https://doi.org/10.4236/ce.2018.916205
Kayapinar, U. (2014). Measuring Essay Assessment: Intra-Rater and Inter-Rater Reliability.
Eurasian Journal of Educational Research,(57): 113–135.
Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese second language writing performance. Language Testing, 19(1), 3-31.
Le, H.T.T. (2018). Impacts of the Cefr-Aligned Learning Outcomes Implementation on Assessment Practice. Hue University Journal of Science: Social Sciences and
Humanities,127(6B): 87.
Linacre, J. M. (1989). Many-facet Rasch measurement. MESA Press.
Linacre, J. M. (2014). Facets computer program for many-facet Rasch measurement, version
71.4. Beaverton, Oregon, Winsteps.com
Lumley, T. & Mcnamara, T.F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing,12(1): 54–71.
McNamara, T.F. (1996). Measuring Second Language Performance.Pearson Education Limited.
McNamara, T.F.(2014). 30 Years on — Evolution or Revolution ? Language Assessment Quarterly,11(2): 226–232.
Mohd Noh, M.F. & Mohd Matore, M.E.E. (2020). Rating Performance among Raters of Different Experience Through Multi-Facet Rasch Measurement (MFRM) Model. Journal of Measurement and Evaluation in Education and Psychology,11(2): 1–16.
Myford, C.M. & Wolfe, E.W. (2003). Detecting and Measuring Rater Effects Using Many-Facet Rasch Measurement: Part I. Journal of Applied Measurement,4(4): 368–422.
Myford, C.M. & Wolfe, E.W. (2004). Detecting and Measuring Rater Effects Using Many-Facet Rasch Measurement: Part II. Journal of Applied Measurement,5(2): 189–227.
Ohta, R., Plakans, L.M. & Gebril, A. (2018). Integrated writing scores based on holistic and multi-trait scales : A generalizability analysis. Assessing Writing,38(August): 21–36. https://doi.org/10.1016/j.asw.2018.08.001.
Özdemir-Yllmazer, M. & Özkan, Y. (2017). Speaking assessment perceptions and practices of English teachers at tertiary level in the Turkish context. Language Learning in Higher Education,7(2): 371–391.
Panadero, E., Broadbent, J., Boud, D. & Lodge, J.M. (2018). Using formative assessment to influence self- and co-regulated learning : the role of evaluative judgement. European Journal of Psychology of Education 1–43.
Polat, M. (2018). Defining sevre graders through Many Faceted Rasch Measurement. European Journal of Foreign Language Teaching,3(4): 186–198.
Şahan, Ö. &Razı, S. (2020). Do experience and text quality matter for raters’ decision-making behaviors? Language Testing, 1–22.
Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing
Van Huy Nguyen & M. Obaidul Hamid (2020): The CEFR as a national language policy in Vietnam: insights from a sociogenetic analysis.Journal of Multilingual and Multicultural Development. https://doi.org/10.1080/01434632.2020.1715416.
Wang, J., Engelhard, G. & Wolfe, E.W. (2016). Evaluating Rater Accuracy in Rater-Mediated
Assessments Using an Unfolding Model. Educational and Psychological Measurement,76(6): 1005–1025.
Weiqiang Wang (2016): Using rubrics in student self-assessment: student perceptions in the English as a foreign language writing context.Assessment & Evaluation in Higher Education.https://doi.org/10.1080/02602938.2016.1261993.
Weigle, S.C. (2010). Scoring procedures for writing assessment. (J. Charles Alderson & L. F. Bachman, Eds.)Assessing Writing. Cambridge: Cambridge University Press.(original work published 2002).
Wigglesworth, G. (1993). Exploring bias analysis as a tool for improving rater consistency in assessing oral interation. Language Testing,10(3): 305–319.
Wind, S.A. & Engelhard, G. (2012). Examining rating quality in Writing assessment: Rater agreement, error, and accuracy. Journal of Applied Measurement,13(4): 321–335.
Wu, S.M. & Tan, S. (2016). Managing rater effects through the use of FACETS analysis: the case of a university placement test. Higher Education Research and Development35(2): 380–394.
Zeng, Y. & Fan, T. (2017). Developing reading proficiency scales for EFL learners in China.
Language Testing in Asia,7(8): 1–21. https://doi.org/10.1186/s40468-017-0039-y.
Zheng, Y., Zhang, Y. & Yan, Y. (2016). Investigating the practice of The Common European Framework of Reference for Languages (CEFR) outside Europe: a case study on the assessment of writing in English in China. British Council. University of Southampton.
Zuraidah, M.D. (2015). English Language Education Reform in Malaysia: The Roadmap 2015- 2025. Ministry of Education Malaysia.
Downloads
Published
How to Cite
Issue
Section
License
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.