How robust are cross-country comparisons of PISA scores to the scaling model used?

The Programme for International Student Assessment (PISA) is an important international study of 15‐olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received...

Descripción completa

Detalles Bibliográficos
Autores: Jerrim, John, Parker, Philip, Choi Mendizábal, Álvaro B. (Álvaro Borja), Chmielewski, Anna Katyn, Sälzer, Christine, Shure, Nikki
Tipo de recurso: artículo
Estado:Versión aceptada para publicación
Fecha de publicación:2018
País:España
Institución:Universidad de Barcelona
Repositorio:Dipòsit Digital de la UB
OAI Identifier:oai:diposit.ub.edu:2445/130879
Acceso en línea:https://hdl.handle.net/2445/130879
Access Level:acceso abierto
Palabra clave:Rendiment acadèmic
Avaluació educativa
Mètode comparatiu
Academic achievement
Educational evaluation
Comparative method
Descripción
Sumario:The Programme for International Student Assessment (PISA) is an important international study of 15‐olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has focused upon the psychometric scaling model used to create the proficiency scores. The aim of this article is to therefore investigate the robustness of cross‐country comparisons of PISA scores to subtle changes to the underlying scaling model used. This includes the specification of the item‐response model, whether the difficulty and discrimination of items are allowed to vary across countries (item‐by‐country interactions) and how test questions not reached by pupils are treated. Our key finding is that these technical choices make little substantive difference to the overall country‐level results.