Validity and Purpose of Assessment (C&M Core: Seminar 3)
The Seminar 3 task was to:
- Read Chapter 12 of Churchill et al. (2013)1, Chapters 2 and 3 of Brady and Kennedy (2012)2, and Klenowski & Wyatt-Smith (2010).
- List the various types of assessments you have identified, and answer the following two questions in 250 words.
Why is the purpose of an assessment task so important?
Why is validity, dependability, reliability and moderating so important to student assessment?
Chapter 2 of Brady and Kennedy (2012)2 lays out many of the different types of assessment well, summarising many of the discussions in other readings as well. Several dichotomies are listed in their Table 2.1, which is a reproduction of the ‘modes of assessment’ identified by the New South Wales Board of Studies (2007):
- Formal/ Informal
- Formative/ Summative
- Continuous/ Terminal
- Coursework/ Examination
- Process-oriented/ Product-oriented
- Internal/ External
These dichotomies and the interactions between them alone lend to enormous complexity to the problem of categorisation of assessment. Wyatt-Smith and Ludwig (1998) offer an alternative categorisation, presented in Table 2.2 of Chapter 2 of Brady and Kennedy (2012)2 where assessment is categorised into
- Cohort testing
- Survey sampling
- Progress mapping, or
- School-based assessment.
Another important dichotomy discussed is that of `authentic assessment’, as described by Goodwin and MacDonald (1997, pg 223). All in all, the topic of categorisation of assessment is very complex, and even a brief understanding of these complexities lead to an appreciation of the discussions of Masters (2014)3 on division amongst educators and eduction researchers about assessment.
I think considering the purpose of assessment is crucial, because if assessment is to be effective and acheive it’s intended purpose it fundamentally must be designed from the ground up around that purpose, and tailor suited to acheive it’s purpose. To this end, consideration of all the above mentioned (and more) subdivisions of assessment can be useful in informing the designer of an assessment, but it is also important to keep in mind the message that
The fundamental purpose of assessment in education is to establish and understand where learners are in an aspect of their learning at the time of assessment.
As for why validity, dependability, reliability and moderating are so important, well validity is obvious. If assessment is not measuring the thing it was intended to measure, then the conclusions, actions, and policies drawn from the results of such assessment will fundamentally be based on false premises, and would in the best case scenario be unjustified. Some of the worse case consequences of such scenarios could be unconscionable. The importance of dependability and reliability is perhaps slightly more difficult to nail down because it is founded in concepts of variance. No matter how much people might like to beleive it is possible to construct completely dependable and reliable assessment it is fundamentally not. There will always be some amount of uncertainty, of unreliability. None-the-less, it is a similarly ethical issue to that of validity, in that often assessment will be used to make decisions, at the very least conclusions will be drawn from the results of assessment and those conclusions will be used to inform furture oppinions. Hence, as the designers of assessment we have a moral obligation to take every step possible to make the assessment as reliable and dependable as we possibly can, as the negative results of unreliability will impact on peoples lives in potentially very harmful ways. It should be noted that this is more crucial for some types of assessment that others. Specifically assessment that results in consequences for individuals have the highest need for reliability and dependability as the consequences are the most direct. Larger scale assessment such as PISA for example, used only for the comparison of countries, have the advantage of much large replication — sample size — which can help average out unreliability, especially when average values are not used for analyses but rather distributions of scores are compared to each other in more comprehensive ways that capture information about such unreliability. Lastly, moderation. Both statistical moderation, and so-called `social moderation’ as discussed by Klenowski & Wyatt-Smith (2010) can be crucial in particular cases. Similar to reliability and dependability, how crucial moderation is varies across different types of assessment, and will depend on the purpose and function of the assessment in question. Statistical moderation is crucial for assessment intended to be used for comparisons between groups or individuals, for things like tertiary entrance rankings and such. Social moderation on the other hand, can be a powerful tool for ensuring a high degree of reliability and dependability, controlling for variability between, for example, different markers.
TODO: Re-Read Chapter 12 of Churchill et al. (2013)1 in more detail. TODO: Add Standards, etc.
References:
-
Churchill, R., Ferguson, P., Godinho, S., Johnson, N. F., Keddie, A., Letts, W., Mackay, J., McGill, M., Moss, J., Nagel, M. & Nicholson, P. (2013). Teaching: Making a difference. ↩ ↩2
-
Brady, L., & Kennedy, K. J. (2012). Assessment and reporting: Celebrating student achievement. ↩ ↩2 ↩3
-
Masters, G. N. (2014). Assessment: getting to the essence. ↩ ↩2