Z-score: Need for preventing erosion of trust in systems
May 29, 2014, 7:53 pm , The Island
By Sankha Muthu Poruthotage
Ph.D (Statistics)
We
know that we have short memories; a fact that most of us Sri Lankans,
readily acknowledge. Perhaps, it became a part of the nation’s psyche
during the war. Perhaps, it allowed us to move on, rather than keep
dwelling on the past. However, as we prepare to usher in an era of
sustained peace and economic prosperity we can’t afford to be
forgetful. It is the time to thoroughly investigate the root causes of
the problems we have had in the past and have now. Then we have to
formulate long lasting solutions to fix them.
The
use Z-Score for university admission is such an issue that we all
chose to forget. Despite it being contentious enough to cause numerous
protest campaigns and eventually needing a verdict from the country’s
highest court. We may recall that, it made headlines in all national
newspapers for months. Unfortunately, since we never cared to address
the root causes of the problem, it may only be a matter of time before
it flares up once again.
I intend to discuss
some of the shortcomings of the Z-Score method and propose a rough
outline for a potential long term, sustainable solution. But first, a
word of caution to the reader:Z-Score is a statistical technique which
has its relative strengths and weaknesses, hence praises and
criticisms. In fact, this is common to all scientific methodology. I
personally do not know a superior statistical approach myself and am
yet to see a convincing argument for an alternative method. It was
proposed in a vacuum of any other viable alternatives - as a solution
to a peculiar situation that arose in our AL examination process. In
fact the proposal deserves to be applauded rather than criticized since
it may have performed better than the next best method. My intention
is to illustrate that Z-Score is a method which may work satisfactorily
only under some strong assumptions and to emphasize the need to
eliminate the use of statistical methods in the critical process of
university admission.
Some of us may already
know how to calculate the Z-Scores of a given set of data. We first
calculate the mean (average) and the standard deviation (S.D.) of the
data set (S.D.is a widely used statistical measure that indicates the
dispersion among data points). Then from each data point, we subtract
the mean and divide the answer by the S.D. Note that for each original
data point, we get a corresponding Z-Score.
Now
let us consider the following situation. The data I have presented are
completely imaginary and serves as an illustration only.
Suppose
we select the 10 students who obtained highest marks for mathematics
at the past GCE OL exam and the 10 students who obtained the lowest.
Then we give two different, but equally difficult, new mathematics exams
to the two groups. Let the group of good students be given the exam 1
and called group 1 and the other group exam 2 and group 2. Suppose
following are the exam marks in the two groups. They appear the way we
expect it to be, with group 1 scoring much higher than group 2.
(Please see the table)
Thereafter
let us calculate the Z-Score for each student.Now, if we rank all the
students by their Z-Score, several students who are in group 2 will be
ranked higher than the ones in group 1. In fact Marvan who is in group 2
will be the highest ranked overall!
This
illustration, despite its extreme nature, exposes a critical assumption
one needs to make when using Z-Scores to compare two groups. The two
groups are identical in every sense other than the treatment of
interest. If not, a rank based on Z- Scores can produce outrageous
results as illustrated by this extreme example.
In
our AL exam situation, willingly or not, we make several such
assumptions when we rank students based on their Z-Scores. Such as,
*
The new syllabus students (Those who take it for the first time) and
the old syllabus students are identical in every sense other than the
exam papers they were given. In other words, we assume the two groups
are identical in terms of intelligence, motivation, exam preparedness
and everything else that one may think of as having a potential impact
on their exam performance.
* The two groups of
students who take two alternative subjects, such as Economics and
French language are once again identical in all aspects except for the
two different subject streams that they have chosen to study.
These are strong assumptions and should be avoided if possible.
Even
if we make such strong assumptions there are further obstacles down
the road.A fundamental question one can ask is "Can it be justified
that a Z-Score of 2 is better than a Z-Score of 1.5?" If these scores
are for two candidates who sat for two different exams but belong to two
groups that are identical in all other sense, the answer is "Yes".The
justification is:
*The candidate with Z- Score
of 2 belongs to the top 2.5% of his group while the candidate with
Z-Score of 1.5 has about 6.5% of students above him. Since the two
groups are assumed to be identical the candidate with Z-Score of 2 is
better than the one with Z-Score of 1.5.
For
theoretical completeness, I will add the following. This justification
is also based on an assumption called "normality". However it is a more
realistic assumption to make in a situation such as the exam scores.
Even without the normality assumption this can be justified using a
well-known probability result called "Tchebyshev’s Inequality".
Nevertheless,
when we add Z-Scores things get a lot more complicated and harder to
justify. (An aggregate Z-Score is calculated in the AL ranking process)
It is a lot harder to justify that Candidate "A" with an aggregate
Z-Score of 4.0 should be prioritized over candidate "B" with an
aggregate Z-Score of 3.9!
At the very beginning
of this article I mentioned that long lasting solutions can be devised
only by investigating the root causes of a problem. The need for Z
scores (Standardization) arose due to two reasons.
* Sudden existence of new and old syllabuses.
It
can be argued that education curriculums should evolve—not change.
Simply from an educational stand point, comprehensive curriculum changes
once in every decade or so should be avoided. It deprives one group of
students of any advancement in their relative fields for a prolonged
period of time while burdening the other group of students and
educators with an extensive amount of new material. By making
curriculum revisions a continuous process,where it is done annually or
bi-annually, the changes will be nominal year to year, hence will
eliminate the need for two different curriculums. This will in effect
eradicate the ambiguity around the use of a statistical method.
* Ability to compete for the same university degree while studying different subject combinations.
This
is a complex situation that needs to be addressed from several
perspectives. It is not uncommon to have students from slightly
different prior educational backgrounds studying for the same degree in a
university. The global practice is to evaluate students for university
admissions based on a standardized exam, such as SAT, which generally
covers core subject disciplines. However we need to recognize the fact
that there is no need to have a standardized exam for degrees such as
medicine and engineering, where a specific subject combination is
required for admissions.
An alternative
solution might be for higher education institutes and regulatory bodies
to decide on a quota for each subject combination.I personally like an
approach that demands both educators and students to evaluate their
programs and choices based on resource availability and demand for
employment. However I acknowledge that there are several ground
realities that need to be factored in when devising a quota system as
such.
Finally, I would like to point out that
none of these solutions are easy fixes. They require a great deal of
careful planning, hard work, dedication and determination. However, we
owe it to our future generations. As we have experienced repeatedly in
the past, breakdown of trust in our systems, especially in the
education system, can lead to catastrophic consequences.