Skip to main content
All CollectionsHiring ManagersAdding Questions
How does WeCP measure the difficulty, quality, and relevance of each question?
How does WeCP measure the difficulty, quality, and relevance of each question?
Abhishek avatar
Written by Abhishek
Updated over a year ago

WeCP uses statistical analysis to determine the difficulty level, quality, and relevancy-to-skill of each question. This ensures that each question is appropriate for the intended candidates and that the assessment is fair and accurate.

Here's how WeCP measures the difficulty, quality and relevancy of each question:

  1. Item Difficulty (p-value): WeCP uses this statistical analysis method to measure the proportion of participants who answered a question correctly. A question with a high p-value (e.g., 0.8) would indicate that the majority of participants answered it correctly, and it would be considered an "easy" question. A question with a low p-value (e.g., 0.2) would indicate that the majority of participants answered it incorrectly, and it would be considered a "hard" question. This metric is an indicator of the difficulty level of a question.

  2. Discrimination Index: WeCP uses this statistical analysis method to measure how well a question is able to distinguish between "high-performing" and "low-performing" candidates. A question with a high discrimination index (e.g., 0.8) would indicate that the question was able to effectively distinguish between high-performing and low-performing participants, and it would be considered a "good" question. A question with a low discrimination index (e.g., 0.2) would indicate that the question was not able to effectively distinguish between high-performing and low-performing participants, and it would be considered a "poor" question. This metric is an indicator of the quality of a question.

  3. Correlation: WeCP uses this statistical analysis method to measure the correlation between the performance on the question and the total score. A question with a high correlation (e.g., 0.8) would indicate that the question is closely related to the overall concept, and it would be considered a "good" question. A question with a low correlation (e.g., 0.2) would indicate that the question is not closely related to the overall concept, and it would be considered a "poor" question. This metric is an indicator of the relevancy of a question to the skill and concept desired to be assessed.

By using the above three statistical methods, WeCP assigns difficulty, quality, and relevancy for each question. This ensures that each question is appropriate for the intended audience and that the assessment is fair and accurate.

Understanding P-value:

P-value, also known as item difficulty, is a measure of the proportion of participants who answered a multiple-choice question correctly. It is calculated by taking the number of correct answers and dividing it by the total number of participants.

The formula for P-value is:

P-value = (number of correct answers) / (total number of participants)

P-value ranges between 0 and 1, with higher values indicating that the question is easier (i.e. more participants answered it correctly) and lower values indicating that the question is harder (i.e. fewer participants answered it correctly).

For example, if a question was answered correctly by 80 out of 100 participants, the P-value would be 0.8, indicating that the question is considered to be an "easy" question, as 80% of the participants answered it correctly.

It's important to note that P-value is used to measure the performance of multiple choice questions, and it's one of the common ways to measure the difficulty level of a question, but there are other methods that can be used to measure the difficulty level of a question as well.

Understanding Discrimination Index:

Discrimination Index is a measure of how well a question is able to distinguish between high-performing and low-performing participants. It is calculated by comparing the proportion of high-performing participants who answered the question correctly to the proportion of low-performing participants who answered the question correctly.

There are different ways of calculating the discrimination index, one of the most common is point biserial correlation coefficient (PB). The formula for point biserial correlation coefficient is :

PB = (rxy * sqrt(N-n)) / sqrt(n(1-n))

Where:

  • rxy is the Pearson's correlation coefficient between the scores on the test and the scores on the criterion measure

  • N is the total number of participants

  • n is the number of participants who passed the criterion measure (high-performing participants)

The PB coefficient ranges between -1 and 1, with positive values indicating that the question is able to distinguish between high-performing and low-performing participants, and negative values indicating that the question is not able to distinguish between high-performing and low-performing participants.

It's important to note that the discrimination index is not only limited to Point Biserial correlation coefficient, there are other methods like Corrected item-total correlation, logistic discrimination and many more.

Understanding Correlation:

The formula for correlation, also known as Pearson's correlation coefficient, is represented by the symbol "r" and is calculated as:

r = (n ∑xy - ∑x ∑y) / √((n ∑x^2 - (∑x)^2) * (n ∑y^2 - (∑y)^2))

Where:

  • x and y are the scores for each participant on two variables (e.g. question performance and total score)

  • n is the number of participants

  • ∑x represents the sum of all x scores

  • ∑y represents the sum of all y scores

  • ∑xy represents the sum of the product of x and y scores for each participant

The correlation coefficient (r) ranges from -1 to 1, with -1 indicating a perfect negative correlation, 0 indicating no correlation, and 1 indicating a perfect positive correlation. A positive correlation means that as the score on one variable increases, the score on the other variable also tends to increase. A negative correlation means that as the score on one variable increases, the score on the other variable tends to decrease.

It's important to note that correlation doesn't imply causation, it just indicates the relationship between two variables.

Did this answer your question?