01.05.2021, 18:39
Let me reiterate why I think the sports analogy does not work here.
In a sports event (like platform diving), the idea of the rating system is that there is a single level on a linear scale which accurately describes the quality of the athlete's performance. Since such a level cannot be defined objectively, it is estimated based on a number of ratings using statistical means. A rating which deviates a lot from the average is considered an outlier or an anomaly because of this belief in that single performance level.
What we have here in the Puzzle Portal is a completely different rating approach. The different "judges" are not trying to reach a consensus about the quality of the puzzle. A simplified example: Suppose a Sudoku is published, and there are three categories of solvers: those who enjoy Sudokus very much, those who can tolerate them, and those who do not like Sudokus at all. These are three independent views. The last group is not an anomaly!
Calculating the average (or some other statistical variable) may give the impression that there is a single objective level that describes the quality of the puzzle, but this is just not the case in my opinion. A solver whose vote deviates a lot from the others cannot just be assumed to have misjudged the quality of the puzzle, he may have an entirely different taste. And that element of "taste" does not occur in the sports analogy.
In a sports event (like platform diving), the idea of the rating system is that there is a single level on a linear scale which accurately describes the quality of the athlete's performance. Since such a level cannot be defined objectively, it is estimated based on a number of ratings using statistical means. A rating which deviates a lot from the average is considered an outlier or an anomaly because of this belief in that single performance level.
What we have here in the Puzzle Portal is a completely different rating approach. The different "judges" are not trying to reach a consensus about the quality of the puzzle. A simplified example: Suppose a Sudoku is published, and there are three categories of solvers: those who enjoy Sudokus very much, those who can tolerate them, and those who do not like Sudokus at all. These are three independent views. The last group is not an anomaly!
Calculating the average (or some other statistical variable) may give the impression that there is a single objective level that describes the quality of the puzzle, but this is just not the case in my opinion. A solver whose vote deviates a lot from the others cannot just be assumed to have misjudged the quality of the puzzle, he may have an entirely different taste. And that element of "taste" does not occur in the sports analogy.