There are a couple main indicators for quality that can easily be improved by understanding what the underlying issues are. The process of creating a well performing project can be highly iterative, so these scores are meant to be a way to keep a pulse on what can be improved.
Batches
> Calibration BatchQuality Lab
> Evaluation Tasks
The issues queue is found in Quality Lab
and serves as a centralized location where the Rapid platform will automatically surface any issues to resolve that are impacting quality. There are different types of severities for each issue:
Within the issues queue, you can browse a sorted list of issues and resolve them on a case-by-case basis.
You find your Calibration Scores at the batch level. For each Calibration Batch, this can be accessed through Batches
> Calibration Batch.
You can only see your Calibration Score after you've finished a full audit of the calibration.
You find an overall picture of the accuracy of your project in Metrics
.
Keep in mind that while Evaluation Task Accuracies are intended to represent your project as a whole, this is just a summative representation of the tasks you selected to be Evaluation tasks.
It is important to maintain a healthy set of evaluation tasks in order to get high quality data.
See more: Examples of various Evaluation Task curves and what they might indicate
Most healthy projects will have an Evaluation Task curve that looks like a bell curve centered around 70-80% accuracy. This indicates that the evaluation is has good coverage of the difficulty and breadth of the potential tasks, and thus the Evaluation Tasks will ensure properly quality of Tasker workforce
This is an example of a set of Evaluation Tasks that has two centers on the low and high ends, **which may be indicating at a problem with the project definition. If there are many Evaluation Tasks under 40% or so, it can indicate that you may want to refine your project instructions and taxonomy.
A set of Evaluation Tasks that result in a curve centered around high accuracy such as around 90% could indicate two things. One, your instructions could be clear and/or your dataset doesn't have too large of content breadth and difficulty - in this case this is healthy. Two, if you notice that your audit results don't really match up with the accuracy of evaluation tasks, it may indicate that you need to add additional "harder" evaluation tasks to maintain quality.
You will also be able to see individual accuracies at the Quality Lab
view.
Diving into an evaluation task type will bring up each task and its average accuracy, as well as number of completions.