Products

Solutions

Resources

Company

Debagreement: Reddit 50K

Open-source agreement / disagreement dataset of Reddit interactions developed in partnership with Oxford University.

overview

Agreement / Disagreement Dataset

Developed in partnership with Oxford University, this dataset contains comment-reply interactions across five subreddits: Democrats, Republicans, Black Lives Matter, Brexit, and Climate. Each comment-reply interaction is annotated with “agree,” “disagree,” “neutral,” or “unsure” labels by at least three raters.

Original Reddit Interactions

The original comments and subsequent interactions scraped from Reddit with no assigned labels.

Labeled Dataset: Full Agreement

Dataset where all annotators achieved full agreement and pairs have been checked by authors of the paper.

Labeled Dataset: 2/3 Agreement

Dataset where there is 2/3 inter-annotator agreement or above.

Labeled Dataset: Challenge

Dataset where consensus was not reached among annotators, or pairs were labeled as “unsure.”

42,894

annotations

classes*

data collection

The data was collected from Reddit using pushshift

Build better models faster with data you can trust. Maximize operational efficiency and productivity while reducing the cost of ML projects.

The pushshift.io Reddit API was designed and created by the /r/datasets mod team to help provide enhanced functionality
and search capabilities for searching Reddit comments and submissions. This RESTful API gives full functionality for searching Reddit data and also includes the capability of creating powerful
data aggregations. With this API, you can quickly find the data
that you are interested in and find fascinating correlations.

data annotation

Labeling Taxonomies

Agree

The reply shows approval towards the initial statement or initial author.

Some examples of phrases that express agreement are: “I agree”,
“That’s absolutely right”, “That’s so true”, “Exactly”, “That’s how I feel”, etc.

Neutral

There is a topical exchange between authors but agreement / disagreement is not expressed.

Disagree

The reply shows disapproval or dislike towards the initial statement or initial author.

Some examples of phrases that express disagreement are: “No way”,
“I don’t think so”, “Not necessarily”, “That’s not always true”, etc.

Unsure

It is not possible to make a decision based on the information at hand.

dev kit

Get Started with Oxford Dataset

Read the full Research Paper. If you use the dataset, please make sure to cite our paper.

Download Dataset

Get Started: Colab Notebook