Developed in partnership with Oxford University, this dataset contains comment-reply interactions across five subreddits: Democrats, Republicans, Black Lives Matter, Brexit, and Climate. Each comment-reply interaction is annotated with “agree,” “disagree,” “neutral,” or “unsure” labels by at least three raters.
Original Reddit Interactions
The original comments and subsequent interactions scraped from Reddit with no assigned labels.
Labeled Dataset: Full Agreement
Dataset where all annotators achieved full agreement and pairs have been checked by authors of the paper.
Labeled Dataset: 2/3 Agreement
Dataset where there is 2/3 inter-annotator agreement or above.
Labeled Dataset: Challenge
Dataset where consensus was not reached among annotators, or pairs were labeled as “unsure.”
Build better models faster with data you can trust. Maximize operational efficiency and productivity while reducing the cost of ML projects.
The reply shows approval towards the initial statement or initial author.
Some examples of phrases that express agreement are: “I agree”,
“That’s absolutely right”, “That’s so true”, “Exactly”, “That’s how I feel”, etc.
There is a topical exchange between authors but agreement / disagreement is not expressed.
The reply shows disapproval or dislike towards the initial statement or initial author.
Some examples of phrases that express disagreement are: “No way”,
“I don’t think so”, “Not necessarily”, “That’s not always true”, etc.
It is not possible to make a decision based on the information at hand.