This repository contains the data and explanations required to reconstruct the dataset as described in the paper “Detecting Persuasive Arguments based on Author-Reader Personality Traits and their Interaction”.


The data is in CVS format, and contains 9747 quadruplets of the form:
comment_id, submission_id, comment_author_id, delta_reader_id

For example:
ceefsm0, 1u4fqn, jmsolerm, CleanMyWounds53013

Extract the content using REDDIT API

  1. First access the comment content:, and from there you can get:
    1. The content of the comment
    2. The link to the submission by extracting the “parent_id“ field
  2. Get the content of the submission by accessing, and extracting the field “selftext”.
  3. To access the comment_author_id or delta_reader_id content, use: to get the comments, and to get the posts.

To read more about the API, see:

If you use this data, please cite:

  author    = {Michal Shmueli{-}Scheuer and
               Jonathan Herzig and
               David Konopnicki and
               Tommy Sandbank},
  title     = {Detecting Persuasive Arguments based on Author-Reader Personality
               Traits and their Interaction},
  booktitle = {Proceedings of the 27th {ACM} Conference on User Modeling, Adaptation
               and Personalization, {UMAP} 2019, Larnaca, Cyprus, June 9-12, 2019.},
  pages     = {211--215},
  year      = {2019},
  url       = {},
  doi       = {10.1145/3320435.3320467}


Michal Shmueli-Scheuer, Information Retrieval Solutions, IBM Research - Haifa