This repository contains the data and explanations required to reconstruct the dataset as described in the paper “Detecting Persuasive Arguments based on Author-Reader Personality Traits and their Interaction”.

Dataset:

The data is in CVS format, and contains 9747 quadruplets of the form:
comment_id, submission_id, comment_author_id, delta_reader_id

For example:
ceefsm0, 1u4fqn, jmsolerm, CleanMyWounds53013

Extract the content using REDDIT API

  1. First access the comment content: https://www.reddit.com/api/info.json?id=comment_id, and from there you can get:
    1. The content of the comment
    2. The link to the submission by extracting the “parent_id“ field
  2. Get the content of the submission by accessing https://www.reddit.com/api/info.json?id=parent_id, and extracting the field “selftext”.
  3. To access the comment_author_id or delta_reader_id content, use: https://www.reddit.com/user/comment_author_id/comments/ to get the comments, and https://www.reddit.com/user/comment_author_id/posts/ to get the posts.

To read more about the API, see: https://www.reddit.com/dev/api/

If you use this data, please cite:

@inproceedings{umap-Shmueli-Scheuer19,
  author    = {Michal Shmueli{-}Scheuer and
               Jonathan Herzig and
               David Konopnicki and
               Tommy Sandbank},
  title     = {Detecting Persuasive Arguments based on Author-Reader Personality
               Traits and their Interaction},
  booktitle = {Proceedings of the 27th {ACM} Conference on User Modeling, Adaptation
               and Personalization, {UMAP} 2019, Larnaca, Cyprus, June 9-12, 2019.},
  pages     = {211--215},
  year      = {2019},
  url       = {https://doi.org/10.1145/3320435.3320467},
  doi       = {10.1145/3320435.3320467}
}


Contact

Michal Shmueli-Scheuer, Information Retrieval Solutions, IBM Research - Haifa