In order to respond to an opponent’s speech, the system must process the opponent’s voice and ‘understand’ its content. The provided datasets focus on the Automatic Speech Recognition (ASR) output and on the upstream tasks related to understanding opponents' speeches.
Dataset |
Reference |
Speeches |
Topics |
Contents |
|
ACL 2020 |
3,562 |
440 |
- Recordings of expert debaters
- Automatic and manually-corrected transcripts of the speeches, in both raw and cleaned (processed) versions
- An annotation specifying the response speeches recorded for each speech, and the type of the response (explicit/implicit)
- Metadata describing the speeches, such as the topic discussed in each speech
|
|
EMNLP 2019
EMNLP 2018 |
200 |
50 |
- Recordings of expert debaters
- 55 general-purpose claim and rebuttal pairs written by an expert human debater
- An annotation specifying for each of the 50 controversial topics, which of the 55 general-purpose claims is relevant to the topic
- An annotation of general-purpose claims relevant to a topic, specifying whether a relevant claim was mentioned in speeches discussing the topic
- An annotation of general-purpose claims and sentences from speeches in which they were mentioned, specifying whether the claim was mentioned in the sentence
- An annotation of general-purpose rebuttals, specifying whether they are a plausible response to general-purpose claims mentioned in speeches
|
|
ArgMining 2019 @ACL
LREC 2018 |
400 |
200 |
Recordings of expert debaters + mined claims annotated in a listening comprehension task |
|
EMNLP 2018
LREC 2018 |
200 |
50 |
Recordings of expert debaters + arguments annotated in a listening comprehension task |
|
LREC 2018 |
60 |
16 |
Recordings of 10 expert debaters |