Decomp

Factuality concerns whether events described in text actually occurred, are occurring, or will occur. This semantic property is mediated by various linguistic devices including modal auxiliaries, negation, conditionality, and evidential markers that signal speaker commitment to the truth of propositions.

The annotation framework decomposes factuality into two orthogonal dimensions: the factuality value itself (ranging from definitely did not happen to definitely happened) and the speaker’s confidence in that assessment. This two-dimensional approach captures uncertainty inherent in natural language, where speakers often hedge claims or report information from other sources.

Annotators assess factuality through questions about whether events happened in the actual world according to the speaker, with confidence ratings capturing epistemic uncertainty. The framework handles complex linguistic phenomena including embedded clauses, reported speech, and hypothetical scenarios.

The resulting annotations enable computational models to perform factuality prediction as a regression task rather than classification, better reflecting the gradient nature of certainty in natural language. Applications include information extraction systems that need to distinguish factual claims from speculation or misinformation detection systems.

Datasets

v1 English Web TreeBank Download
v2 English Web TreeBank Download

Publications

White, Aaron Steven, Dee Ann Reisinger, Keisuke Sakaguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger, Kyle Rawlins, and Benjamin Van Durme. 2016. Universal Decompositional Semantics on Universal Dependencies. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1713–1723. Austin, Texas: Association for Computational Linguistics.

pdf doi bib
Rudinger, Rachel, Aaron Steven White, and Benjamin Van Durme. 2018. Neural Models of Factuality. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 731–744. New Orleans, Louisiana: Association for Computational Linguistics.

pdf doi bib
White, Aaron Steven, Rachel Rudinger, Kyle Rawlins, and Benjamin Van Durme. 2018. Lexicosyntactic Inference in Neural Models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4717–4724. Brussels, Belgium: Association for Computational Linguistics.

pdf data doi

Back to All Projects