Resources
Most recent models are published on Huggingface
[Benchmark, GitHub] MBIB – the first Media Bias Identification Benchmark Task and Dataset Collection
[Dataset, Huggingface] Anno-lexical (Lexical bias)
[Dataset, GitHub] BABE – Bias Annotations By Experts
[Dataset, Paper] BAT – Bias And Twitter
[Scale/Questionnaire to measure bias perception] Do You Think It’s Biased? How To Ask For The Perception Of Media Bias (A set of tested questions to assess media bias perception to be used in any bias-related research)
[Dataset, Zenodo] MBIC -A Media Bias Annotation Dataset Including Annotator Characteristics
Publications
2021
Haak, Fabian; Engelmann, Björn
IRCologne at GermEval 2021: Toxicity Classification Proceedings Article
In: Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments, pp. 47–53, Association for Computational Linguistics, Duesseldorf, Germany, 2021.
Abstract | Links | BibTeX | Tags: 2021 bias classification data engelmann haak nlp programming snorkel toxic
@inproceedings{haak-engelmann-2021-ircologne,
title = {IRCologne at GermEval 2021: Toxicity Classification},
author = {Fabian Haak and Björn Engelmann},
url = {https://aclanthology.org/2021.germeval-1.7},
year = {2021},
date = {2021-09-01},
booktitle = {Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments},
pages = {47–53},
publisher = {Association for Computational Linguistics},
address = {Duesseldorf, Germany},
abstract = {In this paper, we describe the TH Köln's submission for the Shared Task on the Identification of Toxic Comments at GermEval 2021. Toxicity is a severe and latent problem in comments in online discussions. Complex language model based methods have shown the most success in identifying toxicity. However, these approaches lack explainability and might be insensitive to domain-specific renditions of toxicity. In the scope of the GermEval 2021 toxic comment classification task (Risch et al., 2021), we employed a simple but promising combination of term-frequency-based classification and rule-based labeling to produce effective but to no lesser degree explainable toxicity predictions.},
keywords = {2021 bias classification data engelmann haak nlp programming snorkel toxic},
pubstate = {published},
tppubtype = {inproceedings}
}