Projects

This is a collection of most software projects that are public. This list is a mix of personal and research projects; the linked repository should give the correct association.
Posted on December 15, 2021

Measuring Fairness with Biased Rulers

A Survey on Quantifying Biases in Pretrained Language Models

An increasing awareness of biased patterns in natural language processing resources, like BERT, has motivated many metrics to quantify 'bias' and 'fairness'. But comparing the results of different metrics and the works that evaluate with such metrics remains difficult, if not outright impossible. We survey the existing literature on fairness metrics for pretrained language models and experimentally evaluate compatibility, including both biases in language models as in their downstream tasks. We do this by a mixture of traditional literature survey and correlation analysis, as well as by running empirical evaluations. We find that many metrics are not compatible and highly depend on templates, attribute and target seeds and the choice of embeddings.

Posted on April 21, 2021

Attitudes Towards COVID-19 Measures

Measuring Shifts in Belgium Using Multilingual BERT

We classify seven months' worth of Belgian COVID-related Tweets using multilingual BERT and relate them to their governments' COVID measures. We classify Tweets by their stated opinion on Belgian government curfew measures (too strict, ok, too loose). We examine the change in topics discussed and views expressed over time and in reference to dates of related events such as implementation of new measures or COVID-19 related announcements in the media.

Posted on September 14, 2020

Ethical Adversaries

Towards Mitigating Unfairness with Adversarial Machine Learning

We offer a new framework that assists in mitigating unfair representations in the dataset used for training. Our framework relies on adversaries to improve fairness. First, it evaluates a model for unfairness w.r.t. protected attributes and ensures that an adversary cannot guess such attributes for a given outcome, by optimizing the model’s parameters for fairness while limiting utility losses. Second, the framework leverages evasion attacks from adversarial machine learning to perform adversarial retraining with new examples unseen by the model. We evaluated our framework on well-studied datasets in the fairness literature where it can surpass other approaches concerning demographic parity, equality of opportunity and also the model’s utility.

Posted on January 20, 2020

RobBERT

A Dutch RoBERTa-based Language Model

Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks. One of the most prominent pre-trained language models is BERT. Although the multilingual version of BERT performs well on many tasks, recent studies showed that BERT models trained on a single language significantly outperform the multilingual results.<br/>For this reason we present a Dutch model based on RoBERTa, which we call RobBERT. We show that RobBERT improves state of the art results in Dutch-specific language tasks.

Posted on August 01, 2019

Computational Ad Hominem Detection

Fallacies like the personal attack—also known as the ad hominem attack—are introduced in debates as an easy win, even though they provide no rhetorical contribution. Although their importance in argumentation mining is acknowledged, automated mining and analysis is still lacking. We show TF-IDF approaches are insufficient to detect the ad hominem attack. Therefore we present a machine learning approach for information extraction, which has a recall of 80% for a social media data source. We also demonstrate our approach with an application that uses online learning.

Posted on July 01, 2019

Jerboa

A self-deployable chat app

A self-deployable chat application with support for custom emoji, math rendering, side-by-side PDF viewing, etc...

Posted on July 17, 2018

Rule engine Rule for Dart

An open source rule matching system

An open source rule engine that follows a syntax inspired by Drools, but adapted for Flutter and Dart. The project contains a standard lexer and parser, which output is then used to compile the described rules into a RETE network.