Pieter Delobelle is Lead AI Scientist at Pleias and postdoc at KU Leuven, working on pretraining, tokenization & AI safety. He created RobBERT, the state-of-the-art Dutch language model family (top 80 model downloads on Hugging Face, 398 citations), and has collaborated on most of the open Dutch LLMs.
His work has been published at top AI and NLP venues, including work done at Apple and Aleph Alpha. His research has been covered by WIRED, MIT Technology Review, De Tijd, and VTM Nieuws. He has done research visits at MilaNLP (Bocconi), HU Berlin, and the Weizenbaum Institute.
He contributes to AI policy as an NLP expert for the EU AI Office's code of practice, was a member of the KU Leuven GenAI advisory board, and sits on the KU Leuven council for research policy. He received his PhD from KU Leuven in 2023 and has accumulated 1045+ citations (h-index 14) across 32 publications.
Last updated on February 1, 2026.
Working on synthetic data and small language models.
Working on large language models and fairness in AI systems.
Led R&D of translation and explainability techniques, demonstrated to EU parliament. Contributed to research on bias mitigation and LLM steering. Developed various inference-related features, e.g. multi-node Deepseek over Infiniband.
Led workshops on the workings of LLMs and AI safety for clients, e.g. KBC senior management. Developed LLMQ Python package for scheduling large LLM inference jobs. NLP Expert for the EU AI Office's code of practice for foundation models.
Led development of tokenization research and released BübleLM-2B with HU Berlin. Research visit to Alan Akbik's group at HU Berlin. Lecturer for the NLP course at KULAK. Member of the KU Leuven GenAI committee.
Led AurA project on controlling LLMs for toxicity reduction, published at ICML 2024. Worked with state-of-the-art models: Falcon 7B/40B, OPT, MPT, Mistral. Collaborated with researchers in Barcelona, Paris and Cambridge.
PhD on fairer foundation models at the DTAI lab with Prof. Luc De Raedt and Prof. Bettina Berendt. Developed RobBERT, a Dutch language model ranking in the global top 80 on Hugging Face. Experience with major language models (Falcon, OPT, MPT) and research visits to MilaNLP and Weizenbaum Institute.
Conducted research on fault analysis in a distributed stream processing system.
Thesis on bias mitigation in large language models, supervised by Prof. Luc De Raedt and Prof. Bettina Berendt.
Focus on machine learning and computer vision. Thesis on obstacle detection with Prof. Johan Suykens and CNH Industrial.
Focus on distributed systems. Thesis on failure detection in stream processing at Nokia Bell Labs.