PI Alan Akbik and his HU team develop search engine for media quotes

What were the most relevant comments about the German elections? What did Donald Trump say about immigration? What do Berlin politicians think about the latest BVG strike? Social scientists and journalists ask questions like these every day. To help answer them, our PI Alan Akbik and his team at the Institute of Computer Science at Humboldt-Universität zu Berlin developed the Zitatsuchmaschine, an AI-driven search engine that enables users to search for quotes across German media.

Alan, who holds the Chair of Machine Learning at Humboldt-Universität, came up with the idea during his studies. “Back then, I often searched for quotes online and realized there was a need for an efficient search engine focused on citations,” he explained. “At HU, we work on language models and on how computers can process human language (natural language processing, or NLP). We used this expertise to develop the project. Over the last four years, we’ve built a vast database of two million quotes from around 50 different journalistic sources. More than 10,000 additional quotes are added daily.”

The Zitatsuchmaschine goes beyond what conventional search engines offer. Unlike typical databases, which are manually indexed, Alan’s tool uses fully automated processes powered by AI models developed by his team. “The search engine scans daily German-language online news, making the results more comprehensive and up-to-date.”

This project also addresses a key challenge at the university. “As an academic institution, we want to develop our own language models to reduce our dependency on models developed by companies like OpenAI. That’s why we’re working on NLP models that require minimal data and fewer resources to train. The citation search engine is an example of this research.” Currently, the search engine is available only in German, but the team is working on an English version.


An overview of our scientific work

See our Research Projects