This repository provides an app that is able to transcribe and translate
debates, where speakers take turns. Any such video or audio file in the format
mp4 or wav can be uploaded via a dashboard for analysis.
-
The analysis is performed with the hugging face component odtp-pyannote-whisper, that was developed in the context of this project and can be accessed directly via hugging face.
-
The results of that analysis are loaded into an S3 compatible object store (garage).
-
From there it will be indexed into the Search Engine Solr. A Mongo db database is used to manage the media processing results and status
-
A dashboard is provided to make all processing and results available via a common interface: it consists of a frontend, a backend and a redis queue for a decoupled processing of the long running media analysis jobs on hugging face.
Installation and options for the installations are described in the documentation
Usage is described in the documentation
See documentation
This work was originally funded by the SNSF Spark Grant number 221139 “Debating Human Rights” SNSF Data Portal . Documentation: Political Debates.
The goal of that project was to create specialized components for the analysis of videos from United Nations Human Rights Council (UNHRC) debates.
- Sophisticated Transcription: Integrating and optimizing cutting-edge transcription models (e.g., Whisper 3.0) to ensure accurate, multilingual transcription of UNHRC debates.
- Multimodal Data Handling: Developing components tailored to video/audio processing, scene extraction, and diarization.
- Specialized Database Integration: Designing and deploying a database structure to effectively store debate transcripts, relevant metadata, and extracted features.
This repo was created as a wrapup of that project, to make the processings and results available in a more general form.
Copyright © 2025-2028 Swiss Data Science Center (SDSC), www.datascience.ch. All rights reserved. The SDSC is jointly established and legally represented by the École Polytechnique Fédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich (ETH Zürich). This copyright encompasses all materials, software, documentation, and other content created and developed by the SDSC.