Dynamic Graph Transformers for Temporal Human Activity Recognition

Main Article Content

Cedric Halbrunn

Abstract

Human activity recognition (HAR) from sequential sensor or video data is a fundamental problem in machine perception, with applications in surveillance, robotics, healthcare monitoring, and smart environments. Traditional models rely on static graph structures or recurrent architectures that struggle to capture dynamic spatial-temporal dependencies. In this paper, we propose a novel architecture—Dynamic Graph Transformer (DGT)—that integrates graph construction and temporal attention within a unified transformer framework. Unlike prior works that use pre-defined or fixed adjacency matrices, our model learns time-varying interaction graphs among human joints or entities through self-attention, enabling adaptive modeling of pose, motion, and contextual correlation. We introduce a dynamic graph encoder that computes attention-weighted edge strengths at each frame and a temporal transformer that aggregates node- level information across time. The model is fully end-to-end trainable and requires no manual graph design. Evaluations on three benchmark datasets—NTU RGB+D 60, Kinetics Skeleton, and SHREC—demonstrate that our approach significantly outperforms conventional graph convolution networks and RNN-based models.

Article Details

How to Cite
Halbrunn, C. (2025). Dynamic Graph Transformers for Temporal Human Activity Recognition. Journal of Computer Science and Software Applications, 5(5). https://doi.org/10.5281/zenodo.15381917
Section
Articles