Abstract:Fingerspelling is a component of sign languages in which words are spelled out letter by letter using specific hand poses. Automatic fingerspelling recognition plays a crucial role in bridging the communication gap between Deaf and hearing communities, yet it remains challenging due to the signing-hand ambiguity issue, the lack of appropriate training losses, and the out-of-vocabulary (OOV) problem. Prior fingerspelling recognition methods rely on explicit signing-hand detection, which often leads to recognition failures, and on a connectionist temporal classification (CTC) loss, which exhibits the peaky behavior problem. To address these issues, we develop OpenFS, an open-source approach for fingerspelling recognition and synthesis. We propose a multi-hand-capable fingerspelling recognizer that supports both single- and multi-hand inputs and performs implicit signing-hand detection by incorporating a dual-level positional encoding and a signing-hand focus (SF) loss. The SF loss encourages cross-attention to focus on the signing hand, enabling implicit signing-hand detection during recognition. Furthermore, without relying on the CTC loss, we introduce a monotonic alignment (MA) loss that enforces the output letter sequence to follow the temporal order of the input pose sequence through cross-attention regularization. In addition, we propose a frame-wise letter-conditioned generator that synthesizes realistic fingerspelling pose sequences for OOV words. This generator enables the construction of a new synthetic benchmark, called FSNeo. Through comprehensive experiments, we demonstrate that our approach achieves state-of-the-art performance in recognition and validate the effectiveness of the proposed recognizer and generator. Codes and data are available in: https://github.com/JunukCha/OpenFS.




Abstract:We address the correspondence search problem among multiple graphs with complex properties while considering the matching consistency. We describe each pair of graphs by combining multiple attributes, then jointly match them in a unified framework. The main contribution of this paper is twofold. First, we formulate the global correspondence search problem of multi-attributed graphs by utilizing a set of multi-layer structures. The proposed formulation describes each pair of graphs as a multi-layer structure, and jointly considers whole matching pairs. Second, we propose a robust multiple graph matching method based on the multi-layer random walks framework. The proposed framework synchronizes movements of random walkers, and leads them to consistent matching candidates. In our extensive experiments, the proposed method exhibits robust and accurate performance over the state-of-the-art multiple graph matching algorithms.




Abstract:Multi-attributed graph matching is a problem of finding correspondences between two sets of data while considering their complex properties described in multiple attributes. However, the information of multiple attributes is likely to be oversimplified during a process that makes an integrated attribute, and this degrades the matching accuracy. For that reason, a multi-layer graph structure-based algorithm has been proposed recently. It can effectively avoid the problem by separating attributes into multiple layers. Nonetheless, there are several remaining issues such as a scalability problem caused by the huge matrix to describe the multi-layer structure and a back-projection problem caused by the continuous relaxation of the quadratic assignment problem. In this work, we propose a novel multi-attributed graph matching algorithm based on the multi-layer graph factorization. We reformulate the problem to be solved with several small matrices that are obtained by factorizing the multi-layer structure. Then, we solve the problem using a convex-concave relaxation procedure for the multi-layer structure. The proposed algorithm exhibits better performance than state-of-the-art algorithms based on the single-layer structure.