Alert button
Picture for Fei Jiang

Fei Jiang

Alert button

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

May 10, 2023
Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan

Figure 1 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 2 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 3 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 4 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users' historical sessions. As a result, these GNNs have difficulty recommending items that users have never interacted with (new items), which leads to a phenomenon of information cocoon. Therefore, it is necessary to recommend new items to users. As there is no interaction between new items and users, we cannot include new items when building session graphs for GNN session-based recommender systems. Thus, it is challenging to recommend new items for users when using GNN-based methods. We regard this challenge as '\textbf{G}NN \textbf{S}ession-based \textbf{N}ew \textbf{I}tem \textbf{R}ecommendation (GSNIR)'. To solve this problem, we propose a dual-intent enhanced graph neural network for it. Due to the fact that new items are not tied to historical sessions, the users' intent is difficult to predict. We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item. To solve the challenge that new items cannot be learned by GNNs, inspired by zero-shot learning (ZSL), we infer the new item representation in GNN space by using their attributes. By outputting new item probabilities, which contain recommendation scores of the corresponding items, the new items with higher scores are recommended to users. Experiments on two representative real-world datasets show the superiority of our proposed method. The case study from the real-world verifies interpretability benefits brought by the dual-intent module and the new item reasoning module. The code is available at Github: https://github.com/Ee1s/NirGNN

* 10 Pages, 6 figures, WWW'2023 
Viaarxiv icon

BPJDet: Extended Object Representation for Generic Body-Part Joint Detection

Apr 21, 2023
Huayi Zhou, Fei Jiang, Jiaxin Si, Yue Ding, Hongtao Lu

Figure 1 for BPJDet: Extended Object Representation for Generic Body-Part Joint Detection
Figure 2 for BPJDet: Extended Object Representation for Generic Body-Part Joint Detection
Figure 3 for BPJDet: Extended Object Representation for Generic Body-Part Joint Detection
Figure 4 for BPJDet: Extended Object Representation for Generic Body-Part Joint Detection

Detection of human body and its parts (e.g., head or hands) has been intensively studied. However, most of these CNNs-based detectors are trained independently, making it difficult to associate detected parts with body. In this paper, we focus on the joint detection of human body and its corresponding parts. Specifically, we propose a novel extended object representation integrating center-offsets of body parts, and construct a dense one-stage generic Body-Part Joint Detector (BPJDet). In this way, body-part associations are neatly embedded in a unified object representation containing both semantic and geometric contents. Therefore, we can perform multi-loss optimizations to tackle multi-tasks synergistically. BPJDet does not suffer from error-prone post matching, and keeps a better trade-off between speed and accuracy. Furthermore, BPJDet can be generalized to detect any one or more body parts. To verify the superiority of BPJDet, we conduct experiments on three body-part datasets (CityPersons, CrowdHuman and BodyHands) and one body-parts dataset COCOHumanParts. While keeping high detection accuracy, BPJDet achieves state-of-the-art association performance on all datasets comparing with its counterparts. Besides, we show benefits of advanced body-part association capability by improving performance of two representative downstream applications: accurate crowd head detection and hand contact estimation. Code is released in https://github.com/hnuzhy/BPJDet.

* 15 pages. arXiv admin note: text overlap with arXiv:2212.07652 
Viaarxiv icon

DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles

Feb 14, 2023
Huayi Zhou, Fei Jiang, Hongtao Lu

Figure 1 for DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 2 for DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 3 for DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 4 for DirectMHP: Direct 2D Multi-Person Head Pose Estimation with Full-range Angles

Existing head pose estimation (HPE) mainly focuses on single person with pre-detected frontal heads, which limits their applications in real complex scenarios with multi-persons. We argue that these single HPE methods are fragile and inefficient for Multi-Person Head Pose Estimation (MPHPE) since they rely on the separately trained face detector that cannot generalize well to full viewpoints, especially for heads with invisible face areas. In this paper, we focus on the full-range MPHPE problem, and propose a direct end-to-end simple baseline named DirectMHP. Due to the lack of datasets applicable to the full-range MPHPE, we firstly construct two benchmarks by extracting ground-truth labels for head detection and head orientation from public datasets AGORA and CMU Panoptic. They are rather challenging for having many truncated, occluded, tiny and unevenly illuminated human heads. Then, we design a novel end-to-end trainable one-stage network architecture by joint regressing locations and orientations of multi-head to address the MPHPE problem. Specifically, we regard pose as an auxiliary attribute of the head, and append it after the traditional object prediction. Arbitrary pose representation such as Euler angles is acceptable by this flexible design. Then, we jointly optimize these two tasks by sharing features and utilizing appropriate multiple losses. In this way, our method can implicitly benefit from more surroundings to improve HPE accuracy while maintaining head detection performance. We present comprehensive comparisons with state-of-the-art single HPE methods on public benchmarks, as well as superior baseline results on our constructed MPHPE datasets. Datasets and code are released in https://github.com/hnuzhy/DirectMHP.

* 13 pages 
Viaarxiv icon

A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with Full-range Angles

Feb 02, 2023
Huayi Zhou, Fei Jiang, Hongtao Lu

Figure 1 for A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 2 for A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 3 for A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with Full-range Angles
Figure 4 for A Simple Baseline for Direct 2D Multi-Person Head Pose Estimation with Full-range Angles

Existing head pose estimation (HPE) mainly focuses on single person with pre-detected frontal heads, which limits their applications in real complex scenarios with multi-persons. We argue that these single HPE methods are fragile and inefficient for Multi-Person Head Pose Estimation (MPHPE) since they rely on the separately trained face detector that cannot generalize well to full viewpoints, especially for heads with invisible face areas. In this paper, we focus on the full-range MPHPE problem, and propose a direct end-to-end simple baseline named DirectMHP. Due to the lack of datasets applicable to the full-range MPHPE, we firstly construct two benchmarks by extracting ground-truth labels for head detection and head orientation from public datasets AGORA and CMU Panoptic. They are rather challenging for having many truncated, occluded, tiny and unevenly illuminated human heads. Then, we design a novel end-to-end trainable one-stage network architecture by joint regressing locations and orientations of multi-head to address the MPHPE problem. Specifically, we regard pose as an auxiliary attribute of the head, and append it after the traditional object prediction. Arbitrary pose representation such as Euler angles is acceptable by this flexible design. Then, we jointly optimize these two tasks by sharing features and utilizing appropriate multiple losses. In this way, our method can implicitly benefit from more surroundings to improve HPE accuracy while maintaining head detection performance. We present comprehensive comparisons with state-of-the-art single HPE methods on public benchmarks, as well as superior baseline results on our constructed MPHPE datasets. Datasets and code are released in https://github.com/hnuzhy/DirectMHP.

Viaarxiv icon

On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization

Dec 31, 2022
Fei Jiang, Yeqing Zhou, Jianxuan Liu, Yanyuan Ma

Figure 1 for On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization
Figure 2 for On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization
Figure 3 for On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization
Figure 4 for On High dimensional Poisson models with measurement error: hypothesis testing for nonlinear nonconvex optimization

We study estimation and testing in the Poisson regression model with noisy high dimensional covariates, which has wide applications in analyzing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a non-convex target function to minimize. Treating the high dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive the L1 and L2 convergence rates of the estimator and prove the variable selection consistency. We further establish the asymptotic normality of any subset of the parameters, where the subset can have infinitely many components as long as its cardinality grows sufficiently slow. We develop Wald and score tests based on the asymptotic normality of the estimator, which permits testing of linear functions of the members if the subset. We examine the finite sample performance of the proposed tests by extensive simulation. Finally, the proposed method is successfully applied to the Alzheimer's Disease Neuroimaging Initiative study, which motivated this work initially.

Viaarxiv icon

Body-Part Joint Detection and Association via Extended Object Representation

Dec 15, 2022
Huayi Zhou, Fei Jiang, Hongtao Lu

Figure 1 for Body-Part Joint Detection and Association via Extended Object Representation
Figure 2 for Body-Part Joint Detection and Association via Extended Object Representation
Figure 3 for Body-Part Joint Detection and Association via Extended Object Representation
Figure 4 for Body-Part Joint Detection and Association via Extended Object Representation

The detection of human body and its related parts (e.g., face, head or hands) have been intensively studied and greatly improved since the breakthrough of deep CNNs. However, most of these detectors are trained independently, making it a challenging task to associate detected body parts with people. This paper focuses on the problem of joint detection of human body and its corresponding parts. Specifically, we propose a novel extended object representation that integrates the center location offsets of body or its parts, and construct a dense single-stage anchor-based Body-Part Joint Detector (BPJDet). Body-part associations in BPJDet are embedded into the unified representation which contains both the semantic and geometric information. Therefore, BPJDet does not suffer from error-prone association post-matching, and has a better accuracy-speed trade-off. Furthermore, BPJDet can be seamlessly generalized to jointly detect any body part. To verify the effectiveness and superiority of our method, we conduct extensive experiments on the CityPersons, CrowdHuman and BodyHands datasets. The proposed BPJDet detector achieves state-of-the-art association performance on these three benchmarks while maintains high accuracy of detection. Code will be released to facilitate further studies.

Viaarxiv icon

An Intuitive and Unconstrained 2D Cube Representation for Simultaneous Head Detection and Pose Estimation

Dec 07, 2022
Huayi Zhou, Fei Jiang, Lili Xiong, Hongtao Lu

Figure 1 for An Intuitive and Unconstrained 2D Cube Representation for Simultaneous Head Detection and Pose Estimation
Figure 2 for An Intuitive and Unconstrained 2D Cube Representation for Simultaneous Head Detection and Pose Estimation
Figure 3 for An Intuitive and Unconstrained 2D Cube Representation for Simultaneous Head Detection and Pose Estimation
Figure 4 for An Intuitive and Unconstrained 2D Cube Representation for Simultaneous Head Detection and Pose Estimation

Most recent head pose estimation (HPE) methods are dominated by the Euler angle representation. To avoid its inherent ambiguity problem of rotation labels, alternative quaternion-based and vector-based representations are introduced. However, they both are not visually intuitive, and often derived from equivocal Euler angle labels. In this paper, we present a novel single-stage keypoint-based method via an {\it intuitive} and {\it unconstrained} 2D cube representation for joint head detection and pose estimation. The 2D cube is an orthogonal projection of the 3D regular hexahedron label roughly surrounding one head, and itself contains the head location. It can reflect the head orientation straightforwardly and unambiguously in any rotation angle. Unlike the general 6-DoF object pose estimation, our 2D cube ignores the 3-DoF of head size but retains the 3-DoF of head pose. Based on the prior of equal side length, we can effortlessly obtain the closed-form solution of Euler angles from predicted 2D head cube instead of applying the error-prone PnP algorithm. In experiments, our proposed method achieves comparable results with other representative methods on the public AFLW2000 and BIWI datasets. Besides, a novel test on the CMU panoptic dataset shows that our method can be seamlessly adapted to the unconstrained full-view HPE task without modification.

Viaarxiv icon

StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking

Nov 06, 2022
Huayi Zhou, Fei Jiang, Jiaxin Si, Lili Xiong, Hongtao Lu

Figure 1 for StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking
Figure 2 for StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking
Figure 3 for StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking
Figure 4 for StuArt: Individualized Classroom Observation of Students with Automatic Behavior Recognition and Tracking

Each student matters, but it is hardly for instructors to observe all the students during the courses and provide helps to the needed ones immediately. In this paper, we present StuArt, a novel automatic system designed for the individualized classroom observation, which empowers instructors to concern the learning status of each student. StuArt can recognize five representative student behaviors (hand-raising, standing, sleeping, yawning, and smiling) that are highly related to the engagement and track their variation trends during the course. To protect the privacy of students, all the variation trends are indexed by the seat numbers without any personal identification information. Furthermore, StuArt adopts various user-friendly visualization designs to help instructors quickly understand the individual and whole learning status. Experimental results on real classroom videos have demonstrated the superiority and robustness of the embedded algorithms. We expect our system promoting the development of large-scale individualized guidance of students.

* Novel pedagogical approaches in signal processing for K-12 education 
Viaarxiv icon

SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection

Nov 04, 2022
Huayi Zhou, Fei Jiang, Hongtao Lu

Figure 1 for SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection
Figure 2 for SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection
Figure 3 for SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection
Figure 4 for SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection

Domain adaptive object detection (DAOD) aims to alleviate transfer performance degradation caused by the cross-domain discrepancy. However, most existing DAOD methods are dominated by computationally intensive two-stage detectors, which are not the first choice for industrial applications. In this paper, we propose a novel semi-supervised domain adaptive YOLO (SSDA-YOLO) based method to improve cross-domain detection performance by integrating the compact one-stage detector YOLOv5 with domain adaptation. Specifically, we adapt the knowledge distillation framework with the Mean Teacher model to assist the student model in obtaining instance-level features of the unlabeled target domain. We also utilize the scene style transfer to cross-generate pseudo images in different domains for remedying image-level differences. In addition, an intuitive consistency loss is proposed to further align cross-domain predictions. We evaluate our proposed SSDA-YOLO on public benchmarks including PascalVOC, Clipart1k, Cityscapes, and Foggy Cityscapes. Moreover, to verify its generalization, we conduct experiments on yawning detection datasets collected from various classrooms. The results show considerable improvements of our method in these DAOD tasks. Our code is available on \url{https://github.com/hnuzhy/SSDA-YOLO}.

* submitted to CVIU 
Viaarxiv icon

Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding

Oct 27, 2022
Huayi Zhou, Fei Jiang, Jiaxin Si, Hongtao Lu

Figure 1 for Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding
Figure 2 for Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding
Figure 3 for Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding
Figure 4 for Joint Multi-Person Body Detection and Orientation Estimation via One Unified Embedding

Human body orientation estimation (HBOE) is widely applied into various applications, including robotics, surveillance, pedestrian analysis and autonomous driving. Although many approaches have been addressing the HBOE problem from specific under-controlled scenes to challenging in-the-wild environments, they assume human instances are already detected and take a well cropped sub-image as the input. This setting is less efficient and prone to errors in real application, such as crowds of people. In the paper, we propose a single-stage end-to-end trainable framework for tackling the HBOE problem with multi-persons. By integrating the prediction of bounding boxes and direction angles in one embedding, our method can jointly estimate the location and orientation of all bodies in one image directly. Our key idea is to integrate the HBOE task into the multi-scale anchor channel predictions of persons for concurrently benefiting from engaged intermediate features. Therefore, our approach can naturally adapt to difficult instances involving low resolution and occlusion as in object detection. We validated the efficiency and effectiveness of our method in the recently presented benchmark MEBOW with extensive experiments. Besides, we completed ambiguous instances ignored by the MEBOW dataset, and provided corresponding weak body-orientation labels to keep the integrity and consistency of it for supporting studies toward multi-persons. Our work is available at \url{https://github.com/hnuzhy/JointBDOE}.

Viaarxiv icon