Dataset distillation (DD) has emerged as a widely adopted technique for crafting a synthetic dataset that captures the essential information of a training dataset, facilitating the training of accurate neural models. Its applications span various domains, including transfer learning, federated learning, and neural architecture search. The most popular methods for constructing the synthetic data rely on matching the convergence properties of training the model with the synthetic dataset and the training dataset. However, targeting the training dataset must be thought of as auxiliary in the same sense that the training set is an approximate substitute for the population distribution, and the latter is the data of interest. Yet despite its popularity, an aspect that remains unexplored is the relationship of DD to its generalization, particularly across uncommon subgroups. That is, how can we ensure that a model trained on the synthetic dataset performs well when faced with samples from regions with low population density? Here, the representativeness and coverage of the dataset become salient over the guaranteed training error at inference. Drawing inspiration from distributionally robust optimization, we introduce an algorithm that combines clustering with the minimization of a risk measure on the loss to conduct DD. We provide a theoretical rationale for our approach and demonstrate its effective generalization and robustness across subgroups through numerical experiments.
Existing learned video compression models employ flow net or deformable convolutional networks (DCN) to estimate motion information. However, the limited receptive fields of flow net and DCN inherently direct their attentiveness towards the local contexts. Global contexts, such as large-scale motions and global correlations among frames are ignored, presenting a significant bottleneck for capturing accurate motions. To address this issue, we propose a joint local and global motion compensation module (LGMC) for leaned video coding. More specifically, we adopt flow net for local motion compensation. To capture global context, we employ the cross attention in feature domain for motion compensation. In addition, to avoid the quadratic complexity of vanilla cross attention, we divide the softmax operations in attention into two independent softmax operations, leading to linear complexity. To validate the effectiveness of our proposed LGMC, we integrate it with DCVC-TCM and obtain learned video compression with joint local and global motion compensation (LVC-LGMC). Extensive experiments demonstrate that our LVC-LGMC has significant rate-distortion performance improvements over baseline DCVC-TCM.
* ICASSP (International Conference on Acoustics, Speech, and Signal
Processing) 2024 * Fix typos and Fig.1 and Fig.2. Accepted at ICASSP 2024. The first
attempt to use cross attention for bits-free motion estimation and motion
This paper presents a learned video compression method in response to video compression track of the 6th Challenge on Learned Image Compression (CLIC), at DCC 2024.Specifically, we propose a unified contextual video compression framework (UCVC) for joint P-frame and B-frame coding. Each non-intra frame refers to two neighboring decoded frames, which can be either both from the past for P-frame compression, or one from the past and one from the future for B-frame compression. In training stage, the model parameters are jointly optimized with both P-frames and B-frames. Benefiting from the designs, the framework can support both P-frame and B-frame coding and achieve comparable compression efficiency with that specifically designed for P-frame or B-frame.As for challenge submission, we report the optimal compression efficiency by selecting appropriate frame types for each test sequence. Our team name is PKUSZ-LVC.
With the development and popularity of sensors installed in manufacturing systems, complex data are collected during manufacturing processes, which brings challenges for traditional process control methods. This paper proposes a novel process control and monitoring method for the complex structure of high-dimensional image-based overlay errors (modeled in tensor form), which are collected in semiconductor manufacturing processes. The proposed method aims to reduce overlay errors using limited control recipes. We first build a high-dimensional process model and propose different tensor-on-vector regression algorithms to estimate parameters in the model to alleviate the curse of dimensionality. Then, based on the estimate of tensor parameters, the exponentially weighted moving average (EWMA) controller for tensor data is designed whose stability is theoretically guaranteed. Considering the fact that low-dimensional control recipes cannot compensate for all high-dimensional disturbances on the image, control residuals are monitored to prevent significant drifts of uncontrollable high-dimensional disturbances. Through extensive simulations and real case studies, the performances of parameter estimation algorithms and the EWMA controller in tensor space are evaluated. Compared with existing image-based feedback controllers, the superiority of our method is verified especially when disturbances are not stable.
Social relations are leveraged to tackle the sparsity issue of user-item interaction data in recommendation under the assumption of social homophily. However, social recommendation paradigms predominantly focus on homophily based on user preferences. While social information can enhance recommendations, its alignment with user preferences is not guaranteed, thereby posing the risk of introducing informational redundancy. We empirically discover that social graphs in real recommendation data exhibit low preference-aware homophily, which limits the effect of social recommendation models. To comprehensively extract preference-aware homophily information latent in the social graph, we propose Social Heterophily-alleviating Rewiring (SHaRe), a data-centric framework for enhancing existing graph-based social recommendation models. We adopt Graph Rewiring technique to capture and add highly homophilic social relations, and cut low homophilic (or heterophilic) relations. To better refine the user representations from reliable social relations, we integrate a contrastive learning method into the training of SHaRe, aiming to calibrate the user representations for enhancing the result of Graph Rewiring. Experiments on real-world datasets show that the proposed framework not only exhibits enhanced performances across varying homophily ratios but also improves the performance of existing state-of-the-art (SOTA) social recommendation models.
* This paper has been accepted by The Web Conference (WWW) 2024
The burgeoning volume of graph data poses significant challenges in storage, transmission, and particularly the training of graph neural networks (GNNs). To address these challenges, graph condensation (GC) has emerged as an innovative solution. GC focuses on synthesizing a compact yet highly representative graph, on which GNNs can achieve performance comparable to trained on the large original graph. The notable efficacy of GC and its broad prospects have garnered significant attention and spurred extensive research. This survey paper provides an up-to-date and systematic overview of GC, organizing existing research into four categories aligned with critical GC evaluation criteria: effectiveness, generalization, fairness, and efficiency. To facilitate an in-depth and comprehensive understanding of GC, we examine various methods under each category and thoroughly discuss two essential components within GC: optimization strategies and condensed graph generation. Additionally, we introduce the applications of GC in a variety of fields, and highlight the present challenges and novel insights in GC, promoting advancements in future research.
Cell-free massive multi-input multi-output (MIMO) has recently gained much attention for its potential in shaping the landscape of sixth-generation (6G) wireless systems. This paper proposes a hierarchical network architecture tailored for cell-free massive MIMO, seamlessly integrating co-located and distributed antennas. A central base station (CBS), equipped with an antenna array, positions itself near the center of the coverage area, complemented by distributed access points spanning the periphery. The proposed architecture remarkably outperforms conventional cell-free networks, demonstrating superior sum throughput while maintaining a comparable worst-case per-user spectral efficiency. Meanwhile, the implementation cost associated with the fronthaul network is substantially diminished.
* 2024 IEEE International Conference on Communications (ICC-2024)
This one page paper describes our method for the track of image compression. To achieve better perceptual quality, we use the adversarial loss to generate realistic textures, use region of interest (ROI) mask to guide the bit allocation for different regions. Our Team name is TLIC.
Recently, intelligent reflecting surface (IRS)-aided millimeter-wave (mmWave) and terahertz (THz) communications are considered in the wireless community. This paper aims to design a beam-based multiple-access strategy for this new paradigm. Its key idea is to make use of multiple sub-arrays over a hybrid digital-analog array to form independent beams, each of which is steered towards the desired direction to mitigate inter-user interference and suppress unwanted signal reflection. The proposed scheme combines the advantages of both orthogonal multiple access (i.e., no inter-user interference) and non-orthogonal multiple access (i.e., full time-frequency resource use). Consequently, it can substantially boost the system capacity, as verified by Monte-Carlo simulations.
* 2024 IEEE Wireless Communications and Networking Conference (WCNC)