Accurate cancer diagnosis remains a critical challenge in digital pathology, largely due to the gigapixel size and complex spatial relationships present in whole slide images. Traditional multiple instance learning (MIL) methods often struggle with these intricacies, especially in preserving the necessary context for accurate diagnosis. In response, we introduce a novel framework named Semantics-Aware Attention Guidance (SAG), which includes 1) a technique for converting diagnostically relevant entities into attention signals, and 2) a flexible attention loss that efficiently integrates various semantically significant information, such as tissue anatomy and cancerous regions. Our experiments on two distinct cancer datasets demonstrate consistent improvements in accuracy, precision, and recall with two state-of-the-art baseline models. Qualitative analysis further reveals that the incorporation of heuristic guidance enables the model to focus on regions critical for diagnosis. SAG is not only effective for the models discussed here, but its adaptability extends to any attention-based diagnostic model. This opens up exciting possibilities for further improving the accuracy and efficiency of cancer diagnostics.
Text-to-3D scene generation holds immense potential for the gaming, film, and architecture sectors. Despite significant progress, existing methods struggle with maintaining high quality, consistency, and editing flexibility. In this paper, we propose DreamScene, a 3D Gaussian-based novel text-to-3D scene generation framework, to tackle the aforementioned three challenges mainly via two strategies. First, DreamScene employs Formation Pattern Sampling (FPS), a multi-timestep sampling strategy guided by the formation patterns of 3D objects, to form fast, semantically rich, and high-quality representations. FPS uses 3D Gaussian filtering for optimization stability, and leverages reconstruction techniques to generate plausible textures. Second, DreamScene employs a progressive three-stage camera sampling strategy, specifically designed for both indoor and outdoor settings, to effectively ensure object-environment integration and scene-wide 3D consistency. Last, DreamScene enhances scene editing flexibility by integrating objects and environments, enabling targeted adjustments. Extensive experiments validate DreamScene's superiority over current state-of-the-art techniques, heralding its wide-ranging potential for diverse applications. Code and demos will be released at https://dreamscene-project.github.io .
We introduce a novel task, called Generalized Relation Discovery (GRD), for open-world relation extraction. GRD aims to identify unlabeled instances in existing pre-defined relations or discover novel relations by assigning instances to clusters as well as providing specific meanings for these clusters. The key challenges of GRD are how to mitigate the serious model biases caused by labeled pre-defined relations to learn effective relational representations and how to determine the specific semantics of novel relations during classifying or clustering unlabeled instances. We then propose a novel framework, SFGRD, for this task to solve the above issues by learning from semi-factuals in two stages. The first stage is semi-factual generation implemented by a tri-view debiased relation representation module, in which we take each original sentence as the main view and design two debiased views to generate semi-factual examples for this sentence. The second stage is semi-factual thinking executed by a dual-space tri-view collaborative relation learning module, where we design a cluster-semantic space and a class-index space to learn relational semantics and relation label indices, respectively. In addition, we devise alignment and selection strategies to integrate two spaces and establish a self-supervised learning loop for unlabeled data by doing semi-factual thinking across three views. Extensive experimental results show that SFGRD surpasses state-of-the-art models in terms of accuracy by 2.36\% $\sim$5.78\% and cosine similarity by 32.19\%$\sim$ 84.45\% for relation label index and relation semantic quality, respectively. To the best of our knowledge, we are the first to exploit the efficacy of semi-factuals in relation extraction.
Incorporating symmetry as an inductive bias into multi-agent reinforcement learning (MARL) has led to improvements in generalization, data efficiency, and physical consistency. While prior research has succeeded in using perfect symmetry prior, the realm of partial symmetry in the multi-agent domain remains unexplored. To fill in this gap, we introduce the partially symmetric Markov game, a new subclass of the Markov game. We then theoretically show that the performance error introduced by utilizing symmetry in MARL is bounded, implying that the symmetry prior can still be useful in MARL even in partial symmetry situations. Motivated by this insight, we propose the Partial Symmetry Exploitation (PSE) framework that is able to adaptively incorporate symmetry prior in MARL under different symmetry-breaking conditions. Specifically, by adaptively adjusting the exploitation of symmetry, our framework is able to achieve superior sample efficiency and overall performance of MARL algorithms. Extensive experiments are conducted to demonstrate the superior performance of the proposed framework over baselines. Finally, we implement the proposed framework in real-world multi-robot testbed to show its superiority.
Recently, 3D Gaussian, as an explicit 3D representation method, has demonstrated strong competitiveness over NeRF (Neural Radiance Fields) in terms of expressing complex scenes and training duration. These advantages signal a wide range of applications for 3D Gaussians in 3D understanding and editing. Meanwhile, the segmentation of 3D Gaussians is still in its infancy. The existing segmentation methods are not only cumbersome but also incapable of segmenting multiple objects simultaneously in a short amount of time. In response, this paper introduces a 3D Gaussian segmentation method implemented with 2D segmentation as supervision. This approach uses input 2D segmentation maps to guide the learning of the added 3D Gaussian semantic information, while nearest neighbor clustering and statistical filtering refine the segmentation results. Experiments show that our concise method can achieve comparable performances on mIOU and mAcc for multi-object segmentation as previous single-object segmentation methods.
Graph neural networks (GNNs) have shown advantages in graph-based analysis tasks. However, most existing methods have the homogeneity assumption and show poor performance on heterophilic graphs, where the linked nodes have dissimilar features and different class labels, and the semantically related nodes might be multi-hop away. To address this limitation, this paper presents GraphRARE, a general framework built upon node relative entropy and deep reinforcement learning, to strengthen the expressive capability of GNNs. An innovative node relative entropy, which considers node features and structural similarity, is used to measure mutual information between node pairs. In addition, to avoid the sub-optimal solutions caused by mixing useful information and noises of remote nodes, a deep reinforcement learning-based algorithm is developed to optimize the graph topology. This algorithm selects informative nodes and discards noisy nodes based on the defined node relative entropy. Extensive experiments are conducted on seven real-world datasets. The experimental results demonstrate the superiority of GraphRARE in node classification and its capability to optimize the original graph topology.
Multi-agent reinforcement learning (MARL) has achieved promising results in recent years. However, most existing reinforcement learning methods require a large amount of data for model training. In addition, data-efficient reinforcement learning requires the construction of strong inductive biases, which are ignored in the current MARL approaches. Inspired by the symmetry phenomenon in multi-agent systems, this paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss into the existing MARL methods. In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms. Experimental tests on multiple challenging tasks demonstrate the effectiveness of the proposed framework. Moreover, the proposed framework is applied to a physical multi-robot testbed to show its superiority.
Modeling and predicting the performance of students in collaborative learning paradigms is an important task. Most of the research presented in literature regarding collaborative learning focuses on the discussion forums and social learning networks. There are only a few works that investigate how students interact with each other in team projects and how such interactions affect their academic performance. In order to bridge this gap, we choose a software engineering course as the study subject. The students who participate in a software engineering course are required to team up and complete a software project together. In this work, we construct an interaction graph based on the activities of students grouped in various teams. Based on this student interaction graph, we present an extended graph transformer framework for collaborative learning (CLGT) for evaluating and predicting the performance of students. Moreover, the proposed CLGT contains an interpretation module that explains the prediction results and visualizes the student interaction patterns. The experimental results confirm that the proposed CLGT outperforms the baseline models in terms of performing predictions based on the real-world datasets. Moreover, the proposed CLGT differentiates the students with poor performance in the collaborative learning paradigm and gives teachers early warnings, so that appropriate assistance can be provided.
In this paper, we consider waveform design for dualfunction radar-communication systems based on multiple-inputmultiple-out arrays. To achieve better Rician target detection performance, we use the relative entropy associated with the formulated detection problem as the design metric. We also impose a multiuser interference energy constraint on the waveforms to ensure the achievable sum-rate of the communications. Two algorithms are presented to tackle the nonlinear non-convex waveform design problem. In the first algorithm, we derive a quadratic function to minorize the objective function. To tackle the quadratically constrained quadratic programming problem at each iteration, a semidefinite relaxation approach followed by a rank-one decomposition procedure and an efficient alternating direction method of multipliers (ADMM) are proposed, respectively. In the second algorithm, we present a novel ADMM algorithm to tackle the optimization problem and employ an efficient minorization-maximization approach in the inner loop of the ADMM algorithm. Numerical results demonstrate the superiority of both algorithms. Moreover, the presented algorithms can be extended to synthesize peak-to-average-power ratio constrained waveforms, which allows the radio frequency amplifier to operate at an increased efficiency.