Compressed sensing (CS) is an efficient method to reconstruct MR image from small sampled data in $k$-space and accelerate the acquisition of MRI. In this work, we propose a novel deep geometric distillation network which combines the merits of model-based and deep learning-based CS-MRI methods, it can be theoretically guaranteed to improve geometric texture details of a linear reconstruction. Firstly, we unfold the model-based CS-MRI optimization problem into two sub-problems that consist of image linear approximation and image geometric compensation. Secondly, geometric compensation sub-problem for distilling lost texture details in approximation stage can be expanded by Taylor expansion to design a geometric distillation module fusing features of different geometric characteristic domains. Additionally, we use a learnable version with adaptive initialization of the step-length parameter, which allows model more flexibility that can lead to convergent smoothly. Numerical experiments verify its superiority over other state-of-the-art CS-MRI reconstruction approaches. The source code will be available at \url{https://github.com/fanxiaohong/Deep-Geometric-Distillation-Network-for-CS-MRI}
Deep neural networks (DNNs) could be very useful in blockchain applications such as DeFi and NFT trading. However, training / running large-scale DNNs as part of a smart contract is infeasible on today's blockchain platforms, due to two fundamental design issues of these platforms. First, blockchains nowadays typically require that each node maintain the complete world state at any time, meaning that the node must execute all transactions in every block. This is prohibitively expensive for computationally intensive smart contracts involving DNNs. Second, existing blockchain platforms expect smart contract transactions to have deterministic, reproducible results and effects. In contrast, DNNs are usually trained / run lock-free on massively parallel computing devices such as GPUs, TPUs and / or computing clusters, which often do not yield deterministic results. This paper proposes novel platform designs, collectively called A New Hope (ANH), that address the above issues. The main ideas are (i) computing-intensive smart contract transactions are only executed by nodes who need their results, or by specialized serviced providers, and (ii) a non-deterministic smart contract transaction leads to uncertain results, which can still be validated, though at a relatively high cost; specifically for DNNs, the validation cost can often be reduced by verifying properties of the results instead of their exact values. In addition, we discuss various implications of ANH, including its effects on token fungibility, sharding, private transactions, and the fundamental meaning of a smart contract.
In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex. Given a construction layout, we propose a visual context-aware graph generation network that learns the implicit global relations among the scene components and infers the location of a new building unit. The proposed network takes as input the scene graph and the corresponding top-view depth image. It provides the location recommendations for a newly-added building units by learning an auto-regressive edge distribution conditioned on existing scenes. We also introduce a global graph-image matching loss to enhance the awareness of essential geometry semantics of the site. Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.
Given a graph G where each node is associated with a set of attributes, and a parameter k specifying the number of output clusters, k-attributed graph clustering (k-AGC) groups nodes in G into k disjoint clusters, such that nodes within the same cluster share similar topological and attribute characteristics, while those in different clusters are dissimilar. This problem is challenging on massive graphs, e.g., with millions of nodes and billions of edges. For such graphs, existing solutions either incur prohibitively high costs, or produce clustering results with compromised quality. In this paper, we propose ACMin, an effective approach to k-AGC that yields high-quality clusters with cost linear to the size of the input graph G. The main contributions of ACMin are twofold: (i) a novel formulation of the k-AGC problem based on an attributed multi-hop conductance quality measure custom-made for this problem setting, which effectively captures cluster coherence in terms of both topological proximities and attribute similarities, and (ii) a linear-time optimization solver that obtains high-quality clusters iteratively, based on efficient matrix operations such as orthogonal iterations, an alternative optimization approach, as well as an initialization technique that significantly speeds up the convergence of ACMin in practice. Extensive experiments, comparing 11 competitors on 6 real datasets, demonstrate that ACMin consistently outperforms all competitors in terms of result quality measured against ground-truth labels, while being up to orders of magnitude faster. In particular, on the Microsoft Academic Knowledge Graph dataset with 265.2 million edges and 1.1 billion attribute values, ACMin outputs high-quality results for 5-AGC within 1.68 hours using a single CPU core, while none of the 11 competitors finish within 3 days.
This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged into the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets.
This paper proposes a novel location-aware deep learning-based single image reflection removal method. Our network has a reflection detection module to regress a probabilistic reflection confidence map, taking multi-scale Laplacian features as inputs. This probabilistic map tells whether a region is reflection-dominated or transmission-dominated. The novelty is that we use the reflection confidence map as the cues for the network to learn how to encode the reflection information adaptively and control the feature flow when predicting reflection and transmission layers. The integration of location information into the network significantly improves the quality of reflection removal results. Besides, a set of learnable Laplacian kernel parameters is introduced to facilitate the extraction of discriminative Laplacian features for reflection detection. We design our network as a recurrent network to progressively refine each iteration's reflection removal results. Extensive experiments verify the superior performance of the proposed method over state-of-the-art approaches.
Knowledge distillation has become an important technique for model compression and acceleration. The conventional knowledge distillation approaches aim to transfer knowledge from teacher to student networks by minimizing the KL-divergence between their probabilistic outputs, which only consider the mutual relationship between individual representations of teacher and student networks. Recently, the contrastive loss-based knowledge distillation is proposed to enable a student to learn the instance discriminative knowledge of a teacher by mapping the same image close and different images far away in the representation space. However, all of these methods ignore that the teacher's knowledge is multi-level, e.g., individual, relational and categorical level. These different levels of knowledge cannot be effectively captured by only one kind of supervisory signal. Here, we introduce Multi-level Knowledge Distillation (MLKD) to transfer richer representational knowledge from teacher to student networks. MLKD employs three novel teacher-student similarities: individual similarity, relational similarity, and categorical similarity, to encourage the student network to learn sample-wise, structure-wise and category-wise knowledge in the teacher network. Experiments demonstrate that MLKD outperforms other state-of-the-art methods on both similar-architecture and cross-architecture tasks. We further show that MLKD can improve the transferability of learned representations in the student network.
While the superior performance of second-order optimization methods such as Newton's method is well known, they are hardly used in practice for deep learning because neither assembling the Hessian matrix nor calculating its inverse is feasible for large-scale problems. Existing second-order methods resort to various diagonal or low-rank approximations of the Hessian, which often fail to capture necessary curvature information to generate a substantial improvement. On the other hand, when training becomes batch-based (i.e., stochastic), noisy second-order information easily contaminates the training procedure unless expensive safeguard is employed. In this paper, we adopt a numerical algorithm for second-order neural network training. We tackle the practical obstacle of Hessian calculation by using the complex-step finite difference (CSFD) -- a numerical procedure adding an imaginary perturbation to the function for derivative computation. CSFD is highly robust, efficient, and accurate (as accurate as the analytic result). This method allows us to literally apply any known second-order optimization methods for deep learning training. Based on it, we design an effective Newton Krylov procedure. The key mechanism is to terminate the stochastic Krylov iteration as soon as a disturbing direction is found so that unnecessary computation can be avoided. During the optimization, we monitor the approximation error in the Taylor expansion to adjust the step size. This strategy combines advantages of line search and trust region methods making our method preserves good local and global convergency at the same time. We have tested our methods in various deep learning tasks. The experiments show that our method outperforms exiting methods, and it often converges one-order faster. We believe our method will inspire a wide-range of new algorithms for deep learning and numerical optimization.
We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP). In our problem, the agent has limited information about the items to be packed into the bin, and an item must be packed immediately after its arrival without buffering or readjusting. The item's placement also subjects to the constraints of collision avoidance and physical stability. We formulate this online 3D-BPP as a constrained Markov decision process. To solve the problem, we propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework. In particular, we introduce a feasibility predictor to predict the feasibility mask for the placement actions and use it to modulate the action probabilities output by the actor during training. Such supervisions and transformations to DRL facilitate the agent to learn feasible policies efficiently. Our method can also be generalized e.g., with the ability to handle lookahead or items with different orientations. We have conducted extensive evaluation showing that the learned policy significantly outperforms the state-of-the-art methods. A user study suggests that our method attains a human-level performance.