Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zheng Zhang

DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

Feb 25, 2023
Ziyue Liu, Yixing Li, Jing Hu, Xinling Yu, Shinyu Shiau, Xin Ai, Zhiyu Zeng, Zheng Zhang

Figure 1 for DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

Figure 2 for DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

Figure 3 for DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

Figure 4 for DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

Thermal issue is a major concern in 3D integrated circuit (IC) design. Thermal optimization of 3D IC often requires massive expensive PDE simulations. Neural network-based thermal prediction models can perform real-time prediction for many unseen new designs. However, existing works either solve 2D temperature fields only or do not generalize well to new designs with unseen design configurations (e.g., heat sources and boundary conditions). In this paper, for the first time, we propose DeepOHeat, a physics-aware operator learning framework to predict the temperature field of a family of heat equations with multiple parametric or non-parametric design configurations. This framework learns a functional map from the function space of multiple key PDE configurations (e.g., boundary conditions, power maps, heat transfer coefficients) to the function space of the corresponding solution (i.e., temperature fields), enabling fast thermal analysis and optimization by changing key design configurations (rather than just some parameters). We test DeepOHeat on some industrial design cases and compare it against Celsius 3D from Cadence Design Systems. Our results show that, for the unseen testing cases, a well-trained DeepOHeat can produce accurate results with $1000\times$ to $300000\times$ speedup.

* Camera-ready for ACM/IEEE Design Automation Conference (DAC) 2023

Via

Access Paper or Ask Questions

PIFON-EPT: MR-Based Electrical Property Tomography Using Physics-Informed Fourier Networks

Feb 24, 2023
Xinling Yu, José E. C. Serrallés, Ilias I. Giannakopoulos, Ziyue Liu, Luca Daniel, Riccardo Lattanzi, Zheng Zhang

Figure 1 for PIFON-EPT: MR-Based Electrical Property Tomography Using Physics-Informed Fourier Networks

Figure 2 for PIFON-EPT: MR-Based Electrical Property Tomography Using Physics-Informed Fourier Networks

Figure 3 for PIFON-EPT: MR-Based Electrical Property Tomography Using Physics-Informed Fourier Networks

Figure 4 for PIFON-EPT: MR-Based Electrical Property Tomography Using Physics-Informed Fourier Networks

\textit{Objective:} In this paper, we introduce Physics-Informed Fourier Networks (PIFONs) for Electrical Properties (EP) Tomography (EPT). Our novel deep learning-based method is capable of learning EPs globally by solving an inverse scattering problem based on noisy and/or incomplete magnetic resonance (MR) measurements. \textit{Methods:} We use two separate fully-connected neural networks, namely $B_1^{+}$ Net and EP Net, to learn the $B_1^{+}$ field and EPs at any location. A random Fourier features mapping is embedded into $B_1^{+}$ Net, which allows it to learn the $B_1^{+}$ field more efficiently. These two neural networks are trained jointly by minimizing the combination of a physics-informed loss and a data mismatch loss via gradient descent. \textit{Results:} We showed that PIFON-EPT could provide physically consistent reconstructions of EPs and transmit field in the whole domain of interest even when half of the noisy MR measurements of the entire volume was missing. The average error was $2.49\%$, $4.09\%$ and $0.32\%$ for the relative permittivity, conductivity and $B_{1}^{+}$, respectively, over the entire volume of the phantom. In experiments that admitted a zero assumption of $B_z$, PIFON-EPT could yield accurate EP predictions near the interface between regions of different EP values without requiring any boundary conditions. \textit{Conclusion:} This work demonstrated the feasibility of PIFON-EPT, suggesting it could be an accurate and effective method for electrical properties estimation. \textit{Significance:} PIFON-EPT can efficiently de-noise MR measurements, which shows the potential to improve other MR-based EPT techniques. Furthermore, it is the first time that MR-based EPT methods can reconstruct the EPs and $B_{1}^{+}$ field simultaneously from incomplete simulated noisy MR measurements.

* 11 pages

Via

Access Paper or Ask Questions

Side Adapter Network for Open-Vocabulary Semantic Segmentation

Feb 23, 2023
Mengde Xu, Zheng Zhang, Fangyun Wei, Han Hu, Xiang Bai

Figure 1 for Side Adapter Network for Open-Vocabulary Semantic Segmentation

Figure 2 for Side Adapter Network for Open-Vocabulary Semantic Segmentation

Figure 3 for Side Adapter Network for Open-Vocabulary Semantic Segmentation

Figure 4 for Side Adapter Network for Open-Vocabulary Semantic Segmentation

This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN). Our approach models the semantic segmentation task as a region recognition problem. A side network is attached to a frozen CLIP model with two branches: one for predicting mask proposals, and the other for predicting attention bias which is applied in the CLIP model to recognize the class of masks. This decoupled design has the benefit CLIP in recognizing the class of mask proposals. Since the attached side network can reuse CLIP features, it can be very light. In addition, the entire network can be trained end-to-end, allowing the side network to be adapted to the frozen CLIP model, which makes the predicted mask proposals CLIP-aware. Our approach is fast, accurate, and only adds a few additional trainable parameters. We evaluate our approach on multiple semantic segmentation benchmarks. Our method significantly outperforms other counterparts, with up to 18 times fewer trainable parameters and 19 times faster inference speed. We hope our approach will serve as a solid baseline and help ease future research in open-vocabulary semantic segmentation. The code will be available at https://github.com/MendelXu/SAN.

Via

Access Paper or Ask Questions

Tensorized Optical Multimodal Fusion Network

Feb 17, 2023
Yequan Zhao, Xian Xiao, Geza Kurczveil, Raymond G. Beausoleil, Zheng Zhang

Figure 1 for Tensorized Optical Multimodal Fusion Network

Figure 2 for Tensorized Optical Multimodal Fusion Network

We propose the first tensorized optical multimodal fusion network architecture with a self-attention mechanism and low-rank tensor fusion. Simulation results show $51.3 \times$ less hardware requirement and $3.7\times 10^{13}$ MAC/J energy efficiency.

* CLEO 2023 Novel Applications in Integrated Photonics

Via

Access Paper or Ask Questions

Physical Layer Security in Near-Field Communications: What Will Be Changed?

Feb 15, 2023
Zheng Zhang, Yuanwei Liu, Zhaolin Wang, Xidong Mu, Jian Chen

Figure 1 for Physical Layer Security in Near-Field Communications: What Will Be Changed?

Figure 2 for Physical Layer Security in Near-Field Communications: What Will Be Changed?

Figure 3 for Physical Layer Security in Near-Field Communications: What Will Be Changed?

A near-field secure transmission framework is proposed. Employing the hybrid beamforming architecture, a base station (BS) transmits the confidential information to a legitimate user (U) against an eavesdropper (E) in the near field. A two-stage algorithm is proposed to maximize the near-field secrecy capacity. Based on the fully-digital beamformers obtained in the first stage, the optimal analog beamformers and baseband digital beamformers can be alternatingly derived in the closed-form expressions in the second stage. Numerical results demonstrate that in contrast to the far-field secure communication relying on the angular disparity, the near-filed secure communication mainly relies on the distance disparity between U and E.

* 5 pages

Via

Access Paper or Ask Questions

Apples and Oranges? Assessing Image Quality over Content Recognition

Jan 31, 2023
Junyong You, Zheng Zhang

Figure 1 for Apples and Oranges? Assessing Image Quality over Content Recognition

Figure 2 for Apples and Oranges? Assessing Image Quality over Content Recognition

Figure 3 for Apples and Oranges? Assessing Image Quality over Content Recognition

Image recognition and quality assessment are two important viewing tasks, while potentially following different visual mechanisms. This paper investigates if the two tasks can be performed in a multitask learning manner. A sequential spatial-channel attention module is proposed to simulate the visual attention and contrast sensitivity mechanisms that are crucial for content recognition and quality assessment. Spatial attention is shared between content recognition and quality assessment, while channel attention is solely for quality assessment. Such attention module is integrated into Transformer to build a uniform model for the two viewing tasks. The experimental results have demonstrated that the proposed uniform model can achieve promising performance for both quality assessment and content recognition tasks.

Via

Access Paper or Ask Questions

ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

Jan 19, 2023
Kezhao Huang, Haitian Jiang, Minjie Wang, Guangxuan Xiao, David Wipf, Xiang Song, Quan Gan, Zengfeng Huang, Jidong Zhai, Zheng Zhang

Figure 1 for ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

Figure 2 for ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

Figure 3 for ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

Figure 4 for ReFresh: Reducing Memory Access from Exploiting Stable Historical Embeddings for Graph Neural Network Training

A key performance bottleneck when training graph neural network (GNN) models on large, real-world graphs is loading node features onto a GPU. Due to limited GPU memory, expensive data movement is necessary to facilitate the storage of these features on alternative devices with slower access (e.g. CPU memory). Moreover, the irregularity of graph structures contributes to poor data locality which further exacerbates the problem. Consequently, existing frameworks capable of efficiently training large GNN models usually incur a significant accuracy degradation because of the inevitable shortcuts involved. To address these limitations, we instead propose ReFresh, a general-purpose GNN mini-batch training framework that leverages a historical cache for storing and reusing GNN node embeddings instead of re-computing them through fetching raw features at every iteration. Critical to its success, the corresponding cache policy is designed, using a combination of gradient-based and staleness criteria, to selectively screen those embeddings which are relatively stable and can be cached, from those that need to be re-computed to reduce estimation errors and subsequent downstream accuracy loss. When paired with complementary system enhancements to support this selective historical cache, ReFresh is able to accelerate the training speed on large graph datasets such as ogbn-papers100M and MAG240M by 4.6x up to 23.6x and reduce the memory access by 64.5% (85.7% higher than a raw feature cache), with less than 1% influence on test accuracy.

Via

Access Paper or Ask Questions

STARS-ISAC: How Many Sensors Do We Need?

Jan 09, 2023
Zheng Zhang, Yuanwei Liu, Zhaolin Wang, Jian Chen

Figure 1 for STARS-ISAC: How Many Sensors Do We Need?

Figure 2 for STARS-ISAC: How Many Sensors Do We Need?

Figure 3 for STARS-ISAC: How Many Sensors Do We Need?

Figure 4 for STARS-ISAC: How Many Sensors Do We Need?

A simultaneously transmitting and reflecting surface (STARS) enabled integrated sensing and communications (ISAC) framework is proposed, where a novel bi-directional sensing-STARS architecture is devised to facilitate the full-space communication and sensing. Based on the proposed framework, a joint optimization problem is formulated, where the Cramer-Rao bound (CRB) for estimating the 2-dimension direction-of-arrival of the sensing target is minimized. Two cases are considered for sensing performance enhancement. 1) For the two-user case, an alternating optimization algorithm is proposed. In particular, the maximum number of deployable sensors is obtained in the closed-form expressions. 2) For the multi-user case, an extended CRB (ECRB) metric is proposed to characterize the impact of the number of sensors on the sensing performance. Based on the proposed metric, a novel penalty-based double-loop (PDL) algorithm is proposed to solve the ECRB minimization problem. To tackle the coupling of the ECRB, a general decoupling approach is proposed to convert it to a tractable weighted linear summation form. Simulation results reveal that 1) the proposed PDL algorithm can achieve a near-optimal performance with consideration of sensor deployment; 2) without violating the communication under the quality of service requirements, reducing the receive antennas at the BS does not deteriorate the sensing performance; and 3) it is preferable to deploy more passive elements than sensors in terms of achieving optimal sensing performance

* journal paper

Via

Access Paper or Ask Questions

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

Jan 05, 2023
Jia Ning, Chen Li, Zheng Zhang, Zigang Geng, Qi Dai, Kun He, Han Hu

Figure 1 for All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

Figure 2 for All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

Figure 3 for All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

Figure 4 for All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

Unlike language tasks, where the output space is usually limited to a set of tokens, the output space of visual tasks is more complicated, making it difficult to build a unified visual model for various visual tasks. In this paper, we seek to unify the output space of visual tasks, so that we can also build a unified model for visual tasks. To this end, we demonstrate a single unified model that simultaneously handles two typical visual tasks of instance segmentation and depth estimation, which have discrete/fixed-length and continuous/varied-length outputs, respectively. We propose several new techniques that take into account the particularity of visual tasks: 1) Soft token. We employ soft token to represent the task output. Unlike hard tokens in the common VQ-VAE which are assigned one-hot to discrete codebooks/vocabularies, the soft token is assigned softly to the codebook embeddings. Soft token can improve the accuracy of both the next token inference and decoding of the task output; 2) Mask augmentation. Many visual tasks have corruption, undefined or invalid values in label annotations, i.e., occluded area of depth maps. We show that a mask augmentation technique can greatly benefit these tasks. With these new techniques and other designs, we show that the proposed general-purpose task-solver can perform both instance segmentation and depth estimation well. Particularly, we achieve 0.279 RMSE on the specific task of NYUv2 depth estimation, setting a new record on this benchmark. The general-purpose task-solver, dubbed AiT, is available at \url{https://github.com/SwinTransformer/AiT}.

Via

Access Paper or Ask Questions