Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiwei Lyu

Step-Calibrated Diffusion for Biomedical Optical Image Restoration

Mar 26, 2024

Yiwei Lyu, Sung Jik Cha, Cheng Jiang, Asadur Chowdury, Xinhai Hou, Edward Harake, Akhil Kondepudi, Christian Freudiger, Honglak Lee, Todd C. Hollon

Figure 1 for Step-Calibrated Diffusion for Biomedical Optical Image Restoration

Figure 2 for Step-Calibrated Diffusion for Biomedical Optical Image Restoration

Figure 3 for Step-Calibrated Diffusion for Biomedical Optical Image Restoration

Figure 4 for Step-Calibrated Diffusion for Biomedical Optical Image Restoration

Abstract:High-quality, high-resolution medical imaging is essential for clinical care. Raman-based biomedical optical imaging uses non-ionizing infrared radiation to evaluate human tissues in real time and is used for early cancer detection, brain tumor diagnosis, and intraoperative tissue analysis. Unfortunately, optical imaging is vulnerable to image degradation due to laser scattering and absorption, which can result in diagnostic errors and misguided treatment. Restoration of optical images is a challenging computer vision task because the sources of image degradation are multi-factorial, stochastic, and tissue-dependent, preventing a straightforward method to obtain paired low-quality/high-quality data. Here, we present Restorative Step-Calibrated Diffusion (RSCD), an unpaired image restoration method that views the image restoration problem as completing the finishing steps of a diffusion-based image generation task. RSCD uses a step calibrator model to dynamically determine the severity of image degradation and the number of steps required to complete the reverse diffusion process for image restoration. RSCD outperforms other widely used unpaired image restoration methods on both image quality and perceptual evaluation metrics for restoring optical images. Medical imaging experts consistently prefer images restored using RSCD in blinded comparison experiments and report minimal to no hallucinations. Finally, we show that RSCD improves performance on downstream clinical imaging tasks, including automated brain tumor diagnosis and deep tissue imaging. Our code is available at https://github.com/MLNeurosurg/restorative_step-calibrated_diffusion.

Via

Access Paper or Ask Questions

A self-supervised framework for learning whole slide representations

Feb 09, 2024

Xinhai Hou, Cheng Jiang, Akhil Kondepudi, Yiwei Lyu, Asadur Zaman Chowdury, Honglak Lee, Todd C. Hollon

Figure 1 for A self-supervised framework for learning whole slide representations

Figure 2 for A self-supervised framework for learning whole slide representations

Figure 3 for A self-supervised framework for learning whole slide representations

Figure 4 for A self-supervised framework for learning whole slide representations

Abstract:Whole slide imaging is fundamental to biomedical microscopy and computational pathology. However, whole slide images (WSIs) present a complex computer vision challenge due to their gigapixel size, diverse histopathologic features, spatial heterogeneity, and limited/absent data annotations. These challenges highlight that supervised training alone can result in suboptimal whole slide representations. Self-supervised representation learning can achieve high-quality WSI visual feature learning for downstream diagnostic tasks, such as cancer diagnosis or molecular genetic prediction. Here, we present a general self-supervised whole slide learning (S3L) framework for gigapixel-scale self-supervision of WSIs. S3L combines data transformation strategies from transformer-based vision and language modeling into a single unified framework to generate paired views for self-supervision. S3L leverages the inherent regional heterogeneity, histologic feature variability, and information redundancy within WSIs to learn high-quality whole-slide representations. We benchmark S3L visual representations on two diagnostic tasks for two biomedical microscopy modalities. S3L significantly outperforms WSI baselines for cancer diagnosis and genetic mutation prediction. Additionally, S3L achieves good performance using both in-domain and out-of-distribution patch encoders, demonstrating good flexibility and generalizability.

* 18 pages, 11 figures

Via

Access Paper or Ask Questions

TOD-Flow: Modeling the Structure of Task-Oriented Dialogues

Dec 07, 2023

Sungryull Sohn, Yiwei Lyu, Anthony Liu, Lajanugen Logeswaran, Dong-Ki Kim, Dongsub Shim, Honglak Lee

Figure 1 for TOD-Flow: Modeling the Structure of Task-Oriented Dialogues

Figure 2 for TOD-Flow: Modeling the Structure of Task-Oriented Dialogues

Figure 3 for TOD-Flow: Modeling the Structure of Task-Oriented Dialogues

Figure 4 for TOD-Flow: Modeling the Structure of Task-Oriented Dialogues

Abstract:Task-Oriented Dialogue (TOD) systems have become crucial components in interactive artificial intelligence applications. While recent advances have capitalized on pre-trained language models (PLMs), they exhibit limitations regarding transparency and controllability. To address these challenges, we propose a novel approach focusing on inferring the TOD-Flow graph from dialogue data annotated with dialog acts, uncovering the underlying task structure in the form of a graph. The inferred TOD-Flow graph can be easily integrated with any dialogue model to improve its prediction performance, transparency, and controllability. Our TOD-Flow graph learns what a model can, should, and should not predict, effectively reducing the search space and providing a rationale for the model's prediction. We show that the proposed TOD-Flow graph better resembles human-annotated graphs compared to prior approaches. Furthermore, when combined with several dialogue policies and end-to-end dialogue models, we demonstrate that our approach significantly improves dialog act classification and end-to-end response generation performance in the MultiWOZ and SGD benchmarks. Code available at: https://github.com/srsohn/TOD-Flow

Via

Access Paper or Ask Questions

Code Models are Zero-shot Precondition Reasoners

Nov 16, 2023

Lajanugen Logeswaran, Sungryull Sohn, Yiwei Lyu, Anthony Zhe Liu, Dong-Ki Kim, Dongsub Shim, Moontae Lee, Honglak Lee

Figure 1 for Code Models are Zero-shot Precondition Reasoners

Figure 2 for Code Models are Zero-shot Precondition Reasoners

Figure 3 for Code Models are Zero-shot Precondition Reasoners

Figure 4 for Code Models are Zero-shot Precondition Reasoners

Abstract:One of the fundamental skills required for an agent acting in an environment to complete tasks is the ability to understand what actions are plausible at any given point. This work explores a novel use of code representations to reason about action preconditions for sequential decision making tasks. Code representations offer the flexibility to model procedural activities and associated constraints as well as the ability to execute and verify constraint satisfaction. Leveraging code representations, we extract action preconditions from demonstration trajectories in a zero-shot manner using pre-trained code models. Given these extracted preconditions, we propose a precondition-aware action sampling strategy that ensures actions predicted by a policy are consistent with preconditions. We demonstrate that the proposed approach enhances the performance of few-shot policy learning approaches across task-oriented dialog and embodied textworld benchmarks.

* Neurips Foundation Models for Decision Making Workshop 2023

Via

Access Paper or Ask Questions

MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning

Jun 28, 2023

Paul Pu Liang, Yiwei Lyu, Xiang Fan, Arav Agarwal, Yun Cheng, Louis-Philippe Morency, Ruslan Salakhutdinov

Figure 1 for MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning

Figure 2 for MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning

Abstract:Learning multimodal representations involves integrating information from multiple heterogeneous sources of data. In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiZoo, a public toolkit consisting of standardized implementations of > 20 core multimodal algorithms and MultiBench, a large-scale benchmark spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas. Together, these provide an automated end-to-end machine learning pipeline that simplifies and standardizes data loading, experimental setup, and model evaluation. To enable holistic evaluation, we offer a comprehensive methodology to assess (1) generalization, (2) time and space complexity, and (3) modality robustness. MultiBench paves the way towards a better understanding of the capabilities and limitations of multimodal models, while ensuring ease of use, accessibility, and reproducibility. Our toolkits are publicly available, will be regularly updated, and welcome inputs from the community.

* JMLR Open Source Software 2023, Code available at https://github.com/pliang279/MultiBench

Via

Access Paper or Ask Questions

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Jun 12, 2023

Yiwei Lyu, Tiange Luo, Jiacheng Shi, Todd C. Hollon, Honglak Lee

Figure 1 for Fine-grained Text Style Transfer with Diffusion-Based Language Models

Figure 2 for Fine-grained Text Style Transfer with Diffusion-Based Language Models

Figure 3 for Fine-grained Text Style Transfer with Diffusion-Based Language Models

Figure 4 for Fine-grained Text Style Transfer with Diffusion-Based Language Models

Abstract:Diffusion probabilistic models have shown great success in generating high-quality images controllably, and researchers have tried to utilize this controllability into text generation domain. Previous works on diffusion-based language models have shown that they can be trained without external knowledge (such as pre-trained weights) and still achieve stable performance and controllability. In this paper, we trained a diffusion-based model on StylePTB dataset, the standard benchmark for fine-grained text style transfers. The tasks in StylePTB requires much more refined control over the output text compared to tasks evaluated in previous works, and our model was able to achieve state-of-the-art performance on StylePTB on both individual and compositional transfers. Moreover, our model, trained on limited data from StylePTB without external knowledge, outperforms previous works that utilized pretrained weights, embeddings, and external grammar parsers, and this may indicate that diffusion-based language models have great potential under low-resource settings.

* Accepted at Repl4NLP workshop at ACL 2023

Via

Access Paper or Ask Questions

Risk-aware Safe Control for Decentralized Multi-agent Systems via Dynamic Responsibility Allocation

May 22, 2023

Yiwei Lyu, Wenhao Luo, John M. Dolan

Abstract:Decentralized control schemes are increasingly favored in various domains that involve multi-agent systems due to the need for computational efficiency as well as general applicability to large-scale systems. However, in the absence of an explicit global coordinator, it is hard for distributed agents to determine how to efficiently interact with others. In this paper, we present a risk-aware decentralized control framework that provides guidance on how much relative responsibility share (a percentage) an individual agent should take to avoid collisions with others while moving efficiently without direct communications. We propose a novel Control Barrier Function (CBF)-inspired risk measurement to characterize the aggregate risk agents face from potential collisions under motion uncertainty. We use this measurement to allocate responsibility shares among agents dynamically and develop risk-aware decentralized safe controllers. In this way, we are able to leverage the flexibility of robots with lower risk to improve the motion flexibility for those with higher risk, thus achieving improved collective safety. We demonstrate the validity and efficiency of our proposed approach through two examples: ramp merging in autonomous driving and a multi-agent position-swapping game.

Via

Access Paper or Ask Questions

Active Probing and Influencing Human Behaviors Via Autonomous Agents

Apr 24, 2023

Shuangge Wang, Yiwei Lyu, John M. Dolan

Figure 1 for Active Probing and Influencing Human Behaviors Via Autonomous Agents

Figure 2 for Active Probing and Influencing Human Behaviors Via Autonomous Agents

Figure 3 for Active Probing and Influencing Human Behaviors Via Autonomous Agents

Figure 4 for Active Probing and Influencing Human Behaviors Via Autonomous Agents

Abstract:Autonomous agents (robots) face tremendous challenges while interacting with heterogeneous human agents in close proximity. One of these challenges is that the autonomous agent does not have an accurate model tailored to the specific human that the autonomous agent is interacting with, which could sometimes result in inefficient human-robot interaction and suboptimal system dynamics. Developing an online method to enable the autonomous agent to learn information about the human model is therefore an ongoing research goal. Existing approaches position the robot as a passive learner in the environment to observe the physical states and the associated human response. This passive design, however, only allows the robot to obtain information that the human chooses to exhibit, which sometimes doesn't capture the human's full intention. In this work, we present an online optimization-based probing procedure for the autonomous agent to clarify its belief about the human model in an active manner. By optimizing an information radius, the autonomous agent chooses the action that most challenges its current conviction. This procedure allows the autonomous agent to actively probe the human agents to reveal information that's previously unavailable to the autonomous agent. With this gathered information, the autonomous agent can interactively influence the human agent for some designated objectives. Our main contributions include a coherent theoretical framework that unifies the probing and influence procedures and two case studies in autonomous driving that show how active probing can help to create better participant experience during influence, like higher efficiency or less perturbations.

* 2023 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Apr 13, 2023

Wenli Xiao, Yiwei Lyu, John Dolan

Figure 1 for Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Figure 2 for Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Figure 3 for Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Figure 4 for Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning

Abstract:Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize reward but do not have safety guarantees during the learning and deployment phases. Although shielding with Linear Temporal Logic (LTL) is a promising formal method to ensure safety in single-agent Reinforcement Learning (RL), it results in conservative behaviors when scaling to multi-agent scenarios. Additionally, it poses computational challenges for synthesizing shields in complex multi-agent environments. This work introduces Model-based Dynamic Shielding (MBDS) to support MARL algorithm design. Our algorithm synthesizes distributive shields, which are reactive systems running in parallel with each MARL agent, to monitor and rectify unsafe behaviors. The shields can dynamically split, merge, and recompute based on agents' states. This design enables efficient synthesis of shields to monitor agents in complex environments without coordination overheads. We also propose an algorithm to synthesize shields without prior knowledge of the dynamics model. The proposed algorithm obtains an approximate world model by interacting with the environment during the early stage of exploration, making our MBDS enjoy formal safety guarantees with high probability. We demonstrate in simulations that our framework can surpass existing baselines in terms of safety guarantees and learning performance.

* Accepted in AAMAS 2023

Via

Access Paper or Ask Questions

Minimally Constrained Multi-Robot Coordination with Line-of-sight Connectivity Maintenance

Mar 07, 2023

Yupeng Yang, Yiwei Lyu, Wenhao Luo

Abstract:In this paper, we consider a team of mobile robots executing simultaneously multiple behaviors by different subgroups, while maintaining global and subgroup line-of-sight (LOS) network connectivity that minimally constrains the original multi-robot behaviors. The LOS connectivity between pairwise robots is preserved when two robots stay within the limited communication range and their LOS remains occlusion-free from static obstacles while moving. By using control barrier functions (CBF) and minimum volume enclosing ellipsoids (MVEE), we first introduce the LOS connectivity barrier certificate (LOS-CBC) to characterize the state-dependent admissible control space for pairwise robots, from which their resulting motion will keep the two robots LOS connected over time. We then propose the Minimum Line-of-Sight Connectivity Constraint Spanning Tree (MLCCST) as a step-wise bilevel optimization framework to jointly optimize (a) the minimum set of LOS edges to actively maintain, and (b) the control revision with respect to a nominal multi-robot controller due to LOS connectivity maintenance. As proved in the theoretical analysis, this allows the robots to improvise the optimal composition of LOS-CBC control constraints that are least constraining around the nominal controllers, and at the same time enforce the global and subgroup LOS connectivity through the resulting preserved set of pairwise LOS edges. The framework thus leads to robots staying as close to their nominal behaviors, while exhibiting dynamically changing LOS-connected network topology that provides the greatest flexibility for the existing multi-robot tasks in real time. We demonstrate the effectiveness of our approach through simulations with up to 64 robots.

* Extended version of the paper to appear in ICRA 2023, 12 pages

Via

Access Paper or Ask Questions