Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts

Apr 13, 2024
Yusheng Liao, Shuyang Jiang, Yu Wang, Yanfeng Wang

Large language models like ChatGPT have shown substantial progress in natural language understanding and generation, proving valuable across various disciplines, including the medical field. Despite advancements, challenges persist due to the complexity and diversity inherent in medical tasks which often require multi-task learning capabilities. Previous approaches, although beneficial, fall short in real-world applications because they necessitate task-specific annotations at inference time, limiting broader generalization. This paper introduces MING-MOE, a novel Mixture-of-Expert~(MOE)-based medical large language model designed to manage diverse and complex medical tasks without requiring task-specific annotations, thus enhancing its usability across extensive datasets. MING-MOE employs a Mixture of Low-Rank Adaptation (MoLoRA) technique, allowing for efficient parameter usage by maintaining base model parameters static while adapting through a minimal set of trainable parameters. We demonstrate that MING-MOE achieves state-of-the-art (SOTA) performance on over 20 medical tasks, illustrating a significant improvement over existing models. This approach not only extends the capabilities of medical language models but also improves inference efficiency.

* 15 pages, 3 figures

Via

Access Paper or Ask Questions

Flying with Photons: Rendering Novel Views of Propagating Light

Apr 10, 2024
Anagh Malik, Noah Juravsky, Ryan Po, Gordon Wetzstein, Kiriakos N. Kutulakos, David B. Lindell

We present an imaging and neural rendering technique that seeks to synthesize videos of light propagating through a scene from novel, moving camera viewpoints. Our approach relies on a new ultrafast imaging setup to capture a first-of-its kind, multi-viewpoint video dataset with picosecond-level temporal resolution. Combined with this dataset, we introduce an efficient neural volume rendering framework based on the transient field. This field is defined as a mapping from a 3D point and 2D direction to a high-dimensional, discrete-time signal that represents time-varying radiance at ultrafast timescales. Rendering with transient fields naturally accounts for effects due to the finite speed of light, including viewpoint-dependent appearance changes caused by light propagation delays to the camera. We render a range of complex effects, including scattering, specular reflection, refraction, and diffraction. Additionally, we demonstrate removing viewpoint-dependent propagation delays using a time warping procedure, rendering of relativistic effects, and video synthesis of direct and global components of light transport.

* Project page: https://anaghmalik.com/FlyingWithPhotons/

Via

Access Paper or Ask Questions

Efficient Test-Time Adaptation of Vision-Language Models

Mar 27, 2024
Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, Eric Xing

Test-time adaptation with pre-trained vision-language models has attracted increasing attention for tackling distribution shifts during the test time. Though prior studies have achieved very promising performance, they involve intensive computation which is severely unaligned with test-time adaptation. We design TDA, a training-free dynamic adapter that enables effective and efficient test-time adaptation with vision-language models. TDA works with a lightweight key-value cache that maintains a dynamic queue with few-shot pseudo labels as values and the corresponding test-sample features as keys. Leveraging the key-value cache, TDA allows adapting to test data gradually via progressive pseudo label refinement which is super-efficient without incurring any backpropagation. In addition, we introduce negative pseudo labeling that alleviates the adverse impact of pseudo label noises by assigning pseudo labels to certain negative classes when the model is uncertain about its pseudo label predictions. Extensive experiments over two benchmarks demonstrate TDA's superior effectiveness and efficiency as compared with the state-of-the-art. The code has been released in \url{https://kdiaaa.github.io/tda/}.

* Accepted to CVPR 2024. The code has been released in \url{https://kdiaaa.github.io/tda/}

Via

Access Paper or Ask Questions

Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation

Apr 09, 2024
Sidra Aleem, Fangyijie Wang, Mayug Maniparambil, Eric Arazo, Julia Dietlmeier, Kathleen Curran, Noel E. O'Connor, Suzanne Little

The Segment Anything Model (SAM) and CLIP are remarkable vision foundation models (VFMs). SAM, a prompt driven segmentation model, excels in segmentation tasks across diverse domains, while CLIP is renowned for its zero shot recognition capabilities. However, their unified potential has not yet been explored in medical image segmentation. To adapt SAM to medical imaging, existing methods primarily rely on tuning strategies that require extensive data or prior prompts tailored to the specific task, making it particularly challenging when only a limited number of data samples are available. This work presents an in depth exploration of integrating SAM and CLIP into a unified framework for medical image segmentation. Specifically, we propose a simple unified framework, SaLIP, for organ segmentation. Initially, SAM is used for part based segmentation within the image, followed by CLIP to retrieve the mask corresponding to the region of interest (ROI) from the pool of SAM generated masks. Finally, SAM is prompted by the retrieved ROI to segment a specific organ. Thus, SaLIP is training and fine tuning free and does not rely on domain expertise or labeled data for prompt engineering. Our method shows substantial enhancements in zero shot segmentation, showcasing notable improvements in DICE scores across diverse segmentation tasks like brain (63.46%), lung (50.11%), and fetal head (30.82%), when compared to un prompted SAM. Code and text prompts will be available online.

Via

Access Paper or Ask Questions

Prompts As Programs: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

Apr 02, 2024
Tobias Schnabel, Jennifer Neville

Large language models (LLMs) can now handle longer and more complex inputs, which facilitate the use of more elaborate prompts. However, prompts often require some tuning to improve performance for deployment. Recent work has proposed automatic prompt optimization methods, but as prompt complexity and LLM strength increase, many prompt optimization techniques are no longer sufficient and a new approach is needed to optimize {\em meta prompt programs}. To address this, we introduce SAMMO, a framework for {\em compile-time} optimizations of metaprompt programs, which represent prompts as structured objects that allows for a rich set of transformations that can be searched over during optimization. We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs. We make all code available open-source at https://github.com/microsoft/sammo .

Via

Access Paper or Ask Questions

Active Learning for Control-Oriented Identification of Nonlinear Systems

Apr 13, 2024
Bruce D. Lee, Ingvar Ziemann, George J. Pappas, Nikolai Matni

Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the system may be costly and time consuming, targeted exploration is crucial for developing an effective control-oriented model with minimal experimentation. Motivated by this challenge, recent work has begun to study finite sample data requirements and sample efficient algorithms for the problem of optimal exploration in model-based reinforcement learning. However, existing theory and algorithms are limited to model classes which are linear in the parameters. Our work instead focuses on models with nonlinear parameter dependencies, and presents the first finite sample analysis of an active learning algorithm suitable for a general class of nonlinear dynamics. In certain settings, the excess control cost of our algorithm achieves the optimal rate, up to logarithmic factors. We validate our approach in simulation, showcasing the advantage of active, control-oriented exploration for controlling nonlinear systems.

Via

Access Paper or Ask Questions

PraFFL: A Preference-Aware Scheme in Fair Federated Learning

Apr 13, 2024
Rongguang Ye, Ming Tang

Fairness in federated learning has emerged as a critical concern, aiming to develop an unbiased model for any special group (e.g., male or female) of sensitive features. However, there is a trade-off between model performance and fairness, i.e., improving fairness will decrease model performance. Existing approaches have characterized such a trade-off by introducing hyperparameters to quantify client's preferences for fairness and model performance. Nevertheless, these methods are limited to scenarios where each client has only a single pre-defined preference. In practical systems, each client may simultaneously have multiple preferences for the model performance and fairness. The key challenge is to design a method that allows the model to adapt to diverse preferences of each client in real time. To this end, we propose a Preference-aware scheme in Fair Federated Learning paradigm (called PraFFL). PraFFL can adaptively adjust the model based on each client's preferences to meet their needs. We theoretically prove that PraFFL can provide the optimal model for client's arbitrary preferences. Experimental results show that our proposed PraFFL outperforms five existing fair federated learning algorithms in terms of the model's capability in adapting to clients' different preferences.

* 10 pages, 10 figures, and 1 table. This paper has been submitted to MobiHoc'24

Via

Access Paper or Ask Questions

RIS-Assisted OTFS Communications: Phase Configuration via Received Energy Maximization

Apr 12, 2024
Mohamad H. Dinan, Arman Farhang

In this paper, we explore the integration of two revolutionary technologies, reconfigurable intelligent surfaces (RISs) and orthogonal time frequency space (OTFS) modulation, to enhance high-speed wireless communications. We introduce a novel phase shift design algorithm for RIS-assisted OTFS, optimizing energy reception and channel gain in dynamic environments. The study evaluates the proposed approach in a downlink scenario, demonstrating significant performance improvements compared to benchmark schemes in the literature, particularly in terms of bit error rate (BER). Our results showcase the potential of RIS to enhance the system's performance. Specifically, our proposed phase shift design technique outperforms the benchmark solutions by over 4 dB. Furthermore, even greater gains can be obtained as the number of RIS elements increases.

* 6 pages (double column), 5 figures, conference paper

Via

Access Paper or Ask Questions

Swing-Up of a Weakly Actuated Double Pendulum via Nonlinear Normal Modes

Apr 12, 2024
Arne Sachtler, Davide Calzolari, Maximilian Raff, Annika Schmidt, Yannik P. Wotte, Cosimo Della Santina, C. David Remy, Alin Albu-Schäffer

We identify the nonlinear normal modes spawning from the stable equilibrium of a double pendulum under gravity, and we establish their connection to homoclinic orbits through the unstable upright position as energy increases. This result is exploited to devise an efficient swing-up strategy for a double pendulum with weak, saturating actuators. Our approach involves stabilizing the system onto periodic orbits associated with the nonlinear modes while gradually injecting energy. Since these modes are autonomous system evolutions, the required control effort for stabilization is minimal. Even with actuator limitations of less than 1% of the maximum gravitational torque, the proposed method accomplishes the swing-up of the double pendulum by allowing sufficient time.

* Preprint of a paper to appear at the European Control Conference (ECC) 2024 in Stockholm, Sweden

Via

Access Paper or Ask Questions

Test-Time Domain Generalization for Face Anti-Spoofing

Mar 28, 2024
Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Shouhong Ding, Lizhuang Ma

Face Anti-Spoofing (FAS) is pivotal in safeguarding facial recognition systems against presentation attacks. While domain generalization (DG) methods have been developed to enhance FAS performance, they predominantly focus on learning domain-invariant features during training, which may not guarantee generalizability to unseen data that differs largely from the source distributions. Our insight is that testing data can serve as a valuable resource to enhance the generalizability beyond mere evaluation for DG FAS. In this paper, we introduce a novel Test-Time Domain Generalization (TTDG) framework for FAS, which leverages the testing data to boost the model's generalizability. Our method, consisting of Test-Time Style Projection (TTSP) and Diverse Style Shifts Simulation (DSSS), effectively projects the unseen data to the seen domain space. In particular, we first introduce the innovative TTSP to project the styles of the arbitrarily unseen samples of the testing distribution to the known source space of the training distributions. We then design the efficient DSSS to synthesize diverse style shifts via learnable style bases with two specifically designed losses in a hyperspherical feature space. Our method eliminates the need for model updates at the test time and can be seamlessly integrated into not only the CNN but also ViT backbones. Comprehensive experiments on widely used cross-domain FAS benchmarks demonstrate our method's state-of-the-art performance and effectiveness.

* Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Via

Access Paper or Ask Questions