Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Katsuki Fujisawa

Toward Ultra-Long-Horizon Sequential Model Editing

Jan 30, 2026

Mingda Liu, Zhenghan Zhu, Ze'an Miao, Katsuki Fujisawa

Abstract:Model editing has emerged as a practical approach for mitigating factual errors and outdated knowledge in large language models (LLMs). Among existing methods, the Locate-and-Edit (L&E) paradigm is the dominant framework: it locates MLP parameters implicated in expressing a target fact, and then performs a localized update to rewrite that fact. However, long sequences of edits often trigger abrupt model collapse in L&E beyond a critical point. We empirically identify a strong correlation between collapse and explosive growth of edited MLP weight norms, and formally prove that commonly used L&E update rules can induce exponential norm growth across sequential edits in the absence of explicit norm control. To address this issue, we propose Norm-Anchor Scaling NAS, a plug-and-play norm-constrained strategy. Across extensive experiments, NAS delays the collapse point of representative L&E algorithms by more than 4 times and yields a 72.2% average relative gain in editing performance, requiring only a single additional line of code and incurring negligible computational overhead.

Via

Access Paper or Ask Questions

More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization

Dec 31, 2025

Yuma Ichikawa, Yoshihiko Fujisawa, Yudai Fujimoto, Akira Sakai, Katsuki Fujisawa

Abstract:For extreme low-bit quantization of large language models (LLMs), Double Binary Factorization (DBF) is attractive as it enables efficient inference without sacrificing accuracy. However, the scaling parameters of DBF are too restrictive; after factoring out signs, all rank components share the same magnitude profile, resulting in performance saturation. We propose Multi-envelope DBF (MDBF), which retains a shared pair of 1-bit sign bases but replaces the single envelope with a rank-$l$ envelope. By sharing sign matrices among envelope components, MDBF effectively maintains a binary carrier and utilizes the limited memory budget for magnitude expressiveness. We also introduce a closed-form initialization and an alternating refinement method to optimize MDBF. Across the LLaMA and Qwen families, MDBF enhances perplexity and zero-shot accuracy over previous binary formats at matched bits per weight while preserving the same deployment-friendly inference primitive.

* 14 pages, 2 figures

Via

Access Paper or Ask Questions

Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features

Feb 05, 2025

Keiichiro Yamamura, Toru Mitsutake, Hiroki Ishikura, Daiki Kusuhara, Akihiro Yoshida, Katsuki Fujisawa

Figure 1 for Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features

Figure 2 for Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features

Figure 3 for Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features

Figure 4 for Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features

Abstract:Quadratic Unconstrained Binary Optimization (QUBO)-based suppression in object detection is known to have superiority to conventional Non-Maximum Suppression (NMS), especially for crowded scenes where NMS possibly suppresses the (partially-) occluded true positives with low confidence scores. Whereas existing QUBO formulations are less likely to miss occluded objects than NMS, there is room for improvement because existing QUBO formulations naively consider confidence scores and pairwise scores based on spatial overlap between predictions. This study proposes new QUBO formulations that aim to distinguish whether the overlap between predictions is due to the occlusion of objects or due to redundancy in prediction, i.e., multiple predictions for a single object. The proposed QUBO formulation integrates two features into the pairwise score of the existing QUBO formulation: i) the appearance feature calculated by the image similarity metric and ii) the product of confidence scores. These features are derived from the hypothesis that redundant predictions share a similar appearance feature and (partially-) occluded objects have low confidence scores, respectively. The proposed methods demonstrate significant advancement over state-of-the-art QUBO-based suppression without a notable increase in runtime, achieving up to 4.54 points improvement in mAP and 9.89 points gain in mAR.

* 8 pages for main contents, 3 pages for appendix, 3 pages for reference

Via

Access Paper or Ask Questions

Enhancing Output Diversity Improves Conjugate Gradient-based Adversarial Attacks

Aug 07, 2024

Keiichiro Yamamura, Issa Oe, Hiroki Ishikura, Katsuki Fujisawa

Abstract:Deep neural networks are vulnerable to adversarial examples, and adversarial attacks that generate adversarial examples have been studied in this context. Existing studies imply that increasing the diversity of model outputs contributes to improving the attack performance. This study focuses on the Auto Conjugate Gradient (ACG) attack, which is inspired by the conjugate gradient method and has a high diversification performance. We hypothesized that increasing the distance between two consecutive search points would enhance the output diversity. To test our hypothesis, we propose Rescaling-ACG (ReACG), which automatically modifies the two components that significantly affect the distance between two consecutive search points, including the search direction and step size. ReACG showed higher attack performance than that of ACG, and is particularly effective for ImageNet models with several classification classes. Experimental results show that the distance between two consecutive search points enhances the output diversity and may help develop new potent attacks. The code is available at \url{https://github.com/yamamura-k/ReACG}

* ICPRAI2024

Via

Access Paper or Ask Questions

Diversified Adversarial Attacks based on Conjugate Gradient Method

Jun 20, 2022

Keiichiro Yamamura, Haruki Sato, Nariaki Tateiwa, Nozomi Hata, Toru Mitsutake, Issa Oe, Hiroki Ishikura, Katsuki Fujisawa

Figure 1 for Diversified Adversarial Attacks based on Conjugate Gradient Method

Figure 2 for Diversified Adversarial Attacks based on Conjugate Gradient Method

Figure 3 for Diversified Adversarial Attacks based on Conjugate Gradient Method

Figure 4 for Diversified Adversarial Attacks based on Conjugate Gradient Method

Abstract:Deep learning models are vulnerable to adversarial examples, and adversarial attacks used to generate such examples have attracted considerable research interest. Although existing methods based on the steepest descent have achieved high attack success rates, ill-conditioned problems occasionally reduce their performance. To address this limitation, we utilize the conjugate gradient (CG) method, which is effective for this type of problem, and propose a novel attack algorithm inspired by the CG method, named the Auto Conjugate Gradient (ACG) attack. The results of large-scale evaluation experiments conducted on the latest robust models show that, for most models, ACG was able to find more adversarial examples with fewer iterations than the existing SOTA algorithm Auto-PGD (APGD). We investigated the difference in search performance between ACG and APGD in terms of diversification and intensification, and define a measure called Diversity Index (DI) to quantify the degree of diversity. From the analysis of the diversity using this index, we show that the more diverse search of the proposed method remarkably improves its attack success rate.

Via

Access Paper or Ask Questions

Nested Subspace Arrangement for Representation of Relational Data

Jul 04, 2020

Nozomi Hata, Shizuo Kaji, Akihiro Yoshida, Katsuki Fujisawa

Figure 1 for Nested Subspace Arrangement for Representation of Relational Data

Figure 2 for Nested Subspace Arrangement for Representation of Relational Data

Figure 3 for Nested Subspace Arrangement for Representation of Relational Data

Figure 4 for Nested Subspace Arrangement for Representation of Relational Data

Abstract:Studies on acquiring appropriate continuous representations of discrete objects, such as graphs and knowledge base data, have been conducted by many researchers in the field of machine learning. In this study, we introduce Nested SubSpace (NSS) arrangement, a comprehensive framework for representation learning. We show that existing embedding techniques can be regarded as special cases of the NSS arrangement. Based on the concept of the NSS arrangement, we implement a Disk-ANChor ARrangement (DANCAR), a representation learning method specialized to reproducing general graphs. Numerical experiments have shown that DANCAR has successfully embedded WordNet in ${\mathbb R}^{20}$ with an F1 score of 0.993 in the reconstruction task. DANCAR is also suitable for visualization in understanding the characteristics of graphs.

* 11 pages, 13 figures, ICML 2020

Via

Access Paper or Ask Questions