Abstract:Many recent reasoning gains in large language models can be explained as distribution sharpening: biasing generation toward high-likelihood trajectories already supported by the pretrained model, rather than modifying its weights. A natural formalization is the sequence-level power distribution $π_α(y\mid x)\propto p_θ(y\mid x)^α$ ($α>1$), which concentrates mass on whole sequences instead of adjusting token-level temperature. Prior work shows that Metropolis--Hastings (MH) sampling from this distribution recovers strong reasoning performance, but at order-of-magnitude inference slowdowns. We introduce Power-SMC, a training-free Sequential Monte Carlo scheme that targets the same objective while remaining close to standard decoding latency. Power-SMC advances a small particle set in parallel, corrects importance weights token-by-token, and resamples when necessary, all within a single GPU-friendly batched decode. We prove that temperature $τ=1/α$ is the unique prefix-only proposal minimizing incremental weight variance, interpret residual instability via prefix-conditioned Rényi entropies, and introduce an exponent-bridging schedule that improves particle stability without altering the target. On MATH500, Power-SMC matches or exceeds MH power sampling while reducing latency from $16$--$28\times$ to $1.4$--$3.3\times$ over baseline decoding.
Abstract:In this paper, we study a latticized optimization problem with fuzzy relational inequality constraints where the feasible region is formed as the intersection of two inequality fuzzy systems and Sugeno-Weber family of t-norms is considered as fuzzy composition. Sugeno-Weber family of t-norms and t-conorms is one of the most applied one in various fuzzy modelling problems. This family of t-norms and t-conorms was suggested by Weber for modeling intersection and union of fuzzy sets. Also, the t-conorms were suggested as addition rules by Sugeno for so-called alpha-fuzzy measures. The resolution of the feasible region of the problem is firstly investigated when it is defined with max-Sugeno-Weber composition and a necessary and sufficient condition is presented for determining the feasibility. Then, based on some theoretical properties of the problem, an algorithm is presented for solving this nonlinear problem. It is proved that the algorithm can find the exact optimal solution and an example is presented to illustrate the proposed algorithm.




Abstract:Understanding emotions in videos is a challenging task. However, videos contain several modalities which make them a rich source of data for machine learning and deep learning tasks. In this work, we aim to improve video sentiment classification by focusing on two key aspects: the video itself, the accompanying text, and the acoustic features. To address the limitations of relying on large labeled datasets, we are developing a method that utilizes clustering-based semi-supervised pre-training to extract meaningful representations from the data. This pre-training step identifies patterns in the video and text data, allowing the model to learn underlying structures and relationships without requiring extensive labeled information at the outset. Once these patterns are established, we fine-tune the system in a supervised manner to classify the sentiment expressed in videos. We believe that this multi-modal approach, combining clustering with supervised fine-tuning, will lead to more accurate and insightful sentiment classification, especially in cases where labeled data is limited.