Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuta Baba

Point-MF: One-step Point Cloud Generation from a Single Image via Mean Flows

Apr 27, 2026

Yuta Baba, Keiji Yanai

Abstract:Single-image point cloud reconstruction must infer complete 3D geometry, including occluded parts, from a single RGB image. While diffusion-based reconstructors achieve high accuracy, they typically require many denoising iterations, resulting in slow and expensive inference. We propose Point-MF, a Mean-Flow-based framework for low-NFE single-image point cloud reconstruction that couples a Mean-Flow-compatible architecture with an auxiliary loss. Specifically, Point-MF operates directly in point-cloud space to learn the mean velocity field and enables one-step reconstruction with a single network function evaluation (1-NFE), without relying on VAE-based latent representations. To make Mean Flow effective under large interval jumps, Point-MF employs a Diffusion Transformer tailored to the Mean-Flow setting, conditioned on frozen DINOv3 image features via a lightweight token adapter and equipped with explicit interval/time conditioning. Moreover, we introduce Denoised Space Anchor, a set-distance auxiliary loss on the denoised-space estimate $x_θ$ induced by the predicted velocity field, to stabilize large-step generation and reduce outliers and density artifacts. On ShapeNet-R2N2 and Pix3D, Point-MF strikes a strong balance between reconstruction quality and inference speed compared to multi-step diffusion baselines and competitive feedforward models, while generating high-quality point clouds with millisecond-level latency.

* 28 pages, 14 figures

Via

Access Paper or Ask Questions

Accurate and Scalable Matching of Translators to Displaced Persons for Overcoming Language Barriers

Nov 30, 2020

Divyansh Agarwal, Yuta Baba, Pratik Sachdeva, Tanya Tandon, Thomas Vetterli, Aziz Alghunaim

Figure 1 for Accurate and Scalable Matching of Translators to Displaced Persons for Overcoming Language Barriers

Figure 2 for Accurate and Scalable Matching of Translators to Displaced Persons for Overcoming Language Barriers

Figure 3 for Accurate and Scalable Matching of Translators to Displaced Persons for Overcoming Language Barriers

Abstract:Residents of developing countries are disproportionately susceptible to displacement as a result of humanitarian crises. During such crises, language barriers impede aid workers in providing services to those displaced. To build resilience, such services must be flexible and robust to a host of possible languages. \textit{Tarjimly} aims to overcome the barriers by providing a platform capable of matching bilingual volunteers to displaced persons or aid workers in need of translating. However, Tarjimly's large pool of translators comes with the challenge of selecting the right translator per request. In this paper, we describe a machine learning system that matches translator requests to volunteers at scale. We demonstrate that a simple logistic regression, operating on easily computable features, can accurately predict and rank translator response. In deployment, this lightweight system matches 82\% of requests with a median response time of 59 seconds, allowing aid workers to accelerate their services supporting displaced persons.

* Presented at NeurIPS 2020 Workshop on Machine Learning for the Developing World

Via

Access Paper or Ask Questions