Boston Medical Sciences, Tokyo, Japan
Abstract:Patients awaiting invasive procedures often have unanswered pre-procedural questions; however, time-pressured workflows and privacy constraints limit personalized counseling. We present LENOHA (Low Energy, No Hallucination, Leave No One Behind Architecture), a safety-first, local-first system that routes inputs with a high-precision sentence-transformer classifier and returns verbatim answers from a clinician-curated FAQ for clinical queries, eliminating free-text generation in the clinical path. We evaluated two domains (tooth extraction and gastroscopy) using expert-reviewed validation sets (n=400/domain) for thresholding and independent test sets (n=200/domain). Among the four encoders, E5-large-instruct (560M) achieved an overall accuracy of 0.983 (95% CI 0.964-0.991), AUC 0.996, and seven total errors, which were statistically indistinguishable from GPT-4o on this task; Gemini made no errors on this test set. Energy logging shows that the non-generative clinical path consumes ~1.0 mWh per input versus ~168 mWh per small-talk reply from a local 8B SLM, a ~170x difference, while maintaining ~0.10 s latency on a single on-prem GPU. These results indicate that near-frontier discrimination and generation-induced errors are structurally avoided in the clinical path by returning vetted FAQ answers verbatim, supporting privacy, sustainability, and equitable deployment in bandwidth-limited environments.
Abstract:Every generation of mobile devices strives to capture video at higher resolution and frame rate than previous ones. This quality increase also requires additional power and computation to capture and encode high-quality media. We propose a method to reduce the overall power consumption for capturing high-quality videos in mobile devices. Using video frame interpolation (VFI), sensors can be driven at lower frame rate, which reduces sensor power consumption. With modern RGB hybrid event-based vision sensors (EVS), event data can be used to guide the interpolation, leading to results of much higher quality. If applied naively, interpolation methods can be expensive and lead to large amounts of intermediate data before video is encoded. This paper proposes a video encoder that generates a bitstream for high frame rate video without explicit interpolation. The proposed method estimates encoded video data (notably motion vectors) rather than frames. Thus, an encoded video file can be produced directly without explicitly producing intermediate frames.