Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sushma Rao

Apple Intelligence Foundation Language Models

Jul 29, 2024

Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu(+144 more)

Figure 1 for Apple Intelligence Foundation Language Models

Figure 2 for Apple Intelligence Foundation Language Models

Figure 3 for Apple Intelligence Foundation Language Models

Figure 4 for Apple Intelligence Foundation Language Models

Abstract:We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used to train the model, the training process, how the models are optimized for inference, and the evaluation results. We highlight our focus on Responsible AI and how the principles are applied throughout the model development.

Via

Access Paper or Ask Questions

SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Jun 12, 2020

Leo F Isikdogan, Bhavin V Nayak, Chyuan-Tyng Wu, Joao Peralta Moreira, Sushma Rao, Gilad Michael

Figure 1 for SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Figure 2 for SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Figure 3 for SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Figure 4 for SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

Abstract:We propose a system comprised of fixed-topology neural networks having partially frozen weights, named SemifreddoNets. SemifreddoNets work as fully-pipelined hardware blocks that are optimized to have an efficient hardware implementation. Those blocks freeze a certain portion of the parameters at every layer and replace the corresponding multipliers with fixed scalers. Fixing the weights reduces the silicon area, logic delay, and memory requirements, leading to significant savings in cost and power consumption. Unlike traditional layer-wise freezing approaches, SemifreddoNets make a profitable trade between the cost and flexibility by having some of the weights configurable at different scales and levels of abstraction in the model. Although fixing the topology and some of the weights somewhat limits the flexibility, we argue that the efficiency benefits of this strategy outweigh the advantages of a fully configurable model for many use cases. Furthermore, our system uses repeatable blocks, therefore it has the flexibility to adjust model complexity without requiring any hardware change. The hardware implementation of SemifreddoNets provides up to an order of magnitude reduction in silicon area and power consumption as compared to their equivalent implementation on a general-purpose accelerator.

Via

Access Paper or Ask Questions

A Machine Learning Imaging Core using Separable FIR-IIR Filters

Jan 02, 2020

Masayoshi Asama, Leo F. Isikdogan, Sushma Rao, Bhavin V. Nayak, Gilad Michael

Figure 1 for A Machine Learning Imaging Core using Separable FIR-IIR Filters

Figure 2 for A Machine Learning Imaging Core using Separable FIR-IIR Filters

Figure 3 for A Machine Learning Imaging Core using Separable FIR-IIR Filters

Figure 4 for A Machine Learning Imaging Core using Separable FIR-IIR Filters

Abstract:We propose fixed-function neural network hardware that is designed to perform pixel-to-pixel image transformations in a highly efficient way. We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks. Our model uses compressed skip lines and hybrid FIR-IIR blocks to reduce the latency and hardware footprint. Our proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of ~3mm^2 (in TSMC 16nm), which is orders of magnitude smaller than a comparable pixel-wise dense prediction model. MagIC requires no DDR bandwidth, no SRAM, and practically no external memory. Each MagIC core consumes 56mW (215 mW max power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm^2. MagIC can be used as a multi-purpose image processing block in an imaging pipeline, approximating compute-heavy image processing applications, such as image deblurring, denoising, and colorization, within the power and silicon area limits of mobile devices.

Via

Access Paper or Ask Questions

VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Nov 14, 2019

Chyuan-Tyng Wu, Leo F. Isikdogan, Sushma Rao, Bhavin Nayak, Timo Gerasimow, Aleksandar Sutic, Liron Ain-kedem, Gilad Michael

Figure 1 for VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Figure 2 for VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Figure 3 for VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Figure 4 for VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Abstract:Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. We propose a set of methods, which we collectively call VisionISP, to repurpose the ISP for machine consumption. VisionISP significantly reduces data transmission needs by reducing the bit-depth and resolution while preserving the relevant information. The blocks in VisionISP are simple, content-aware, and trainable. Experimental results show that VisionISP boosts the performance of a subsequent computer vision system trained to detect objects in an autonomous driving setting. The results demonstrate the potential and the practicality of VisionISP for computer vision applications.

* IEEE International Conference on Image Processing (ICIP), 2019, pp. 4624-4628

Via

Access Paper or Ask Questions

Automatic ISP image quality tuning using non-linear optimization

Feb 24, 2019

Jun Nishimura, Timo Gerasimow, Sushma Rao, Aleksandar Sutic, Chyuan-Tyng Wu, Gilad Michael

Figure 1 for Automatic ISP image quality tuning using non-linear optimization

Figure 2 for Automatic ISP image quality tuning using non-linear optimization

Figure 3 for Automatic ISP image quality tuning using non-linear optimization

Figure 4 for Automatic ISP image quality tuning using non-linear optimization

Abstract:Image Signal Processor (ISP) comprises of various blocks to reconstruct image sensor raw data to final image consumed by human visual system or computer vision applications. Each block typically has many tuning parameters due to the complexity of the operation. These need to be hand tuned by Image Quality (IQ) experts, which takes considerable amount of time. In this paper, we present an automatic IQ tuning using nonlinear optimization and automatic reference generation algorithms. The proposed method can produce high quality IQ in minutes as compared with weeks of hand-tuned results by IQ experts. In addition, the proposed method can work with any algorithms without being aware of their specific implementation. It was found successful on multiple different processing blocks such as noise reduction, demosaic, and sharpening.

* 2018 25th IEEE International Conference on Image Processing (ICIP), 2471-2475
* 5 pages, 2018 25th IEEE International Conference on Image Processing (ICIP), 2471-2475

Via

Access Paper or Ask Questions