Abstract:This document consolidates publicly reported technical details about Metas Llama 4 model family. It summarizes (i) released variants (Scout and Maverick) and the broader herd context including the previewed Behemoth teacher model, (ii) architectural characteristics beyond a high-level MoE description covering routed/shared-expert structure, early-fusion multimodality, and long-context design elements reported for Scout (iRoPE and length generalization strategies), (iii) training disclosures spanning pre-training, mid-training for long-context extension, and post-training methodology (lightweight SFT, online RL, and lightweight DPO) as described in release materials, (iv) developer-reported benchmark results for both base and instruction-tuned checkpoints, and (v) practical deployment constraints observed across major serving environments, including provider-specific context limits and quantization packaging. The manuscript also summarizes licensing obligations relevant to redistribution and derivative naming, and reviews publicly described safeguards and evaluation practices. The goal is to provide a compact technical reference for researchers and practitioners who need precise, source-backed facts about Llama 4.




Abstract:A payment card (such as debit or credit) is one of the most convenient payment methods for purchasing goods and services. Hundreds of millions of card transactions take place across the globe every day, generating a massive volume of transaction data. The data render a holistic view of cardholder-merchant interactions, containing insights that can benefit various applications, such as payment fraud detection and merchant recommendation. However, utilizing these insights often requires additional information about merchants missing from the data owner's (i.e., payment company's) perspective. For example, payment companies do not know the exact type of product a merchant serves. Collecting merchant attributes from external sources for commercial purposes can be expensive. Motivated by this limitation, we aim to infer latent merchant attributes from transaction data. As proof of concept, we concentrate on restaurants and infer the cuisine types of restaurants from transactions. To this end, we present a framework for inferring the cuisine types of restaurants from transaction data. Our proposed framework consists of three steps. In the first step, we generate cuisine labels for a limited number of restaurants via weak supervision. In the second step, we extract a wide variety of statistical features and neural embeddings from the restaurant transactions. In the third step, we use deep neural networks (DNNs) to infer the remaining restaurants' cuisine types. The proposed framework achieved a 76.2% accuracy in classifying the US restaurants. To the best of our knowledge, this is the first framework to infer the cuisine types of restaurants by analyzing transaction data as the only source.