In advanced manufacturing, the incorporation of sensing technology provides an opportunity to achieve efficient in-situ process monitoring using machine learning methods. Meanwhile, the advances of information technologies also enable a connected and decentralized environment for manufacturing systems, making different manufacturing units in the system collaborate more closely. In a decentralized manufacturing system, the involved units may fabricate same or similar products and deploy their own machine learning model for online process monitoring. However, due to the possible inconsistency of task progress during the operation, it is also common that some units are data-rich while some are data-poor. Thus, the learning progress of the machine learning-based process monitoring model for each unit may vary. Therefore, it is highly valuable to achieve efficient and secured knowledge sharing among the units in a decentralized manufacturing system. To realize this goal, this paper proposes a knowledge distillation-based information sharing (KD-IS) framework, which could distill informative knowledge from data-rich unit to improve the monitoring performance of data-poor unit. To validate the effectiveness of this method, a real-world case study is conducted in a connected fused filament fabrication (FFF)-based additive manufacturing (AM) platform. The experimental results show that the developed method is efficient in improving model monitoring performance at data-poor unit, with solid protection on potential data privacy.
The federated learning (FL) paradigm emerges to preserve data privacy during model training by only exposing clients' model parameters rather than original data. One of the biggest challenges in FL lies in the non-IID (not identical and independently distributed) data (a.k.a., data heterogeneity) distributed on clients. To address this challenge, various personalized FL (pFL) methods are proposed such as similarity-based aggregation and model decoupling. The former one aggregates models from clients of a similar data distribution. The later one decouples a neural network (NN) model into a feature extractor and a classifier. Personalization is captured by classifiers which are obtained by local training. To advance pFL, we propose a novel pFedSim (pFL based on model similarity) algorithm in this work by combining these two kinds of methods. More specifically, we decouple a NN model into a personalized feature extractor, obtained by aggregating models from similar clients, and a classifier, which is obtained by local training and used to estimate client similarity. Compared with the state-of-the-art baselines, the advantages of pFedSim include: 1) significantly improved model accuracy; 2) low communication and computation overhead; 3) a low risk of privacy leakage; 4) no requirement for any external public information. To demonstrate the superiority of pFedSim, extensive experiments are conducted on real datasets. The results validate the superb performance of our algorithm which can significantly outperform baselines under various heterogeneous data settings.
Communication compression is a common technique in distributed optimization that can alleviate communication overhead by transmitting compressed gradients and model parameters. However, compression can introduce information distortion, which slows down convergence and incurs more communication rounds to achieve desired solutions. Given the trade-off between lower per-round communication costs and additional rounds of communication, it is unclear whether communication compression reduces the total communication cost. This paper explores the conditions under which unbiased compression, a widely used form of compression, can reduce the total communication cost, as well as the extent to which it can do so. To this end, we present the first theoretical formulation for characterizing the total communication cost in distributed optimization with communication compression. We demonstrate that unbiased compression alone does not necessarily save the total communication cost, but this outcome can be achieved if the compressors used by all workers are further assumed independent. We establish lower bounds on the communication rounds required by algorithms using independent unbiased compressors to minimize smooth convex functions, and show that these lower bounds are tight by refining the analysis for ADIANA. Our results reveal that using independent unbiased compression can reduce the total communication cost by a factor of up to $\Theta(\sqrt{\min\{n, \kappa\}})$, where $n$ is the number of workers and $\kappa$ is the condition number of the functions being minimized. These theoretical findings are supported by experimental results.
We propose a learning-based method to recover normals, specularity, and roughness from a single diffuse image of a material, using microgeometry appearance as our primary cue. Previous methods that work on single images tend to produce over-smooth outputs with artifacts, operate at limited resolution, or train one model per class with little room for generalization. Previous methods that work on single images tend to produce over-smooth outputs with artifacts, operate at limited resolution, or train one model per class with little room for generalization. In contrast, in this work, we propose a novel capture approach that leverages a generative network with attention and a U-Net discriminator, which shows outstanding performance integrating global information at reduced computational complexity. We showcase the performance of our method with a real dataset of digitized textile materials and show that a commodity flatbed scanner can produce the type of diffuse illumination required as input to our method. Additionally, because the problem might be illposed -more than a single diffuse image might be needed to disambiguate the specular reflection- or because the training dataset is not representative enough of the real distribution, we propose a novel framework to quantify the model's confidence about its prediction at test time. Our method is the first one to deal with the problem of modeling uncertainty in material digitization, increasing the trustworthiness of the process and enabling more intelligent strategies for dataset creation, as we demonstrate with an active learning experiment.
While the world has been combating COVID-19 for over three years, an ongoing "Infodemic" due to the spread of fake news regarding the pandemic has also been a global issue. The existence of the fake news impact different aspect of our daily lives, including politics, public health, economic activities, etc. Readers could mistake fake news for real news, and consequently have less access to authentic information. This phenomenon will likely cause confusion of citizens and conflicts in society. Currently, there are major challenges in fake news research. It is challenging to accurately identify fake news data in social media posts. In-time human identification is infeasible as the amount of the fake news data is overwhelming. Besides, topics discussed in fake news are hard to identify due to their similarity to real news. The goal of this paper is to identify fake news on social media to help stop the spread. We present Deep Learning approaches and an ensemble approach for fake news detection. Our detection models achieved higher accuracy than previous studies. The ensemble approach further improved the detection performance. We discovered feature differences between fake news and real news items. When we added them into the sentence embeddings, we found that they affected the model performance. We applied a hybrid method and built models for recognizing topics from posts. We found half of the identified topics were overlapping in fake news and real news, which could increase confusion in the population.
Large Language Models (LLMs) have made significant progress in recent years, achieving remarkable results in question-answering tasks (QA). However, they still face two major challenges: hallucination and outdated information after the training phase. These challenges take center stage in critical domains like climate change, where obtaining accurate and up-to-date information from reliable sources in a limited time is essential and difficult. To overcome these barriers, one potential solution is to provide LLMs with access to external, scientifically accurate, and robust sources (long-term memory) to continuously update their knowledge and prevent the propagation of inaccurate, incorrect, or outdated information. In this study, we enhanced GPT-4 by integrating the information from the Sixth Assessment Report of the Intergovernmental (IPCC AR6), the most comprehensive, up-to-date, and reliable source in this domain. We present our conversational AI prototype, available at www.chatclimate.ai and demonstrate its ability to answer challenging questions accurately in three different QA scenarios: asking from 1) GPT-4, 2) chatClimate, and 3) hybrid chatClimate. The answers and their sources were evaluated by our team of IPCC authors, who used their expert knowledge to score the accuracy of the answers from 1 (very-low) to 5 (very-high). The evaluation showed that the hybrid chatClimate provided more accurate answers, highlighting the effectiveness of our solution. This approach can be easily scaled for chatbots in specific domains, enabling the delivery of reliable and accurate information.
Learning high quality sentence embeddings from dialogues has drawn increasing attentions as it is essential to solve a variety of dialogue-oriented tasks with low annotation cost. However, directly annotating and gathering utterance relationships in conversations are difficult, while token-level annotations, \eg, entities, slots and templates, are much easier to obtain. General sentence embedding methods are usually sentence-level self-supervised frameworks and cannot utilize token-level extra knowledge. In this paper, we introduce Template-aware Dialogue Sentence Embedding (TaDSE), a novel augmentation method that utilizes template information to effectively learn utterance representation via self-supervised contrastive learning framework. TaDSE augments each sentence with its corresponding template and then conducts pairwise contrastive learning over both sentence and template. We further enhance the effect with a synthetically augmented dataset that enhances utterance-template relation, in which entity detection (slot-filling) is a preliminary step. We evaluate TaDSE performance on five downstream benchmark datasets. The experiment results show that TaDSE achieves significant improvements over previous SOTA methods, along with a consistent Intent Classification task performance improvement margin. We further introduce a novel analytic instrument of Semantic Compression method, for which we discover a correlation with uniformity and alignment. Our code will be released soon.
In most urban and suburban areas, pole-like structures such as tree trunks or utility poles are ubiquitous. These structural landmarks are very useful for the localization of autonomous vehicles given their geometrical locations in maps and measurements from sensors. In this work, we aim at creating an accurate map for autonomous vehicles or robots with pole-like structures as the dominant localization landmarks, hence called pole-map. In contrast to the previous pole-based mapping or localization methods, we exploit the semantics of pole-like structures. Specifically, semantic segmentation is achieved by a new mask-range transformer network in a mask-classfication paradigm. With the semantics extracted for the pole-like structures in each frame, a multi-layer semantic pole-map is created by aggregating the detected pole-like structures from all frames. Given the semantic pole-map, we propose a semantic particle-filtering localization scheme for vehicle localization. Theoretically, we have analyzed why the semantic information can benefit the particle-filter localization, and empirically it is validated on the public SemanticKITTI dataset that the particle-filtering localization with semantics achieves much better performance than the counterpart without semantics when each particle's odometry prediction and/or the online observation is subject to uncertainties at significant levels.
Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of information, making binary judgments of quality inadequate, and (2) human evaluation is time-consuming and costly. In this paper, we introduce FActScore (Factual precision in Atomicity Score), a new evaluation that breaks a generation into a series of atomic facts and computes the percentage of atomic facts supported by a reliable knowledge source. We conduct an extensive human evaluation to obtain FActScores of people biographies generated by several state-of-the-art commercial LMs -- InstructGPT, ChatGPT, and the retrieval-augmented PerplexityAI -- and report new analysis demonstrating the need for such a fine-grained score (e.g., ChatGPT only achieves 58%). Since human evaluation is costly, we also introduce an automated model that estimates FActScore, using retrieval and a strong language model, with less than a 2% error rate. Finally, we use this automated metric to evaluate 6,500 generations from a new set of 13 recent LMs that would have cost $26K if evaluated by humans, with various findings: GPT-4 and ChatGPT are more factual than public models, and Vicuna and Alpaca are some of the best public models.
Latest developments in computer hardware, sensor technologies, and artificial intelligence can make virtual reality (VR) and virtual spaces an important part of human everyday life. Eye tracking offers not only a hands-free way of interaction but also the possibility of a deeper understanding of human visual attention and cognitive processes in VR. Despite these possibilities, eye-tracking data also reveal privacy-sensitive attributes of users when it is combined with the information about the presented stimulus. To address these possibilities and potential privacy issues, in this survey, we first cover major works in eye tracking, VR, and privacy areas between the years 2012 and 2022. While eye tracking in the VR part covers the complete pipeline of eye-tracking methodology from pupil detection and gaze estimation to offline use and analyses, as for privacy and security, we focus on eye-based authentication as well as computational methods to preserve the privacy of individuals and their eye-tracking data in VR. Later, taking all into consideration, we draw three main directions for the research community by mainly focusing on privacy challenges. In summary, this survey provides an extensive literature review of the utmost possibilities with eye tracking in VR and the privacy implications of those possibilities.