This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital. We define the Factuality Issue as the probability of LLMs to produce content inconsistent with established facts. We first delve into the implications of these inaccuracies, highlighting the potential consequences and challenges posed by factual errors in LLM outputs. Subsequently, we analyze the mechanisms through which LLMs store and process facts, seeking the primary causes of factual errors. Our discussion then transitions to methodologies for evaluating LLM factuality, emphasizing key metrics, benchmarks, and studies. We further explore strategies for enhancing LLM factuality, including approaches tailored for specific domains. We focus two primary LLM configurations standalone LLMs and Retrieval-Augmented LLMs that utilizes external data, we detail their unique challenges and potential enhancements. Our survey offers a structured guide for researchers aiming to fortify the factual reliability of LLMs.
As a common visual problem, co-saliency detection within a single image does not attract enough attention and yet has not been well addressed. Existing methods often follow a bottom-up strategy to infer co-saliency in an image, where salient regions are firstly detected using visual primitives such as color and shape, and then grouped and merged into a co-saliency map. However, co-saliency is intrinsically perceived in a complex manner with bottom-up and top-down strategies combined in human vision. To deal with this problem, a novel end-to-end trainable network is proposed in this paper, which includes a backbone net and two branch nets. The backbone net uses ground-truth masks as top-down guidance for saliency prediction, while the two branch nets construct triplet proposals for feature organization and clustering, which drives the network to be sensitive to co-salient regions in a bottom-up way. To evaluate the proposed method, we construct a new dataset of 2,019 nature images with co-saliency in each image. Experimental results show that the proposed method achieves a state-of-the-art accuracy with a running speed of 28fps.
Lane detection in driving scenes is an important module for autonomous vehicles and advanced driver assistance systems. In recent years, many sophisticated lane detection methods have been proposed. However, most methods focus on detecting the lane from one single image, and often lead to unsatisfactory performance in handling some extremely-bad situations such as heavy shadow, severe mark degradation, serious vehicle occlusion, and so on. In fact, lanes are continuous line structures on the road. Consequently, the lane that cannot be accurately detected in one current frame may potentially be inferred out by incorporating information of previous frames. To this end, we investigate lane detection by using multiple frames of a continuous driving scene, and propose a hybrid deep architecture by combining the convolutional neural network (CNN) and the recurrent neural network (RNN). Specifically, information of each frame is abstracted by a CNN block, and the CNN features of multiple continuous frames, holding the property of time-series, are then fed into the RNN block for feature learning and lane prediction. Extensive experiments on two large-scale datasets demonstrate that, the proposed method outperforms the competing methods in lane detection, especially in handling difficult situations.