Alert button
Picture for Yang Yan

Yang Yan

Alert button

School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China

ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases

Jun 28, 2023
Jiaxi Cui, Zongjian Li, Yang Yan, Bohua Chen, Li Yuan

Figure 1 for ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases
Figure 2 for ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases

Large Language Models (LLMs) have shown the potential to revolutionize natural language processing tasks in various domains, sparking great interest in vertical-specific large models. However, unlike proprietary models such as BloombergGPT and FinGPT, which have leveraged their unique data accumulations to make strides in the finance domain, there hasn't not many similar large language models in the Chinese legal domain to facilitate its digital transformation. In this paper, we propose an open-source legal large language model named ChatLaw. Due to the importance of data quality, we carefully designed a legal domain fine-tuning dataset. Additionally, to overcome the problem of model hallucinations in legal data screening during reference data retrieval, we introduce a method that combines vector database retrieval with keyword retrieval to effectively reduce the inaccuracy of relying solely on vector database retrieval. Furthermore, we propose a self-attention method to enhance the ability of large models to overcome errors present in reference data, further optimizing the issue of model hallucinations at the model level and improving the problem-solving capabilities of large models. We also open-sourced our model and part of the data at https://github.com/PKU-YuanGroup/ChatLaw.

Viaarxiv icon

SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds

May 06, 2023
Junda Ye, Zhongbao Zhang, Li Sun, Yang Yan, Feiyang Wang, Fuxin Ren

Figure 1 for SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds
Figure 2 for SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds
Figure 3 for SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds
Figure 4 for SINCERE: Sequential Interaction Networks representation learning on Co-Evolving RiEmannian manifolds

Sequential interaction networks (SIN) have been commonly adopted in many applications such as recommendation systems, search engines and social networks to describe the mutual influence between users and items/products. Efforts on representing SIN are mainly focused on capturing the dynamics of networks in Euclidean space, and recently plenty of work has extended to hyperbolic geometry for implicit hierarchical learning. Previous approaches which learn the embedding trajectories of users and items achieve promising results. However, there are still a range of fundamental issues remaining open. For example, is it appropriate to place user and item nodes in one identical space regardless of their inherent discrepancy? Instead of residing in a single fixed curvature space, how will the representation spaces evolve when new interaction occurs? To explore these issues for sequential interaction networks, we propose SINCERE, a novel method representing Sequential Interaction Networks on Co-Evolving RiEmannian manifolds. SIN- CERE not only takes the user and item embedding trajectories in respective spaces into account, but also emphasizes on the space evolvement that how curvature changes over time. Specifically, we introduce a fresh cross-geometry aggregation which allows us to propagate information across different Riemannian manifolds without breaking conformal invariance, and a curvature estimator which is delicately designed to predict global curvatures effectively according to current local Ricci curvatures. Extensive experiments on several real-world datasets demonstrate the promising performance of SINCERE over the state-of-the-art sequential interaction prediction methods.

* Accepted by ACM The Web Conference 2023 (WWW) 
Viaarxiv icon

Prediction of superconducting properties of materials based on machine learning models

Nov 06, 2022
Jie Hu, Yongquan Jiang, Yang Yan, Houchen Zuo

Figure 1 for Prediction of superconducting properties of materials based on machine learning models
Figure 2 for Prediction of superconducting properties of materials based on machine learning models
Figure 3 for Prediction of superconducting properties of materials based on machine learning models
Figure 4 for Prediction of superconducting properties of materials based on machine learning models

The application of superconducting materials is becoming more and more widespread. Traditionally, the discovery of new superconducting materials relies on the experience of experts and a large number of "trial and error" experiments, which not only increases the cost of experiments but also prolongs the period of discovering new superconducting materials. In recent years, machine learning has been increasingly applied to materials science. Based on this, this manuscript proposes the use of XGBoost model to identify superconductors; the first application of deep forest model to predict the critical temperature of superconductors; the first application of deep forest to predict the band gap of materials; and application of a new sub-network model to predict the Fermi energy level of materials. Compared with our known similar literature, all the above algorithms reach state-of-the-art. Finally, this manuscript uses the above models to search the COD public dataset and identify 50 candidate superconducting materials with possible critical temperature greater than 90 K.

Viaarxiv icon

IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels

Oct 07, 2022
Xuewei Li, Weilun Zhang, Mankun Zhao, Ming Li, Yang Yan, Jian Yu

Figure 1 for IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels
Figure 2 for IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels
Figure 3 for IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels
Figure 4 for IDPL: Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels

Unsupervised domain adaptation(UDA) has been applied to image semantic segmentation to solve the problem of domain offset. However, in some difficult categories with poor recognition accuracy, the segmentation effects are still not ideal. To this end, in this paper, Intra-subdomain adaptation adversarial learning segmentation method based on Dynamic Pseudo Labels(IDPL) is proposed. The whole process consists of 3 steps: Firstly, the instance-level pseudo label dynamic generation module is proposed, which fuses the class matching information in global classes and local instances, thus adaptively generating the optimal threshold for each class, obtaining high-quality pseudo labels. Secondly, the subdomain classifier module based on instance confidence is constructed, which can dynamically divide the target domain into easy and difficult subdomains according to the relative proportion of easy and difficult instances. Finally, the subdomain adversarial learning module based on self-attention is proposed. It uses multi-head self-attention to confront the easy and difficult subdomains at the class level with the help of generated high-quality pseudo labels, so as to focus on mining the features of difficult categories in the high-entropy region of target domain images, which promotes class-level conditional distribution alignment between the subdomains, improving the segmentation performance of difficult categories. For the difficult categories, the experimental results show that the performance of IDPL is significantly improved compared with other latest mainstream methods.

* Lecture Notes in Computer Science (LNCS) proceedings of ICONIP 2022  
* Accepted at The 29th International Conference on Neural Information Processing (ICONIP 2022) 
Viaarxiv icon

Hardness prediction of age-hardening aluminum alloy based on ensemble learning

Jun 16, 2022
Zuo Houchen, Jiang Yongquan, Yang Yan, Liu Baoying, Hu Jie

Figure 1 for Hardness prediction of age-hardening aluminum alloy based on ensemble learning
Figure 2 for Hardness prediction of age-hardening aluminum alloy based on ensemble learning
Figure 3 for Hardness prediction of age-hardening aluminum alloy based on ensemble learning
Figure 4 for Hardness prediction of age-hardening aluminum alloy based on ensemble learning

With the rapid development of artificial intelligence, the combination of material database and machine learning has driven the progress of material informatics. Because aluminum alloy is widely used in many fields, so it is significant to predict the properties of aluminum alloy. In this thesis, the data of Al-Cu-Mg-X (X: Zn, Zr, etc.) alloy are used to input the composition, aging conditions (time and temperature) and predict its hardness. An ensemble learning solution based on automatic machine learning and an attention mechanism introduced into the secondary learner of deep neural network are proposed respectively. The experimental results show that selecting the correct secondary learner can further improve the prediction accuracy of the model. This manuscript introduces the attention mechanism to improve the secondary learner based on deep neural network, and obtains a fusion model with better performance. The R-Square of the best model is 0.9697 and the MAE is 3.4518HV.

Viaarxiv icon

InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Mar 08, 2022
Liwen Wang, Rumei Li, Yang Yan, Yuanmeng Yan, Sirui Wang, Wei Wu, Weiran Xu

Figure 1 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER
Figure 2 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER
Figure 3 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER
Figure 4 for InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Recently, prompt-based methods have achieved significant performance in few-shot learning scenarios by bridging the gap between language model pre-training and fine-tuning for downstream tasks. However, existing prompt templates are mostly designed for sentence-level tasks and are inappropriate for sequence labeling objectives. To address the above issue, we propose a multi-task instruction-based generative framework, named InstructionNER, for low-resource named entity recognition. Specifically, we reformulate the NER task as a generation problem, which enriches source sentences with task-specific instructions and answer options, then inferences the entities and types in natural language. We further propose two auxiliary tasks, including entity extraction and entity typing, which enable the model to capture more boundary information of entities and deepen the understanding of entity type semantics, respectively. Experimental results show that our method consistently outperforms other baselines on five datasets in few-shot settings.

* Work in progress 
Viaarxiv icon

Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images

Jul 12, 2019
Xueyan Ding, Yafei Wang, Yang Yan, Zheng Liang, Zetian Mi, Xianping Fu

Figure 1 for Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images
Figure 2 for Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images
Figure 3 for Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images
Figure 4 for Jointly Adversarial Network to Wavelength Compensation and Dehazing of Underwater Images

Severe color casts, low contrast and blurriness of underwater images caused by light absorption and scattering result in a difficult task for exploring underwater environments. Different from most of previous underwater image enhancement methods that compute light attenuation along object-camera path through hazy image formation model, we propose a novel jointly wavelength compensation and dehazing network (JWCDN) that takes into account the wavelength attenuation along surface-object path and the scattering along object-camera path simultaneously. By embedding a simplified underwater formation model into generative adversarial network, we can jointly estimates the transmission map, wavelength attenuation and background light via different network modules, and uses the simplified underwater image formation model to recover degraded underwater images. Especially, a multi-scale densely connected encoder-decoder network is proposed to leverage features from multiple layers for estimating the transmission map. To further improve the recovered image, we use an edge preserving network module to enhance the detail of the recovered image. Moreover, to train the proposed network, we propose a novel underwater image synthesis method that generates underwater images with inherent optical properties of different water types. The synthesis method can simulate the color, contrast and blurriness appearance of real-world underwater environments simultaneously. Extensive experiments on synthetic and real-world underwater images demonstrate that the proposed method yields comparable or better results on both subjective and objective assessments, compared with several state-of-the-art methods.

Viaarxiv icon