Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain.
A cooperative driving strategy is proposed, in which the dynamic driving privilege assignment in real-time and the driving privilege gradual handover are realized. The first issue in cooperative driving is the driving privilege assignment based on the risk level. The risk assessment methods in 2 typical dangerous scenarios are presented, i.e. the car-following scenario and the cut-in scenario. The naturalistic driving data is used to study the behavior characteristics of the driver. TTC (time to collosion) is defined as an obvious risk measure, whereas the time before the host vehicle has to brake assuming that the target vehicle is braking is defined as the potential risk measure, i.e. the time margin (TM). A risk assessment algorithm is proposed based on the obvious risk and potential risk. The naturalistic driving data are applied to verify the effectiveness of the risk assessment algorithm. It is identified that the risk assessment algorithm performs better than TTC in the ROC (receiver operating characteristic). The second issue in cooperative driving is the driving privilege gradual handover. The vehicle is jointly controlled by the driver and automated driving system during the driving privilege gradual handover. The non-cooperative MPC (model predictive control) is employed to resolve the conflicts between the driver and automated driving system. It is identified that the Nash equilibrium of the non-cooperative MPC can be achieved by using a non-iterative method. The driving privilege gradual handover is realized by using the confidence matrixes update. The simulation verification shows that the the cooperative driving strategy can realize the gradual handover of the driving privilege between the driver and automated system, and the cooperative driving strategy can dynamically assige the driving privilege in real-time according to the risk level.