Abstract:The task of scholar name disambiguation is crucial in various real-world scenarios, including bibliometric-based candidate evaluation for awards, application material anti-fraud measures, and more. Despite significant advancements, current methods face limitations due to the complexity of heterogeneous data, often necessitating extensive human intervention. This paper proposes a novel approach by leveraging search-enhanced language models across multiple languages to improve name disambiguation. By utilizing the powerful query rewriting, intent recognition, and data indexing capabilities of search engines, our method can gather richer information for distinguishing between entities and extracting profiles, resulting in a more comprehensive data dimension. Given the strong cross-language capabilities of large language models(LLMs), optimizing enhanced retrieval methods with this technology offers substantial potential for high-efficiency information retrieval and utilization. Our experiments demonstrate that incorporating local languages significantly enhances disambiguation performance, particularly for scholars from diverse geographic regions. This multi-lingual, search-enhanced methodology offers a promising direction for more efficient and accurate active scholar name disambiguation.
Abstract:Large-scale online ride-sharing platforms have substantially transformed our lives by reallocating transportation resources to alleviate traffic congestion and promote transportation efficiency. An efficient fleet management strategy not only can significantly improve the utilization of transportation resources but also increase the revenue and customer satisfaction. It is a challenging task to design an effective fleet management strategy that can adapt to an environment involving complex dynamics between demand and supply. Existing studies usually work on a simplified problem setting that can hardly capture the complicated stochastic demand-supply variations in high-dimensional space. In this paper we propose to tackle the large-scale fleet management problem using reinforcement learning, and propose a contextual multi-agent reinforcement learning framework including two concrete algorithms, namely contextual deep Q-learning and contextual multi-agent actor-critic, to achieve explicit coordination among a large number of agents adaptive to different contexts. We show significant improvements of the proposed framework over state-of-the-art approaches through extensive empirical studies.
Abstract:Population migration is valuable information which leads to proper decision in urban-planning strategy, massive investment, and many other fields. For instance, inter-city migration is a posterior evidence to see if the government's constrain of population works, and inter-community immigration might be a prior evidence of real estate price hike. With timely data, it is also impossible to compare which city is more favorable for the people, suppose the cities release different new regulations, we could also compare the customers of different real estate development groups, where they come from, where they probably will go. Unfortunately these data was not available. In this paper, leveraging the data generated by positioning team in Didi, we propose a novel approach that timely monitoring population migration from community scale to provincial scale. Migration can be detected as soon as in a week. It could be faster, the setting of a week is for statistical purpose. A monitoring system is developed, then applied nation wide in China, some observations derived from the system will be presented in this paper. This new method of migration perception is origin from the insight that nowadays people mostly moving with their personal Access Point (AP), also known as WiFi hotspot. Assume that the ratio of AP moving to the migration of population is constant, analysis of comparative population migration would be feasible. More exact quantitative research would also be done with few sample research and model regression. The procedures of processing data includes many steps: eliminating the impact of pseudo-migration AP, for instance pocket WiFi, and second-hand traded router; distinguishing moving of population with moving of companies; identifying shifting of AP by the finger print clusters, etc..