Abstract:A major bottleneck in multi-agent AI is the lack of simulateable models for the bottom-up emergence of social structure under realistic behavioral constraints. Similarly, many foundational theories in economics and sociology including the concepts of "institutions" and "norms" tend to describe social structures post hoc, often relying on implicit assumptions of shared culture, morality, or symbolic agreement. These concepts are often treated as primitives rather than reconstructed from agent-level behavior, leaving both their origins and operational definitions under-specified. To address this, we propose a three-stage bottom-up framework: Reciprocal Dynamics, capturing individual-level reciprocal exchanges; Norm Stabilization, the consolidation of shared expectations; and Institutional Construction, the externalization of stable patterns into scalable structures. By grounding social emergence in agent-level reciprocity, our framework enables the systematic exploration of how moral, cultural, and institutional structures emerge from cognitively minimal interactions.
Abstract:A key challenge in multi-agent AI is modeling social cooperation under realistic behavioral constraints. Many foundational concepts in economics and ethics such as "trust" or "morality" are often defined informally, without operational criteria or cognitive grounding, which limits their testability and implementation in artificial agents. Drawing on converging empirical evidence from primate behavior, infant cognition, and economic anthropology, we propose a conceptual framework composed of three cognitively minimal mechanisms: individual recognition, reciprocal credence, and cost return sensitivity. This framework reframes trust as a graded cognitive expectation, providing a simulateable basis for reciprocal exchange in artificial agents, and enabling the bottom-up emergence of scalable cooperation and institutional dynamics.
Abstract:LLMs have gotten attention across various research domains due to their exceptional performance on a wide range of complex tasks. Therefore, refined methods to evaluate the capabilities of LLMs are needed to determine the tasks and responsibility they should undertake. Our study mainly discussed how LLMs, as useful tools, should be effectively assessed. We proposed the two-stage framework: from ``core ability'' to ``agent'', clearly explaining how LLMs can be applied based on their specific capabilities, along with the evaluation methods in each stage. Core ability refers to the capabilities that LLMs need in order to generate high-quality natural language texts. After confirming LLMs possess core ability, they can solve real-world and complex tasks as agent. In the "core ability" stage, we discussed the reasoning ability, societal impact, and domain knowledge of LLMs. In the ``agent'' stage, we demonstrated embodied action, planning, and tool learning of LLMs agent applications. Finally, we examined the challenges currently confronting the evaluation methods for LLMs, as well as the directions for future development.