We present an end-to-end framework for the Assignment Problem with multiple tasks mapped to a group of workers, using reinforcement learning while preserving many constraints. Tasks and workers have time constraints and there is a cost associated with assigning a worker to a task. Each worker can perform multiple tasks until it exhausts its allowed time units (capacity). We train a reinforcement learning agent to find near optimal solutions to the problem by minimizing total cost associated with the assignments while maintaining hard constraints. We use proximal policy optimization to optimize model parameters. The model generates a sequence of actions in real-time which correspond to task assignment to workers, without having to retrain for changes in the dynamic state of the environment. In our problem setting reward is computed as negative of the assignment cost. We also demonstrate our results on bin packing and capacitated vehicle routing problem, using the same framework. Our results outperform Google OR-Tools using MIP and CP-SAT solvers with large problem instances, in terms of solution quality and computation time.
In this paper, we present a novel approach to identify linked fraudulent activities or actors sharing similar attributes, using Graph Convolution Network (GCN). These linked fraudulent activities can be visualized as graphs with abstract concepts like relationships and interactions, which makes GCNs an ideal solution to identify the graph edges which serve as links between fraudulent nodes. Traditional approaches like community detection require strong links between fraudulent attempts like shared attributes to find communities and the supervised solutions require large amount of training data which may not be available in fraud scenarios and work best to provide binary separation between fraudulent and non fraudulent activities. Our approach overcomes the drawbacks of traditional methods as GCNs simply learn similarities between fraudulent nodes to identify clusters of similar attempts and require much smaller dataset to learn. We demonstrate our results on linked accounts with both strong and weak links to identify fraud rings with high confidence. Our results outperform label propagation community detection and supervised GBTs algorithms in terms of solution quality and computation time.