Governments' net zero emission target aims at increasing the share of renewable energy sources as well as influencing the behaviours of consumers to support the cost-effective balancing of energy supply and demand. These will be achieved by the advanced information and control infrastructures of smart grids which allow the interoperability among various stakeholders. Under this circumstance, increasing number of consumers produce, store, and consume energy, giving them a new role of prosumers. The integration of prosumers and accommodation of incurred bidirectional flows of energy and information rely on two key factors: flexible structures of energy markets and intelligent operations of power systems. The blockchain and artificial intelligence (AI) are innovative technologies to fulfil these two factors, by which the blockchain provides decentralised trading platforms for energy markets and the AI supports the optimal operational control of power systems. This paper attempts to address how to incorporate the blockchain and AI in the smart grids for facilitating prosumers to participate in energy markets. To achieve this objective, first, this paper reviews how policy designs price carbon emissions caused by the fossil-fuel based generation so as to facilitate the integration of prosumers with renewable energy sources. Second, the potential structures of energy markets with the support of the blockchain technologies are discussed. Last, how to apply the AI for enhancing the state monitoring and decision making during the operations of power systems is introduced.
In a federated learning system, the clients, e.g. mobile devices and organization participants, usually have different personal preferences or behavior patterns, namely Non-IID data problems across clients. Clustered federated learning is to group users into different clusters that the clients in the same group will share the same or similar behavior patterns that are to satisfy the IID data assumption for most traditional machine learning algorithms. Most of the existing clustering methods in FL treat every client equally that ignores the different importance contributions among clients. This paper proposes a novel weighted client-based clustered FL algorithm to leverage the client's group and each client in a unified optimization framework. Moreover, the paper proposes convergence analysis to the proposed clustered FL method. The experimental analysis has demonstrated the effectiveness of the proposed method.
High-speed train (HST) communications with orthogonal frequency division multiplexing (OFDM) techniques have received significant attention in recent years. Besides, cell-free (CF) massive multiple-input multiple-output (MIMO) is considered a promising technology to achieve the ultimate performance limit. In this paper, we focus on the performance of CF massive MIMO-OFDM systems with both matched filter and large-scale fading decoding (LSFD) receivers in HST communications. HST communications with small cell and cellular massive MIMO-OFDM systems are also analyzed for comparison. Considering the bad effect of Doppler frequency offset (DFO) on system performance, exact closed-form expressions for uplink spectral efficiency (SE) of all systems are derived. According to the simulation results, we find that the CF massive MIMO-OFDM system with LSFD achieves both larger SE and lower SE drop percentages than other systems. In addition, increasing the number of access points (APs) and antennas per AP can effectively compensate for the performance loss from the DFO. Moreover, there is an optimal vertical distance between APs and HST to achieve the maximum SE.
Imbalanced Learning (IL) is an important problem that widely exists in data mining applications. Typical IL methods utilize intuitive class-wise resampling or reweighting to directly balance the training set. However, some recent research efforts in specific domains show that class-imbalanced learning can be achieved without class-wise manipulation. This prompts us to think about the relationship between the two different IL strategies and the nature of the class imbalance. Fundamentally, they correspond to two essential imbalances that exist in IL: the difference in quantity between examples from different classes as well as between easy and hard examples within a single class, i.e., inter-class and intra-class imbalance. Existing works fail to explicitly take both imbalances into account and thus suffer from suboptimal performance. In light of this, we present Duple-Balanced Ensemble, namely DUBE , a versatile ensemble learning framework. Unlike prevailing methods, DUBE directly performs inter-class and intra-class balancing without relying on heavy distance-based computation, which allows it to achieve competitive performance while being computationally efficient. We also present a detailed discussion and analysis about the pros and cons of different inter/intra-class balancing strategies based on DUBE . Extensive experiments validate the effectiveness of the proposed method. Code and examples are available at https://github.com/ICDE2022Sub/duplebalance.
imbalanced-ensemble, abbreviated as imbens, is an open-source Python toolbox for quick implementing and deploying ensemble learning algorithms on class-imbalanced data. It provides access to multiple state-of-art ensemble imbalanced learning (EIL) methods, visualizer, and utility functions for dealing with the class imbalance problem. These ensemble methods include resampling-based, e.g., under/over-sampling, and reweighting-based ones, e.g., cost-sensitive learning. Beyond the implementation, we also extend conventional binary EIL algorithms with new functionalities like multi-class support and resampling scheduler, thereby enabling them to handle more complex tasks. The package was developed under a simple, well-documented API design follows that of scikit-learn for increased ease of use. imbens is released under the MIT open-source license and can be installed from Python Package Index (PyPI). Source code, binaries, detailed documentation, and usage examples are available at https://github.com/ZhiningLiu1998/imbalanced-ensemble.
Offline reinforcement learning (RL) aims to learn the optimal policy from a pre-collected dataset without online interactions. Most of the existing studies focus on distributional shift caused by out-of-distribution actions. However, even in-distribution actions can raise serious problems. Since the dataset only contains limited information about the underlying model, offline RL is vulnerable to spurious correlations, i.e., the agent tends to prefer actions that by chance lead to high returns, resulting in a highly suboptimal policy. To address such a challenge, we propose a practical and theoretically guaranteed algorithm SCORE that reduces spurious correlations by combing an uncertainty penalty into policy evaluation. We show that this is consistent with the pessimism principle studied in theory, and the proposed algorithm converges to the optimal policy with a sublinear rate under mild assumptions. By conducting extensive experiments on existing benchmarks, we show that SCORE not only benefits from a solid theory but also obtains strong empirical results on a variety of tasks.
While diverse question answering (QA) datasets have been proposed and contributed significantly to the development of deep learning models for QA tasks, the existing datasets fall short in two aspects. First, we lack QA datasets covering complex questions that involve answers as well as the reasoning processes to get the answers. As a result, the state-of-the-art QA research on numerical reasoning still focuses on simple calculations and does not provide the mathematical expressions or evidences justifying the answers. Second, the QA community has contributed much effort to improving the interpretability of QA models. However, these models fail to explicitly show the reasoning process, such as the evidence order for reasoning and the interactions between different pieces of evidence. To address the above shortcomings, we introduce NOAHQA, a conversational and bilingual QA dataset with questions requiring numerical reasoning with compound mathematical expressions. With NOAHQA, we develop an interpretable reasoning graph as well as the appropriate evaluation metric to measure the answer quality. We evaluate the state-of-the-art QA models trained using existing QA datasets on NOAHQA and show that the best among them can only achieve 55.5 exact match scores, while the human performance is 89.7. We also present a new QA model for generating a reasoning graph where the reasoning graph metric still has a large gap compared with that of humans, e.g., 28 scores.
By constructing digital twins (DT) of an integrated energy system (IES), one can benefit from DT's predictive capabilities to improve coordinations among various energy converters, hence enhancing energy efficiency, cost savings and carbon emission reduction. This paper is motivated by the fact that practical IESs suffer from multiple uncertainty sources, and complicated surrounding environment. To address this problem, a novel DT-based day-ahead scheduling method is proposed. The physical IES is modelled as a multi-vector energy system in its virtual space that interacts with the physical IES to manipulate its operations. A deep neural network is trained to make statistical cost-saving scheduling by learning from both historical forecasting errors and day-ahead forecasts. Case studies of IESs show that the proposed DT-based method is able to reduce the operating cost of IES by 63.5%, comparing to the existing forecast-based scheduling methods. It is also found that both electric vehicles and thermal energy storages play proactive roles in the proposed method, highlighting their importance in future energy system integration and decarbonisation.
Distant supervision uses triple facts in knowledge graphs to label a corpus for relation extraction, leading to wrong labeling and long-tail problems. Some works use the hierarchy of relations for knowledge transfer to long-tail relations. However, a coarse-grained relation often implies only an attribute (e.g., domain or topic) of the distant fact, making it hard to discriminate relations based solely on sentence semantics. One solution is resorting to entity types, but open questions remain about how to fully leverage the information of entity types and how to align multi-granular entity types with sentences. In this work, we propose a novel model to enrich distantly-supervised sentences with entity types. It consists of (1) a pairwise type-enriched sentence encoding module injecting both context-free and -related backgrounds to alleviate sentence-level wrong labeling, and (2) a hierarchical type-sentence alignment module enriching a sentence with the triple fact's basic attributes to support long-tail relations. Our model achieves new state-of-the-art results in overall and long-tail performance on benchmarks.