Alert button
Picture for Yongjae Lee

Yongjae Lee

Alert button

NFTs to MARS: Multi-Attention Recommender System for NFTs

Jun 13, 2023
Seonmi Kim, Youngbin Lee, Yejin Kim, Joohwan Hong, Yongjae Lee

Figure 1 for NFTs to MARS: Multi-Attention Recommender System for NFTs
Figure 2 for NFTs to MARS: Multi-Attention Recommender System for NFTs
Figure 3 for NFTs to MARS: Multi-Attention Recommender System for NFTs
Figure 4 for NFTs to MARS: Multi-Attention Recommender System for NFTs

Recommender systems have become essential tools for enhancing user experiences across various domains. While extensive research has been conducted on recommender systems for movies, music, and e-commerce, the rapidly growing and economically significant Non-Fungible Token (NFT) market remains underexplored. The unique characteristics and increasing prominence of the NFT market highlight the importance of developing tailored recommender systems to cater to its specific needs and unlock its full potential. In this paper, we examine the distinctive characteristics of NFTs and propose the first recommender system specifically designed to address NFT market challenges. In specific, we develop a Multi-Attention Recommender System for NFTs (NFT-MARS) with three key characteristics: (1) graph attention to handle sparse user-item interactions, (2) multi-modal attention to incorporate feature preference of users, and (3) multi-task learning to consider the dual nature of NFTs as both artwork and financial assets. We demonstrate the effectiveness of NFT-MARS compared to various baseline models using the actual transaction data of NFTs collected directly from blockchain for four of the most popular NFT collections. The source code and data are available at https://anonymous.4open.science/r/RecSys2023-93ED.

Viaarxiv icon

Mean-Variance Efficient Collaborative Filtering for Stock Recommendation

Jun 11, 2023
Munki Chung, Yongjae Lee, Woo Chang Kim

Figure 1 for Mean-Variance Efficient Collaborative Filtering for Stock Recommendation
Figure 2 for Mean-Variance Efficient Collaborative Filtering for Stock Recommendation
Figure 3 for Mean-Variance Efficient Collaborative Filtering for Stock Recommendation
Figure 4 for Mean-Variance Efficient Collaborative Filtering for Stock Recommendation

The rise of FinTech has transformed financial services onto online platforms, yet stock investment recommender systems have received limited attention compared to other industries. Personalized stock recommendations can significantly impact customer engagement and satisfaction within the industry. However, traditional investment recommendations focus on high-return stocks or highly diversified portfolios based on the modern portfolio theory, often neglecting user preferences. On the other hand, collaborative filtering (CF) methods also may not be directly applicable to stock recommendations, because it is inappropriate to just recommend stocks that users like. The key is to optimally blend users preference with the portfolio theory. However, research on stock recommendations within the recommender system domain remains comparatively limited, and no existing model considers both the preference of users and the risk-return characteristics of stocks. In this regard, we propose a mean-variance efficient collaborative filtering (MVECF) model for stock recommendations that consider both aspects. Our model is specifically designed to improve the pareto optimality (mean-variance efficiency) in a trade-off between the risk (variance of return) and return (mean return) by systemically handling uncertainties in stock prices. Such improvements are incorporated into the MVECF model using regularization, and the model is restructured to fit into the ordinary matrix factorization scheme to boost computational efficiency. Experiments on real-world fund holdings data show that our model can increase the mean-variance efficiency of suggested portfolios while sacrificing just a small amount of mean average precision and recall. Finally, we further show MVECF is easily applicable to the state-of-the-art graph-based ranking models.

* 12 pages, 4 figures, preprint, under review 
Viaarxiv icon

DarkBERT: A Language Model for the Dark Side of the Internet

May 18, 2023
Youngjin Jin, Eugene Jang, Jian Cui, Jin-Woo Chung, Yongjae Lee, Seungwon Shin

Figure 1 for DarkBERT: A Language Model for the Dark Side of the Internet
Figure 2 for DarkBERT: A Language Model for the Dark Side of the Internet
Figure 3 for DarkBERT: A Language Model for the Dark Side of the Internet
Figure 4 for DarkBERT: A Language Model for the Dark Side of the Internet

Recent research has suggested that there are clear differences in the language used in the Dark Web compared to that of the Surface Web. As studies on the Dark Web commonly require textual analysis of the domain, language models specific to the Dark Web may provide valuable insights to researchers. In this work, we introduce DarkBERT, a language model pretrained on Dark Web data. We describe the steps taken to filter and compile the text data used to train DarkBERT to combat the extreme lexical and structural diversity of the Dark Web that may be detrimental to building a proper representation of the domain. We evaluate DarkBERT and its vanilla counterpart along with other widely used language models to validate the benefits that a Dark Web domain specific model offers in various use cases. Our evaluations show that DarkBERT outperforms current language models and may serve as a valuable resource for future research on the Dark Web.

* 9 pages (main paper), 17 pages (including bibliography and appendix), to appear at the ACL 2023 Main Conference 
Viaarxiv icon

MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table

Apr 27, 2023
Yongjae Lee, Li Yang, Deliang Fan

Figure 1 for MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table
Figure 2 for MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table
Figure 3 for MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table
Figure 4 for MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table

Neural radiance field (NeRF) has shown remarkable performance in generating photo-realistic novel views. Since the emergence of NeRF, many studies have been conducted, among which managing features with explicit structures such as grids has achieved exceptionally fast training by reducing the complexity of multilayer perceptron (MLP) networks. However, storing features in dense grids requires significantly large memory space, which leads to memory bottleneck in computer systems and thus large training time. To address this issue, in this work, we propose MF-NeRF, a memory-efficient NeRF framework that employs a mixed-feature hash table to improve memory efficiency and reduce training time while maintaining reconstruction quality. We first design a mixed-feature hash table to adaptively mix part of multi-level feature grids into one and map it to a single hash table. Following that, in order to obtain the correct index of a grid point, we further design an index transformation method that transforms indices of an arbitrary level grid to those of a canonical grid. Extensive experiments benchmarking with state-of-the-art Instant-NGP, TensoRF, and DVGO, indicate our MF-NeRF could achieve the fastest training time on the same GPU hardware with similar or even higher reconstruction quality. Source code is available at https://github.com/nfyfamr/MF-NeRF.

Viaarxiv icon

Shedding New Light on the Language of the Dark Web

Apr 14, 2022
Youngjin Jin, Eugene Jang, Yongjae Lee, Seungwon Shin, Jin-Woo Chung

Figure 1 for Shedding New Light on the Language of the Dark Web
Figure 2 for Shedding New Light on the Language of the Dark Web
Figure 3 for Shedding New Light on the Language of the Dark Web
Figure 4 for Shedding New Light on the Language of the Dark Web

The hidden nature and the limited accessibility of the Dark Web, combined with the lack of public datasets in this domain, make it difficult to study its inherent characteristics such as linguistic properties. Previous works on text classification of Dark Web domain have suggested that the use of deep neural models may be ineffective, potentially due to the linguistic differences between the Dark and Surface Webs. However, not much work has been done to uncover the linguistic characteristics of the Dark Web. This paper introduces CoDA, a publicly available Dark Web dataset consisting of 10000 web documents tailored towards text-based Dark Web analysis. By leveraging CoDA, we conduct a thorough linguistic analysis of the Dark Web and examine the textual differences between the Dark Web and the Surface Web. We also assess the performance of various methods of Dark Web page classification. Finally, we compare CoDA with an existing public Dark Web dataset and evaluate their suitability for various use cases.

* To appear at NAACL 2022 (main conference) 
Viaarxiv icon

DEEP-BO for Hyperparameter Optimization of Deep Networks

May 23, 2019
Hyunghun Cho, Yongjin Kim, Eunjung Lee, Daeyoung Choi, Yongjae Lee, Wonjong Rhee

Figure 1 for DEEP-BO for Hyperparameter Optimization of Deep Networks
Figure 2 for DEEP-BO for Hyperparameter Optimization of Deep Networks
Figure 3 for DEEP-BO for Hyperparameter Optimization of Deep Networks
Figure 4 for DEEP-BO for Hyperparameter Optimization of Deep Networks

The performance of deep neural networks (DNN) is very sensitive to the particular choice of hyper-parameters. To make it worse, the shape of the learning curve can be significantly affected when a technique like batchnorm is used. As a result, hyperparameter optimization of deep networks can be much more challenging than traditional machine learning models. In this work, we start from well known Bayesian Optimization solutions and provide enhancement strategies specifically designed for hyperparameter optimization of deep networks. The resulting algorithm is named as DEEP-BO (Diversified, Early-termination-Enabled, and Parallel Bayesian Optimization). When evaluated over six DNN benchmarks, DEEP-BO easily outperforms or shows comparable performance with some of the well-known solutions including GP-Hedge, Hyperband, BOHB, Median Stopping Rule, and Learning Curve Extrapolation. The code used is made publicly available at https://github.com/snu-adsl/DEEP-BO.

* 26 pages, NeurIPS19 under review 
Viaarxiv icon