Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Olga Baysal

Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation

Nov 07, 2023

AmirHossein Naghshzan, Latifa Guerrouj, Olga Baysal

Figure 1 for Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation

Figure 2 for Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation

Figure 3 for Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation

Figure 4 for Leveraging Deep Learning for Abstractive Code Summarization of Unofficial Documentation

Abstract:Usually, programming languages have official documentation to guide developers with APIs, methods, and classes. However, researchers identified insufficient or inadequate documentation examples and flaws with the API's complex structure as barriers to learning an API. As a result, developers may consult other sources (StackOverflow, GitHub, etc.) to learn more about an API. Recent research studies have shown that unofficial documentation is a valuable source of information for generating code summaries. We, therefore, have been motivated to leverage such a type of documentation along with deep learning techniques towards generating high-quality summaries for APIs discussed in informal documentation. This paper proposes an automatic approach using the BART algorithm, a state-of-the-art transformer model, to generate summaries for APIs discussed in StackOverflow. We built an oracle of human-generated summaries to evaluate our approach against it using ROUGE and BLEU metrics which are the most widely used evaluation metrics in text summarization. Furthermore, we evaluated our summaries empirically against a previous work in terms of quality. Our findings demonstrate that using deep learning algorithms can improve summaries' quality and outperform the previous work by an average of %57 for Precision, %66 for Recall, and %61 for F-measure, and it runs 4.4 times faster.

Via

Access Paper or Ask Questions

Leveraging Data Mining Algorithms to Recommend Source Code Changes

Apr 29, 2023

AmirHossein Naghshzan, Saeed Khalilazar, Pierre Poilane, Olga Baysal, Latifa Guerrouj, Foutse Khomh

Abstract:Context: Recent research has used data mining to develop techniques that can guide developers through source code changes. To the best of our knowledge, very few studies have investigated data mining techniques and--or compared their results with other algorithms or a baseline. Objectives: This paper proposes an automatic method for recommending source code changes using four data mining algorithms. We not only use these algorithms to recommend source code changes, but we also conduct an empirical evaluation. Methods: Our investigation includes seven open-source projects from which we extracted source change history at the file level. We used four widely data mining algorithms \ie{} Apriori, FP-Growth, Eclat, and Relim to compare the algorithms in terms of performance (Precision, Recall and F-measure) and execution time. Results: Our findings provide empirical evidence that while some Frequent Pattern Mining algorithms, such as Apriori may outperform other algorithms in some cases, the results are not consistent throughout all the software projects, which is more likely due to the nature and characteristics of the studied projects, in particular their change history. Conclusion: Apriori seems appropriate for large-scale projects, whereas Eclat appears to be suitable for small-scale projects. Moreover, FP-Growth seems an efficient approach in terms of execution time.

Via

Access Paper or Ask Questions

Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Nov 27, 2021

AmirHossein Naghshzan, Latifa Guerrouj, Olga Baysal

Figure 1 for Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Figure 2 for Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Figure 3 for Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Figure 4 for Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

Abstract:Automated source code summarization is a task that generates summarized information about the purpose, usage, and--or implementation of methods and classes to support understanding of these code entities. Multiple approaches and techniques have been proposed for supervised and unsupervised learning in code summarization, however, they were mostly focused on generating a summary for a piece of code. In addition, very few works have leveraged unofficial documentation. This paper proposes an automatic and novel approach for summarizing Android API methods discussed in Stack Overflow that we consider as unofficial documentation in this research. Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that API method. We have conducted a survey that involves 16 Android developers to evaluate the quality of our automatically generated summaries and compare them with the official Android documentation. Our results demonstrate that while developers find the official documentation more useful in general, the generated summaries are also competitive, in particular for offering implementation details, and can be used as a complementary source for guiding developers in software development and maintenance tasks.

* 2021 IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM), 2021, pp. 142-152

Via

Access Paper or Ask Questions