Blog | REALISE Lab | Concordia University

Our paper "Towards Supporting Open Source Library Maintainers with Community-Based Analytics" accepted at ICSE 2026! 🎉

October 17, 2025 · One min read

Rachna Raj and Diego Elias Costa

How can OSS maintainers better understand the communities that depend on their work? While thousands of projects build on top of open libraries, maintainers often have limited visibility into how their code is actually being used. This gap matters: testing practices may overlook the very parts of the library most critical to dependents.

We introduce analytics that surface which features are most used by dependent projects and how well those features are tested.

Our analysis finds that not all community-used APIs are fully reflected in maintainers’ test suites, pointing to gaps that could inform more targeted maintenence strategies. We observe that while maintainers provide extensive tests, unit test suites does not always extend to every API most relied upon by dependents.”

info

Interested? You can find a pre-print of our paper here.

How Granularity and Order Drive Code Completion Quality

September 5, 2025 · One min read

Uswat Yusuf, Genevieve Caumartin, Diego Elias Costa

We are happy to announce that our paper "Beyond More Context: How Granularity and Order Drive Code Completion Quality" was accepted in the Context Competition Challenge Workshop, colocated with ASE 2025. This work was authored by Uswat Yusuf during her internship at RealiseLab last summer.

Context results

The competition challenged participants to develop strategies for gathering code context to maximize the performance of code completion models, based on a baseline provided by JetBrains. Our team achieved third place in the competition! We experimented with file chunking and chunk ordering on both Python and Kotlin source files, and found that chunk-level retrieval outperforms file-level retrieval.

Also, ordering is important: ordering the chunks in reverse order yielded measurable gains, supporting the recency bias.

info

Interested? You can find a pre-print of our paper here.

You can also find the paper that describes all approaches here

Group summer Hike 2025

August 26, 2025 · 2 min read

Mohamed Bilel Besbes

As winter is approaching by the hour, Montréalers, including ourselves, are making most out of summertime while possible. Therefore, for a brief summmer afternoon, we turn off the Overleaf and Google colab tabs (but not our laptops because the 'experiment is running', it is always the case) and head out for a sunshine hunt hike, and since Concordia is in downtown, there is no better option than to do the hike in Mont Royal, the iconic landmark of the island. This hike became our yearly ritual. We started the hike with a convinient lunch of Kebab sandwiches in the Chalet across Beaver lake. Before proceeding to the summit, we had to get that Beaver lake picture and it couldn't look any better. Group picture We continued the hike towards the summit to take the iconic view of the Montréal skyline. After too much effort to claim some space among the touristic crowds, we were able to capture the picture. We wraped the hike with a Freesbe game under the shadows of the mountain trees. This hike was the opportunity to show our interns, especially our (relatively) new international Mitacs interns, how beautiful Montréal is. Also, at the end of the hike, we said our farewell to our beloved Haya and wished to her a safe trip back to Palestine.

Opportunities and Security Risks of Technical Leverage

March 28, 2025 · One min read

Diego Elias Costa

What are the trade-offs of heavily relying on Free and Open-Source (FOSS) components to develop your own software system? How much faster are you able to ship your code to production versus what security risks you may expose your system to?

Inspired by the work of Massacci and Pashenko, who used the technical leverage to assess this trade-off in the Java ecosystem, we perform a large-scale analysis of opportunities and risks of technical leverage in the JavaScript ecosystem.
Our models indicate that heavily relying on FOSS shorten the release cycles of small libraries, but at a cost of significantly higher (4-7x) vulnerability exposure.

info

Interested? You can find a pre-print of our paper here.

Our Paper 'A Dataset of Performance Measurements and Alerts from Mozilla (Data Artifact)' Accepted at ICPE'25 🎉

March 19, 2025 · 4 min read

Mohamed Bilel Besbes, Diego Elias Costa

What is the paper about?

Performance regressions in software systems can lead to significant financial losses and degraded user satisfaction, making their early detection and mitigation critical.

One of the major issues encountered in both the academic and industrial landscapes is access to data from the industry, and this issue prevales in performance engineering as such datasets contain information about a company's internal systems. To address this gap, we introduce a unique dataset to support various research studies in performance engineering, anomaly detection, and machine learning.

This paper introduces a unique dataset of performance measurements and alerts from Mozilla, aimed at advancing research in performance engineering and anomaly detection.
Collected from Mozilla Firefox’s testing systems, this dataset contains:

5,655 time series
17,989 performance alerts
Mozilla Engineers-validated annotations spanning May 2023 – May 2024

Exploring the Potential of Llama Models in Automated Code Refinement accepted at SANER 2025

December 5, 2024 · One min read

Genevieve Caumartin, Diego Elias Costa

How does smaller and open-source Large Language Models compare to ChatGPT in refining code? Our study dives into code reviews, a cornerstone of modern software development. While code reviews are indispensable for ensuring quality and transferring knowledge, they can also become bottlenecks in large-scale projects.

Inspired by a recent papaer by Guo et al, we explore how open-source models like CodeLlama and Llama 2 (7B parameters) measure up against proprietary solutions like ChatGPT for automating code refinement tasks.
Our findings show that with proper tuning, these open-source models can offer an interesting balance between performance, cost-efficiency, and privacy.
This research not only opens doors for privacy-conscious and cost-effective solutions but also sheds light on where current AI models shine—and where they still need a human touch.

info

Interested? You can find a pre-print of our paper here. Our replication package is available here.

Concordia has funded our application for Reanimate 2025

November 8, 2024 · One min read

Diego Elias Costa

Great news! Concordia has decided to fund the second edition of REANIMATE, our summer school on Retro Gaming History, Critic, and Development. This funds are part of the Aid to Research Related Events, Exhibition, Publication and Dissemination Activities (ARRE) Program.

The first edition of Reanimate'24 was organized by Prof. Yann Gael and team, and had a rich program, with 11 speakers from academia and industry, who shared their knowledge to participants of the event. The event included 5 full days of activities with talks, game jams, workshops, and it was a success. I am thrilled to join the organization for the second event, and glad that Concordia will be able to host the summer school again in 2025.

Poster Presentation by Yasmine at the Undergraduate Research Showcase

October 2, 2024 · One min read

Diego Elias Costa

Yasmine's Presentation Yasmine presented her summer internship work at the Undergraduate Research Showcase at Concordia University. Her work entitled "Can ChatGPT Migrate My Code" explores the idea of using ChatGPT for migrating code that uses third-party libraries. The experiment consisted in prompting ChatGPT to migrate the code of one library version to another, and evaluating whether the generated code was correct. And the results were promising, with ChatGPT achieving a much higher degree of success than we originally anticipated. Yasmine's Presentation The poster presentation was a great success, way to go Yasmine! If you are interested in the details of this project, keep an eye out as we are preparing a paper submission soon.

Farewell Celebration for Our Interns Adam and Yasmine

August 27, 2024 · One min read

Mohamed Bilel Besbes

Group Photo As the internship period comes to an end, we gathered to bid a fond farewell to our interns, Adam and Yasmine. Over the past few months, their contributions have been invaluable, and they have truly become part of our lab family. To celebrate their hard work and dedication, we shared a delicious cake and exchanged heartfelt best wishes written on personalized cards. We are especially excited for Adam as he embarks on his upcoming internship at TMX, and we wish Yasmine tremendous success as she begins her new academic school year tomorrow. This isn't goodbye but more of a "see you later," as we hope to welcome both Adam and Yasmine back in the future. The day was filled with gratitude, memories, and well wishes for their bright futures ahead. We captured this special moment with a group picture in front of the cake, a token of appreciation for all they’ve brought to our team. Until we meet again, we wish Adam and Yasmine all the best in their future endeavors!

Group Photo

Our Approach for Predicting Contributor's Response Latency was accepted at TSE.

August 3, 2024 · One min read

Diego Elias Costa

The efficiency of a Pull Request (PR) process hinges on how quickly maintainers and contributors respond to each other. Knowing how long this might take can improve interactions and manage expectations.

Our new study introduces a machine-learning method to predict these response times by analyzing data from 20 popular open-source GitHub projects. We examine various features of the projects and PRs, and identified key factors that influence response times.

PRs submitted earlier in the week, with a moderate number of commits and clear descriptions, tend to get quicker responses.
Contributors who are more engaged and have a good track record also tend to respond faster.
We also highlight how understanding and predicting response times can enhance the PR review process.

info

Interested? You can find a pre-print of our paper here.

What is the paper about?​

What is the paper about?