Researcher profile

Timofey Bryksin

Timofey Bryksin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2026arXiv

Developer Needs and Feasible Features for AI Assistants in IDEs

Despite the increasing presence of AI assistants in Integrated Development Environments (IDEs), it remains unclear what different groups of developers actually need from these tools and which features are likely to be implemented in practice. To investigate this gap, we conducted a two-phase study. First, we interviewed 35 professional developers from three user groups (Adopters, Churners, and Non-Users) to uncover unmet needs and expectations. Our analysis revealed five key areas of need distinctly distributed across practitioners' groups: Technology Improvement, Interaction, and Customization, as well as Simplifying Skill Building, and Programming Tasks. We then examined the feasibility of addressing selected needs through an internal prediction market involving 102 practitioners. The results demonstrate a strong alignment between the developers' needs and the practitioners' judgment for features focused on implementation and context awareness. However, features related to proactivity and maintenance remain both underestimated and technically unaddressed. Our findings reveal gaps in current AI support and provide practical directions for developing more effective and sustainable in-IDE AI systems

preprint2026arXiv

Multi-Agent Coordinated Rename Refactoring

The primary value of AI agents in software development lies in their ability to extend the developer's capacity for reasoning and action, not to supplant human involvement. To showcase how to use agents working in tandem with developers, we designed a novel approach for carrying out coordinated renaming. Coordinated renaming, where a single rename refactoring triggers refactorings in multiple, related identifiers, is a frequent yet challenging task. Developers must manually propagate these rename refactorings across numerous files and contexts, a process that is both tedious and highly error-prone. State-of-the-art heuristic-based approaches produce an overwhelming number of false positives, while vanilla Large Language Models (LLMs) provide incomplete suggestions due to their limited context and inability to interact with refactoring tools. This leaves developers with incomplete refactorings or burdens them with filtering too many false positives. Coordinated renaming is exactly the kind of repetitive task that agents can significantly reduce the developers' burden while keeping them in the driver's seat. We designed, implemented, and evaluated the first multi-agent framework that automates coordinated renaming. It operates on a key insight: a developer's initial refactoring is a clue to infer the scope of related refactorings. Our Scope Inference Agent first transforms this clue into an explicit, natural-language Declared Scope. The Planned Execution Agent then uses this as a strict plan to identify program elements that should undergo refactoring and safely executes the changes by invoking the IDE's own trusted refactoring APIs. Finally, the Replication Agent uses it to guide the project-wide search. We first conducted a formative study on the practice of coordinated renaming in 609K commits in 100 open-source projects and surveyed 205 developers ...

preprint2023arXiv

Predicting Tags For Programming Tasks by Combining Textual And Source Code Data

Competitive programming remains a very popular activity that combines both software engineering and education. In order to prepare and to practice, contestants use extensive archives of problems from past contents available on various competitive programming platforms. One way to make this process more effective is to provide an automatic tag system for the tasks. Prior works do that by either using the tasks' problem statements or the code of their solutions. In this study, we investigate which information source is more valuable for tag prediction. To answer that question, we compare existing approaches of both types on the same dataset and with the same set of tags. Then, we propose a novel approach, which is an ensemble of the Gated Graph Neural Network model for analyzing solutions and the Bidirectional Encoder Representations from Transformers model for processing statements. Our experiments show that our approach outperforms previously proposed models by 0.175 of the PR-AUC metric.

preprint2022arXiv

A Large-Scale Comparison of Python Code in Jupyter Notebooks and Scripts

In recent years, Jupyter notebooks have grown in popularity in several domains of software engineering, such as data science, machine learning, and computer science education. Their popularity has to do with their rich features for presenting and visualizing data, however, recent studies show that notebooks also share a lot of drawbacks: high number of code clones, low reproducibility, etc. In this work, we carry out a comparison between Python code written in Jupyter Notebooks and in traditional Python scripts. We compare the code from two perspectives: structural and stylistic. In the first part of the analysis, we report the difference in the number of lines, the usage of functions, as well as various complexity metrics. In the second part, we show the difference in the number of stylistic issues and provide an extensive overview of the 15 most frequent stylistic issues in the studied mediums. Overall, we demonstrate that notebooks are characterized by the lower code complexity, however, their code could be perceived as more entangled than in the scripts. As for the style, notebooks tend to have 1.4 times more stylistic issues, but at the same time, some of them are caused by specific coding practices in notebooks and should be considered as false positives. With this research, we want to pave the way to studying specific problems of notebooks that should be addressed by the development of notebook-specific tools, and provide various insights that can be useful in this regard.

preprint2022arXiv

All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs

In this work, we propose an approach for collecting completion usage logs from the users in an IDE and using them to train a machine learning based model for ranking completion candidates. We developed a set of features that describe completion candidates and their context, and deployed their anonymized collection in the Early Access Program of IntelliJ-based IDEs. We used the logs to collect a dataset of code completions from users, and employed it to train a ranking CatBoost model. Then, we evaluated it in two settings: on a held-out set of the collected completions and in a separate A/B test on two different groups of users in the IDE. Our evaluation shows that using a simple ranking model trained on the past user behavior logs significantly improved code completion experience. Compared to the default heuristics-based ranking, our model demonstrated a decrease in the number of typing actions necessary to perform the completion in the IDE from 2.073 to 1.832. The approach adheres to privacy requirements and legal constraints, since it does not require collecting personal information, performing all the necessary anonymization on the client's side. Importantly, it can be improved continuously: implementing new features, collecting new data, and evaluating new models - this way, we have been using it in production since the end of 2020.

preprint2022arXiv

AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE

We developed a plugin for IntelliJ IDEA called AntiCopyPaster, which tracks the pasting of code fragments inside the IDE and suggests the appropriate Extract Method refactoring to combat the propagation of duplicates. Unlike the existing approaches, our tool is integrated with the developer's workflow, and pro-actively recommends refactorings. Since not all code fragments need to be extracted, we develop a classification model to make this decision. When a developer copies and pastes a code fragment, the plugin searches for duplicates in the currently opened file, waits for a short period of time to allow the developer to edit the code, and finally inferences the refactoring decision based on a number of features. Our experimental study on a large dataset of 18,942 code fragments mined from 13 Apache projects shows that AntiCopyPaster correctly recommends Extract Method refactorings with an F-score of 0.82. Furthermore, our survey of 59 developers reflects their satisfaction with the developed plugin's operation. The plugin and its source code are publicly available on GitHub at https://github.com/JetBrains-Research/anti-copy-paster. The demonstration video can be found on YouTube: https://youtu.be/_wwHg-qFjJY.

preprint2022arXiv

Assessing Project-Level Fine-Tuning of ML4SE Models

Machine Learning for Software Engineering (ML4SE) is an actively growing research area that focuses on methods that help programmers in their work. In order to apply the developed methods in practice, they need to achieve reasonable quality in order to help rather than distract developers. While the development of new approaches to code representation and data collection improves the overall quality of the models, it does not take into account the information that we can get from the project at hand. In this work, we investigate how the model's quality can be improved if we target a specific project. We develop a framework to assess quality improvements that models can get after fine-tuning for the method name prediction task on a particular project. We evaluate three models of different complexity and compare their quality in three settings: trained on a large dataset of Java projects, further fine-tuned on the data from a particular project, and trained from scratch on this data. We show that per-project fine-tuning can greatly improve the models' quality as they capture the project's domain and naming conventions. We open-source the tool we used for data collection, as well as the code to run the experiments: https://zenodo.org/record/6040745.

preprint2022arXiv

DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

The task of finding the best developer to fix a bug is called bug triage. Most of the existing approaches consider the bug triage task as a classification problem, however, classification is not appropriate when the sets of classes change over time (as developers often do in a project). Furthermore, to the best of our knowledge, all the existing models use textual sources of information, i.e., bug descriptions, which are not always available. In this work, we explore the applicability of existing solutions for the bug triage problem when stack traces are used as the main data source of bug reports. Additionally, we reformulate this task as a ranking problem and propose new deep learning models to solve it. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network, with the weights of the models optimized using a ranking loss function. To improve the quality of ranking, we propose using additional information from version control system annotations. Two approaches are proposed for extracting features from annotations: manual and using an additional neural network. To evaluate our models, we collected two datasets of real-world stack traces. Our experiments show that the proposed models outperform existing models adapted to handle stack traces. To facilitate further research in this area, we publish the source code of our models and one of the collected datasets.

preprint2022arXiv

Evaluating the Impact of Source Code Parsers on ML4SE Models

As researchers and practitioners apply Machine Learning to increasingly more software engineering problems, the approaches they use become more sophisticated. A lot of modern approaches utilize internal code structure in the form of an abstract syntax tree (AST) or its extensions: path-based representation, complex graph combining AST with additional edges. Even though the process of extracting ASTs from code can be done with different parsers, the impact of choosing a parser on the final model quality remains unstudied. Moreover, researchers often omit the exact details of extracting particular code representations. In this work, we evaluate two models, namely Code2Seq and TreeLSTM, in the method name prediction task backed by eight different parsers for the Java language. To unify the process of data preparation with different parsers, we develop SuperParser, a multi-language parser-agnostic library based on PathMiner. SuperParser facilitates the end-to-end creation of datasets suitable for training and evaluation of ML models that work with structural information from source code. Our results demonstrate that trees built by different parsers vary in their structure and content. We then analyze how this diversity affects the models' quality and show that the quality gap between the most and least suitable parsers for both models turns out to be significant. Finally, we discuss other features of the parsers that researchers and practitioners should take into account when selecting a parser along with the impact on the models' quality. The code of SuperParser is publicly available at https://doi.org/10.5281/zenodo.6366591. We also publish Java-norm, the dataset we use to evaluate the models: https://doi.org/10.5281/zenodo.6366599.

preprint2022arXiv

Evaluation of Contrastive Learning with Various Code Representations for Code Clone Detection

Code clones are pairs of code snippets that implement similar functionality. Clone detection is a fundamental branch of automatic source code comprehension, having many applications in refactoring recommendation, plagiarism detection, and code summarization. A particularly interesting case of clone detection is the detection of semantic clones, i.e., code snippets that have the same functionality but significantly differ in implementation. A promising approach to detecting semantic clones is contrastive learning (CL), a machine learning paradigm popular in computer vision but not yet commonly adopted for code processing. Our work aims to evaluate the most popular CL algorithms combined with three source code representations on two tasks. The first task is code clone detection, which we evaluate on the POJ-104 dataset containing implementations of 104 algorithms. The second task is plagiarism detection. To evaluate the models on this task, we introduce CodeTransformator, a tool for transforming source code. We use it to create a dataset that mimics plagiarised code based on competitive programming solutions. We trained nine models for both tasks and compared them with six existing approaches, including traditional tools and modern pre-trained neural models. The results of our evaluation show that proposed models perform diversely in each task, however the performance of the graph-based models is generally above the others. Among CL algorithms, SimCLR and SwAV lead to better results, while Moco is the most robust approach. Our code and trained models are available at https://doi.org/10.5281/zenodo.6360627, https://doi.org/10.5281/zenodo.5596345.

preprint2022arXiv

IntelliTC: Automating Type Changes in IntelliJ IDEA

Developers often change types of program elements. Such refactoring often involves updating not only the type of the element itself, but also the API of all type-dependent references in the code, thus it is tedious and time-consuming. Despite type changes being more frequent than renamings, just a few current IDE tools provide partially-automated support only for a small set of hard-coded types. Researchers have recently proposed a data-driven approach to inferring API rewrite rules for type change patterns in Java using code commits history. In this paper, we build upon these recent advances and introduce IntelliTC - a tool to perform Java type change refactoring. We implemented it as a plugin for IntelliJ IDEA, a popular Java IDE developed by JetBrains. We present 3 different ways of providing support for such a refactoring from the standpoint of the user experience: Classic mode, Suggested Refactoring, and Inspection mode. To evaluate these modalities of using IntelliTC, we surveyed 22 experienced software developers. They positively rated the usefulness of the tool. The source code and distribution of the plugin are available on GitHub: https://github.com/JetBrains-Research/data-driven-type-migration. A demonstration video is on YouTube: https://youtu.be/pdcfvADA1PY.

preprint2022arXiv

Lupa: A Framework for Large Scale Analysis of the Programming Language Usage

In this paper, we present Lupa - a framework for large-scale analysis of the programming language usage. Lupa is a command line tool that uses the power of the IntelliJ Platform under the hood, which gives it access to powerful static analysis tools used in modern IDEs. The tool supports custom analyzers that process the rich concrete syntax tree of the code and can calculate its various features: the presence of entities, their dependencies, definition-usage chains, etc. Currently, Lupa supports analyzing Python and Kotlin, but can be extended to other languages supported by IntelliJ-based IDEs. We explain the internals of the tool, show how it can be extended and customized, and describe an example analysis that we carried out with its help: analyzing the syntax of ranges in Kotlin.

preprint2022arXiv

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

A recent study by Ahmed and Devanbu reported that using a corpus of code written in multilingual datasets to fine-tune multilingual Pre-trained Language Models (PLMs) achieves higher performance as opposed to using a corpus of code written in just one programming language. However, no analysis was made with respect to fine-tuning monolingual PLMs. Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i.e., Ruby and Java code possess very different structure. To better understand how monolingual and multilingual PLMs affect different programming languages, we investigate 1) the performance of PLMs on Ruby for two popular Software Engineering tasks: Code Summarization and Code Search, 2) the strategy (to select programming languages) that works well on fine-tuning multilingual PLMs for Ruby, and 3) the performance of the fine-tuned PLMs on Ruby given different code lengths. In this work, we analyze over a hundred of pre-trained and fine-tuned models. Our results show that 1) multilingual PLMs have a lower Performance-to-Time Ratio (the BLEU, METEOR, or MRR scores over the fine-tuning duration) as compared to monolingual PLMs, 2) our proposed strategy to select target programming languages to fine-tune multilingual PLMs is effective: it reduces the time to fine-tune yet achieves higher performance in Code Summarization and Code Search tasks, and 3) our proposed strategy consistently shows good performance on different code lengths.

preprint2022arXiv

Reflekt: a Library for Compile-Time Reflection in Kotlin

Reflection in Kotlin is a powerful mechanism to introspect program behavior during its execution at run-time. However, among the variety of practical tasks involving reflection, there are scenarios when the poor performance of run-time approaches becomes a significant disadvantage. This problem manifests itself in Kotless, a popular framework for developing serverless applications, because the faster the applications launch, the less their cloud infrastructure costs. In this paper, we present Reflekt - a compile-time reflection library which allows to perform the search among classes, object expressions (which in Kotlin are implemented as singleton classes), and functions in Kotlin code based on the given search query. It comes with a convenient DSL and better performance comparing to the existing run-time reflection approaches. Our experiments show that replacing run-time reflection calls with Reflekt in serverless applications created with Kotless resulted in a significant performance boost in start-up time of these applications.

preprint2022arXiv

So Much in So Little: Creating Lightweight Embeddings of Python Libraries

In software engineering, different approaches and machine learning models leverage different types of data: source code, textual information, historical data. An important part of any project is its dependencies. The list of dependencies is relatively small but carries a lot of semantics with it, which can be used to compare projects or make judgements about them. In this paper, we focus on Python projects and their PyPi dependencies in the form of requirements.txt files. We compile a dataset of 7,132 Python projects and their dependencies, as well as use Git to pull their versions from previous years. Using this data, we build 32-dimensional embeddings of libraries by applying Singular Value Decomposition to the co-occurrence matrix of projects and libraries. We then cluster the embeddings and study their semantic relations. To showcase the usefulness of such lightweight library embeddings, we introduce a prototype tool for suggesting relevant libraries to a given project. The tool computes project embeddings and uses dependencies of projects with similar embeddings to form suggestions. To compare different library recommenders, we have created a benchmark based on the evolution of dependency sets in open-source projects. Approaches based on the created embeddings significantly outperform the baseline of showing the most popular libraries in a given year. We have also conducted a user study that showed that the suggestions differ in quality for different project domains and that even relevant suggestions might be not particularly useful. Finally, to facilitate potentially more useful recommendations, we extended the recommender system with an option to suggest rarer libraries.

preprint2021arXiv

Multi-threshold token-based code clone detection

Clone detection plays an important role in software engineering. Finding clones within a single project introduces possible refactoring opportunities, and between different projects it could be used for detecting code reuse or possible licensing violations. In this paper, we propose a modification to bag-of-tokens based clone detection that allows detecting more clone pairs of greater diversity without losing precision by implementing a multi-threshold search, i.e. conducting the search several times, aimed at different groups of clones. To combat the increase in operation time that this approach brings about, we propose an optimization that allows to significantly decrease the overlap in detected clones between the searches. We evaluate the method for two different popular clone detection tools on two datasets of different sizes. The implementation of the technique allows to increase the number of detected clones by 40.5-56.6% for different datasets. BigCloneBench evaluation also shows that the recall of detecting Strongly Type-3 clones increases from 37.5% to 59.6%.

preprint2021arXiv

ReSplit: Improving the Structure of Jupyter Notebooks by Re-Splitting Their Cells

Jupyter notebooks represent a unique format for programming - a combination of code and Markdown with rich formatting, separated into individual cells. We propose to perceive a Jupyter Notebook cell as a simplified and raw version of a programming function. Similar to functions, Jupyter cells should strive to contain singular, self-contained actions. At the same time, research shows that real-world notebooks fail to do so and suffer from the lack of proper structure. To combat this, we propose ReSplit, an algorithm for an automatic re-splitting of cells in Jupyter notebooks. The algorithm analyzes definition-usage chains in the notebook and consists of two parts - merging and splitting the cells. We ran the algorithm on a large corpus of notebooks to evaluate its performance and its overall effect on notebooks, and evaluated it by human experts: we showed them several notebooks in their original and the re-split form. In 29.5% of cases, the re-split notebook was selected as the preferred way of perceiving the code. We analyze what influenced this decision and describe several individual cases in detail.

preprint2020arXiv

A Study of Potential Code Borrowing and License Violations in Java Projects on GitHub

With an ever-increasing amount of open source software, the popularity of services like GitHub that facilitate code reuse, and common misconceptions about the licensing of open source software, the problem of license violations in the code is getting more and more prominent. In this study, we compile an extensive corpus of popular Java projects from GitHub, search it for code clones and perform an original analysis of possible code borrowing and license violations on the level of code fragments. We chose Java as a language because of its popularity in industry, where the plagiarism problem is especially relevant because of possible legal action. We analyze and discuss distribution of 94 different discovered and manually evaluated licenses in files and projects, differences in the licensing of files, distribution of potential code borrowing between licenses, various types of possible license violations, most violated licenses, etc. Studying possible license violations in specific blocks of code, we have discovered that 29.6% of them might be involved in potential code borrowing and 9.4% of them could potentially violate original licenses.

preprint2020arXiv

Building Implicit Vector Representations of Individual Coding Style

With the goal of facilitating team collaboration, we propose a new approach to building vector representations of individual developers by capturing their individual contribution style, or coding style. Such representations can find use in the next generation of software development team collaboration tools, for example by enabling the tools to track knowledge transfer in teams. The key idea of our approach is to avoid using explicitly defined metrics of coding style and instead build the representations through training a model for authorship recognition and extracting the representations of individual developers from the trained model. By empirically evaluating the output of our approach, we find that implicitly built individual representations reflect some properties of team structure: developers who report learning from each other are represented closer to each other.

preprint2020arXiv

Recommendation of Move Method Refactoring Using Path-Based Representation of Code

Software refactoring plays an important role in increasing code quality. One of the most popular refactoring types is the Move Method refactoring. It is usually applied when a method depends more on members of other classes than on its own original class. Several approaches have been proposed to recommend Move Method refactoring automatically. Most of them are based on heuristics and have certain limitations (e.g., they depend on the selection of metrics and manually-defined thresholds). In this paper, we propose an approach to recommend Move Method refactoring based on a path-based representation of code called code2vec that is able to capture the syntactic structure and semantic information of a code fragment. We use this code representation to train a machine learning classifier suggesting to move methods to more appropriate classes. We evaluate the approach on two publicly available datasets: a manually compiled dataset of well-known open-source projects and a synthetic dataset with automatically injected code smell instances. The results show that our approach is capable of recommending accurate refactoring opportunities and outperforms JDeodorant and JMove, which are state of the art tools in this field.

preprint2020arXiv

Sosed: a tool for finding similar software projects

In this paper, we present Sosed, a tool for discovering similar software projects. We use fastText to compute the embeddings of subtokens into a dense space for 120,000 GitHub repositories in 200 languages. Then, we cluster embeddings to identify groups of semantically similar sub-tokens that reflect topics in source code. We use a dataset of 9 million GitHub projects as a reference search base. To identify similar projects, we compare the distributions of clusters among their sub-tokens. The tool receives an arbitrary project as input, extracts sub-tokens in 16 most popular programming languages, computes cluster distribution, and finds projects with the closest distribution in the search base. We labeled subtoken clusters with short descriptions to enable Sosed to produce interpretable output. Sosed is available at https://github.com/JetBrains-Research/sosed/. The tool demo is available at https://www.youtube.com/watch?v=LYLkztCGRt8. The multi-language extractor of sub-tokens is available separately at https://github.com/JetBrains-Research/buckwheat/.

preprint2020arXiv

Visualization of Methods Changeability Based on VCS Data

Software engineers have a wide variety of tools and techniques that can help them improve the quality of their code, but still, a lot of bugs remain undetected. In this paper we build on the idea that if a particular fragment of code is changed too often, it could be caused by some technical or architectural issues, therefore, this fragment requires additional attention from developers. Most teams nowadays use version control systems to track changes in their code and organize cooperation between developers. We propose to use data from version control systems to track the number of changes for each method in a project for a selected time period and display this information within the IDE's code editor. The paper describes such a tool called Topias built as a plugin for IntelliJ IDEA. Its source code is available at https://github.com/JetBrains-Research/topias. A demonstration video can be found at https://www.youtube.com/watch?v=xsqc4gCTxfA.