Source author record

Rajdeep Mukherjee

Rajdeep Mukherjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Logic in Computer Science Computation and Language Formal Languages and Automata Theory Information Retrieval Other Computer Science Programming Languages Social and Information Networks Software Engineering

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Static Analysis for AWS Best Practices in Python Code

Amazon Web Services (AWS) is a comprehensive and broadly adopted cloud provider, offering over 200 fully featured services, including compute, database, storage, networking and content delivery, machine learning, Internet of Things and many others. AWS SDKs provide access to AWS services through API endpoints. However, incorrect use of these APIs can lead to code defects, crashes, performance issues, and other problems. This paper presents automated static analysis rules, developed in the context of a commercial service for detection of code defects and security vulnerabilities, to identify deviations from AWS best practices in Python applications that use the AWS SDK. Such applications use the AWS SDK for Python, called "Boto3", to access AWS cloud services. However, precise static analysis of Python applications that use cloud SDKs requires robust type inference for inferring the types of cloud service clients. The dynamic style of Boto3 APIs poses unique challenges for type resolution, as does the interprocedural style in which service clients are used in practice. In support of our best-practices goal, we present a layered strategy for type inference that combines multiple type-resolution and tracking strategies in a staged manner. From our experiments across >3,000 popular Python GitHub repos that make use of the AWS SDK, our layered type inference system achieves 85% precision and 100% recall in inferring Boto3 clients in Python client code. Additionally, we present a representative sample of eight AWS best-practice rules that detect a wide range of issues including pagination, polling, and batch operations. We have assessed the efficacy of these rules based on real-world developer feedback. Developers have accepted more than 85% of the recommendations made by five out of eight Python rules, and almost 83% of all recommendations.

preprint2020arXiv

Hardware/Software Co-verification Using Path-based Symbolic Execution

Conventional tools for formal hardware/software co-verification use bounded model checking techniques to construct a single monolithic propositional formula. Formulas generated in this way are extremely complex and contain a great deal of irrelevant logic, hence are difficult to solve even by the state-of-the-art Satis ability (SAT) solvers. In a typical hardware/software co-design the firmware only exercises a fraction of the hardware state-space, and we can use this observation to generate simpler and more concise formulas. In this paper, we present a novel verification algorithm for hardware/software co-designs that identify partitions of the firmware and the hardware logic pertaining to the feasible execution paths by means of path-based symbolic simulation with custom path-pruning, property-guided slicing and incremental SAT solving. We have implemented this approach in our tool COVERIF. We have experimentally compared COVERIF with HW-CBMC, a monolithic BMC based co-verification tool, and observed an average speed-up of 5X over HW-CBMC for proving safety properties as well as detecting critical co-design bugs in an open-source Universal Asynchronous Receiver Transmitter design and a large SoC design.

preprint2020arXiv

How Have We Reacted To The COVID-19 Pandemic? Analyzing Changing Indian Emotions Through The Lens of Twitter

Since its outbreak, the ongoing COVID-19 pandemic has caused unprecedented losses to human lives and economies around the world. As of 18th July 2020, the World Health Organization (WHO) has reported more than 13 million confirmed cases including close to 600,000 deaths across 216 countries and territories. Despite several government measures, India has gradually moved up the ranks to become the third worst-hit nation by the pandemic after the US and Brazil, thus causing widespread anxiety and fear among her citizens. As majority of the world's population continues to remain confined to their homes, more and more people have started relying on social media platforms such as Twitter for expressing their feelings and attitudes towards various aspects of the pandemic. With rising concerns of mental well-being, it becomes imperative to analyze the dynamics of public affect in order to anticipate any potential threats and take precautionary measures. Since affective states of human mind are more nuanced than meager binary sentiments, here we propose a deep learning-based system to identify people's emotions from their tweets. We achieve competitive results on two benchmark datasets for multi-label emotion classification. We then use our system to analyze the evolution of emotional responses among Indians as the pandemic continues to spread its wings. We also study the development of salient factors contributing towards the changes in attitudes over time. Finally, we discuss directions to further improve our work and hope that our analysis can aid in better public health monitoring.

preprint2020arXiv

Read what you need: Controllable Aspect-based Opinion Summarization of Tourist Reviews

Manually extracting relevant aspects and opinions from large volumes of user-generated text is a time-consuming process. Summaries, on the other hand, help readers with limited time budgets to quickly consume the key ideas from the data. State-of-the-art approaches for multi-document summarization, however, do not consider user preferences while generating summaries. In this work, we argue the need and propose a solution for generating personalized aspect-based opinion summaries from large collections of online tourist reviews. We let our readers decide and control several attributes of the summary such as the length and specific aspects of interest among others. Specifically, we take an unsupervised approach to extract coherent aspects from tourist reviews posted on TripAdvisor. We then propose an Integer Linear Programming (ILP) based extractive technique to select an informative subset of opinions around the identified aspects while respecting the user-specified values for various control parameters. Finally, we evaluate and compare our summaries using crowdsourcing and ROUGE-based metrics and obtain competitive results.

preprint2016arXiv

Equivalence Checking a Floating-point Unit against a High-level C Model (Extended Version)

Semiconductor companies have increasingly adopted a methodology that starts with a system-level design specification in C/C++/SystemC. This model is extensively simulated to ensure correct functionality and performance. Later, a Register Transfer Level (RTL) implementation is created in Verilog, either manually by a designer or automatically by a high-level synthesis tool. It is essential to check that the C and Verilog programs are consistent. In this paper, we present a two-step approach, embodied in two equivalence checking tools, VERIFOX and HW-CBMC, to validate designs at the software and RTL levels, respectively. VERIFOX is used for equivalence checking of an untimed software model in C against a high-level reference model in C. HW-CBMC verifies the equivalence of a Verilog RTL implementation against an untimed software model in C. To evaluate our tools, we applied them to a commercial floating-point arithmetic unit (FPU) from ARM and an open-source dual-path floating-point adder.

preprint2013arXiv

A Multi-objective Perspective for Operator Scheduling using Fine-grained DVS Architecture

The stringent power budget of fine grained power managed digital integrated circuits have driven chip designers to optimize power at the cost of area and delay, which were the traditional cost criteria for circuit optimization. The emerging scenario motivates us to revisit the classical operator scheduling problem under the availability of DVFS enabled functional units that can trade-off cycles with power. We study the design space defined due to this trade-off and present a branch-and-bound(B/B) algorithm to explore this state space and report the pareto-optimal front with respect to area and power. The scheduling also aims at maximum resource sharing and is able to attain sufficient area and power gains for complex benchmarks when timing constraints are relaxed by sufficient amount. Experimental results show that the algorithm that operates without any user constraint(area/power) is able to solve the problem for most available benchmarks, and the use of power budget or area budget constraints leads to significant performance gain.

Rajdeep Mukherjee

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Static Analysis for AWS Best Practices in Python Code

Hardware/Software Co-verification Using Path-based Symbolic Execution

How Have We Reacted To The COVID-19 Pandemic? Analyzing Changing Indian Emotions Through The Lens of Twitter

Read what you need: Controllable Aspect-based Opinion Summarization of Tourist Reviews

Equivalence Checking a Floating-point Unit against a High-level C Model (Extended Version)

A Multi-objective Perspective for Operator Scheduling using Fine-grained DVS Architecture