Researcher profile

Ankur Jain

Ankur Jain contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2025arXiv

Exact analysis of potential flow past bodies of irregular shapes

Fluid flow past one or more solid bodies is a fundamental problem of much practical importance. Standard solutions of simplified problems involving incompressible inviscid irrotational flow past common geometries such as circular cylinders and airfoils are commonly available. This work presents exact analysis of a potential flow problem involving fluid flow past one or more bodies of irregular shapes. The problem is solved by expressing the shape of each body using Heaviside functions, and writing the potential function as an eigenfunction-based series. Using the properties of Heaviside functions, the series coefficients are determined by deriving a set of linear algebraic equations that govern the coefficients. Benchmarking of the analytical technique against well-known solutions of standard problems is carried out, showing excellent agreement. Good agreement with past work on the specific problem of potential flow past multiple circular cylinders further establishes the accuracy of the analytical technique. Illustrative problems of flow past complicated geometries are solved. Implementation aspects and limitations of the analytical technique are discussed.

preprint2011arXiv

CBLOCK: An Automatic Blocking Mechanism for Large-Scale De-duplication Tasks

De-duplication---identification of distinct records referring to the same real-world entity---is a well-known challenge in data integration. Since very large datasets prohibit the comparison of every pair of records, {\em blocking} has been identified as a technique of dividing the dataset for pairwise comparisons, thereby trading off {\em recall} of identified duplicates for {\em efficiency}. Traditional de-duplication tasks, while challenging, typically involved a fixed schema such as Census data or medical records. However, with the presence of large, diverse sets of structured data on the web and the need to organize it effectively on content portals, de-duplication systems need to scale in a new dimension to handle a large number of schemas, tasks and data sets, while handling ever larger problem sizes. In addition, when working in a map-reduce framework it is important that canopy formation be implemented as a {\em hash function}, making the canopy design problem more challenging. We present CBLOCK, a system that addresses these challenges. CBLOCK learns hash functions automatically from attribute domains and a labeled dataset consisting of duplicates. Subsequently, CBLOCK expresses blocking functions using a hierarchical tree structure composed of atomic hash functions. The application may guide the automated blocking process based on architectural constraints, such as by specifying a maximum size of each block (based on memory requirements), impose disjointness of blocks (in a grid environment), or specify a particular objective function trading off recall for efficiency. As a post-processing step to automatically generated blocks, CBLOCK {\em rolls-up} smaller blocks to increase recall. We present experimental results on two large-scale de-duplication datasets at Yahoo!---consisting of over 140K movies and 40K restaurants respectively---and demonstrate the utility of CBLOCK.