Source author record

Yin Yang

Yin Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Computer Vision Graphics Machine Learning Distributed, Parallel, and Cluster Computing Artificial Intelligence Cryptography and Security Social and Information Networks Computation and Language cond-mat.other Data Structures and Algorithms eess.SP Information Theory math-ph math.IT math.MP math.OC Networking and Internet Architecture

Catalog footprint

What is connected

31works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Accurate Table Question Answering with Accessible LLMs

Given a table T in a database and a question Q in natural language, the table question answering (TQA) task aims to return an accurate answer to Q based on the content of T. Recent state-of-the-art solutions leverage large language models (LLMs) to obtain high-quality answers. However, most rely on proprietary, large-scale LLMs with costly API access, posing a significant financial barrier. This paper instead focuses on TQA with smaller, open-weight LLMs that can run on a desktop or laptop. This setting is challenging, as such LLMs typically have weaker capabilities than large proprietary models, leading to substantial performance degradation with existing methods. We observe that a key reason for this degradation is that prior approaches often require the LLM to solve a highly sophisticated task using long, complex prompts, which exceed the capabilities of small open-weight LLMs. Motivated by this observation, we present Orchestra, a multi-agent approach that unlocks the potential of accessible LLMs for high-quality, cost-effective TQA. Orchestra coordinates a group of LLM agents, each responsible for a relatively simple task, through a structured, layered workflow to solve complex TQA problems -- akin to an orchestra. By reducing the prompt complexity faced by each agent, Orchestra significantly improves output reliability. We implement Orchestra on top of AgentScope, an open-source multi-agent framework, and evaluate it on multiple TQA benchmarks using a wide range of open-weight LLMs. Experimental results show that Orchestra achieves strong performance even with small- to medium-sized models. For example, with Qwen2.5-14B, Orchestra reaches 72.1% accuracy on WikiTQ, approaching the best prior result of 75.3% achieved with GPT-4; with larger Qwen, Llama, or DeepSeek models, Orchestra outperforms all prior methods and establishes new state-of-the-art results across all benchmarks.

preprint2026arXiv

Sparse-to-Complete: From Sparse Image Captures to Complete 3D Scenes

We introduce S2C-3D, a novel sparse-view 3D reconstruction framework for high-fidelity and complete scene reconstruction from as few as six to eight images. Our framework features three components: a specialized diffusion model for scene-specific image restoration, a training-free view-consistency conditioned sampling process in the diffusion model for refined Gaussian optimization, and a camera trajectory planning scheme to ensure comprehensive scene coverage. The specialized diffusion model is developed by finetuning a pretrained architecture on the input views and their corresponding degraded counterparts. The adaptation to the scene distribution allows the model to repair Gaussian renderings while effectively eliminating domain gaps. Meanwhile, the trajectory planning scheme optimizes scene coverage by connecting each newly sampled camera to its two nearest neighbors. By iteratively constructing paths and retaining only those that significantly enhance visibility, the scheme establishes a trajectory that covers the entire scene. To address multi-view conflicts, the view-consistency conditioned sampling process quantifies the consistency between neighboring repaired images. This information is injected as a condition into the sampling process of the frozen diffusion model, facilitating the generation of view-consistent images without additional training. Consequently, our approach produces high-fidelity 3D Gaussians that are robust to artifacts. Experimental results demonstrate that S2C-3D outperforms state-of-the-art methods, constructing high-quality scenes that are free from missing regions, blurring, or other artifacts with very sparse inputs. The source code and data are available at https://gapszju.github.io/S2C-3D.

preprint2026arXiv

WorldParticle: Unified Simulation of Lagrangian Particle Dynamics via Transformer

A unified simulator that can model diverse physical phenomena without solver-specific redesign is a long-standing goal across simulation science. We present a learning-based particle simulator built on a single transformer architecture to model cloth, elastic solds, Newtonian and non-Newtonian fluids, granular materials, and molecular dynamics. Our model follows a prediction-correction design on a shared Lagrangian particle representation. An explicit predictor first advances particles under the known external forces, producing an intermediate state that captures externally driven motion but not inter-particle interactions. A learned corrector then predicts the residual position and velocity updates through three stages: a particle tokenizer that encodes local particle-particle, particle-boundary, and topology-guided interactions; a super-token encoder that hierarchically merges particle tokens into a compact set of super tokens via alternating self-attention and token merging; and a super-token decoder that lifts these super tokens back to particle resolution through cross-attention to predict per-particle position and velocity corrections. Progressive token merging reduces the attention cost at successive encoder layers by halving the token count at each level, and the decoder communicates through the compact super-token set rather than full particle-to-particle attention. Across the six dynamics categories, the same architecture generalizes to unseen materials, boundary configurations, initial conditions, and external forces. We further demonstrate downstream interactive control, inverse design, and learning from real-world manipulation data, reducing the need for per-phenomenon solver engineering.

preprint2023arXiv

A Contact Proxy Splitting Method for Lagrangian Solid-Fluid Coupling

We present a robust and efficient method for simulating Lagrangian solid-fluid coupling based on a new operator splitting strategy. We use variational formulations to approximate fluid properties and solid-fluid interactions, and introduce a unified two-way coupling formulation for SPH fluids and FEM solids using interior point barrier-based frictional contact. We split the resulting optimization problem into a fluid phase and a solid-coupling phase using a novel time-splitting approach with augmented contact proxies, and propose efficient custom linear solvers. Our technique accounts for fluids interaction with nonlinear hyperelastic objects of different geometries and codimensions, while maintaining an algorithmically guaranteed non-penetrating criterion. Comprehensive benchmarks and experiments demonstrate the efficacy of our method.

preprint2022arXiv

Active Boundary Loss for Semantic Segmentation

This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged in to the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets.

preprint2022arXiv

Affine Body Dynamics: Fast, Stable & Intersection-free Simulation of Stiff Materials

Simulating stiff materials in applications where deformations are either not significant or can safely be ignored is a pivotal task across fields. Rigid body modeling has thus long remained a fundamental tool and is, by far, the most popular simulation strategy currently employed for modeling stiff solids. At the same time, numerical models of a rigid body continue to pose a number of known challenges and trade-offs including intersections, instabilities, inaccuracies, and/or slow performances that grow with contact-problem complexity. In this paper we revisit this problem and present ABD, a simple and highly effective affine body dynamics framework, which significantly improves state-of-the-art stiff simulations. We trace the challenges in the rigid-body IPC (incremental potential contact) method to the necessity of linearizing piecewise-rigid (SE(3)) trajectories and subsequent constraints. ABD instead relaxes the unnecessary (and unrealistic) constraint that each body's motion be exactly rigid with a stiff orthogonality potential, while preserving the rigid body model's key feature of a small coordinate representation. In doing so ABD replaces piecewise linearization with piecewise linear trajectories. This, in turn, combines the best from both parties: compact coordinates ensure small, sparse system solves, while piecewise-linear trajectories enable efficient and accurate constraint (contact and joint) evaluations. Beginning with this simple foundation, ABD preserves all guarantees of the underlying IPC model e.g., solution convergence, guaranteed non-intersection, and accurate frictional contact. Over a wide range and scale of simulation problems we demonstrate that ABD brings orders of magnitude performance gains (two- to three-order on the CPU and an order more utilizing the GPU, which is 10,000x speedups) over prior IPC-based methods with a similar or higher simulation quality.

preprint2022arXiv

Automatic Quantization for Physics-Based Simulation

Quantization has proven effective in high-resolution and large-scale simulations, which benefit from bit-level memory saving. However, identifying a quantization scheme that meets the requirement of both precision and memory efficiency requires trial and error. In this paper, we propose a novel framework to allow users to obtain a quantization scheme by simply specifying either an error bound or a memory compression rate. Based on the error propagation theory, our method takes advantage of auto-diff to estimate the contributions of each quantization operation to the total error. We formulate the task as a constrained optimization problem, which can be efficiently solved with analytical formulas derived for the linearized objective function. Our workflow extends the Taichi compiler and introduces dithering to improve the precision of quantized simulations. We demonstrate the generality and efficiency of our method via several challenging examples of physics-based simulation, which achieves up to 2.5x memory compression without noticeable degradation of visual quality in the results. Our code and data are available at https://github.com/Hanke98/AutoQuantizer.

preprint2022arXiv

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP). In our problem, the agent has limited information about the items to be packed into the bin, and an item must be packed immediately after its arrival without buffering or readjusting. The item's placement also subjects to the constraints of collision avoidance and physical stability. We formulate this online 3D-BPP as a constrained Markov decision process. To solve the problem, we propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework. In particular, we introduce a feasibility predictor to predict the feasibility mask for the placement actions and use it to modulate the action probabilities output by the actor during training. Such supervisions and transformations to DRL facilitate the agent to learn feasible policies efficiently. Our method can also be generalized e.g., with the ability to handle lookahead or items with different orientations. We have conducted extensive evaluation showing that the learned policy significantly outperforms the state-of-the-art methods. A user study suggests that our method attains a human-level performance.

preprint2022arXiv

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Generating new images with desired properties (e.g. new view/poses) from source images has been enthusiastically pursued recently, due to its wide range of potential applications. One way to ensure high-quality generation is to use multiple sources with complementary information such as different views of the same object. However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques. Therefore, we propose a new general approach which models multiple types of variations among sources, such as view angles, poses, facial expressions, in a unified framework, so that it can be employed on datasets of vastly different nature. We verify our approach on a variety of data including humans bodies, faces, city scenes and 3D objects. Both the qualitative and quantitative results demonstrate the better performance of our method than the state of the art.

preprint2022arXiv

Topology Optimization with Frictional Self-Contact

Contact-aware topology optimization faces challenges in robustness, accuracy, and applicability to internal structural surfaces under self-contact. This work builds on the recently proposed barrier-based Incremental Potential Contact (IPC) model and presents a new self-contact-aware topology optimization framework. A combination of SIMP, adjoint sensitivity analysis, and the IPC frictional-contact model is investigated. Numerical examples for optimizing varying objective functions under contact are presented. The resulting algorithm proposed solves topology optimization for large deformation and complex frictionally contacting scenarios with accuracy and robustness.

preprint2021arXiv

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex. Given a construction layout, we propose a visual context-aware graph generation network that learns the implicit global relations among the scene components and infers the location of a new building unit. The proposed network takes as input the scene graph and the corresponding top-view depth image. It provides the location recommendations for a newly-added building units by learning an auto-regressive edge distribution conditioned on existing scenes. We also introduce a global graph-image matching loss to enhance the awareness of essential geometry semantics of the site. Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.

preprint2020arXiv

DCAF: A Dynamic Computation Allocation Framework for Online Serving System

Modern large-scale systems such as recommender system and online advertising system are built upon computation-intensive infrastructure. The typical objective in these applications is to maximize the total revenue, e.g. GMV~(Gross Merchandise Volume), under a limited computation resource. Usually, the online serving system follows a multi-stage cascade architecture, which consists of several stages including retrieval, pre-ranking, ranking, etc. These stages usually allocate resource manually with specific computing power budgets, which requires the serving configuration to adapt accordingly. As a result, the existing system easily falls into suboptimal solutions with respect to maximizing the total revenue. The limitation is due to the face that, although the value of traffic requests vary greatly, online serving system still spends equal computing power among them. In this paper, we introduce a novel idea that online serving system could treat each traffic request differently and allocate "personalized" computation resource based on its value. We formulate this resource allocation problem as a knapsack problem and propose a Dynamic Computation Allocation Framework~(DCAF). Under some general assumptions, DCAF can theoretically guarantee that the system can maximize the total revenue within given computation budget. DCAF brings significant improvement and has been deployed in the display advertising system of Taobao for serving the main traffic. With DCAF, we are able to maintain the same business performance with 20\% computation resource reduction.

preprint2020arXiv

Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank

Given an input graph G and a node v in G, homogeneous network embedding (HNE) maps the graph structure in the vicinity of v to a compact, fixed-dimensional feature vector. This paper focuses on HNE for massive graphs, e.g., with billions of edges. On this scale, most existing approaches fail, as they incur either prohibitively high costs, or severely compromised result utility. Our proposed solution, called Node-Reweighted PageRank (NRP), is based on a classic idea of deriving embedding vectors from pairwise personalized PageRank (PPR) values. Our contributions are twofold: first, we design a simple and efficient baseline HNE method based on PPR that is capable of handling billion-edge graphs on commodity hardware; second and more importantly, we identify an inherent drawback of vanilla PPR, and address it in our main proposal NRP. Specifically, PPR was designed for a very different purpose, i.e., ranking nodes in G based on their relative importance from a source node's perspective. In contrast, HNE aims to build node embeddings considering the whole graph. Consequently, node embeddings derived directly from PPR are of suboptimal utility. The proposed NRP approach overcomes the above deficiency through an effective and efficient node reweighting algorithm, which augments PPR values with node degree information, and iteratively adjusts embedding vectors accordingly. Overall, NRP takes O(mlogn) time and O(m) space to compute all node embeddings for a graph with m edges and n nodes. Our extensive experiments that compare NRP against 18 existing solutions over 7 real graphs demonstrate that NRP achieves higher result utility than all the solutions for link prediction, graph reconstruction and node classification, while being up to orders of magnitude faster. In particular, on a billion-edge Twitter graph, NRP terminates within 4 hours, using a single CPU core.

preprint2020arXiv

LinSBFT: Linear-Communication One-Step BFT Protocol for Public Blockchains

This paper presents LinSBFT, a Byzantine Fault Tolerance (BFT) protocol with the capacity of processing over 2000 smart contract transactions per second in production. LinSBFT applies to a permissionless, public blockchain system, in which there is no public-key infrastructure, based on the classic PBFT with 4 improvements: (\romannumeral1) LinSBFT achieves $O(n)$ worst-case communication volume, in contract to PBFT's $O(n^4)$; (\romannumeral2) LinSBFT rotates the leader of protocol randomly to reduce the risk of denial-of-service attacks on leader; and (\romannumeral3) each run of LinSBFT finalizes one block, which is robust against participants that are honest in one run of the protocol, and dishonest in another, and the set of participants is dynamic, which is update periodically. (\romannumeral4) LinSBFT helps the delayed nodes to catch up via a synchronization mechanism to promise the liveness. Further, in the ordinary case, LinSBFT involves only a single round of voting instead of two in PBFT, which reduces both communication overhead and confirmation time, and employs the \emph{proof-of-stake} scheme to reward all participants. Extensive experiments using data obtained from the Ethereum demonstrate that LinSBFT consistently and significantly outperforms existing in-production BFT protocols for blockchains.

preprint2020arXiv

Outage Probability and Capacity Scaling Law of Multiple RIS-Aided Cooperative Networks

In this letter, we consider a dual-hop cooperative network assisted by multiple reconfigurable intelligent surfaces (RISs). Assuming that the RIS with the highest instantaneous end-to-end signal-to-noise ratio (SNR) is selected to aid the communication, the outage probability (OP) and average sum-rate are investigated. Specifically, an exact analysis for the OP is developed. In addition, relying on the extreme value theory, closed-form expressions for the asymptotic OP and asymptotic sum-rate are derived, based on which the capacity scaling law is established. Our results are corroborated through simulations and insightful discussions are provided. In particular, our analysis shows that the number of RISs as well as the number of reflecting elements play a crucial role in the capacity scaling law of multiple RIS-aided cooperative networks. Also, comparisons with relay-aided systems are carried out to demonstrate that the proposed system setup outperforms relaying schemes both in terms of the OP and average sum-rate.

preprint2020arXiv

Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs

Given a graph G and a node u in G, a single source SimRank query evaluates the similarity between u and every node v in G. Existing approaches to single source SimRank computation incur either long query response time, or expensive pre-computation, which needs to be performed again whenever the graph G changes. Consequently, to our knowledge none of them is ideal for scenarios in which (i) query processing must be done in realtime, and (ii) the underlying graph G is massive, with frequent updates. Motivated by this, we propose SimPush, a novel algorithm that answers single source SimRank queries without any pre-computation, and at the same time achieves significantly higher query processing speed than even the fastest known index-based solutions. Further, SimPush provides rigorous result quality guarantees, and its high performance does not rely on any strong assumption of the underlying graph. Specifically, compared to existing methods, SimPush employs a radically different algorithmic design that focuses on (i) identifying a small number of nodes relevant to the query, and subsequently (ii) computing statistics and performing residue push from these nodes only. We prove the correctness of SimPush, analyze its time complexity, and compare its asymptotic performance with that of existing methods. Meanwhile, we evaluate the practical performance of SimPush through extensive experiments on 8 real datasets. The results demonstrate that SimPush consistently outperforms all existing solutions, often by over an order of magnitude. In particular, on a commodity machine, SimPush answers a single source SimRank query on a web graph containing over 133 million nodes and 5.4 billion edges in under 62 milliseconds, with 0.00035 empirical error, while the fastest index-based competitor needs 1.18 seconds.

preprint2020arXiv

Second-order Neural Network Training Using Complex-step Directional Derivative

While the superior performance of second-order optimization methods such as Newton's method is well known, they are hardly used in practice for deep learning because neither assembling the Hessian matrix nor calculating its inverse is feasible for large-scale problems. Existing second-order methods resort to various diagonal or low-rank approximations of the Hessian, which often fail to capture necessary curvature information to generate a substantial improvement. On the other hand, when training becomes batch-based (i.e., stochastic), noisy second-order information easily contaminates the training procedure unless expensive safeguard is employed. In this paper, we adopt a numerical algorithm for second-order neural network training. We tackle the practical obstacle of Hessian calculation by using the complex-step finite difference (CSFD) -- a numerical procedure adding an imaginary perturbation to the function for derivative computation. CSFD is highly robust, efficient, and accurate (as accurate as the analytic result). This method allows us to literally apply any known second-order optimization methods for deep learning training. Based on it, we design an effective Newton Krylov procedure. The key mechanism is to terminate the stochastic Krylov iteration as soon as a disturbing direction is found so that unnecessary computation can be avoided. During the optimization, we monitor the approximation error in the Taylor expansion to adjust the step size. This strategy combines advantages of line search and trust region methods making our method preserves good local and global convergency at the same time. We have tested our methods in various deep learning tasks. The experiments show that our method outperforms exiting methods, and it often converges one-order faster. We believe our method will inspire a wide-range of new algorithms for deep learning and numerical optimization.

preprint2018arXiv

DeepWarp: DNN-based Nonlinear Deformation

DeepWarp is an efficient and highly re-usable deep neural network (DNN) based nonlinear deformable simulation framework. Unlike other deep learning applications such as image recognition, where different inputs have a uniform and consistent format (e.g. an array of all the pixels in an image), the input for deformable simulation is quite variable, high-dimensional, and parametrization-unfriendly. Consequently, even though DNN is known for its rich expressivity of nonlinear functions, directly using DNN to reconstruct the force-displacement relation for general deformable simulation is nearly impossible. DeepWarp obviates this difficulty by partially restoring the force-displacement relation via warping the nodal displacement simulated using a simplistic constitutive model -- the linear elasticity. In other words, DeepWarp yields an incremental displacement fix based on a simplified (therefore incorrect) simulation result other than returning the unknown displacement directly. We contrive a compact yet effective feature vector including geodesic, potential and digression to sort training pairs of per-node linear and nonlinear displacement. DeepWarp is robust under different model shapes and tessellations. With the assistance of deformation substructuring, one DNN training is able to handle a wide range of 3D models of various geometries including most examples shown in the paper. Thanks to the linear elasticity and its constant system matrix, the underlying simulator only needs to perform one pre-factorized matrix solve at each time step, and DeepWarp is able to simulate large models in real time.

preprint2016arXiv

Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy

Organizations with a large user base, such as Samsung and Google, can potentially benefit from collecting and mining users' data. However, doing so raises privacy concerns, and risks accidental privacy breaches with serious consequences. Local differential privacy (LDP) techniques address this problem by only collecting randomized answers from each user, with guarantees of plausible deniability; meanwhile, the aggregator can still build accurate models and predictors by analyzing large amounts of such randomized data. So far, existing LDP solutions either have severely restricted functionality, or focus mainly on theoretical aspects such as asymptotical bounds rather than practical usability and performance. Motivated by this, we propose Harmony, a practical, accurate and efficient system for collecting and analyzing data from smart device users, while satisfying LDP. Harmony applies to multi-dimensional data containing both numerical and categorical attributes, and supports both basic statistics (e.g., mean and frequency estimates), and complex machine learning tasks (e.g., linear regression, logistic regression and SVM classification). Experiments using real data confirm Harmony's effectiveness.

preprint2016arXiv

Contour-based 3d tongue motion visualization using ultrasound image sequences

This article describes a contour-based 3D tongue deformation visualization framework using B-mode ultrasound image sequences. A robust, automatic tracking algorithm characterizes tongue motion via a contour, which is then used to drive a generic 3D Finite Element Model (FEM). A novel contour-based 3D dynamic modeling method is presented. Modal reduction and modal warping techniques are applied to model the deformation of the tongue physically and efficiently. This work can be helpful in a variety of fields, such as speech production, silent speech recognition, articulation training, speech disorder study, etc.

preprint2016arXiv

Convex Optimization for Linear Query Processing under Approximate Differential Privacy

Differential privacy enables organizations to collect accurate aggregates over sensitive data with strong, rigorous guarantees on individuals' privacy. Previous work has found that under differential privacy, computing multiple correlated aggregates as a batch, using an appropriate \emph{strategy}, may yield higher accuracy than computing each of them independently. However, finding the best strategy that maximizes result accuracy is non-trivial, as it involves solving a complex constrained optimization program that appears to be non-linear and non-convex. Hence, in the past much effort has been devoted in solving this non-convex optimization program. Existing approaches include various sophisticated heuristics and expensive numerical solutions. None of them, however, guarantees to find the optimal solution of this optimization problem. This paper points out that under ($ε$, $δ$)-differential privacy, the optimal solution of the above constrained optimization problem in search of a suitable strategy can be found, rather surprisingly, by solving a simple and elegant convex optimization program. Then, we propose an efficient algorithm based on Newton's method, which we prove to always converge to the optimal solution with linear global convergence rate and quadratic local convergence rate. Empirical evaluations demonstrate the accuracy and efficiency of the proposed solution.

preprint2016arXiv

Development of a 3D tongue motion visualization platform based on ultrasound image sequences

This article describes the development of a platform designed to visualize the 3D motion of the tongue using ultrasound image sequences. An overview of the system design is given and promising results are presented. Compared to the analysis of motion in 2D image sequences, such a system can provide additional visual information and a quantitative description of the tongue 3D motion. The platform can be useful in a variety of fields, such as speech production, articulation training, etc.

preprint2015arXiv

DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

In a data stream management system (DSMS), users register continuous queries, and receive result updates as data arrive and expire. We focus on applications with real-time constraints, in which the user must receive each result update within a given period after the update occurs. To handle fast data, the DSMS is commonly placed on top of a cloud infrastructure. Because stream properties such as arrival rates can fluctuate unpredictably, cloud resources must be dynamically provisioned and scheduled accordingly to ensure real-time response. It is quite essential, for the existing systems or future developments, to possess the ability of scheduling resources dynamically according to the current workload, in order to avoid wasting resources, or failing in delivering correct results on time. Motivated by this, we propose DRS, a novel dynamic resource scheduler for cloud-based DSMSs. DRS overcomes three fundamental challenges: (a) how to model the relationship between the provisioned resources and query response time (b) where to best place resources; and (c) how to measure system load with minimal overhead. In particular, DRS includes an accurate performance model based on the theory of \emph{Jackson open queueing networks} and is capable of handling \emph{arbitrary} operator topologies, possibly with loops, splits and joins. Extensive experiments with real data confirm that DRS achieves real-time response with close to optimal resource consumption.

preprint2015arXiv

Optimal Operator State Migration for Elastic Data Stream Processing

A cloud-based data stream management system (DSMS) handles fast data by utilizing the massively parallel processing capabilities of the underlying platform. An important property of such a DSMS is elasticity, meaning that nodes can be dynamically added to or removed from an application to match the latter's workload, which may fluctuate in an unpredictable manner. For an application involving stateful operations such as aggregates, the addition / removal of nodes necessitates the migration of operator states. Although the importance of migration has been recognized in existing systems, two key problems remain largely neglected, namely how to migrate and what to migrate, i.e., the migration mechanism that reduces synchronization overhead and result delay during migration, and the selection of the optimal task assignment that minimizes migration costs. Consequently, migration in current systems typically incurs a high spike in result delay caused by expensive synchronization barriers and suboptimal task assignments. Motivated by this, we present the first comprehensive study on efficient operator states migration, and propose designs and algorithms that enable live, progressive, and optimized migrations. Extensive experiments using real data justify our performance claims.

preprint2015arXiv

Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., $ε$-) and approximate (i.e., ($ε$, $δ$)-) differential privacy definitions. We derive the utility guarantees of LRM, and provide guidance on how to set the privacy parameters given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

preprint2014arXiv

Evaluation of an ensemble-based incremental variational data assimilation

In this work, we aim at studying ensemble based optimal control strategies for data assimilation. Such formulation nicely combines the ingredients of ensemble Kalman filters and variational data assimilation (4DVar). In the same way as variational assimilation schemes, it is formulated as the minimization of an objective function, but similarly to ensemble filter, it introduces in its objective function an empirical ensemble-based background-error covariance and works in an off-line smoothing mode rather than sequentially like sequential filters. These techniques have the great advantage to avoid the introduction of tangent linear and adjoint models, which are necessary for standard incremental variational techniques. They also allow handling a time varying background covariance matrix representing the error evolution between the estimated solution and a background solution. As this background error covariance matrix -- of reduced rank in practice -- plays a key role in the variational process, our study particularly focuses on the generation of the analysis ensemble state with localization techniques. Besides, to clarify well the differences between the different methods and to highlight the potential pitfall and advantages of the different methods, we present key theoretical properties associated to different choices involved in their setup. We compared experimentally the performances of several variations of an ensemble technique of interest with an incremental 4DVar method. The comparisons have been leaded on the basis of a Shallow Water model and have been carried out both with synthetic data and through a close experimental setup. The cases where the system's components are either fully observed or only partially have been in particular addressed.

preprint2012arXiv

Functional Mechanism: Regression Analysis under Differential Privacy

ε-differential privacy is the state-of-the-art model for releasing sensitive information while protecting privacy. Numerous methods have been proposed to enforce epsilon-differential privacy in various analytical tasks, e.g., regression analysis. Existing solutions for regression analysis, however, are either limited to non-standard types of regression or unable to produce accurate regression results. Motivated by this, we propose the Functional Mechanism, a differentially private method designed for a large class of optimization-based analyses. The main idea is to enforce epsilon-differential privacy by perturbing the objective function of the optimization problem, rather than its results. As case studies, we apply the functional mechanism to address two most widely used regression models, namely, linear regression and logistic regression. Both theoretical analysis and thorough experimental evaluations show that the functional mechanism is highly effective and efficient, and it significantly outperforms existing solutions.

preprint2012arXiv

Low Rank Mechanism for Optimizing Batch Queries under Differential Privacy

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose the \emph{Low-Rank Mechanism} (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a \emph{low rank approximation} of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

preprint2012arXiv

Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably the matrix mechanism, has suggested that processing a batch of correlated queries as a whole can potentially achieve considerable accuracy gains, compared to answering them individually. However, as we point out in this paper, the matrix mechanism is mainly of theoretical interest; in particular, several inherent problems in its design limit its accuracy in practice, which almost never exceeds that of naive methods. In fact, we are not aware of any existing solution that can effectively optimize a query batch under differential privacy. Motivated by this, we propose the Low-Rank Mechanism (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a low rank approximation of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

preprint2011arXiv

Compressive Mechanism: Utilizing Sparse Representation in Differential Privacy

Differential privacy provides the first theoretical foundation with provable privacy guarantee against adversaries with arbitrary prior knowledge. The main idea to achieve differential privacy is to inject random noise into statistical query results. Besides correctness, the most important goal in the design of a differentially private mechanism is to reduce the effect of random noise, ensuring that the noisy results can still be useful. This paper proposes the \emph{compressive mechanism}, a novel solution on the basis of state-of-the-art compression technique, called \emph{compressive sensing}. Compressive sensing is a decent theoretical tool for compact synopsis construction, using random projections. In this paper, we show that the amount of noise is significantly reduced from $O(\sqrt{n})$ to $O(\log(n))$, when the noise insertion procedure is carried on the synopsis samples instead of the original database. As an extension, we also apply the proposed compressive mechanism to solve the problem of continual release of statistical results. Extensive experiments using real datasets justify our accuracy claims.

preprint2009arXiv

Off-diagonal Long-Range Order and Supersolidity in a Quantum Solid with Vacancies

We consider a lattice of bosonic atoms, whose number N may be smaller than the number of lattice sites M. We study the Hartree-Fock wave function built up from localized wave functios w(\mathbf{r}) of single atoms, with nearest neighboring overlap. The zero-momentum particle number is expressed in terms of permanents of matrices. In one dimension, it is analytically calculated to be α*N(M-N+1)/M, with α=|\int w(\mathbf{r})dΩ|^2/[(1+2a)l], where a is the nearest-neighboring overlap, l is the lattice constant. αis of the order of 1. The result indicates that the condensate fraction is proportional to and of the same order of magnitude as that of the vacancy concentration, hence there is off-diagonal long-range order or Bose-Einstein condensation of atoms when the number of vacancies M-N is a finite fraction of the number of the lattice sites M.

Yin Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

Accurate Table Question Answering with Accessible LLMs

Sparse-to-Complete: From Sparse Image Captures to Complete 3D Scenes

WorldParticle: Unified Simulation of Lagrangian Particle Dynamics via Transformer

A Contact Proxy Splitting Method for Lagrangian Solid-Fluid Coupling

Active Boundary Loss for Semantic Segmentation

Affine Body Dynamics: Fast, Stable & Intersection-free Simulation of Stiff Materials

Automatic Quantization for Physics-Based Simulation

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Topology Optimization with Frictional Self-Contact

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

DCAF: A Dynamic Computation Allocation Framework for Online Serving System

Homogeneous Network Embedding for Massive Graphs via Reweighted Personalized PageRank

LinSBFT: Linear-Communication One-Step BFT Protocol for Public Blockchains

Outage Probability and Capacity Scaling Law of Multiple RIS-Aided Cooperative Networks

Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs

Second-order Neural Network Training Using Complex-step Directional Derivative

DeepWarp: DNN-based Nonlinear Deformation

Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy

Contour-based 3d tongue motion visualization using ultrasound image sequences

Convex Optimization for Linear Query Processing under Approximate Differential Privacy

Development of a 3D tongue motion visualization platform based on ultrasound image sequences

DRS: Dynamic Resource Scheduling for Real-Time Analytics over Fast Streams

Optimal Operator State Migration for Elastic Data Stream Processing

Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy

Evaluation of an ensemble-based incremental variational data assimilation

Functional Mechanism: Regression Analysis under Differential Privacy

Low Rank Mechanism for Optimizing Batch Queries under Differential Privacy

Low-Rank Mechanism: Optimizing Batch Queries under Differential Privacy

Compressive Mechanism: Utilizing Sparse Representation in Differential Privacy

Off-diagonal Long-Range Order and Supersolidity in a Quantum Solid with Vacancies