Source author record

Carlee Joe-Wong

Carlee Joe-Wong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Networking and Internet Architecture Artificial Intelligence cs.CY Social and Information Networks Computer Science and Game Theory Computation and Language Computer Vision Cryptography and Security Distributed, Parallel, and Cluster Computing eess.SY Information Retrieval Multiagent Systems physics.chem-ph

Catalog footprint

What is connected

20works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Cost-Ordered Feasibility for Multi-Armed Bandits with Cost Subsidy

The classic multi-armed bandit (MAB) problem tackles the challenge of accruing maximum reward while making decisions under uncertainty. However, in applications, often the goal is to minimize cost subject to a constraint on the minimum permissible reward, an objective captured by multi-armed bandits with cost-subsidy (MAB-CS). Of interest to this paper is the setting where the quality (reward) constraint is specified relative to the unknown best reward and the cost of each arm is known. We characterize the expected sub-optimal samples required by any policy by proving instance-dependent lower bounds that offer new insight into the problem and are a strict generalization of prior bounds. Then, we propose an algorithm called Cost-Ordered Feasibility (COF) that leverages our insight and intelligently combine samples from all arms to gauge the feasibility of a cheap arm. Thereafter, we analyze COF to establish instance-dependent upper bounds on its expected cumulative cost and quality regret, i.e., relative to the cheapest feasible arm. Finally, we empirically validate the merits of COF, comparing it to baselines from the literature through extensive simulation experiments on the MovieLens and Goodreads datasets as well as representative synthetic instances. Not only does our paper develop qualitatively better theoretical regret upper bounds, but COF also convincingly demonstrates improved empirical performance.

preprint2026arXiv

Emergent and Subliminal Misalignment Through the Lens of Data-Mediated Transfer

Fine-tuning LLMs on narrow harmful datasets can induce Emergent Misalignment (EM), where models exhibit misaligned behavior far beyond the fine-tuning distribution. We argue that emergent misalignment can be better understood as a data-mediated transfer phenomenon: harmful fine-tuning examples do not induce uniform behavioral spillover, but interact with the structural properties of the dataset and the difficulty of the tasks relative to the model. Across our experiments, we find that misalignment appears more readily when fine-tuning and evaluation prompts share similar underlying functional structure, when prompts leave more room for coherent harmful completions, and when the target behavior has been more reliably learned by the model. The training pipeline itself also matters: pretraining composition shapes later misalignment. We further study Subliminal Learning (SL), where misalignment is transmitted by fine-tuning on seemingly benign data generated by a harmful teacher. Moving beyond the standard SFT setting, we for the first time compare this transfer under off-policy and on-policy distillation as well, allowing us to separate the roles of the teacher guidance and the training data distribution in transmitting misalignment. Together, these results argue for a data-centric view: Emergent/subliminal misalignment should not be treated as a simple consequence of isolated harmful fine-tuning examples, but as the result of interactions between fine-tuning data structure, pretraining distributions, and training channels.

preprint2026arXiv

Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry

Compositional generalization in sequential decision-making requires identifying which parts of prior rollouts remain useful for new tasks. Existing methods reuse skills or predictive models, but often overlook rich local transition geometry and dynamics. We propose Matrix-Space Reinforcement Learning (MSRL), a geometric abstraction that represents trajectory segments through positive semidefinite matrix descriptors aggregating first- and second-order statistics of lifted one-step transitions. These descriptors expose shared hidden structure, support algebraic composition in an abstract matrix space, and reveal opportunities for transfer. We prove that the descriptor is well defined up to coordinate gauge, complete for the induced low-order additive signal class, additive under valid segment composition, and minimally sufficient among admissible additive descriptors. We further show that conditioning value functions on the trajectory-segment matrix yields a first-order smooth approximation of action values, enabling source-learned matrix-to-value mappings to bootstrap learning in new tasks. MSRL is plug-in compatible with standard model-free and model-based methods, while obstruction filtering rejects implausible compositions. Empirically, MSRL achieves the best average finite-budget target AUC of 0.73, outperforming MSRL from scratch (0.65), TD-MPC-PT+FT (0.63), and TD-MPC (0.57).

preprint2026arXiv

Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On

The rapid advancement of Large Language Models has given rise to autonomous LLM-based agents capable of complex reasoning and execution. As these agents transition from isolated operation to collaborative ecosystems, we witness the emergence of the Agent-to-Agent (A2A) network, a paradigm where heterogeneous agents autonomously coordinate to solve multi-step tasks. While these networks may offer better task performance compared to simply using one agent to complete the entire task, they introduce systemic vulnerabilities, such as adversarial composition, semantic misalignment, and cascading operational failures, that existing agent alignment techniques cannot address. In this vision paper, we argue that the trustworthiness of A2A networks cannot be fully guaranteed via retrofitting on existing protocols that are largely designed for individual agents. Rather, it must be architected from the very beginning of the A2A coordination framework. We present a comprehensive conceptual framework that situates trust in A2A systems through four design pillars.

preprint2023arXiv

Predicting Learning Interactions in Social Learning Networks: A Deep Learning Enabled Approach

We consider the problem of predicting link formation in Social Learning Networks (SLN), a type of social network that forms when people learn from one another through structured interactions. While link prediction has been studied for general types of social networks, the evolution of SLNs over their lifetimes coupled with their dependence on which topics are being discussed presents new challenges for this type of network. To address these challenges, we develop a series of autonomous link prediction methodologies that utilize spatial and time-evolving network architectures to pass network state between space and time periods, and that models over three types of SLN features updated in each period: neighborhood-based (e.g., resource allocation), path-based (e.g., shortest path), and post-based (e.g., topic similarity). Through evaluation on six real-world datasets from Massive Open Online Course (MOOC) discussion forums and from Purdue University, we find that our method obtains substantial improvements over Bayesian models, linear classifiers, and graph neural networks, with AUCs typically above 0.91 and reaching 0.99 depending on the dataset. Our feature importance analysis shows that while neighborhood and path-based features contribute the most to the results, post-based features add additional information that may not always be relevant for link prediction.

preprint2022arXiv

Can we Generalize and Distribute Private Representation Learning?

We study the problem of learning representations that are private yet informative, i.e., provide information about intended "ally" targets while hiding sensitive "adversary" attributes. We propose Exclusion-Inclusion Generative Adversarial Network (EIGAN), a generalized private representation learning (PRL) architecture that accounts for multiple ally and adversary attributes unlike existing PRL solutions. While centrally-aggregated dataset is a prerequisite for most PRL techniques, data in real-world is often siloed across multiple distributed nodes unwilling to share the raw data because of privacy concerns. We address this practical constraint by developing D-EIGAN, the first distributed PRL method that learns representations at each node without transmitting the source data. We theoretically analyze the behavior of adversaries under the optimal EIGAN and D-EIGAN encoders and the impact of dependencies among ally and adversary tasks on the optimization objective. Our experiments on various datasets demonstrate the advantages of EIGAN in terms of performance, robustness, and scalability. In particular, EIGAN outperforms the previous state-of-the-art by a significant accuracy margin (47% improvement), and D-EIGAN's performance is consistently on par with EIGAN under different network settings.

preprint2022arXiv

Dynamic Coupling Strategy for Interdependent Network Systems Against Cascading Failures

Cascading failures are a common phenomenon in complex networked systems where failures at only a few nodes may trigger a process of sequential failure. We applied a flow redistribution model to investigate the robustness against cascading failures in modern systems carrying flows/loads (i.e. power grid, transportation system, etc.) that contain multiple interdependent networks. In such a system, the coupling coefficients between networks, which determine how much flows/loads are redistributed between networks, are a key factor determining the robustness to cascading failures. We derive recursive expressions to characterize the evolution of such a system under dynamic network coupling. Using these expressions, we enhance the robustness of interdependent network systems by dynamically adjusting the coupling coefficients based on current system situations, minimizing the subsequent failures. The analytical and simulation results show a significant improvement in robustness compared to prior work, which considers only fixed coupling coefficients. Our proposed Step-wise Optimization (SWO) method not only shows good performance against cascading failures, but also offers better computational complexity, scalability to multiple networks, and flexibility to different attack types. We show in simulation that SWO provides robustness against cascading failures for multiple different network topologies.

preprint2022arXiv

Faithful Explanations for Deep Graph Models

This paper studies faithful explanations for Graph Neural Networks (GNNs). First, we provide a new and general method for formally characterizing the faithfulness of explanations for GNNs. It applies to existing explanation methods, including feature attributions and subgraph explanations. Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful. Third, we introduce \emph{k-hop Explanation with a Convolutional Core} (KEC), a new explanation method that provably maximizes faithfulness to the original GNN by leveraging information about the graph structure in its adjacency matrix and its \emph{k-th} power. Lastly, our empirical results over both synthetic and real-world datasets for classification and anomaly detection tasks with GNNs demonstrate the effectiveness of our approach.

preprint2022arXiv

Hierarchical Conversational Preference Elicitation with Bandit Feedback

The recent advances of conversational recommendations provide a promising way to efficiently elicit users' preferences via conversational interactions. To achieve this, the recommender system conducts conversations with users, asking their preferences for different items or item categories. Most existing conversational recommender systems for cold-start users utilize a multi-armed bandit framework to learn users' preference in an online manner. However, they rely on a pre-defined conversation frequency for asking about item categories instead of individual items, which may incur excessive conversational interactions that hurt user experience. To enable more flexible questioning about key-terms, we formulate a new conversational bandit problem that allows the recommender system to choose either a key-term or an item to recommend at each round and explicitly models the rewards of these actions. This motivates us to handle a new exploration-exploitation (EE) trade-off between key-term asking and item recommendation, which requires us to accurately model the relationship between key-term and item rewards. We conduct a survey and analyze a real-world dataset to find that, unlike assumptions made in prior works, key-term rewards are mainly affected by rewards of representative items. We propose two bandit algorithms, Hier-UCB and Hier-LinUCB, that leverage this observed relationship and the hierarchical structure between key-terms and items to efficiently learn which items to recommend. We theoretically prove that our algorithm can reduce the regret bound's dependency on the total number of items from previous work. We validate our proposed algorithms and regret bound on both synthetic and real-world data.

preprint2022arXiv

Online Competitive Influence Maximization

Online influence maximization has attracted much attention as a way to maximize influence spread through a social network while learning the values of unknown network parameters. Most previous works focus on single-item diffusion. In this paper, we introduce a new Online Competitive Influence Maximization (OCIM) problem, where two competing items (e.g., products, news stories) propagate in the same network and influence probabilities on edges are unknown. We adopt a combinatorial multi-armed bandit (CMAB) framework for OCIM, but unlike the non-competitive setting, the important monotonicity property (influence spread increases when influence probabilities on edges increase) no longer holds due to the competitive nature of propagation, which brings a significant new challenge to the problem. We provide a nontrivial proof showing that the Triggering Probability Modulated (TPM) condition for CMAB still holds in OCIM, which is instrumental for our proposed algorithms OCIM-TS and OCIM-OFU to achieve sublinear Bayesian and frequentist regret, respectively. We also design an OCIM-ETC algorithm that requires less feedback and easier offline computation, at the expense of a worse frequentist regret bound. Experimental evaluations demonstrate the effectiveness of our algorithms.

preprint2021arXiv

Reconstructing Actions To Explain Deep Reinforcement Learning

Feature attribution has been a foundational building block for explaining the input feature importance in supervised learning with Deep Neural Network (DNNs), but face new challenges when applied to deep Reinforcement Learning (RL).We propose a new approach to explaining deep RL actions by defining a class of \emph{action reconstruction} functions that mimic the behavior of a network in deep RL. This approach allows us to answer more complex explainability questions than direct application of DNN attribution methods, which we adapt to \emph{behavior-level attributions} in building our action reconstructions. It also allows us to define \emph{agreement}, a metric for quantitatively evaluating the explainability of our methods. Our experiments on a variety of Atari games suggest that perturbation-based attribution methods are significantly more suitable in reconstructing actions to explain the deep RL agent than alternative attribution methods, and show greater \emph{agreement} than existing explainability work utilizing attention. We further show that action reconstruction allows us to demonstrate how a deep agent learns to play Pac-Man game.

preprint2021arXiv

Towards Flexible Device Participation in Federated Learning

Traditional federated learning algorithms impose strict requirements on the participation rates of devices, which limit the potential reach of federated learning. This paper extends the current learning paradigm to include devices that may become inactive, compute incomplete updates, and depart or arrive in the middle of training. We derive analytical results to illustrate how allowing more flexible device participation can affect the learning convergence when data is not independently and identically distributed (non-IID). We then propose a new federated aggregation scheme that converges even when devices may be inactive or return incomplete updates. We also study how the learning process can adapt to early departures or late arrivals, and analyze their impacts on the convergence.

preprint2020arXiv

Machine Learning on Volatile Instances

Due to the massive size of the neural network models and training datasets used in machine learning today, it is imperative to distribute stochastic gradient descent (SGD) by splitting up tasks such as gradient evaluation across multiple worker nodes. However, running distributed SGD can be prohibitively expensive because it may require specialized computing resources such as GPUs for extended periods of time. We propose cost-effective strategies to exploit volatile cloud instances that are cheaper than standard instances, but may be interrupted by higher priority workloads. To the best of our knowledge, this work is the first to quantify how variations in the number of active worker nodes (as a result of preemption) affects SGD convergence and the time to train the model. By understanding these trade-offs between preemption probability of the instances, accuracy, and training time, we are able to derive practical strategies for configuring distributed SGD jobs on volatile instances such as Amazon EC2 spot instances and other preemptible cloud instances. Experimental results show that our strategies achieve good training performance at substantially lower cost.

preprint2020arXiv

Paid Prioritization with Content Competition

We study the effects of allowing paid prioritization arrangements in a market with content provider (CP) competition. We consider competing CPs who pay prioritization fees to a monopolistic ISP so as to offset the ISP's cost for investing in infrastructure to support fast lanes. Unlike prior works, our proposed model of users' content consumption accounts for multi-purchasing (i.e., users simultaneously subscribing to more than one CP). This model allows us to account for the "attention" received by each CP, and consequently to draw a contrast between how subscription-revenues and ad-revenues are impacted by paid prioritization. We show that there exist incentives for the ISP to build additional fast lanes subsidized by CPs with sufficiently high revenue (from either subscription fees or advertisements). We show that non-prioritized content providers need not lose users, yet may lose revenue from advertisements due to decreased attention from users. We further show that users will consume a wider variety of content in a prioritized regime, and that they can attain higher welfare provided that non-prioritized traffic is not throttled. We discuss some policy and practical implications of these findings and numerically validate them.

preprint2020arXiv

PayPlace: Secure and Flexible Operator-Mediated Payments in Blockchain Marketplaces at Scale

Decentralized marketplace applications demand fast, cheap and easy-to-use cryptocurrency payment mechanisms to facilitate high transaction volumes. The standard solution for off-chain payments, state channels, are optimized for frequent transactions between two entities and impose prohibitive liquidity and capital requirements on payment senders for marketplace transactions. We propose PayPlace, a scalable off-chain protocol for payments between consumers and sellers. Using PayPlace, consumers establish a virtual unidirectional payment channel with an intermediary operator to pay for their transactions. Unlike state channels, however, the PayPlace operator can reference the custodial funds accrued off-chain in these channels to in-turn make tamper-proof off-chain payments to merchants, without locking up corresponding capital in channels with merchants. Our design ensures that new payments made to merchants are guaranteed to be safe once notarized and provably mitigates well-known drawbacks in previous constructions like the data availability attack and ensures that neither consumers nor merchants need to be online to ensure continued safety of their notarized funds. We show that the on-chain monetary and computational costs for PayPlace is O(1) in the number of payment transactions processed, and is near-constant in other parameters in most scenarios. PayPlace can hence scale the payment throughput for large-scale marketplaces at no marginal cost and is orders of magnitude cheaper than the state-of-art solution for non-pairwise off-chain payments, Zero Knowledge Rollups.

preprint2014arXiv

Offering Supplementary Network Technologies: Adoption Behavior and Offloading Benefits

To alleviate the congestion caused by rapid growth in demand for mobile data, wireless service providers (WSPs) have begun encouraging users to offload some of their traffic onto supplementary network technologies, e.g., offloading from 3G or 4G to WiFi or femtocells. With the growing popularity of such offerings, a deeper understanding of the underlying economic principles and their impact on technology adoption is necessary. To this end, we develop a model for user adoption of a base technology (e.g., 3G) and a bundle of the base plus a supplementary technology (e.g., 3G + WiFi). Users individually make their adoption decisions based on several factors, including the technologies' intrinsic qualities, negative congestion externalities from other subscribers, and the flat access rates that a WSP charges. We then show how these user-level decisions translate into aggregate adoption dynamics and prove that these converge to a unique equilibrium for a given set of exogenously determined system parameters. We fully characterize these equilibria and study adoption behaviors of interest to a WSP. We then derive analytical expressions for the revenue-maximizing prices and optimal coverage factor for the supplementary technology and examine some resulting non-intuitive user adoption behaviors. Finally, we develop a mobile app to collect empirical 3G/WiFi usage data and numerically investigate the profit-maximizing adoption levels when a WSP accounts for its cost of deploying the supplemental technology and savings from offloading traffic onto this technology.

preprint2013arXiv

A Survey of Smart Data Pricing: Past Proposals, Current Plans, and Future Trends

Traditionally, network operators have used simple flat-rate broadband data plans for both wired and wireless network access. But today, with the popularity of mobile devices and exponential growth of apps, videos, and clouds, service providers are gradually moving towards more sophisticated pricing schemes. This decade will therefore likely witness a major change in the ways in which network resources are managed, and the role of economics in allocating these resources. This survey reviews some of the well-known past broadband pricing proposals (both static and dynamic), including their current realizations in various consumer data plans around the world, and discusses several research problems and open questions. By exploring the benefits and challenges of pricing data, this paper attempts to facilitate both the industrial and the academic communities' efforts in understanding the existing literature, recognizing new trends, and shaping an appropriate and timely research agenda.

preprint2013arXiv

Mind Your Own Bandwidth: An Edge Solution to Peak-hour Broadband Congestion

Motivated by recent increases in network traffic, we propose a decentralized network edge-based solution to peak-hour broadband congestion that incentivizes users to moderate their bandwidth demands to their actual needs. Our solution is centered on smart home gateways that allocate bandwidth in a two-level hierarchy: first, a gateway purchases guaranteed bandwidth from the Internet Service Provider (ISP) with virtual credits. It then self-limits its bandwidth usage and distributes the bandwidth among its apps and devices according to their relative priorities. To this end, we design a credit allocation and redistribution mechanism for the first level, and implement our gateways on commodity wireless routers for the second level. We demonstrate our system's effectiveness and practicality with theoretical analysis, simulations and experiments on real traffic. Compared to a baseline equal sharing algorithm, our solution significantly improves users' overall satisfaction and yields a fair allocation of bandwidth across users.

preprint2013arXiv

Topology of Classical Molecular Optimal Control Landscapes in Phase Space

Optimal control of molecular dynamics is commonly expressed from a quantum mechanical perspective. However, in most contexts the preponderance of molecular dynamics studies utilize classical mechanical models. This paper treats laser-driven optimal control of molecular dynamics in a classical framework. We consider the objective of steering a molecular system from an initial point in phase space to a target point, subject to the dynamic constraint of Hamilton's equations. The classical control landscape corresponding to this objective is a functional of the control field, and the topology of the landscape is analyzed through its gradient and Hessian with respect to the control. Under specific assumptions on the regularity of the control fields, the classical control landscape is found to be free of traps that could hinder reaching the objective. The Hessian associated with an optimal control field is shown to have finite rank, indicating the presence of an inherent degree of robustness to control noise. Extensive numerical simulations are performed to illustrate the theoretical principles on a) a model diatomic molecule, b) two coupled Morse oscillators, and c) a chaotic system with a coupled quartic oscillator, confirming the absence of traps in the classical control landscape. We compare the classical formulation with the mathematically analogous state-to-state transition probability control landscape of N-level quantum systems. The absence of traps in both circumstances provides a broader basis to understand the growing number of successful control experiments with complex molecules, which can have dynamics that transcend the classical and quantum regimes.

preprint2012arXiv

Mathematical Frameworks for Pricing in the Cloud: Revenue, Fairness, and Resource Allocations

As more and more users begin to use the cloud for their computing needs, datacenter operators are increasingly pressed to effectively allocate their resources among these client users. Yet while much work has been done in this area, relatively little attention has been paid to studying perhaps the ultimate lever of resource allocation: pricing. Most data centers today charge users by "bundling" heterogeneous resources together in a fixed ratio and selling these bundles to their clients. But bundling masks the fact that different users require different combinations of resources (e.g., CPUs, memory, bandwidth) to process their jobs. The presence of multiple resources in fact allows an operator to offer many different types of pricing strategies, which may have different effects on its revenue. Moreover, to avoid user dissatisfaction, operators must consider the impact of their chosen prices on the fairness of the jobs processed for different users. In this paper, we develop an analytical framework that accounts for the fairness and revenue tradeoffs that arise in a datacenter's multi-resource setting and the impact that different pricing plans can have on this tradeoff. We characterize the implications of different pricing plans on various fairness metrics and derive analytical limits on the operator's fairness-revenue tradeoff. We then provide an algorithm to navigate this tradeoff and compare the tradeoff points for different pricing strategies on a data trace taken from a Google cluster.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.07171:author:2:carlee-joe-wong

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.12798:author:6:carlee-joe-wong

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.14304:author:2:carlee-joe-wong

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.19035:author:8:carlee-joe-wong

Imported May 20, 2026Synced May 20, 2026

3 works

Mung Chiang

Researcher

Mung Chiang contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Sangtae Ha

Researcher

Sangtae Ha contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Soumya Sen

Researcher

Soumya Sen contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Anupam Datta

Researcher

Anupam Datta contributes to research discovery and scholarly infrastructure.

Open to collaborate

Carlee Joe-Wong

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Cost-Ordered Feasibility for Multi-Armed Bandits with Cost Subsidy

Emergent and Subliminal Misalignment Through the Lens of Data-Mediated Transfer

Matrix-Space Reinforcement Learning for Reusing Local Transition Geometry

Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On

Predicting Learning Interactions in Social Learning Networks: A Deep Learning Enabled Approach

Can we Generalize and Distribute Private Representation Learning?

Dynamic Coupling Strategy for Interdependent Network Systems Against Cascading Failures

Faithful Explanations for Deep Graph Models

Hierarchical Conversational Preference Elicitation with Bandit Feedback

Online Competitive Influence Maximization

Reconstructing Actions To Explain Deep Reinforcement Learning

Towards Flexible Device Participation in Federated Learning

Machine Learning on Volatile Instances

Paid Prioritization with Content Competition

PayPlace: Secure and Flexible Operator-Mediated Payments in Blockchain Marketplaces at Scale

Offering Supplementary Network Technologies: Adoption Behavior and Offloading Benefits

A Survey of Smart Data Pricing: Past Proposals, Current Plans, and Future Trends

Mind Your Own Bandwidth: An Edge Solution to Peak-hour Broadband Congestion

Topology of Classical Molecular Optimal Control Landscapes in Phase Space

Mathematical Frameworks for Pricing in the Cloud: Revenue, Fairness, and Resource Allocations