Source author record

Shanshan Wu

Shanshan Wu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning cond-mat.mtrl-sci Information Theory math.IT Biological Physics Biomolecules Computer Vision cond-mat.mes-hall Data Structures and Algorithms Distributed, Parallel, and Cluster Computing math.OC

Catalog footprint

What is connected

7works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Federated Reconstruction: Partially Local Federated Learning

Personalization methods in federated learning aim to balance the benefits of federated and local training for data availability, communication cost, and robustness to client heterogeneity. Approaches that require clients to communicate all model parameters can be undesirable due to privacy and communication constraints. Other approaches require always-available or stateful clients, impractical in large-scale cross-device settings. We introduce Federated Reconstruction, the first model-agnostic framework for partially local federated learning suitable for training and inference at scale. We motivate the framework via a connection to model-agnostic meta learning, empirically demonstrate its performance over existing approaches for collaborative filtering and next word prediction, and release an open-source library for evaluating approaches in this setting. We also describe the successful deployment of this approach at scale for federated collaborative filtering in a mobile keyboard application.

preprint2022arXiv

Implicit Regularization and Convergence for Weight Normalization

Normalization methods such as batch [Ioffe and Szegedy, 2015], weight [Salimansand Kingma, 2016], instance [Ulyanov et al., 2016], and layer normalization [Baet al., 2016] have been widely used in modern machine learning. Here, we study the weight normalization (WN) method [Salimans and Kingma, 2016] and a variant called reparametrized projected gradient descent (rPGD) for overparametrized least-squares regression. WN and rPGD reparametrize the weights with a scale g and a unit vector w and thus the objective function becomes non-convex. We show that this non-convex formulation has beneficial regularization effects compared to gradient descent on the original objective. These methods adaptively regularize the weights and converge close to the minimum l2 norm solution, even for initializations far from zero. For certain stepsizes of g and w , we show that they can converge close to the minimum norm solution. This is different from the behavior of gradient descent, which converges to the minimum norm solution only when started at a point in the range space of the feature matrix, and is thus more sensitive to initialization.

preprint2020arXiv

A novel mechanism for energy activation in biomolecules

An activated process consists of energy activation and barrier crossing; the former is a prerequisite for the latter. Barrier crossing has been studied extensively, but energy activation has been overlooked due to a lack of means to gauge its progress. We define reaction stability as the probability that reactive trajectories pass a vicinity in phase space; it enabled us to analyze energy activation of a biomolecular isomerization. This process follows a mechanism fundamentally different from presumed mechanisms in standard reaction rate theories: it features accumulation of high kinetic energy in reaction coordinates, achieved by precise synergy between them coordinated by momentum space.

preprint2016arXiv

Single Pass PCA of Matrix Products

In this paper we present a new algorithm for computing a low rank approximation of the product $A^TB$ by taking only a single pass of the two matrices $A$ and $B$. The straightforward way to do this is to (a) first sketch $A$ and $B$ individually, and then (b) find the top components using PCA on the sketch. Our algorithm in contrast retains additional summary information about $A,B$ (e.g. row and column norms etc.) and uses this additional information to obtain an improved approximation from the sketches. Our main analytical result establishes a comparable spectral norm guarantee to existing two-pass methods; in addition we also provide results from an Apache Spark implementation that shows better computational and statistical performance on real-world and synthetic evaluation datasets.

preprint2013arXiv

A First-Principles Study of CdSe Nanoclusters Capped by Thiol Ligands

A first-principles study of small CdnSen quantum dots (QDs) (n = 6, 13, and 33) has been performed for the study of QD-sensitized solar cells. We assessed the effects of the passivating thiol-radical ligands on the optimized structure, the energy gap, and on the absorption spectrum. The simplest thiol, methanethiol, and four other thiol type ligands, namely - cysteine (Cys), mercaptopropionic acid (MPA), and their reduced-chain analogues, were investigated. We have come to the following conclusions. (a) Thiol-radical ligands possessed greater effects on the structure and electronic properties of the CdSe QDs than thiol ligands alone. (b) The sulfur 3p orbitals were localized as the midgap states for the thiol-radical-ligated complex, which altered the absorption spectrum of bare Cd6Se6 by inducing a new lower energy absorption peak at 2.37 eV. (c) The thiol-radical-ligated complex was also found to be sensitive to the position and number of ligands. (d) Both the amine group on Cys and the carboxyl group on Cys and MPA showed a strong tendency to bond with the neighboring Cd atom, especially when the length of the ligand was reduced. This formation of Cd-N and Cd-O bonds resulted in smaller HOMO-LUMO gaps and a stronger binding between the ligands and the surface atoms of CdSe nanoclusters.

preprint2013arXiv

A First-Principles Study of Thiol Ligated CdSe Nanoclusters

A first-principles study of small CdnSen Quantum Dots (QD) ('n' =6, 12, 13, and 33) has been performed for application to QD solar cell development. We separately assess the effects of the particle size and the passivating ligands upon the optimized structure and the energy gap (from a density functional theory (DFT) calculation) and the corresponding absorption spectrum (from a time-dependent density functional theory (TDDFT) calculation). The structures of four thiol ligands, namely - cysteine (Cys), mercaptopropionic acid (MPA), and their reduced-chain analogues, are investigated. We have documented significant passivation effects of the surfactants upon the structure and the optical absorption properties of the CdSe quantum dots: The surface Cd-Se bonds are weakened, whereas the core bonds are strengthened. A blue shift of the absorption spectrum by ~0.2 eV is observed. Also, the optical absorption intensity is enhanced by the passivation. By contrast, we have observed that varying the length of ligands yields only a minor effect upon the absorption properties: a shorter alkane chain might induce a slightly stronger interaction between the -NH2 group and the nearest surface Se atom, which is observed as a stronger ligand binding energy. For Cd12Se12, which is regarded as the 'non-magic' size QD, neither the self-relaxation nor the ligand passivation could fully stabilize the structure or improve the poor electronic properties. We also observe that the category of thiol ligands possesses a better ability to open the band gap of CdSe QD than either phosphine oxide or amine ligands. Our estimation of the absorption peak of the Cys-capped QDs ranges from 413 nm to 460 nm, which is consistent to the experimental peak as 422 nm.

preprint2012arXiv

Information-Theoretic Study on Routing Path Selection in Two-Way Relay Networks

Two-way relaying is a promising technique to improve network throughput. However, how to apply it to a wireless network remains an unresolved issue. Particularly, challenges lie in the joint design between the physical layer and the routing protocol. Applying an existing routing protocol to a two-way relay network can easily compromise the advantages of two-way relaying. Considering routing path selection and two-way relaying together can be formulated as a network optimization problem, but it is usually NP-hard. In this paper, we take a different approach to study routing path selection for two-way relay networks. Instead of solving the joint optimization problem, we study the fundamental characteristics of a routing path consisting of multihop two-way relaying nodes. Information theoretical analysis is carried out to derive bandwidth efficiency and energy efficiency of a routing path in a two-way relay network. Such analysis provides a framework of routing path selection by considering bandwidth efficiency, energy efficiency and latency subject to physical layer constraints such as the transmission rate, transmission power, path loss exponent, path length, and the number of relays. This framework provides insightful guidelines on routing protocol design of a two-way relay network. Our analytical framework and insights are illustrated by extensive numerical results.

Shanshan Wu

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Federated Reconstruction: Partially Local Federated Learning

Implicit Regularization and Convergence for Weight Normalization

A novel mechanism for energy activation in biomolecules

Single Pass PCA of Matrix Products

A First-Principles Study of CdSe Nanoclusters Capped by Thiol Ligands

A First-Principles Study of Thiol Ligated CdSe Nanoclusters

Information-Theoretic Study on Routing Path Selection in Two-Way Relay Networks