Researcher profile

Jian Du

Jian Du contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

DP-PSI: Private and Secure Set Intersection

One way to classify private set intersection (PSI) for secure 2-party computation is whether the intersection is (a) revealed to both parties or (b) hidden from both parties while only the computing function of the matched payload is exposed. Both aim to provide cryptographic security while avoiding exposing the unmatched elements of the other. They may, however, be insufficient to achieve security and privacy in one practical scenario: when the intersection is required and the information leaked through the function's output must be considered for legal, ethical, and competitive reasons. Two parties, such as the advertiser and the ads supplier, hold sets of users for PSI computation, for example, to reveal common users to the ads supplier in joint marketing applications. In addition to the security guarantees required by standard PSIs to secure unmatched elements, neither party is allowed to "single out" whether an element/user belongs to the other party or not, even though common users are required for joint advertising. This is a fascinating problem for which none of the PSI techniques have provided a solution. In light of this shortcoming, we compose differential privacy (DP) and S2PC to provide the best of both worlds and propose differentially-private PSI (DP-PSI), a new privacy model that shares PSI's strong security protection while adhering to the GDPR's recent formalization of the notion of excluding "signaling out" attacks by each party except with very low probability.

preprint2022arXiv

Dynamic Differential-Privacy Preserving SGD

The vanilla Differentially-Private Stochastic Gradient Descent (DP-SGD), including DP-Adam and other variants, ensures the privacy of training data by uniformly distributing privacy costs across training steps. The equivalent privacy costs controlled by maintaining the same gradient clipping thresholds and noise powers in each step result in unstable updates and a lower model accuracy when compared to the non-DP counterpart. In this paper, we propose the dynamic DP-SGD (along with dynamic DP-Adam, and others) to reduce the performance loss gap while maintaining privacy by dynamically adjusting clipping thresholds and noise powers while adhering to a total privacy budget constraint. Extensive experiments on a variety of deep learning tasks, including image classification, natural language processing, and federated learning, demonstrate that the proposed dynamic DP-SGD algorithm stabilizes updates and, as a result, significantly improves model accuracy in the strong privacy protection region when compared to the vanilla DP-SGD. We also conduct theoretical analysis to better understand the privacy-utility trade-off with dynamic DP-SGD, as well as to learn why Dynamic DP-SGD can outperform vanilla DP-SGD.

preprint2020arXiv

Impact of JD Bernal Thoughts in the Science of Science upon China: Implications for Quantitative Studies of Science Today

John Desmond Bernal (1901-1970) was one of the most eminent scientists in molecular biology, and also regarded as the founding father of the Science of Science. His book The Social Function of Science laid the theoretical foundations for the discipline. In this article, we summarize four chief characteristics of his ideas in the Science of Science: the socio-historical perspective, theoretical models, qualitative and quantitative approaches, and studies of science planning and policy. China has constantly reformed its scientific and technological system based on research evidence of the Science of Science. Therefore, we analyze the impact of Bernal Science-of-Science thoughts on the development of Science of Science in China, and discuss how they might be usefully taken still further in quantitative studies of science.

preprint2020arXiv

Paper-Patent Citation Linkages as Early Signs for Predicting Delayed Recognized Knowledge: Macro and Micro Evidence

In this study, we investigate the extent to which patent citations to papers can serve as early signs for predicting delayed recognized knowledge in science using a comparative study with a control group, i.e., instant recognition papers. We identify the two opposite groups of papers by the Bcp measure, a parameter-free index for identifying papers which were recognized with delay. We provide a macro (Science/Nature papers dataset) and micro (a case chosen from the dataset) evidence on paper-patent citation linkages as early signs for predicting delayed recognized knowledge in science. It appears that papers with delayed recognition show a stronger and longer technical impact than instant recognition papers. We provide indication that in the more recent years papers with delayed recognition are awakened more often and earlier by a patent rather than by a scientific paper (also called "prince"). We also found that patent citations seem to play an important role to avoid instant recognition papers to level off or to become a so called "flash in the pan", i.e., instant recognition. It also appears that the sleeping beauties may firstly encounter negative citations and then patent citations and finally get widely recognized. In contrast to the two focused fields (biology and chemistry) for instant recognition papers, delayed recognition papers are rather evenly distributed in biology, chemistry, psychology, geology, materials science, and physics. We discovered several pairs of "science sleeping"-"technology [...]. We propose in further research to discover the potential ahead of time and transformative research by using citation delay analysis, patent & NPL analysis, and citation context analysis.

preprint2020arXiv

Variational Optimization for the Submodular Maximum Coverage Problem

We examine the \emph{submodular maximum coverage problem} (SMCP), which is related to a wide range of applications. We provide the first variational approximation for this problem based on the Nemhauser divergence, and show that it can be solved efficiently using variational optimization. The algorithm alternates between two steps: (1) an E step that estimates a variational parameter to maximize a parameterized \emph{modular} lower bound; and (2) an M step that updates the solution by solving the local approximate problem. We provide theoretical analysis on the performance of the proposed approach and its curvature-dependent approximate factor, and empirically evaluate it on a number of public data sets and several application tasks.