Researcher profile

Jessica Lin

Jessica Lin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2023arXiv

PMP: Privacy-Aware Matrix Profile against Sensitive Pattern Inference for Time Series

Recent rapid development of sensor technology has allowed massive fine-grained time series (TS) data to be collected and set the foundation for the development of data-driven services and applications. During the process, data sharing is often involved to allow the third-party modelers to perform specific time series data mining (TSDM) tasks based on the need of data owner. The high resolution of TS brings new challenges in protecting privacy. While meaningful information in high-resolution TS shifts from concrete point values to local shape-based segments, numerous research have found that long shape-based patterns could contain more sensitive information and may potentially be extracted and misused by a malicious third party. However, the privacy issue for TS patterns is surprisingly seldom explored in privacy-preserving literature. In this work, we consider a new privacy-preserving problem: preventing malicious inference on long shape-based patterns while preserving short segment information for the utility task performance. To mitigate the challenge, we investigate an alternative approach by sharing Matrix Profile (MP), which is a non-linear transformation of original data and a versatile data structure that supports many data mining tasks. We found that while MP can prevent concrete shape leakage, the canonical correlation in MP index can still reveal the location of sensitive long pattern. Based on this observation, we design two attacks named Location Attack and Entropy Attack to extract the pattern location from MP. To further protect MP from these two attacks, we propose a Privacy-Aware Matrix Profile (PMP) via perturbing the local correlation and breaking the canonical correlation in MP index vector. We evaluate our proposed PMP against baseline noise-adding methods through quantitative analysis and real-world case studies to show the effectiveness of the proposed method.

preprint2022arXiv

Leveraging World Knowledge in Implicit Hate Speech Detection

While much attention has been paid to identifying explicit hate speech, implicit hateful expressions that are disguised in coded or indirect language are pervasive and remain a major challenge for existing hate speech detection systems. This paper presents the first attempt to apply Entity Linking (EL) techniques to both explicit and implicit hate speech detection, where we show that such real world knowledge about entity mentions in a text does help models better detect hate speech, and the benefit of adding it into the model is more pronounced when explicit entity triggers (e.g., rally, KKK) are present. We also discuss cases where real world knowledge does not add value to hate speech detection, which provides more insights into understanding and modeling the subtleties of hate speech.

preprint2022arXiv

Symmetric cooperative motion in one dimension

We explore the relationship between recursive distributional equations and convergence results for finite difference schemes of parabolic partial differential equations (PDEs). We focus on a family of random processes called symmetric cooperative motions, which generalize the symmetric simple random walk and the symmetric hipster random walk introduced in [Addario-Berry, Cairns, Devroye, Kerriou and Mitchell, arXiv:1909.07367]. We obtain a distributional convergence result for symmetric cooperative motions and, along the way, obtain a novel proof of the Bernoulli central limit theorem. In addition, we prove a PDE result relating distributional solutions and viscosity solutions of the porous medium equation and the parabolic $p$-Laplace equation, respectively, in one dimension.

preprint2020arXiv

Ensemble Grammar Induction For Detecting Anomalies in Time Series

Time series anomaly detection is an important task, with applications in a broad variety of domains. Many approaches have been proposed in recent years, but often they require that the length of the anomalies be known in advance and provided as an input parameter. This limits the practicality of the algorithms, as such information is often unknown in advance, or anomalies with different lengths might co-exist in the data. To address this limitation, previously, a linear time anomaly detection algorithm based on grammar induction has been proposed. While the algorithm can find variable-length patterns, it still requires preselecting values for at least two parameters at the discretization step. How to choose these parameter values properly is still an open problem. In this paper, we introduce a grammar-induction-based anomaly detection method utilizing ensemble learning. Instead of using a particular choice of parameter values for anomaly detection, the method generates the final result based on a set of results obtained using different parameter values. We demonstrate that the proposed ensemble approach can outperform existing grammar-induction-based approaches with different criteria for selection of parameter values. We also show that the proposed approach can achieve performance similar to that of the state-of-the-art distance-based anomaly detection algorithm.

preprint2020arXiv

Nanoparticle seeded glancing-angle deposition of tip-handle heterostructures for manipulation of individual nanoparticles

The controllable handling of an arbitrary single particle of matter with sub-100 nanometer (nm) dimensions is an essential but unsolved scientific challenge. We demonstrate nanoparticle-seeded glancing angle deposition using 10-100 nm diameter nanoparticle seeds (Er2O3, Fe@C, and Fe). The products are nanoparticle-nanowire heterostructures composed of arbitrary nanoscale tips attached to micron-length nanowire handles. Optical micromanipulation of the micron-scale handles enables concurrent manipulation of the attached nanoscale particles of matter.

preprint2020arXiv

Semantic Discord: Finding Unusual Local Patterns for Time Series

Finding anomalous subsequence in a long time series is a very important but difficult problem. Existing state-of-the-art methods have been focusing on searching for the subsequence that is the most dissimilar to the rest of the subsequences; however, they do not take into account the background patterns that contain the anomalous candidates. As a result, such approaches are likely to miss local anomalies. We introduce a new definition named \textit{semantic discord}, which incorporates the context information from larger subsequences containing the anomaly candidates. We propose an efficient algorithm with a derived lower bound that is up to 3 orders of magnitude faster than the brute force algorithm in real world data. We demonstrate that our method significantly outperforms the state-of-the-art methods in locating anomalies by extensive experiments. We further explain the interpretability of semantic discord.