Source author record

Cheng-Yu Hsieh

Cheng-Yu Hsieh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Machine Learning Artificial Intelligence Data Structures and Algorithms

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Survey on Programmatic Weak Supervision

Labeling training data has become one of the major roadblocks to using machine learning. Among various weak supervision paradigms, programmatic weak supervision (PWS) has achieved remarkable success in easing the manual labeling bottleneck by programmatically synthesizing training labels from multiple potentially noisy supervision sources. This paper presents a comprehensive survey of recent advances in PWS. In particular, we give a brief introduction of the PWS learning paradigm, and review representative approaches for each component within PWS's learning workflow. In addition, we discuss complementary learning paradigms for tackling limited labeled data scenarios and how these related approaches can be used in conjunction with PWS. Finally, we identify several critical challenges that remain under-explored in the area to hopefully inspire future research directions in the field.

preprint2022arXiv

Understanding Programmatic Weak Supervision via Source-aware Influence Function

Programmatic Weak Supervision (PWS) aggregates the source votes of multiple weak supervision sources into probabilistic training labels, which are in turn used to train an end model. With its increasing popularity, it is critical to have some tool for users to understand the influence of each component (e.g., the source vote or training data) in the pipeline and interpret the end model behavior. To achieve this, we build on Influence Function (IF) and propose source-aware IF, which leverages the generation process of the probabilistic labels to decompose the end model's training objective and then calculate the influence associated with each (data, source, class) tuple. These primitive influence score can then be used to estimate the influence of individual component of PWS, such as source vote, supervision source, and training data. On datasets of diverse domains, we demonstrate multiple use cases: (1) interpreting incorrect predictions from multiple angles that reveals insights for debugging the PWS pipeline, (2) identifying mislabeling of sources with a gain of 9%-37% over baselines, and (3) improving the end model's generalization performance by removing harmful components in the training objective (13%-24% better than ordinary IF).

preprint2016arXiv

A Linear-Time Algorithm for the Weighted Paired-Domination Problem on Block Graphs

In a graph $G = (V,E)$, a vertex subset $S\subseteq V(G)$ is said to be a dominating set of $G$ if every vertex not in $S$ is adjacent to a vertex in $S$. A dominating set $S$ of $G$ is called a paired-dominating set of $G$ if the induced subgraph $G[S]$ contains a perfect matching. In this paper, we propose an $O(n+m)$-time algorithm for the weighted paired-domination problem on block graphs using dynamic programming, which strengthens the results in [Theoret. Comput. Sci., 410(47--49):5063--5071, 2009] and [J. Comb. Optim., 19(4):457--470, 2010]. Moreover, the algorithm can be completed in $O(n)$ time if the block-cut-vertex structure of $G$ is given.