Researcher profile

Dominic Dotterrer

Dominic Dotterrer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2021arXiv

Long Document Summarization in a Low Resource Setting using Pretrained Language Models

Abstractive summarization is the task of compressing a long document into a coherent short document while retaining salient information. Modern abstractive summarization methods are based on deep neural networks which often require large training datasets. Since collecting summarization datasets is an expensive and time-consuming task, practical industrial settings are usually low-resource. In this paper, we study a challenging low-resource setting of summarizing long legal briefs with an average source document length of 4268 words and only 120 available (document, summary) pairs. To account for data scarcity, we used a modern pretrained abstractive summarizer BART (Lewis et al., 2020), which only achieves 17.9 ROUGE-L as it struggles with long documents. We thus attempt to compress these long documents by identifying salient sentences in the source which best ground the summary, using a novel algorithm based on GPT-2 (Radford et al., 2019) language model perplexity scores, that operates within the low resource regime. On feeding the compressed documents to BART, we observe a 6.0 ROUGE-L improvement. Our method also beats several competitive salience detection baselines. Furthermore, the identified salient sentences tend to agree with an independent human labeling by domain experts.

preprint2017arXiv

Quantitative null-cobordism

For a given null-cobordant Riemannian $n$-manifold, how does the minimal geometric complexity of a null-cobordism depend on the geometric complexity of the manifold? In [Gro99], Gromov conjectured that this dependence should be linear. We show that it is at most a polynomial whose degree depends on $n$. This construction relies on another of independent interest. Take $X$ and $Y$ to be sufficiently nice compact metric spaces, such as Riemannian manifolds or simplicial complexes. Suppose $Y$ is simply connected and rationally homotopy equivalent to a product of Eilenberg-MacLane spaces: for example, any simply connected Lie group. Then two homotopic L-Lipschitz maps $f, g : X \rightarrow Y$ are homotopic via a $CL$-Lipschitz homotopy. We present a counterexample to show that this is not true for larger classes of spaces $Y$.