Source author record

Michael Sollami

Michael Sollami appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Computer Vision Formal Languages and Automata Theory math.CO

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

XFBoost: Improving Text Generation with Controllable Decoders

Multimodal conditionality in transformer-based natural language models has demonstrated state-of-the-art performance in the task of product description generation. Recent approaches condition a language model on one or more images and other textual metadata to achieve near-human performance for describing products from e-commerce stores. However, generated descriptions may exhibit degrees of inaccuracy or even contradictory claims relative to the inputs of a given product. In this paper, we propose a controllable language generation framework called Extract-Finetune-Boost (XFBoost), which addresses the problem of inaccurate low-quality inference. By using visual semantic attributes as constraints at the decoding stage of the generation process and finetuning the language model with policy gradient techniques, the XFBoost framework is found to produce significantly more descriptive text with higher image relevancy, outperforming baselines and lowering the frequency of factually inaccurate descriptions. We further demonstrate the application of XFBoost to online learning wherein human-in-the-loop critics improve language models with active feedback.

preprint2020arXiv

GarmentGAN: Photo-realistic Adversarial Fashion Transfer

The garment transfer problem comprises two tasks: learning to separate a person's body (pose, shape, color) from their clothing (garment type, shape, style) and then generating new images of the wearer dressed in arbitrary garments. We present GarmentGAN, a new algorithm that performs image-based garment transfer through generative adversarial methods. The GarmentGAN framework allows users to virtually try-on items before purchase and generalizes to various apparel types. GarmentGAN requires as input only two images, namely, a picture of the target fashion item and an image containing the customer. The output is a synthetic image wherein the customer is wearing the target apparel. In order to make the generated image look photo-realistic, we employ the use of novel generative adversarial techniques. GarmentGAN improves on existing methods in the realism of generated imagery and solves various problems related to self-occlusions. Our proposed model incorporates additional information during training, utilizing both segmentation maps and body key-point information. We show qualitative and quantitative comparisons to several other networks to demonstrate the effectiveness of this technique.

preprint2016arXiv

An Improved Lower Bound for $n$-Brinkhuis $k$-Triples

Let $s_n$ be the number of words consisting of the ternary alphabet consisting of the digits 0, 1, and 2 such that no subword (or factor) is a square (a word concatenated with itself, e.g., $11$, $1212$, or $102102$). From computational evidence, $s_n$ grows exponentially at a rate of about $1.317277^n$. While known upper bounds are already relatively close to the conjectured rate, effective lower bounds are much more difficult to obtain. In this paper, we construct a $54$-Brinkhuis $952$-triple, which leads to an improved lower bound on the number of $n$-letter ternary squarefree words: $952^{n/53} \approx 1.1381531^n$.