Source author record

Yi-Ling Chen

Yi-Ling Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Computer Vision eess.AS Machine Learning Neurons and Cognition Social and Information Networks

Catalog footprint

What is connected

3works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

i-Code: An Integrative and Composable Multimodal Learning Framework

Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview. Most current pretraining methods, however, are limited to one or two modalities. We present i-Code, a self-supervised pretraining framework where users may flexibly combine the modalities of vision, speech, and language into unified and general-purpose vector representations. In this framework, data from each modality are first given to pretrained single-modality encoders. The encoder outputs are then integrated with a multimodal fusion network, which uses novel attention mechanisms and other architectural innovations to effectively combine information from the different modalities. The entire system is pretrained end-to-end with new objectives including masked modality unit modeling and cross-modality contrastive learning. Unlike previous research using only video for pretraining, the i-Code framework can dynamically process single, dual, and triple-modality data during training and inference, flexibly projecting different combinations of modalities into a single representation space. Experimental results demonstrate how i-Code can outperform state-of-the-art techniques on five video understanding tasks and the GLUE NLP benchmark, improving by as much as 11% and demonstrating the power of integrative multimodal pretraining.

preprint2021arXiv

Ubiquitous proximity to a critical state for collective neural activity in the CA1 region of freely moving mice

Using miniscope recordings of calcium fluorescence signals in the CA1 region of the hippocampus of mice, we monitor the neural activity of hippocampal regions while the animals are freely moving in an open chamber. Using a data-driven statistical modeling approach, the statistical properties of the recorded data are mapped to spin-glass models with pairwise interactions. Considering the parameter space of the model, the observed system is generally near a critical state between two distinct phases. The close proximity to the criticality is found to be robust against different ways of sampling and segmentation of the measured data. By independently altering the coupling distribution and the network structure of the statistical model, the network structures are found to be vital to maintain the proximity to the critical state. We further find the observed assignment of the coupling strengths makes the net coupling at each site more balanced with slight variation, which likely helps the maintenance of the critical state. Network analysis on the connectivity obtained by thresholding the coupling strengths find the connectivity of the networks to be well described by a random network model. These results are consistent across different experiments, sampling and segmentation choices in our analysis.

preprint2011arXiv

On Social-Temporal Group Query with Acquaintance Constraint

Three essential criteria are important for activity planning, including: (1) finding a group of attendees familiar with the initiator, (2) ensuring each attendee in the group to have tight social relations with most of the members in the group, and (3) selecting an activity period available for all attendees. Therefore, this paper proposes Social-Temporal Group Query to find the activity time and attendees with the minimum total social distance to the initiator. Moreover, this query incorporates an acquaintance constraint to avoid finding a group with mutually unfamiliar attendees. Efficient processing of the social-temporal group query is very challenging. We show that the problem is NP-hard via a proof and formulate the problem with Integer Programming. We then propose two efficient algorithms, SGSelect and STGSelect, which include effective pruning techniques and employ the idea of pivot time slots to substantially reduce the running time, for finding the optimal solutions. Experimental results indicate that the proposed algorithms are much more efficient and scalable. In the comparison of solution quality, we show that STGSelect outperforms the algorithm that represents manual coordination by the initiator.