Researcher profile

Ziheng Zhang

Ziheng Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

WARDEN: Endangered Indigenous Language Transcription and Translation with 6 Hours of Training Data

This paper introduces WARDEN, an early language model system capable of transcribing and translating Wardaman, an endangered Australian indigenous language into English. The significant challenge we face is the lack of large-scale training data: in fact, we only have 6 hours of annotated audio. Therefore, while it is common practice to train a single model for transcription and translation using large datasets (like English to French), this practice is no longer viable in the Wardaman to English context. To tackle the low-resource challenge, we design WARDEN to have separate transcription and translation models: WARDEN first turns a Wardaman audio input into phonemic transcription, and then the transcription into English translation. Further, we propose two useful techniques to enhance performance. For transcription, we initialize the Wardaman token from Sundanese, a language that shares similar phonemes with Wardaman, to accelerate fine-tuning of the transcription model. For translation, we compile a Wardaman-English dictionary from expert annotations, and provide this domain-specific knowledge to a large language model (LLM) to reason and decide the final output. We empirically demonstrate that this two-stage design works better than data-hungry unified approaches in extremely low data settings. Using a mere 6 hours of annotated data, WARDEN outperforms larger open-source and proprietary models and establishes a strong baseline. Data and code are available.

preprint2022arXiv

Multi-modal Contrastive Representation Learning for Entity Alignment

Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs, which consist of structural triples and images associated with entities. Most previous works focus on how to utilize and encode information from different modalities, while it is not trivial to leverage multi-modal knowledge in entity alignment because of the modality heterogeneity. In this paper, we propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model, to obtain effective joint representations for multi-modal entity alignment. Different from previous works, MCLEA considers task-oriented modality and models the inter-modal relationships for each entity representation. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions. Extensive experimental results show that MCLEA outperforms state-of-the-art baselines on public datasets under both supervised and unsupervised settings.

preprint2022arXiv

Neuro-Symbolic Learning: Principles and Applications in Ophthalmology

Neural networks have been rapidly expanding in recent years, with novel strategies and applications. However, challenges such as interpretability, explainability, robustness, safety, trust, and sensibility remain unsolved in neural network technologies, despite the fact that they will unavoidably be addressed for critical applications. Attempts have been made to overcome the challenges in neural network computing by representing and embedding domain knowledge in terms of symbolic representations. Thus, the neuro-symbolic learning (NeSyL) notion emerged, which incorporates aspects of symbolic representation and bringing common sense into neural networks (NeSyL). In domains where interpretability, reasoning, and explainability are crucial, such as video and image captioning, question-answering and reasoning, health informatics, and genomics, NeSyL has shown promising outcomes. This review presents a comprehensive survey on the state-of-the-art NeSyL approaches, their principles, advances in machine and deep learning algorithms, applications such as opthalmology, and most importantly, future perspectives of this emerging field.