Language-based Audio Retrieval with Converging Tied Layers and Contrastive Loss

Authors: A. Koh, C. E. Siong

Published in: APSIPA Annual Summit and Conference (ASC) 2022

DOI: 10.23919/APSIPAASC55919.2022.9979840

Abstract #

This paper presents a novel approach to language-based audio retrieval using converging tied layers and contrastive loss. The work explores efficient methods for matching natural language queries to audio content.

Key Contributions #

Introduction of converging tied layers architecture for cross-modal retrieval
Application of contrastive loss for audio-text alignment
Improved retrieval performance on benchmark datasets

Technologies & Methods #

Deep learning architectures for cross-modal learning
Contrastive learning frameworks
Audio feature extraction and text embeddings
Neural network optimization techniques

Citation #

A. Koh and C. E. Siong, "Language-based audio retrieval with converging tied layers and contrastive loss,"
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC),
2022, doi: 10.23919/APSIPAASC55919.2022.9979840.