Skip to main content

Language-based Audio Retrieval with Converging Tied Layers and Contrastive Loss

·1 min

Authors: A. Koh, C. E. Siong

Published in: APSIPA Annual Summit and Conference (ASC) 2022

DOI: 10.23919/APSIPAASC55919.2022.9979840

Abstract #

This paper presents a novel approach to language-based audio retrieval using converging tied layers and contrastive loss. The work explores efficient methods for matching natural language queries to audio content.

Key Contributions #

  • Introduction of converging tied layers architecture for cross-modal retrieval
  • Application of contrastive loss for audio-text alignment
  • Improved retrieval performance on benchmark datasets

Technologies & Methods #

  • Deep learning architectures for cross-modal learning
  • Contrastive learning frameworks
  • Audio feature extraction and text embeddings
  • Neural network optimization techniques

Citation #

A. Koh and C. E. Siong, "Language-based audio retrieval with converging tied layers and contrastive loss,"
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC),
2022, doi: 10.23919/APSIPAASC55919.2022.9979840.