Skip to main content

Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning

·1 min

Read the full blog post about this research

Authors: A. Koh, S. Tiwari, C. E. Siong

Published in: APSIPA Annual Summit and Conference (ASC) 2022

DOI: 10.23919/APSIPAASC55919.2022.9980242

Abstract #

This paper introduces a curriculum learning approach for automated audio captioning that uses epochal difficult captions to progressively train the model. The method improves caption quality by strategically ordering training samples based on difficulty.

Key Contributions #

  • Novel curriculum learning strategy for audio captioning
  • Epochal difficulty measurement for caption samples
  • Improved training efficiency and caption quality
  • Progressive learning from easy to difficult examples

Technologies & Methods #

  • Deep learning for audio captioning
  • Curriculum learning frameworks
  • Automatic difficulty assessment
  • Sequence-to-sequence modeling
  • Audio feature extraction

Research Impact #

This work demonstrates how curriculum learning principles can be effectively applied to automated audio captioning, leading to better model performance and training stability.

Citation #

A. Koh, S. Tiwari and C. E. Siong, "Automated audio captioning with epochal difficult captions for curriculum learning,"
2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC),
2022, doi: 10.23919/APSIPAASC55919.2022.9980242.