Skip to main content section
Education Spotlight
::: Home / About / Education Spotlight
2022 Eighth Frederick Jelinek Memorial Summer Workshop
Participation in JSALT 2022 by Prof. Hung-yi Lee and NTU Students

From June to August 2022, Associate Professor Hung-yi Lee of the Department of Electrical Engineering led students from NTU’s College of Electrical Engineering and Computer Science to participate in the 2022 Eighth Frederick Jelinek Memorial Summer Workshop (JSALT). The team built close international collaborations and achieved notable breakthroughs in speech processing. The following is based on an interview with Prof. Lee.

About JSALT

JSALT is an annual event in the speech processing field, aiming to bring researchers together to tackle key challenges. It began in 1995, initiated by the Center for Language and Speech Processing (CLSP) at Johns Hopkins University (JHU). In recent years, it alternates between CLSP and other hosting teams. Many widely used tools and techniques, such as the speech recognition toolkit Kaldi, the machine translation toolkit Moses, and the speaker recognition method i-vector, were first developed at JSALT. The 2022 workshop was hosted by CLSP with sponsorship from Amazon, Google, and Microsoft.

Proposal Stage

Each JSALT starts with proposals submitted in the previous October. “Although the proposal itself is just a one-page document, the review process is extremely rigorous,” said Prof. Lee. The review spans a three-day workshop with 34 committee members (12 from academia, 20 from industry, and 2 from the U.S. government). Each proposal is presented three times, revised twice, and finalized through voting. Only the top three proposals are selected for execution.

Prof. Lee’s proposal, “Leveraging Pre-Training Models for Speech Processing” focused on self-supervised learning. Unlike supervised learning, which requires large amounts of labeled data, self-supervised learning allows models to learn from raw speech or videos, similar to how children acquire language naturally. With just a small amount of labeled data, these models can then adapt to downstream tasks (e.g., speech recognition, speaker verification). Prof. Lee’s proposal was one of the three selected.
The other two were:

● “Speech Translation for Under-Resourced Languages” (Dr. Anthony Larcher)
● “Multilingual and Code-Switching Speech Recognition” (Dr. Ahmed Ali)

Project Execution

Once approved, the proposer becomes team lead. Prof. Lee’s team included members from NTU, the University of Edinburgh, University of Texas at Austin, UT El Paso, Carnegie Mellon University, JHU, A*STAR Singapore, NUS, Penn State, University of Maryland, MIT,

AUO-NTU Research Center
arrow up