Skip to content

[voice agent] Add audio logging to NeMo Voice Agent#15279

Merged
tango4j merged 29 commits intomainfrom
tango4j/add_va_audio_log
Jan 13, 2026
Merged

[voice agent] Add audio logging to NeMo Voice Agent#15279
tango4j merged 29 commits intomainfrom
tango4j/add_va_audio_log

Conversation

@tango4j
Copy link
Copy Markdown
Collaborator

@tango4j tango4j commented Jan 9, 2026

What does this PR do ?

This PR adds an audio logging feature to NeMo Voice Agent.

Collection: ASR

Changelog

Changed the following files
"nemo/agents/voice_agent/pipecat/services/nemo/turn_taking.py"
"nemo/agents/voice_agent/pipecat/services/nemo/tts.py"
"nemo/agents/voice_agent/pipecat/services/nemo/stt.py"
"examples/voice_agent/server/bot_websocket_server.py"

Usage

Run NeMo voice agent using README file.

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

tango4j and others added 21 commits October 9, 2025 12:17
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Comment thread nemo/agents/voice_agent/pipecat/services/nemo/turn_taking.py Fixed
Comment thread nemo/agents/voice_agent/pipecat/services/nemo/turn_taking.py Fixed
tango4j and others added 3 commits January 8, 2026 18:24
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
@stevehuang52 stevehuang52 self-requested a review January 12, 2026 15:29
Comment thread nemo/agents/voice_agent/pipecat/services/nemo/turn_taking.py

pipeline = Pipeline(pipeline)

rtvi_params = RTVIObserverParams(bot_llm_enabled=False)
Copy link
Copy Markdown
Collaborator

@stevehuang52 stevehuang52 Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? This is used to only log the LLM messages that had been transported to the user, so that those after user interruption will not appear in the log.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mistakenly removed while merging the main. Reverting this and bringing back the rtvi_params.

tango4j and others added 2 commits January 12, 2026 14:50
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

[🤖]: Hi @tango4j 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

//cc @chtruong814 @ko3n1g @pablo-garay @thomasdhc

@tango4j tango4j merged commit 8d9b2ad into main Jan 13, 2026
57 checks passed
@tango4j tango4j deleted the tango4j/add_va_audio_log branch January 13, 2026 17:59
AkCodes23 pushed a commit to AkCodes23/NeMo that referenced this pull request Jan 28, 2026
* Adding Kokoro TTS to TTS options

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding environment and req

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removed unused import

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding eSpeakNG and nvidia yaml

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding audio logger draft

Signed-off-by: taejinp <tango4j@gmail.com>

* check vad state in STT

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding audio logger draft for checking

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* fix text aggregation, eob handling, logging

Signed-off-by: stevehuang52 <heh@nvidia.com>

* improve text segmentation logic

Signed-off-by: stevehuang52 <heh@nvidia.com>

* improve text segmentation logic

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert default cfg

Signed-off-by: stevehuang52 <heh@nvidia.com>

* adding only essential audio logger files

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed files that are wrongly updated

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed all issues and updating default.yaml

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Solved linting issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Solved another f string issue

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved backchannel start_time issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Signed-off-by: Akhil Varanasi <akhilvaranasi23@gmail.com>
nune-tadevosyan pushed a commit to nune-tadevosyan/NeMo that referenced this pull request Mar 13, 2026
* Adding Kokoro TTS to TTS options

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding environment and req

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removed unused import

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding eSpeakNG and nvidia yaml

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding audio logger draft

Signed-off-by: taejinp <tango4j@gmail.com>

* check vad state in STT

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding audio logger draft for checking

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* fix text aggregation, eob handling, logging

Signed-off-by: stevehuang52 <heh@nvidia.com>

* improve text segmentation logic

Signed-off-by: stevehuang52 <heh@nvidia.com>

* improve text segmentation logic

Signed-off-by: stevehuang52 <heh@nvidia.com>

* revert default cfg

Signed-off-by: stevehuang52 <heh@nvidia.com>

* adding only essential audio logger files

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed files that are wrongly updated

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed all issues and updating default.yaml

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Solved linting issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Solved another f string issue

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved backchannel start_time issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants