fix: bound json/regex/whitespace_pattern guided-decoding inputs by dagil-nvidia · Pull Request #8349 · ai-dynamo/dynamo

dagil-nvidia · 2026-04-19T17:24:11Z

Overview

Adds byte-length and nesting-depth caps to GuidedDecodingOptions::validate() so the OpenAI guided-decoding path bounds pathological inputs before they reach the downstream guided decoding backend (notably xgrammar, the default for vLLM, SGLang, and TRT-LLM workers). Without these caps, sufficiently deep / oversized guided_json, guided_regex, guided_grammar, or guided_whitespace_pattern inputs can drive unbounded recursion or runaway parsing in the worker.

Note: This PR targets release/1.0.2. The branch has been rebased onto release/1.0.2 so the diff is isolated to the guided-decoding change.

Details

New constants in lib/llm/src/protocols/common.rs:

Field	Cap
`guided_grammar`	64 KiB
`guided_regex`	32 KiB
`guided_whitespace_pattern`	1 KiB
`guided_json` (serialized)	256 KiB
`guided_json` nesting depth	64

CountingWriter is a zero-alloc std::io::Write adapter used to measure JSON serialized length without paying for to_string.
json_exceeds_nesting_depth is an iterative early-return walker: O(visited_until_breach), so wide-flat schemas bail out at the first over-depth child rather than fully traversing.
GuidedDecodingOptions::new() doc comment now explicitly states it bypasses validation by design (used by migration.rs); public callers building from external input must use validated() / from_optional().

Defaults are deliberately generous; they exist to bound pathological inputs, not to gate normal traffic.

Where should the reviewer start?

lib/llm/src/protocols/common.rs — the new constants, CountingWriter, json_exceeds_nesting_depth, and the additions to GuidedDecodingOptions::validate().
The new tests at the bottom of the same file's tests module.

Test Plan

cargo test -p dynamo-llm protocols::common::tests — 16 passed, 0 failed (8 pre-existing + 8 new at-limit/just-over cases for every new bound)
cargo clippy -p dynamo-llm --all-targets -- -D warnings — clean
CI green (19 pass / 0 fail on the release/1.0.2 target)

Related Issues

Relates to: guided-decoding input hardening

Made with Cursor

Summary by CodeRabbit

Bug Fixes
- Enhanced validation of guided-decoding inputs to enforce stricter size and complexity limits.
Documentation
- Updated documentation for guided-decoding configuration options.
Tests
- Added validation tests for guided-decoding parameters.

Link to Devin session: https://nvidia.devinenterprise.com/sessions/da7195f7c9ca45bd875c180ffb6c29f6
Requested by: @dagil-nvidia

coderabbitai · 2026-04-19T17:30:18Z

Walkthrough

Added validation guards to GuidedDecodingOptions::validate() enforcing byte-length caps on grammar, regex, and whitespace pattern inputs, plus byte-length and JSON nesting depth caps on JSON inputs. Introduced CountingWriter utility and json_exceeds_nesting_depth helper function, with comprehensive unit test coverage.

Changes

Cohort / File(s)	Summary
Guided Decoding Validation Enhancements `lib/llm/src/protocols/common.rs`	Added size and depth validation guards for guided-decoding inputs with new byte-length constants and JSON nesting depth checks. Introduced `CountingWriter` Write adapter for serialization measurement and `json_exceeds_nesting_depth` helper. Updated `GuidedDecodingOptions::new` documentation. Added unit tests covering boundary conditions and depth constraints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding bounds/validation for JSON, regex, and whitespace pattern inputs in guided-decoding to prevent security/stability issues.
Description check	✅ Passed	The PR description provides comprehensive coverage of all required sections including overview, detailed changes with a reference table, specific files to review, and related issues context.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

lib/llm/src/protocols/common.rs (1)

1071-1094: Add JSON at-limit acceptance cases.

JSON currently only has rejection tests. Add exact-boundary tests for serialized byte length and nesting depth to lock the intended off-by-one behavior.

🧪 Proposed boundary tests

     fn test_guided_json_byte_length_rejected() {
         // A JSON string of (LIMIT+1) `a`s serializes to LIMIT+3 bytes (two
         // surrounding quotes), which trips the byte-length cap before any
         // depth check.
         let big_string = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH + 1);
         let json = serde_json::Value::String(big_string);
         let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None);
         assert!(result.is_err());
         assert!(result.unwrap_err().to_string().contains("byte length"));
     }
+
+    #[test]
+    fn test_guided_json_byte_length_at_limit_accepted() {
+        // Two bytes are used by the surrounding JSON string quotes.
+        let s = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH - 2);
+        let json = serde_json::Value::String(s);
+        let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None);
+        assert!(result.is_ok());
+    }
 
     #[test]
     fn test_guided_json_nesting_depth_rejected() {
         // Build a chain of `MAX_JSON_SCHEMA_NESTING_DEPTH + 1` nested objects,
         // each `{"a": ...}`. The serialized form stays well under the byte cap.
         let mut value = serde_json::json!({});
         for _ in 0..(MAX_JSON_SCHEMA_NESTING_DEPTH + 1) {
             value = serde_json::json!({ "a": value });
         }
         let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None);
         assert!(result.is_err());
         assert!(result.unwrap_err().to_string().contains("nesting depth"));
     }
+
+    #[test]
+    fn test_guided_json_nesting_depth_at_limit_accepted() {
+        let mut value = serde_json::json!("leaf");
+        for _ in 0..MAX_JSON_SCHEMA_NESTING_DEPTH {
+            value = serde_json::json!({ "a": value });
+        }
+        let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None);
+        assert!(result.is_ok());
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/llm/src/protocols/common.rs` around lines 1071 - 1094, Add two tests that
assert acceptance at the exact boundaries: for byte length, create a string of
length MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including
quotes equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as
serde_json::Value::String, call GuidedDecodingOptions::validated(Some(...),
None, None, None, None, None) and assert Ok (e.g.,
test_guided_json_byte_length_accepted_at_limit); for nesting depth, build a
chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects (not +1), call
GuidedDecodingOptions::validated with that value and assert Ok (e.g.,
test_guided_json_nesting_depth_accepted_at_limit). Use the existing
MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and
GuidedDecodingOptions::validated identifiers to locate where to add them.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 77-98: The function json_exceeds_nesting_depth currently checks
depth before matching the Value, causing scalar leaves to be counted as extra
nesting; in json_exceeds_nesting_depth, remove or move the `if d > max_depth {
return true; }` check out of the top of the loop and instead check depth only
when processing container nodes: inside the serde_json::Value::Object and
::Array arms, before pushing children, test whether d + 1 > max_depth and return
true if so, otherwise push children with depth d + 1; this ensures scalar
variants (the `_` arm) do not cause false positives while preserving correct
container-depth detection.
- Around line 57-70: The error is leaking attacker-controlled payloads and
CountingWriter doesn't enforce a size cap; update CountingWriter to hold a
max_bytes field and return an error (std::io::ErrorKind::Other or a custom
error) once buf.len() would push the count past max_bytes so serialization
terminates early, and replace any error messages that use debug-printing of
GuidedDecodingOptions (e.g., places referencing GuidedDecodingOptions in error
returns/logs) with a generic, non-payload-bearing message; apply the same
no-debug-message change to the other occurrences noted (around the blocks at
~521-525 and ~586-600) so errors never echo the full struct.

---

Nitpick comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 1071-1094: Add two tests that assert acceptance at the exact
boundaries: for byte length, create a string of length
MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including quotes
equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as serde_json::Value::String, call
GuidedDecodingOptions::validated(Some(...), None, None, None, None, None) and
assert Ok (e.g., test_guided_json_byte_length_accepted_at_limit); for nesting
depth, build a chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects
(not +1), call GuidedDecodingOptions::validated with that value and assert Ok
(e.g., test_guided_json_nesting_depth_accepted_at_limit). Use the existing
MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and
GuidedDecodingOptions::validated identifiers to locate where to add them.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 827390f2-4d09-40bb-8ba0-6376b29be0d6

📥 Commits

Reviewing files that changed from the base of the PR and between d94b350 and def3841.

📒 Files selected for processing (1)

lib/llm/src/protocols/common.rs

…puts Add byte-length caps for guided_grammar (64 KiB), guided_regex (32 KiB), guided_whitespace_pattern (1 KiB), and guided_json (256 KiB serialized), plus a JSON nesting-depth cap (64) on guided_json. These bound pathological inputs that can otherwise drive unbounded recursion / runaway parsing in the downstream guided decoding backend (notably xgrammar, the default for vLLM, SGLang, and TRT-LLM workers). GuidedDecodingOptions::validate() is the single choke point on the OpenAI ingest path for all three workers. GuidedDecodingOptions::new() bypasses validation by design (used by migration.rs to re-wrap already-validated requests); the doc comment now states this explicitly. Defaults are deliberately generous and only intended to bound pathological input. Adds 8 unit tests covering each new bound at-limit and just-over. Signed-off-by: Dan Gil <dagil@nvidia.com> Made-with: Cursor

- json_exceeds_nesting_depth: count container levels only; scalar leaves no longer over-count by one (a 64-deep object chain ending in a scalar is now correctly at-limit instead of over-limit). - CountingWriter: short-circuit serde_json::to_writer once the byte cap is exceeded (returns ErrorKind::InvalidData). Pathological multi-MB schemas now cost O(limit) to reject instead of O(input). - GuidedDecodingOptions::validate exclusivity check: drop the {:?} debug print of self from the error message. Without this, a request that sets multiple guided_* fields would echo the (now-capped but still up to 256 KiB) attacker-supplied payload back into error responses and logs. - Tests: add at-limit acceptance cases for guided_json byte length (MAX-2 chars + 2 quotes = MAX) and nesting depth (MAX container levels around a scalar leaf), to lock the off-by-one boundaries. Signed-off-by: Dan Gil <dagil@nvidia.com> Made-with: Cursor

copy-pr-bot · 2026-04-20T02:06:12Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

nnshah1

A few simplification suggestions. Bottom line: the bounds are correct and the O(cap) short-circuit via CountingWriter is sound. Main wins would be removing one unreachable error branch, collapsing five duplicated byte-cap blocks into a helper, trimming narrative comments, and table-driving the boundary tests. Net ~40 lines lighter with no behavior change.

@nnshah1

Address @nnshah1 simplification suggestions on top of the existing bounds: - Collapse the unreachable JSON error branches: CountingWriter only errors when the byte cap is exceeded and serde_json::Value's Serialize impl is infallible, so the post-Ok re-check and "failed to measure" branch can never fire. One Err -> one return. - Extract check_byte_len helper for the three string-field byte caps (guided_grammar, guided_regex, guided_whitespace_pattern); each call site collapses to one line. - Trim narrative comments that read like commit messages: shorten the module-level rationale, drop the migration.rs reference from the new() doc, and remove the +1/-2 arithmetic narration in the JSON tests. Keep the load-bearing WHYs (ajv default of 64, grammar-scan quote-handling NOTE). - Table-drive the three string-length test pairs into one loop over (field, limit, build) tuples. JSON tests stay separate -- they exercise the writer and depth paths specifically. Net: 54 insertions, 112 deletions. No behavior change. Signed-off-by: Dan Gil <dagil@nvidia.com> Made-with: Cursor

dagil-nvidia requested a review from a team as a code owner April 19, 2026 17:24

pull-request-size Bot added the size/L label Apr 19, 2026

github-actions Bot added fix frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Apr 19, 2026

coderabbitai Bot reviewed Apr 19, 2026

View reviewed changes

Comment thread lib/llm/src/protocols/common.rs Outdated

Comment thread lib/llm/src/protocols/common.rs

copy-pr-bot Bot temporarily deployed to GITLAB April 19, 2026 17:54 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB April 19, 2026 17:55 Inactive

dagil-nvidia changed the title ~~fix(security): bound json/regex/whitespace_pattern guided-decoding inputs~~ fix: bound json/regex/whitespace_pattern guided-decoding inputs Apr 20, 2026

devin-ai-integration Bot changed the base branch from main to release/1.0.2 April 20, 2026 02:04

devin-ai-integration Bot requested review from a team as code owners April 20, 2026 02:04

dagil-nvidia added 2 commits April 20, 2026 02:05

devin-ai-integration Bot force-pushed the dagil/bound-guided-decoding-inputs branch from db4664e to 44d5b31 Compare April 20, 2026 02:06

nnshah1 reviewed Apr 20, 2026

View reviewed changes

Comment thread lib/llm/src/protocols/common.rs Outdated

Comment thread lib/llm/src/protocols/common.rs Outdated

Comment thread lib/llm/src/protocols/common.rs Outdated

Comment thread lib/llm/src/protocols/common.rs

pvijayakrish merged commit 5dc5420 into release/1.0.2 Apr 21, 2026
19 checks passed

pvijayakrish deleted the dagil/bound-guided-decoding-inputs branch April 21, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349

fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349
pvijayakrish merged 3 commits intorelease/1.0.2from
dagil/bound-guided-decoding-inputs

dagil-nvidia commented Apr 19, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot Bot commented Apr 20, 2026

Uh oh!

nnshah1 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dagil-nvidia commented Apr 19, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Where should the reviewer start?

Test Plan

Related Issues

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot Bot commented Apr 20, 2026

Uh oh!

nnshah1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dagil-nvidia commented Apr 19, 2026 •

edited by devin-ai-integration Bot

Loading

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading