Skip to content

fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349

Merged
pvijayakrish merged 3 commits intorelease/1.0.2from
dagil/bound-guided-decoding-inputs
Apr 21, 2026
Merged

fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349
pvijayakrish merged 3 commits intorelease/1.0.2from
dagil/bound-guided-decoding-inputs

Conversation

@dagil-nvidia
Copy link
Copy Markdown
Collaborator

@dagil-nvidia dagil-nvidia commented Apr 19, 2026

Overview

Adds byte-length and nesting-depth caps to GuidedDecodingOptions::validate() so the OpenAI guided-decoding path bounds pathological inputs before they reach the downstream guided decoding backend (notably xgrammar, the default for vLLM, SGLang, and TRT-LLM workers). Without these caps, sufficiently deep / oversized guided_json, guided_regex, guided_grammar, or guided_whitespace_pattern inputs can drive unbounded recursion or runaway parsing in the worker.

Note: This PR targets release/1.0.2. The branch has been rebased onto release/1.0.2 so the diff is isolated to the guided-decoding change.

Details

New constants in lib/llm/src/protocols/common.rs:

Field Cap
guided_grammar 64 KiB
guided_regex 32 KiB
guided_whitespace_pattern 1 KiB
guided_json (serialized) 256 KiB
guided_json nesting depth 64
  • CountingWriter is a zero-alloc std::io::Write adapter used to measure JSON serialized length without paying for to_string.
  • json_exceeds_nesting_depth is an iterative early-return walker: O(visited_until_breach), so wide-flat schemas bail out at the first over-depth child rather than fully traversing.
  • GuidedDecodingOptions::new() doc comment now explicitly states it bypasses validation by design (used by migration.rs); public callers building from external input must use validated() / from_optional().

Defaults are deliberately generous; they exist to bound pathological inputs, not to gate normal traffic.

Where should the reviewer start?

  • lib/llm/src/protocols/common.rs — the new constants, CountingWriter, json_exceeds_nesting_depth, and the additions to GuidedDecodingOptions::validate().
  • The new tests at the bottom of the same file's tests module.

Test Plan

  • cargo test -p dynamo-llm protocols::common::tests — 16 passed, 0 failed (8 pre-existing + 8 new at-limit/just-over cases for every new bound)
  • cargo clippy -p dynamo-llm --all-targets -- -D warnings — clean
  • CI green (19 pass / 0 fail on the release/1.0.2 target)

Related Issues

  • Relates to: guided-decoding input hardening

Made with Cursor

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced validation of guided-decoding inputs to enforce stricter size and complexity limits.
  • Documentation

    • Updated documentation for guided-decoding configuration options.
  • Tests

    • Added validation tests for guided-decoding parameters.

Link to Devin session: https://nvidia.devinenterprise.com/sessions/da7195f7c9ca45bd875c180ffb6c29f6
Requested by: @dagil-nvidia

@dagil-nvidia dagil-nvidia requested a review from a team as a code owner April 19, 2026 17:24
@github-actions github-actions Bot added fix frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Apr 19, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 19, 2026

Walkthrough

Added validation guards to GuidedDecodingOptions::validate() enforcing byte-length caps on grammar, regex, and whitespace pattern inputs, plus byte-length and JSON nesting depth caps on JSON inputs. Introduced CountingWriter utility and json_exceeds_nesting_depth helper function, with comprehensive unit test coverage.

Changes

Cohort / File(s) Summary
Guided Decoding Validation Enhancements
lib/llm/src/protocols/common.rs
Added size and depth validation guards for guided-decoding inputs with new byte-length constants and JSON nesting depth checks. Introduced CountingWriter Write adapter for serialization measurement and json_exceeds_nesting_depth helper. Updated GuidedDecodingOptions::new documentation. Added unit tests covering boundary conditions and depth constraints.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding bounds/validation for JSON, regex, and whitespace pattern inputs in guided-decoding to prevent security/stability issues.
Description check ✅ Passed The PR description provides comprehensive coverage of all required sections including overview, detailed changes with a reference table, specific files to review, and related issues context.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
lib/llm/src/protocols/common.rs (1)

1071-1094: Add JSON at-limit acceptance cases.

JSON currently only has rejection tests. Add exact-boundary tests for serialized byte length and nesting depth to lock the intended off-by-one behavior.

🧪 Proposed boundary tests
     fn test_guided_json_byte_length_rejected() {
         // A JSON string of (LIMIT+1) `a`s serializes to LIMIT+3 bytes (two
         // surrounding quotes), which trips the byte-length cap before any
         // depth check.
         let big_string = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH + 1);
         let json = serde_json::Value::String(big_string);
         let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None);
         assert!(result.is_err());
         assert!(result.unwrap_err().to_string().contains("byte length"));
     }
+
+    #[test]
+    fn test_guided_json_byte_length_at_limit_accepted() {
+        // Two bytes are used by the surrounding JSON string quotes.
+        let s = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH - 2);
+        let json = serde_json::Value::String(s);
+        let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None);
+        assert!(result.is_ok());
+    }
 
     #[test]
     fn test_guided_json_nesting_depth_rejected() {
         // Build a chain of `MAX_JSON_SCHEMA_NESTING_DEPTH + 1` nested objects,
         // each `{"a": ...}`. The serialized form stays well under the byte cap.
         let mut value = serde_json::json!({});
         for _ in 0..(MAX_JSON_SCHEMA_NESTING_DEPTH + 1) {
             value = serde_json::json!({ "a": value });
         }
         let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None);
         assert!(result.is_err());
         assert!(result.unwrap_err().to_string().contains("nesting depth"));
     }
+
+    #[test]
+    fn test_guided_json_nesting_depth_at_limit_accepted() {
+        let mut value = serde_json::json!("leaf");
+        for _ in 0..MAX_JSON_SCHEMA_NESTING_DEPTH {
+            value = serde_json::json!({ "a": value });
+        }
+        let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None);
+        assert!(result.is_ok());
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/llm/src/protocols/common.rs` around lines 1071 - 1094, Add two tests that
assert acceptance at the exact boundaries: for byte length, create a string of
length MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including
quotes equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as
serde_json::Value::String, call GuidedDecodingOptions::validated(Some(...),
None, None, None, None, None) and assert Ok (e.g.,
test_guided_json_byte_length_accepted_at_limit); for nesting depth, build a
chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects (not +1), call
GuidedDecodingOptions::validated with that value and assert Ok (e.g.,
test_guided_json_nesting_depth_accepted_at_limit). Use the existing
MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and
GuidedDecodingOptions::validated identifiers to locate where to add them.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 77-98: The function json_exceeds_nesting_depth currently checks
depth before matching the Value, causing scalar leaves to be counted as extra
nesting; in json_exceeds_nesting_depth, remove or move the `if d > max_depth {
return true; }` check out of the top of the loop and instead check depth only
when processing container nodes: inside the serde_json::Value::Object and
::Array arms, before pushing children, test whether d + 1 > max_depth and return
true if so, otherwise push children with depth d + 1; this ensures scalar
variants (the `_` arm) do not cause false positives while preserving correct
container-depth detection.
- Around line 57-70: The error is leaking attacker-controlled payloads and
CountingWriter doesn't enforce a size cap; update CountingWriter to hold a
max_bytes field and return an error (std::io::ErrorKind::Other or a custom
error) once buf.len() would push the count past max_bytes so serialization
terminates early, and replace any error messages that use debug-printing of
GuidedDecodingOptions (e.g., places referencing GuidedDecodingOptions in error
returns/logs) with a generic, non-payload-bearing message; apply the same
no-debug-message change to the other occurrences noted (around the blocks at
~521-525 and ~586-600) so errors never echo the full struct.

---

Nitpick comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 1071-1094: Add two tests that assert acceptance at the exact
boundaries: for byte length, create a string of length
MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including quotes
equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as serde_json::Value::String, call
GuidedDecodingOptions::validated(Some(...), None, None, None, None, None) and
assert Ok (e.g., test_guided_json_byte_length_accepted_at_limit); for nesting
depth, build a chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects
(not +1), call GuidedDecodingOptions::validated with that value and assert Ok
(e.g., test_guided_json_nesting_depth_accepted_at_limit). Use the existing
MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and
GuidedDecodingOptions::validated identifiers to locate where to add them.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 827390f2-4d09-40bb-8ba0-6376b29be0d6

📥 Commits

Reviewing files that changed from the base of the PR and between d94b350 and def3841.

📒 Files selected for processing (1)
  • lib/llm/src/protocols/common.rs

Comment thread lib/llm/src/protocols/common.rs Outdated
Comment thread lib/llm/src/protocols/common.rs
@dagil-nvidia dagil-nvidia changed the title fix(security): bound json/regex/whitespace_pattern guided-decoding inputs fix: bound json/regex/whitespace_pattern guided-decoding inputs Apr 20, 2026
@devin-ai-integration devin-ai-integration Bot changed the base branch from main to release/1.0.2 April 20, 2026 02:04
@devin-ai-integration devin-ai-integration Bot requested review from a team as code owners April 20, 2026 02:04
…puts

Add byte-length caps for guided_grammar (64 KiB), guided_regex (32 KiB),
guided_whitespace_pattern (1 KiB), and guided_json (256 KiB serialized),
plus a JSON nesting-depth cap (64) on guided_json. These bound pathological
inputs that can otherwise drive unbounded recursion / runaway parsing in
the downstream guided decoding backend (notably xgrammar, the default for
vLLM, SGLang, and TRT-LLM workers).

GuidedDecodingOptions::validate() is the single choke point on the OpenAI
ingest path for all three workers. GuidedDecodingOptions::new() bypasses
validation by design (used by migration.rs to re-wrap already-validated
requests); the doc comment now states this explicitly.

Defaults are deliberately generous and only intended to bound pathological
input. Adds 8 unit tests covering each new bound at-limit and just-over.

Signed-off-by: Dan Gil <dagil@nvidia.com>
Made-with: Cursor
- json_exceeds_nesting_depth: count container levels only; scalar leaves
  no longer over-count by one (a 64-deep object chain ending in a scalar
  is now correctly at-limit instead of over-limit).

- CountingWriter: short-circuit serde_json::to_writer once the byte cap is
  exceeded (returns ErrorKind::InvalidData). Pathological multi-MB schemas
  now cost O(limit) to reject instead of O(input).

- GuidedDecodingOptions::validate exclusivity check: drop the {:?} debug
  print of self from the error message. Without this, a request that sets
  multiple guided_* fields would echo the (now-capped but still up to
  256 KiB) attacker-supplied payload back into error responses and logs.

- Tests: add at-limit acceptance cases for guided_json byte length
  (MAX-2 chars + 2 quotes = MAX) and nesting depth (MAX container levels
  around a scalar leaf), to lock the off-by-one boundaries.

Signed-off-by: Dan Gil <dagil@nvidia.com>
Made-with: Cursor
@devin-ai-integration devin-ai-integration Bot force-pushed the dagil/bound-guided-decoding-inputs branch from db4664e to 44d5b31 Compare April 20, 2026 02:06
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Copy Markdown
Contributor

@nnshah1 nnshah1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few simplification suggestions. Bottom line: the bounds are correct and the O(cap) short-circuit via CountingWriter is sound. Main wins would be removing one unreachable error branch, collapsing five duplicated byte-cap blocks into a helper, trimming narrative comments, and table-driving the boundary tests. Net ~40 lines lighter with no behavior change.

Comment thread lib/llm/src/protocols/common.rs Outdated
Comment thread lib/llm/src/protocols/common.rs Outdated
Comment thread lib/llm/src/protocols/common.rs Outdated
Comment thread lib/llm/src/protocols/common.rs
Address @nnshah1 simplification suggestions on top of the existing
bounds:

- Collapse the unreachable JSON error branches: CountingWriter only
  errors when the byte cap is exceeded and serde_json::Value's
  Serialize impl is infallible, so the post-Ok re-check and
  "failed to measure" branch can never fire. One Err -> one return.

- Extract check_byte_len helper for the three string-field byte caps
  (guided_grammar, guided_regex, guided_whitespace_pattern); each
  call site collapses to one line.

- Trim narrative comments that read like commit messages: shorten
  the module-level rationale, drop the migration.rs reference from
  the new() doc, and remove the +1/-2 arithmetic narration in the
  JSON tests. Keep the load-bearing WHYs (ajv default of 64,
  grammar-scan quote-handling NOTE).

- Table-drive the three string-length test pairs into one loop over
  (field, limit, build) tuples. JSON tests stay separate -- they
  exercise the writer and depth paths specifically.

Net: 54 insertions, 112 deletions. No behavior change.

Signed-off-by: Dan Gil <dagil@nvidia.com>
Made-with: Cursor
@pvijayakrish pvijayakrish merged commit 5dc5420 into release/1.0.2 Apr 21, 2026
19 checks passed
@pvijayakrish pvijayakrish deleted the dagil/bound-guided-decoding-inputs branch April 21, 2026 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants