fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349
fix: bound json/regex/whitespace_pattern guided-decoding inputs#8349pvijayakrish merged 3 commits intorelease/1.0.2from
Conversation
WalkthroughAdded validation guards to Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
lib/llm/src/protocols/common.rs (1)
1071-1094: Add JSON at-limit acceptance cases.JSON currently only has rejection tests. Add exact-boundary tests for serialized byte length and nesting depth to lock the intended off-by-one behavior.
🧪 Proposed boundary tests
fn test_guided_json_byte_length_rejected() { // A JSON string of (LIMIT+1) `a`s serializes to LIMIT+3 bytes (two // surrounding quotes), which trips the byte-length cap before any // depth check. let big_string = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH + 1); let json = serde_json::Value::String(big_string); let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("byte length")); } + + #[test] + fn test_guided_json_byte_length_at_limit_accepted() { + // Two bytes are used by the surrounding JSON string quotes. + let s = "a".repeat(MAX_JSON_SCHEMA_BYTE_LENGTH - 2); + let json = serde_json::Value::String(s); + let result = GuidedDecodingOptions::validated(Some(json), None, None, None, None, None); + assert!(result.is_ok()); + } #[test] fn test_guided_json_nesting_depth_rejected() { // Build a chain of `MAX_JSON_SCHEMA_NESTING_DEPTH + 1` nested objects, // each `{"a": ...}`. The serialized form stays well under the byte cap. let mut value = serde_json::json!({}); for _ in 0..(MAX_JSON_SCHEMA_NESTING_DEPTH + 1) { value = serde_json::json!({ "a": value }); } let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("nesting depth")); } + + #[test] + fn test_guided_json_nesting_depth_at_limit_accepted() { + let mut value = serde_json::json!("leaf"); + for _ in 0..MAX_JSON_SCHEMA_NESTING_DEPTH { + value = serde_json::json!({ "a": value }); + } + let result = GuidedDecodingOptions::validated(Some(value), None, None, None, None, None); + assert!(result.is_ok()); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/llm/src/protocols/common.rs` around lines 1071 - 1094, Add two tests that assert acceptance at the exact boundaries: for byte length, create a string of length MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including quotes equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as serde_json::Value::String, call GuidedDecodingOptions::validated(Some(...), None, None, None, None, None) and assert Ok (e.g., test_guided_json_byte_length_accepted_at_limit); for nesting depth, build a chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects (not +1), call GuidedDecodingOptions::validated with that value and assert Ok (e.g., test_guided_json_nesting_depth_accepted_at_limit). Use the existing MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and GuidedDecodingOptions::validated identifiers to locate where to add them.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 77-98: The function json_exceeds_nesting_depth currently checks
depth before matching the Value, causing scalar leaves to be counted as extra
nesting; in json_exceeds_nesting_depth, remove or move the `if d > max_depth {
return true; }` check out of the top of the loop and instead check depth only
when processing container nodes: inside the serde_json::Value::Object and
::Array arms, before pushing children, test whether d + 1 > max_depth and return
true if so, otherwise push children with depth d + 1; this ensures scalar
variants (the `_` arm) do not cause false positives while preserving correct
container-depth detection.
- Around line 57-70: The error is leaking attacker-controlled payloads and
CountingWriter doesn't enforce a size cap; update CountingWriter to hold a
max_bytes field and return an error (std::io::ErrorKind::Other or a custom
error) once buf.len() would push the count past max_bytes so serialization
terminates early, and replace any error messages that use debug-printing of
GuidedDecodingOptions (e.g., places referencing GuidedDecodingOptions in error
returns/logs) with a generic, non-payload-bearing message; apply the same
no-debug-message change to the other occurrences noted (around the blocks at
~521-525 and ~586-600) so errors never echo the full struct.
---
Nitpick comments:
In `@lib/llm/src/protocols/common.rs`:
- Around line 1071-1094: Add two tests that assert acceptance at the exact
boundaries: for byte length, create a string of length
MAX_JSON_SCHEMA_BYTE_LENGTH - 2 (so serialized JSON string including quotes
equals MAX_JSON_SCHEMA_BYTE_LENGTH), wrap it as serde_json::Value::String, call
GuidedDecodingOptions::validated(Some(...), None, None, None, None, None) and
assert Ok (e.g., test_guided_json_byte_length_accepted_at_limit); for nesting
depth, build a chain of exactly MAX_JSON_SCHEMA_NESTING_DEPTH nested objects
(not +1), call GuidedDecodingOptions::validated with that value and assert Ok
(e.g., test_guided_json_nesting_depth_accepted_at_limit). Use the existing
MAX_JSON_SCHEMA_BYTE_LENGTH, MAX_JSON_SCHEMA_NESTING_DEPTH, and
GuidedDecodingOptions::validated identifiers to locate where to add them.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 827390f2-4d09-40bb-8ba0-6376b29be0d6
📒 Files selected for processing (1)
lib/llm/src/protocols/common.rs
…puts Add byte-length caps for guided_grammar (64 KiB), guided_regex (32 KiB), guided_whitespace_pattern (1 KiB), and guided_json (256 KiB serialized), plus a JSON nesting-depth cap (64) on guided_json. These bound pathological inputs that can otherwise drive unbounded recursion / runaway parsing in the downstream guided decoding backend (notably xgrammar, the default for vLLM, SGLang, and TRT-LLM workers). GuidedDecodingOptions::validate() is the single choke point on the OpenAI ingest path for all three workers. GuidedDecodingOptions::new() bypasses validation by design (used by migration.rs to re-wrap already-validated requests); the doc comment now states this explicitly. Defaults are deliberately generous and only intended to bound pathological input. Adds 8 unit tests covering each new bound at-limit and just-over. Signed-off-by: Dan Gil <dagil@nvidia.com> Made-with: Cursor
- json_exceeds_nesting_depth: count container levels only; scalar leaves
no longer over-count by one (a 64-deep object chain ending in a scalar
is now correctly at-limit instead of over-limit).
- CountingWriter: short-circuit serde_json::to_writer once the byte cap is
exceeded (returns ErrorKind::InvalidData). Pathological multi-MB schemas
now cost O(limit) to reject instead of O(input).
- GuidedDecodingOptions::validate exclusivity check: drop the {:?} debug
print of self from the error message. Without this, a request that sets
multiple guided_* fields would echo the (now-capped but still up to
256 KiB) attacker-supplied payload back into error responses and logs.
- Tests: add at-limit acceptance cases for guided_json byte length
(MAX-2 chars + 2 quotes = MAX) and nesting depth (MAX container levels
around a scalar leaf), to lock the off-by-one boundaries.
Signed-off-by: Dan Gil <dagil@nvidia.com>
Made-with: Cursor
db4664e to
44d5b31
Compare
nnshah1
left a comment
There was a problem hiding this comment.
A few simplification suggestions. Bottom line: the bounds are correct and the O(cap) short-circuit via CountingWriter is sound. Main wins would be removing one unreachable error branch, collapsing five duplicated byte-cap blocks into a helper, trimming narrative comments, and table-driving the boundary tests. Net ~40 lines lighter with no behavior change.
Address @nnshah1 simplification suggestions on top of the existing bounds: - Collapse the unreachable JSON error branches: CountingWriter only errors when the byte cap is exceeded and serde_json::Value's Serialize impl is infallible, so the post-Ok re-check and "failed to measure" branch can never fire. One Err -> one return. - Extract check_byte_len helper for the three string-field byte caps (guided_grammar, guided_regex, guided_whitespace_pattern); each call site collapses to one line. - Trim narrative comments that read like commit messages: shorten the module-level rationale, drop the migration.rs reference from the new() doc, and remove the +1/-2 arithmetic narration in the JSON tests. Keep the load-bearing WHYs (ajv default of 64, grammar-scan quote-handling NOTE). - Table-drive the three string-length test pairs into one loop over (field, limit, build) tuples. JSON tests stay separate -- they exercise the writer and depth paths specifically. Net: 54 insertions, 112 deletions. No behavior change. Signed-off-by: Dan Gil <dagil@nvidia.com> Made-with: Cursor
Overview
Adds byte-length and nesting-depth caps to
GuidedDecodingOptions::validate()so the OpenAI guided-decoding path bounds pathological inputs before they reach the downstream guided decoding backend (notablyxgrammar, the default for vLLM, SGLang, and TRT-LLM workers). Without these caps, sufficiently deep / oversizedguided_json,guided_regex,guided_grammar, orguided_whitespace_patterninputs can drive unbounded recursion or runaway parsing in the worker.Details
New constants in
lib/llm/src/protocols/common.rs:guided_grammarguided_regexguided_whitespace_patternguided_json(serialized)guided_jsonnesting depthCountingWriteris a zero-allocstd::io::Writeadapter used to measure JSON serialized length without paying forto_string.json_exceeds_nesting_depthis an iterative early-return walker: O(visited_until_breach), so wide-flat schemas bail out at the first over-depth child rather than fully traversing.GuidedDecodingOptions::new()doc comment now explicitly states it bypasses validation by design (used bymigration.rs); public callers building from external input must usevalidated()/from_optional().Defaults are deliberately generous; they exist to bound pathological inputs, not to gate normal traffic.
Where should the reviewer start?
lib/llm/src/protocols/common.rs— the new constants,CountingWriter,json_exceeds_nesting_depth, and the additions toGuidedDecodingOptions::validate().testsmodule.Test Plan
cargo test -p dynamo-llm protocols::common::tests— 16 passed, 0 failed (8 pre-existing + 8 new at-limit/just-over cases for every new bound)cargo clippy -p dynamo-llm --all-targets -- -D warnings— cleanrelease/1.0.2target)Related Issues
Made with Cursor
Summary by CodeRabbit
Bug Fixes
Documentation
Tests
Link to Devin session: https://nvidia.devinenterprise.com/sessions/da7195f7c9ca45bd875c180ffb6c29f6
Requested by: @dagil-nvidia