[Cutmix] Make fn.multi_paste more flexible, fix validation by stiepan · Pull Request #5331 · NVIDIA/DALI

stiepan · 2024-02-19T23:19:50Z

Category:

New feature (non-breaking change which adds functionality)
Bug fix (non-breaking change which fixes an issue)

Description:

This PR reworks arguments and their parsing for fn.multi_paste to make it more flexible and easier to use (with cutmix augmentation in mind; namely ability to mix different batches and no need to use image shape explicitly if the inputs are uniformly shaped). It fixes a couple of bugs.

Fixes:

There was no validation of in_ids leading to out-of-bound accesses for incorrect input
The number of channels was handled incorrectly
- GPU assumed 3 channels no matter the actual number of channels - leading to incorrect cuda mem accesses.
- Both backends did not verify if all the pasted regions have the same number of channels (and if the output number of channels matches). The number of output channels was copied from the corresponding input sample. The PR changes that - the number of output channels is inferred from the actually pasted regions and uses the old behaviour only if there are none (to be compatible with the only case it could have worked previously).

New features:

The in_anchors, region shapes and out_anchors have now relative counterparts - they can be specified as [0, 1] floats relative to input shape, input shape and output shape respectively.
If all the input shapes are uniform and no output size is provided, the output shape is assumed to be the same.
To allow mixing images of diffrent batches, the operator can accept multiple inputs - in that case the in_ids must not be specified and the regions are pasted elementwise.

The first two points are aimed to make it easier to use fn.multi_paste without explcitly handling the actual shapes of the samples.
The multi-input mode should make it easier to use the operator in simple applications where the number of paste regions is uniform - with DALI implict batch you can think in terms of samples rather than indicies in the implicit batch. And it enables cases were you need to mix images from different sources.

Other changes:

The args are parsed once and saved not to recompute them along the way
No intersection check is moved to the CPU impl, it's outputs were ignored by the GPU impl anyway.
Added video support.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-3496

stiepan · 2024-02-19T23:20:10Z

!build

dali-automaton · 2024-02-19T23:25:03Z

CI MESSAGE: [12935254]: BUILD STARTED

dali-automaton · 2024-02-20T01:58:57Z

CI MESSAGE: [12935254]: BUILD FAILED

… output shape, allow multipule inputs Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2024-02-26T12:10:30Z

!build

dali-automaton · 2024-02-26T12:15:25Z

CI MESSAGE: [13087567]: BUILD STARTED

dali-automaton · 2024-02-26T15:40:25Z

CI MESSAGE: [13087567]: BUILD PASSED

klecki

Before looking at tests, posting the comments

klecki · 2024-02-26T19:42:19Z

+                      sample_idx, ". It should be a 2D tensor of shape [number of pasted regions]",
+                      "x2 (i.e. ", paste_count, "x", spatial_ndim,


Nitpick, but I would format the shape as {number of pasted regions, 2} or something like this, the [number]x2 is a bit weird.

Whatever brace/parenthesis we use for shapes...

We print the shape without parenthesis, with x as delimiter. :)

I put the inline TensorShape{} for printing the actual shapes consistently and was a bit surprised.

klecki · 2024-02-26T20:17:57Z

+      in_anchors_data_[i].resize(n_paste);
+      region_shapes_data_[i].resize(n_paste);
+      out_anchors_data_[i].resize(n_paste);


It probably doesn't matter much (and for sure we do similarly bad things), but vector of vector sounds like a possibility of a lot of allocations.

I tried replacing it with the TensorListView to have the var sample sizes handled, but if anything, it performed sligthly worse end-to-end.

…arer messages, remove potential oob access, adjusted docs Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2024-03-01T12:12:19Z

!build

dali-automaton · 2024-03-01T12:15:43Z

CI MESSAGE: [13201260]: BUILD STARTED

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan · 2024-03-01T13:26:05Z

!build

dali-automaton · 2024-03-01T13:30:18Z

CI MESSAGE: [13202219]: BUILD STARTED

dali-automaton · 2024-03-01T15:05:38Z

CI MESSAGE: [13202219]: BUILD PASSED

github-advanced-security AI found potential problems Feb 19, 2024

View reviewed changes

Comment thread dali/test/python/operator_1/test_multipaste.py Fixed

stiepan changed the title ~~[Cutmix] Make fn.multipaste more flexible, fix validation~~ [Cutmix] Make fn.multi_paste more flexible, fix validation Feb 20, 2024

stiepan added 2 commits February 23, 2024 18:12

Multipaste op: Fix validation, add relative shapes and anchors, infer…

c05ecde

… output shape, allow multipule inputs Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

Add tests, fix linting

13a4484

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan force-pushed the multipaste_cutmix branch from b325604 to 13a4484 Compare February 26, 2024 11:09

stiepan marked this pull request as ready for review February 26, 2024 11:09

dali-automaton assigned klecki and szkarpinski Feb 26, 2024

Add more tests, fix linting issues

a36ce6e

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

klecki reviewed Feb 26, 2024

View reviewed changes

Comment thread dali/operators/image/paste/multipaste.cc Outdated

klecki reviewed Feb 27, 2024

View reviewed changes

Review fixes: helper for getting pasted region, cpu paste helper, cle…

8b4484f

…arer messages, remove potential oob access, adjusted docs Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

szkarpinski reviewed Feb 27, 2024

View reviewed changes

Adjust docs, error messages, fix spelling mistakes

ae84475

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

klecki approved these changes Feb 27, 2024

View reviewed changes

szkarpinski approved these changes Feb 28, 2024

View reviewed changes

Remove unused capture, validate region shape

081f83a

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>

stiepan merged commit 8889d96 into NVIDIA:main Mar 1, 2024

		sample_idx, ". It should be a 2D tensor of shape [number of pasted regions]",
		"x2 (i.e. ", paste_count, "x", spatial_ndim,

Conversation

stiepan commented Feb 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Uh oh!

stiepan commented Feb 19, 2024

Uh oh!

Uh oh!

dali-automaton commented Feb 19, 2024

Uh oh!

dali-automaton commented Feb 20, 2024

Uh oh!

stiepan commented Feb 26, 2024

Uh oh!

dali-automaton commented Feb 26, 2024

Uh oh!

dali-automaton commented Feb 26, 2024

Uh oh!

Uh oh!

klecki left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

klecki Feb 26, 2024

Choose a reason for hiding this comment

Uh oh!

stiepan Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stiepan Feb 27, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

klecki Feb 26, 2024

Choose a reason for hiding this comment

Uh oh!

stiepan Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stiepan commented Mar 1, 2024

Uh oh!

dali-automaton commented Mar 1, 2024

Uh oh!

stiepan commented Mar 1, 2024

Uh oh!

dali-automaton commented Mar 1, 2024

Uh oh!

dali-automaton commented Mar 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

stiepan commented Feb 19, 2024 •

edited

Loading

stiepan Feb 27, 2024 •

edited

Loading