Skip to content

❓ [Question] how to reduce peak memory usage when loading a Torch-TensorRT module #3998

@patrick-botco

Description

@patrick-botco

❓ Question

hey team,

happy new year!

i am looking to reduce peak memory usage when loading a Torch-TensorRT module. given an ExportedProgram, is there something that i can be doing that is more optimal? this is on a Jetson.

What you have already tried

  1. compiled_model = torch_tensorrt.dynamo.compile(...), using https://docs.pytorch.org/TensorRT/contributors/resource_management.html
  2. torch.save(compiled_model, path, pickle_protocol=5)
  3. torch.load(path, weights_only=False) (cold start, nothing else running)

the result is really high peak memory (a few times more than the serialized engine size).

thank you!! cc @peri044 @lanluo-nvidia @narendasan

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions