❓ [Question] how to reduce peak memory usage when loading a Torch-TensorRT module

## ❓ Question

hey team, 

happy new year! 

i am looking to reduce peak memory usage when loading a Torch-TensorRT module. given an `ExportedProgram`, is there something that i can be doing that is more optimal? this is on a Jetson.

## What you have already tried

1) `compiled_model = torch_tensorrt.dynamo.compile(...)`, using https://docs.pytorch.org/TensorRT/contributors/resource_management.html
2) `torch.save(compiled_model, path, pickle_protocol=5)`
3) `torch.load(path, weights_only=False)` (cold start, nothing else running)

the result is really high peak memory (a few times more than the serialized engine size). 

thank you!! cc @peri044 @lanluo-nvidia @narendasan 

## Environment

> Build information about Torch-TensorRT can be found by turning on debug messages

 - PyTorch Version (e.g., 1.0):
 - CPU Architecture:
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source):
 - Build command you used (if compiling from source):
 - Are you using local sources or building from archives:
 - Python version:
 - CUDA version:
 - GPU models and configuration:
 - Any other relevant information:

## Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

❓ [Question] how to reduce peak memory usage when loading a Torch-TensorRT module #3998

❓ Question

What you have already tried

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

❓ [Question] how to reduce peak memory usage when loading a Torch-TensorRT module #3998

Description

❓ Question

What you have already tried

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions