TinyChat Computer running Llama2-7B Jetson Orin Nano. Key technique: AWQ 4bit quantization.

2.8K views

MIT HAN Lab

1 year ago

TinyChat Computer running Llama2-7B Jetson Orin Nano. Key technique: AWQ 4bit quantization.

TinyChat Computer running Llama2-7B Jetson Orin Nano. Key technique: AWQ 4bit quantization.