Post-transformer inference: 224× compression of Llama-70B with improved accuracy

Status
Not open for further replies.
Status
Not open for further replies.
Top