makeasnek@lemmy.ml to AI@lemmy.mlEnglish · 5 months agoLLM ASICs on USB sticks?lemmy.mlimagemessage-square15fedilinkarrow-up127arrow-down16file-text
arrow-up121arrow-down1imageLLM ASICs on USB sticks?lemmy.mlmakeasnek@lemmy.ml to AI@lemmy.mlEnglish · 5 months agomessage-square15fedilinkfile-text
Source: nostr https://snort.social/nevent1qqsg9c49el0uvn262eq8j3ukqx5jvxzrgcvajcxp23dgru3acfsjqdgzyprqcf0xst760qet2tglytfay2e3wmvh9asdehpjztkceyh0s5r9cqcyqqqqqqgt7uh3n Paper: https://arxiv.org/abs/2406.02528
minus-squareSmorty [she/her]@lemmy.blahaj.zonelinkfedilinkarrow-up1·15 days agoSomething similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!
minus-squareFisch@discuss.tchncs.delinkfedilinkEnglisharrow-up2·15 days agoThat’s really interesting, gonna try out how well it runs
Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!
That’s really interesting, gonna try out how well it runs