I’d recommend koboldcpp for your backend, SillyTavern for your frontend, and I’ve been a fan of dolphin-2.1-mistral-7B. I’ve been using the Q4_K_S. But you could probably run a 13B model just fine.
I’ve heard good things about the nous-hermes models (I was a big fan of their Llama2 model). I’d stick to mistral variants, personally. Their dataset/training has far surpassed base Llama2 stuff in my opinion.
I’d recommend koboldcpp for your backend, SillyTavern for your frontend, and I’ve been a fan of dolphin-2.1-mistral-7B. I’ve been using the Q4_K_S. But you could probably run a 13B model just fine.
I’ve heard good things about the nous-hermes models (I was a big fan of their Llama2 model). I’d stick to mistral variants, personally. Their dataset/training has far surpassed base Llama2 stuff in my opinion.