Recommended default
Qwen3-14B
Use Q4/Q5 GGUF, 16k-32k practical. This is the highest-scoring current open-weight model that fits this card cleanly, selected by benchmark then fit then freshness, not by parameter count.
Benchmark anchor
Qwen3-14B is the stronger current small-mid baseline; clearly ahead of legacy Mistral/Llama 8B-12B rows on reasoning and coding.
Evidence
Qwen3 family is the stronger current baseline at this size; legacy 70B baselines are irrelevant on a 16GB card.