Recommended default
Qwen3.6-35B-A3B Q4 / EXL2
Use Q4 GGUF or EXL2, modest context. This is the highest-scoring current open-weight model that fits this card cleanly, selected by benchmark then fit then freshness, not by parameter count.
Benchmark anchor
MMLU-Pro 85.6 / 85.0 · GPQA Diamond 84.9 / 84.8 · AIME 2025 89.2 / 88.8. Same score class as the 3090, much faster delivery.
Evidence
Qwen says Qwen3.6-35B-A3B is built for stability and coding utility; benchmark profile matches the 3090 row with faster throughput.