Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models

Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models

896 models in Computer Vision · page 15 of 18.

#ModelVendorParametersArchitectureSOTABenchmarksResults
701LightOnOCR-1B-102511
702LlamaParse Cost EffectiveLlamaIndexUnknownCost-optimised LlamaParse pipeline (<$0.004/page)11
703MAEDetIJCAI 202511
704MAERecJiang et al.UnknownViT backbone + Transformer decoder, MAE self-supervised pre-training on Union14M-U11
705MAERec-SResearchUnknownMasked AutoEncoder for scene text Recognition (ViT-Small)11
706MLDGUnknownUnknownUnknown11
707MORANUnknownUnknownUnknown11
708MambaVision-L2NVIDIA241MHybrid Mamba-Transformer11
709Marker 1.10.1VikParuchuriPDF Parser11
710Marker 1.8.2VikParuchuri11
711Mask R-CNN (ResNeXt-101-FPN)11
712Mask2Former (Swin-L)Meta AIUnknownMasked-attention Mask Transformer + Swin-L11
713Mask2Former (Swin-L) LVISMeta AIUnknownMasked-attention Mask Transformer + Swin-L11
714Mask2Former + ResNet-5011
715Mask2Former + Swin-L-FaPN11
716Mask2Former + Swin-T11
717MaskFormer (Swin-T)11
718MaskOCR-LUnknownUnknownUnknown11
719MinerU2-VLMOpenDataLab11
720MinerU2-pipelineOpenDataLab11
721Mistral OCR 2MistralVision-Language Model11
722MonkeyOCR-pro-1.2B11
723MonkeyOCR-pro-1.2BMonkeyOCR11
724Mr. DETR11
725Multimodal (MobileNetV2)UnknownUnknownUnknown11
726Multimodal (ResNet50)UnknownUnknownUnknown11
727Multimodal Side-Tuning (MobileNetV2)UnknownUnknownUnknown11
728Multimodal Side-Tuning (ResNet50)UnknownUnknownUnknown11
729NCBI_BERT(large) (P)UnknownUnknownUnknown11
730NCGMUnknownUnknownUnknown11
731NEC-UIUCNEC / UIUC11
732NJU-ImagineLabNanjing UniversityUnknownScene text detector11
733Nanonets OCR2 3BNanonetsVision-Language OCR Model11
734Nanonets-OCR-sNanonets11
735Nemotron Nano V2 VLNVIDIAVision-Language Model11
736NormTab (Targeted) + SQLUnknownUnknownUnknown11
737OCRFlux-3BChatDoc11
738OCRVerse 4BUnknown4BVision-Language OCR Model11
739OTSNetAnonymous / arxiv preprintUnknownObservation-Thinking-Spelling unified network11
740OneFormer (Swin-L)11
741Oracle-BOWUnknownoracle-extractive11
742Oracle-BOW (HowSumm-Method)Unknown11
743Oracle-HierSummUnknownoracle-extractive11
744PACYan et al.11
745PANet (Joint)ICCV 201911
746PGNet-EUnknownUnknownUnknown11
747PLBARTUCLA / Columbia University140MTransformer encoder-decoder11
748POINTS-ReaderResearch11
749PP-StructureV3Baidu11
750PSENetCVPR 201911