Codesota · Models · BART-large-mnli0 results · 0 benchmarks
Model card

BART-large-mnli.

Text ClassificationText ClassificationApache 2.0

Zero-shot classification using NLI.

§ 01 · Card

Model card,
inline.

Rendered server-side from the upstream README on Hugging Face — same content as the source repo, with editorial typography. The full card, sample weights, and revision history live on HF.


Source
facebook/bart-large-mnli
License
mit
Pipeline
zero-shot-classification

bart-large-mnli

This is the checkpoint for bart-large after being trained on the MultiNLI (MNLI) dataset.

Additional information about this model:

  • The bart-large model page
  • [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

](https://arxiv.org/abs/1910.13461)

NLI-based Zero Shot Text Classification

Yin et al. proposed a method for using pre-trained NLI models as a ready-made zero-shot sequence classifiers. The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label. For example, if we want to evaluate whether a sequence belongs to the class "politics", we could construct a hypothesis of This text is about politics.. The probabilities for entailment and contradiction are then converted to label probabilities.

This method is surprisingly effective in many cases, particularly when used with larger pre-trained models like BART and Roberta. See this blog post for a more expansive introduction to this and other zero shot methods, and see the code snippets below for examples of using this model for zero-shot classification both with Hugging Face's built-in pipeline and with native Transformers/PyTorch code.

With the zero-shot classification pipeline

The model can be loaded with the zero-shot-classification pipeline like so:

python
from transformers import pipeline classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

You can then use this pipeline to classify sequences into any of the class names you specify.

python
sequence_to_classify = "one day I will see the world" candidate_labels = ['travel', 'cooking', 'dancing'] classifier(sequence_to_classify, candidate_labels) #{'labels': ['travel', 'dancing', 'cooking'], # 'scores': [0.9938651323318481, 0.0032737774308770895, 0.002861034357920289], # 'sequence': 'one day I will see the world'}

If more than one candidate label can be correct, pass multi_label=True to calculate each class independently:

python
candidate_labels = ['travel', 'cooking', 'dancing', 'exploration'] classifier(sequence_to_classify, candidate_labels, multi_label=True) #{'labels': ['travel', 'exploration', 'dancing', 'cooking'], # 'scores': [0.9945111274719238, # 0.9383890628814697, # 0.0057061901316046715, # 0.0018193122232332826], # 'sequence': 'one day I will see the world'}
With manual PyTorch
python
# pose sequence as a NLI premise and label as a hypothesis from transformers import AutoModelForSequenceClassification, AutoTokenizer nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli') tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli') premise = sequence hypothesis = f'This example is {label}.' # run through model pre-trained on MNLI x = tokenizer.encode(premise, hypothesis, return_tensors='pt', truncation_strategy='only_first') logits = nli_model(x.to(device))[0] # we throw away "neutral" (dim 1) and take the probability of # "entailment" (2) as the probability of the label being true entail_contradiction_logits = logits[:,[0,2]] probs = entail_contradiction_logits.softmax(dim=1) prob_label_is_true = probs[:,1]
Card content reproduced from huggingface.co/facebook/bart-large-mnli under the upstream license. Rendering trims fenced HTML, raw widgets and tables for safety; tap the link for the untouched original.
§ 02 · Benchmarks

No recorded benchmark results yet.

This model is in the registry but doesn’t have any benchmark_results rows yet. If you have a score, submit it →

Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.