Text Ranking
sentence-transformers
PyTorch
JAX
ONNX
Safetensors
OpenVINO
Transformers
English
roberta
text-classification
text-embeddings-inference
Instructions to use cross-encoder/quora-distilroberta-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use cross-encoder/quora-distilroberta-base with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("cross-encoder/quora-distilroberta-base") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Transformers
How to use cross-encoder/quora-distilroberta-base with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("cross-encoder/quora-distilroberta-base") model = AutoModelForSequenceClassification.from_pretrained("cross-encoder/quora-distilroberta-base") - Notebooks
- Google Colab
- Kaggle
metadata
license: apache-2.0
datasets:
- sentence-transformers/quora-duplicates
language:
- en
base_model:
- distilbert/distilroberta-base
pipeline_tag: text-ranking
library_name: sentence-transformers
tags:
- transformers
Cross-Encoder for Quora Duplicate Questions Detection
This model was trained using SentenceTransformers Cross-Encoder class.
Training Data
This model was trained on the Quora Duplicate Questions dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.
Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rather low score, as these are not duplicates.
Usage and Performance
Pre-trained models can be used like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/quora-distilroberta-base')
scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])
You can use this model also without sentence_transformers and by just using Transformers AutoModel class