cross-encoder
/

quora-distilroberta-base

sentence-transformers

text-classification

text-embeddings-inference

Model card Files Files and versions

quora-distilroberta-base / README.md

tomaarsen's picture

tomaarsen HF Staff

Update model metadata

c6fd261 verified about 1 year ago

|

1.3 kB

	---
	license: apache-2.0
	datasets:
	- sentence-transformers/quora-duplicates
	language:
	- en
	base_model:
	- distilbert/distilroberta-base
	pipeline_tag: text-ranking
	library_name: sentence-transformers
	tags:
	- transformers
	---
	# Cross-Encoder for Quora Duplicate Questions Detection
	This model was trained using [SentenceTransformers](https://sbert.net) [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html) class.

	## Training Data
	This model was trained on the [Quora Duplicate Questions](https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs) dataset. The model will predict a score between 0 and 1 how likely the two given questions are duplicates.

	Note: The model is not suitable to estimate the similarity of questions, e.g. the two questions "How to learn Java" and "How to learn Python" will result in a rather low score, as these are not duplicates.

	## Usage and Performance

	Pre-trained models can be used like this:
	```python
	from sentence_transformers import CrossEncoder

	model = CrossEncoder('cross-encoder/quora-distilroberta-base')
	scores = model.predict([('Question 1', 'Question 2'), ('Question 3', 'Question 4')])
	```

	You can use this model also without sentence_transformers and by just using Transformers ``AutoModel`` class