test_help_text / README.md
dhruvnayee's picture
Upload folder using huggingface_hub
fa608d1 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dense
  - generated_from_trainer
  - dataset_size:16909
  - loss:TripletLoss
base_model: BAAI/bge-large-en-v1.5
widget:
  - source_sentence: >-
      Under what conditions is the default start position of 1 used for a
      dimension in the resulting array?
    sentences:
      - >-
        Array variables example 3: Variable Unit_Prices


        Inheritance of dimension start position and index values in numerical

        expressions.


        The following non-aggregating non-portfolio 1-dimensional array currency
        variables are defined in

        the assumption set Assumption_Set (in all cases the dimension name is
        Fund and it has character

        index values):

         | Dimension properties | 
        Variable | Size | Start
                                                position | Indices | Array elements
        Unit_Prices | 3 | 101 | "A", "B", "C" | 1.25, 0.93, 1.81

        Unit_Prices_2 | 3 | 101 | "A", "B", "C" | 1.21, 0.97, 1.73

        Unit_Prices_3 | 3 | 201 | "A", "B", "C" | 1.32, 0.79, 1.35

        Unit_Prices_4 | 3 | 101 | "X", "Y", "Z" | 1.12, 0.89, 1.97

        Unit_Prices_5 | 2 | 102 | "B", "C" | 0.93, 1.93

        Unit_Prices_6 | 3 | 201 | "X", "Y", "Z" | 1.19, 0.98, 1.95


        These variables are used in the formula of the following variables in
        the program Program in the

        projection process Projection_Process, which is used in the model
        Array_Model (all these variables

        have a single dimension called Fund):

         |  | Dimension properties | 
        Variable | Formula | Size | Start position | Indices | Array elements

        Variable_21 | Unit_Prices - Unit_Prices_2 | 3 | 101 | "A", "B", "C" |
        0.04, -0.04, 0.08

        Variable_22 | Unit_Prices - Unit_Prices_3 | 3 | 1 | "A", "B", "C" |
        -0.07, 0.14, 0.46

        Variable_23 | Unit_Prices - Unit_Prices_4 | 3 | 101 | (undefined) |
        0.13, 0.04, -0.16

        Variable_24 | Unit_Prices[<Fund.index= "B" : "C">] - Unit_Prices_5 | 2 |
        102 | (undefined) | 0, -0.12

        Variable_25 | Unit_Prices - Unit_Prices_6 | 3 | 1 | (undefined) | 0.06,
        -0.05, -0.14


        Notes:


        * The rank of the arrays (number of dimensions), dimension names and
        dimension sizes must be

        identical for such numerical expressions to be valid.

        * If the indices in a particular dimension are the same in both arrays,
        they will be inherited by

        the resulting array, otherwise no indices will be defined in that
        dimension.

        * If the start positions in a particular dimension are the same in both
        arrays, they will be

        inherited by the resulting array, otherwise the default start position
        of 1 will be used in that

        dimension.

        * We could not have a formula like Unit_Prices - Unit_Prices_5, because
        these arrays have

        differently sized dimensions.

        * The subset of an array variable in the formula of Variable_24 loses
        its indices. This means that

        Variable_24 cannot inherit consistent indices and so none are defined
        for it.

        * The subset of an array variable in the formula of Variable_24 inherits
        the numbering of its

        element positions from the variable Unit_Prices, so its start position
        is set to 102. This is the

        same as the start position of Unit_Prices_5, so Variable_24 has its
        dimension start position set to

        102.
      - |-
        ## Examples

        Suppose:

        Variable is a 2-dimensional array

         | Dimension name | Size | Start position
        1 | Dimension_1 | 2 | 4
        2 | Dimension_2 | 3 | 7

        Dimension_2 | Dimension_1
        Position = 4 | Position = 5
        Position = 7 | 1 | 2
        Position = 8 | 3 | 4
        Position = 9 | 5 | 6

        Then:

        Dimension_Start(Variable, <Dimension_1>)

        = 4

        Dimension_Start(Variable, <Dimension_2>)

        = 7
      - >-
        ## Other situations where indices are lost


        There are a number of other circumstances where the indices for an array
        dimension are lost:


        * If an array has a changeable dimension and the array is aggregated
        acrosseventsusing the.totalextension, the indices in

        that dimension will be lost.

        * If an array has a changeable dimension and the array is passed from a
        sub layer to a calling

        layer then the indices in that dimension will be lost.

        * If an array has a changeable dimension and the array is calculated in
        a stochastic return value

        variable then the indices in that dimension will be lost.
  - source_sentence: Where can I find examples of batches within the system?
    sentences:
      - >-
        Grouping example 3: Admin_Grouping


        Used in the Calculation grouping property of a parent program.


        This grouping has the following properties:


        Property | Value

        Name | Admin_Grouping

        Category | Policy

        Description | Group by method of policy administration

        Group identifier | Internet_Admin.text


        This grouping contains just one group:


        Property | Value

        Name | Internet_Admin

        Category | Policy

        Description | Group Internet_Admin by value

        Data type | Indicator

        Grouping expression | Internet_Admin

        Method | By value

        Range boundaries | 

        Boundary value in | [Range above]


        The Grouping expression property is set to the indicator variable
        Internet_Admin, so the Data

        type property must be set to


        Indicator


        .


        The Method property is set to


        By value


        (so the Range boundaries and Boundary value in

        properties will be ignored) and the records will be grouped together
        according to the value of the

        grouping expression. In the


        data view


        Traditional_Data_View


        the variable

        Internet_Admin is read from data, but is expected to take one of two
        possible values. Since this

        variable defines the grouping expression of this group, there should be
        up to two groups. If the

        data file contains additional values, there will be additional groups.


        This grouping is specified as the


        Calculation grouping


        property of the program


        Company


        of the projection process


        Realistic_Projection


        . This program is a parent program and the records being passed to it by

        its child programs will be grouped according to this grouping before
        being processed by the

        program.


        The Group identifier property of the grouping will be used to provide
        the value of the


        system variable


        Group_Identifier in this


        program


        and to provide a unique group

        identifier for each of its groups. These group identifiers will be
        "Internet_Admin=0" and

        "Internet_Admin=1".
      - |-
        The main topic 'Batches' has the following related sub-topics:
        * **Batch examples** : 
        The example user workspace includes examples of batches.
      - >-
        Batch examples


        The example user workspace includes examples of batches.


        No. | Name | Features

        1 | EV_Batch | Contains very similar models that have slightly different
        realistic assumptions

        2 | EV_Batch_2 | Use of theModel string overrideproperty to access
        different external assumption files
  - source_sentence: >-
      How does accessing a subset of an array using an expression like
      `Array_1[<Fund.position = 3 : 6>]` affect the dimension start positions?
    sentences:
      - >-
        ## Inheritance rules for the dimension start positions of an array
        variable


        An array variable inherits the dimension start positions of the array
        variables in its formula

        according to the points below.


        It is not necessary for different assignments for an array variable to
        return the same start

        position. For example, the following formula is valid even when Array_2
        and Array_3 have different

        dimension start positions:


        If Scalar_A > 3 Then
              Array_2
        Else
              Array_3
        EndIf


        An array variable inherits the dimension start positions (and hence the
        element position numbers)
                                of the arrays (after any function calls) used in its formula. It is not
                                necessary for these to be identical. If the start positions in any dimension
                                differ between arrays in a formula then

        R³S Modeler


        sets the
                                start position in that dimension in the calculated array to the default
                                value of 1. A message will be added to the

        Index
                                      and Position Warnings

        folder of the


        Run summary


        to
                                indicate this has happened.

        Simple mathematical operations on an array will preserve the dimension
        start positions.


        Accessing a subset of an array with an expression like


        Array_1[<Fund


        .position


        = 3 :

        6>]


        will cause the dimension start positions in the resulting array to be
        set so as to

        preserve the numbering of the element positions for all its elements. In
        this example, the start

        position of the dimension Fund will be set to 3 in the resulting array.


        Functions of arrays generally produce an array with the same dimension
        start positions as the

        inherited dimensions.
      - >-
        ## Example 1: Step_Length_PS


        Property | Value

        Name | Step_Length_PS

        Category | 

        Description | 

        Documentation | 


        This layer module contains no sub layer modules and just one layer
        variable:


        Variable | Layer module | Formula

        Step_Length_PS | Step_Length_PS | Duration(Step_Date.start,
        Step_Date.end, "Years", "One", "Exact")


        This layer module is a sub layer module of the layer module
        Expense_Renewal.
      - >-
        ## Accessing a subset of an array


        Any dimension indices will always be inherited when accessing a subset
        of another array with an

        expression like Array_1[<Fund


        .position


        = 3 : 6>] or Array_1[<Fund


        .position


        = First_Fund :

        Last_Fund>]. When it is not possible to determine both the start and end
        indices or element

        positions (that is, the variables First_Fund and Last_Fund in the second
        example) until runtime and

        the subset of the array variable is used in a mathematical expression
        then an index and position

        warning message will be issued to state that the indices for a dimension
        may not match, and if not

        that only one of the set of indices will be used which may lead to
        runtime errors or misleading

        results.
  - source_sentence: What kind of variable is Data_Process_Name considered?
    sentences:
      - >-
        Data_Process_Name


        The


        Data_Process_Name


        system variable is a character variable that gives the name of the data
        process.


        You can use this system variable in a data process in the data layer of
        a model.


        This system variable is a placeholder variable.
      - >-
        Assumption set example 4:
                Traditional_Reserve_Assumptions

        An assumption set used as the assumption set of a sub layer containing
        an assumption set

        variable that references an assumption set variable in the assumption
        set of the calling layer using

        the


        Source


        qualifier.


        This example describes the assumptions that might be used in a sub layer
        to calculate reserve

        provisions.


        Assumption set local properties:


        Property | Value

        Name | Traditional_Reserve_Assumptions

        Category | Traditional_Component

        Description | Traditional (non-linked without-profit) reserve
        assumptions

        Assumption connection string | 


        This assumption set has no sub assumption sets.


        This assumption set contains several assumption set variables. These
        variables have the following

        global properties (they all have their


        Aggregates


        and


        Portfolio


        properties set to


        No


        ):


        Variable | Data type | Display format

        Disc_Rate_Reserve | Numeric | Per cent

        Mort_Table_F | Life table | [None]

        Mort_Table_M | Life table | [None]


        They have the following local properties in this assumption set (they
        all have their


        Assumption table


        property left blank):


        Variable | Formula

        Disc_Rate_Reserve | Max(Source.Int_Rate - 3%, 0%)

        Mort_Table_F | 105% *AM92

        Mort_Table_M | 120% * AM92 + 1


        Notes:


        * TheSourcequalifier in the formula of Disc_Rate_Reserve specifies that
        the value of Int_Rate in

        the calling layer is to be used.

        * The per cent (%) and per mille (‰) characters may be included in
        formulas and have the

        effect of dividing by 100 and 1000 respectively, so 3% is interpreted as
        0.03 and 1 is

        interpreted as 0.001.


        This assumption set is specified in the local properties of the sub
        layers Reserve_Sub_Layer and

        Reserve_Sub_Layer_2 of the layer


        Realistic_Layer


        of the

        model


        EV_Model


        .
      - |-
        ## New system variables

        The system variables

        Data_Process_Name

        ,

        Data_Source_Name

        ,

        Layer_Name

        , and

        Program_Name

        are placeholder variables.
  - source_sentence: Is there a specific location where I can find workspace filters?
    sentences:
      - |-
        ## Windows

        You can access the filters of a workspace in the grid of filters.

        The filter window has most properties of the filter.
      - >-
        Category examples


        The example user workspace includes examples of categories.


        Some categories that might be useful in a present value of future
        profits model include:


        Name | Category type | Description

        Ages_Dates_Durations |  | Items relating to ages, dates and durations

        Asset_Shares |  | Items relating to asset shares

        Benefits |  | Items relating to benefits

        Bonuses |  | Items relating to bonuses

        Cash_Flow_Module | Modules | Module for general cash flows

        Cash_Flows |  | Items relating to cash flows

        Commission |  | Items relating to commission

        Commutation_Function_Reserve_Module | Modules | Module for reserving
        using commutation functions

        Decrements |  | Decrement tables and rates

        Economic |  | Economic assumptions and variables

        EU | Data flow | EU non-linked data

        Data flow |  | Items relating to expenses

        Flags |  | Items that are set as flags

        Fund_Charges |  | Items related to unit fund charges

        General_Module | Modules | Module suitable for many situations

        Interest |  | Items relating to interest

        Maturities |  | Items relating to maturities

        Mortality |  | Items relating to mortality

        Multiple_Currencies |  | Component for use with a multiple currency
        model

        NP_End | Data flow | Programs, data sources, and so on, for non-profit
        endowment assurances

        NP_Term | Data flow | Programs, data sources and so on for non-profit
        term assurances

        NP_WoL | Data flow | Programs, data sources, and so on, for non-profit
        whole of life assurances

        Policy |  | Items relating to policies

        Premiums |  | Items relating to premiums

        Probabilities |  | Items relating to probabilities

        Profit |  | Items relating to profit

        PVFP |  | Items relating to present value of future profits

        PVFP_Module | Modules | Module for present value of future profits

        Reserve_Module | Modules | Module for reserving by projection of cash
        flows

        Reserves |  | Items relating to reserves

        Solvency_Margin |  | Items relating to solvency margin

        Statistics |  | Items relating to statistics

        Surrenders |  | Items relating to surrenders

        Tax |  | Items relating to tax

        Traditional | Data flow | Traditional/non-profit/conventional non-linked
        business

        Traditional_Component |  |
        Traditional/conventional/non-linked/non-profit business component

        UK | Data flow | UK non-linked data

        Data flow | Data flow | Unit-linked endowment assurance

        Unit_Fund |  | Items related to the unit fund

        Unit_Linked | Data flow | Programs, and so on, for unit linked business

        Unit_Linked_Component |  | Unit-linked business component

        Unit_Linked_Module | Modules | Module for unit-linked business

        US | Data flow | US non-linked data

        Valuation |  | Items relating to valuations


        Categories provide the items on a drop-down list in the


        Category


        property that can be used to help organize related components.


        There are different possible values for the Category type property,
        including:


        * If the category type is left blank, it may be used for most
        components. For example, there are assumption set variables within
        assumption sets, events and variables within modules that have the
        category Expenses.

        * Modules- This type of category applies only to modules, initialization
        modules and layer modules. For example, the modules NP_End_PVFP,
        NP_Term_PVFP and NP_WoL_PVFP have the modules category PVFP_Module.

        * Data flow- This type of category applies only to data sources and
        programs. Data from a

        data source will only be processed by programs assigned to the same
        category as that data source, so

        data flow categories can be used to control the flow of data through a
        model. For example, in the

        modelEV_Model, the data sources in the data process

        Traditional_Data_Process have the data flow category Traditional and
        will only pass to the programs

        in the projection processRealistic_Projectionthat have the data flow
        category Traditional.
      - >-
        Find/Replace Panel


        See


        choosers and panels


        for information on
            displaying the Find/Replace Panel.

        The Find/Replace Panel allows you to search for specific text in the


        properties


        of the


        components


        of the open


        workspaces


        and


        results workspaces


        .


        You should enter the text for which you wish to search in the


        Find what


        edit field.


        You should select the parts of the open workspaces and results
        workspaces within which you
            wish to search in the tree under

        Within


        .


        You can select multiple items discontinuously by holding down the


        Ctrl


        key while clicking with the mouse.


        You can use the


        Name


        ,


        Formula


        and


        All
             fields

        checkboxes to specify whether the search should include the


        Name


        property, the


        Formula


        property or all properties, respectively. You
            must check at least one of these checkboxes so that there are some properties in which to
            search.

        You can also select further search options:


        * Match case- check this checkbox to perform a case-sensitive search

        * Match whole- check this checkbox to exclude matches with parts of
        words, including
             names of variables and components
        * Ignore spaces- check this checkbox to ignore all white space in the
        properties being
             searched
        * Ignore info fields- check this checkbox to exclude
        theDescription,Documentation,Last modified,Modified by,Path,Protected
        byandReserved byproperties.


        You should press the


        Find


        button to start the search.


        After searching the lower pane will display the number of occurrences of
        the text that have
            been found and provide a tree showing where these are. You can double-click on any of the
            results to open that component in the Central Window, with the found item selected.

        You can select items in the tree if you wish to replace the found text
        in these items. You
            should then type the text to replace the found text in the

        Replace with


        edit field and click the


        Replace


        button.


        The read-only icon


        next to a tree
            item indicates that it has been

        protected


        and so none of its
            text can be replaced using this feature.

        You can drag or copy tree items from the Find/Replace Panel into the


        Central Window


        .
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on BAAI/bge-large-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-large-en-v1.5
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': True, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Is there a specific location where I can find workspace filters?',
    '## Windows\n\nYou can access the filters of a workspace in the grid of filters.\n\nThe filter window has most properties of the filter.',
    'Find/Replace Panel\n\nSee\n\nchoosers and panels\n\nfor information on\n    displaying the Find/Replace Panel.\n\nThe Find/Replace Panel allows you to search for specific text in the\n\nproperties\n\nof the\n\ncomponents\n\nof the open\n\nworkspaces\n\nand\n\nresults workspaces\n\n.\n\nYou should enter the text for which you wish to search in the\n\nFind what\n\nedit field.\n\nYou should select the parts of the open workspaces and results workspaces within which you\n    wish to search in the tree under\n\nWithin\n\n.\n\nYou can select multiple items discontinuously by holding down the\n\nCtrl\n\nkey while clicking with the mouse.\n\nYou can use the\n\nName\n\n,\n\nFormula\n\nand\n\nAll\n     fields\n\ncheckboxes to specify whether the search should include the\n\nName\n\nproperty, the\n\nFormula\n\nproperty or all properties, respectively. You\n    must check at least one of these checkboxes so that there are some properties in which to\n    search.\n\nYou can also select further search options:\n\n* Match case- check this checkbox to perform a case-sensitive search\n* Match whole- check this checkbox to exclude matches with parts of words, including\n     names of variables and components\n* Ignore spaces- check this checkbox to ignore all white space in the properties being\n     searched\n* Ignore info fields- check this checkbox to exclude theDescription,Documentation,Last modified,Modified by,Path,Protected byandReserved byproperties.\n\nYou should press the\n\nFind\n\nbutton to start the search.\n\nAfter searching the lower pane will display the number of occurrences of the text that have\n    been found and provide a tree showing where these are. You can double-click on any of the\n    results to open that component in the Central Window, with the found item selected.\n\nYou can select items in the tree if you wish to replace the found text in these items. You\n    should then type the text to replace the found text in the\n\nReplace with\n\nedit field and click the\n\nReplace\n\nbutton.\n\nThe read-only icon\n\nnext to a tree\n    item indicates that it has been\n\nprotected\n\nand so none of its\n    text can be replaced using this feature.\n\nYou can drag or copy tree items from the Find/Replace Panel into the\n\nCentral Window\n\n.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, -0.9967, -0.9964],
#         [-0.9967,  1.0000,  0.9994],
#         [-0.9964,  0.9994,  1.0000]])

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 16,909 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 18.63 tokens
    • max: 53 tokens
    • min: 4 tokens
    • mean: 188.63 tokens
    • max: 384 tokens
    • min: 3 tokens
    • mean: 150.13 tokens
    • max: 384 tokens
  • Samples:
    anchor positive negative
    What is the purpose of the Analyzer tab in a results workspace? Analyzer

    The

    Analyzer

    tab of a results workspace shows how the variables in the results workspace depend on each other.
    If the results workspace contains sample output, the Analyzer shows these calculated results.
    Analyzer

    The Analyzer tool for a component shows how variables in the component depend on each other.

    Most components that contain variables with formulas have an

    Analyzer

    tab at the bottom of their component window.
    The

    Analyzer

    tab gives access to the Analyzer tool.
    Components with an

    Analyzer

    tab include

    assumption sets

    ,

    data views

    ,

    database views

    ,

    initialization modules

    ,

    layer modules

    ,

    modules

    ,

    MtF views

    ,

    programs

    ,

    projection processes

    ,

    stochastic processes

    ,
    and

    results workspaces

    .
    The

    Analyzer

    tab of a results workspace

    differs from the

    Analyzer

    tab of the other components and is covered separately.
    What kind of output is displayed in the Analyzer if available? Analyzer

    The

    Analyzer

    tab of a results workspace shows how the variables in the results workspace depend on each other.
    If the results workspace contains sample output, the Analyzer shows these calculated results.
    Accessing output

    You can view and use the output from

    R³S Modeler

    in a variety of different ways.
    Where can I find the dependency relationships between variables in my results? Analyzer

    The

    Analyzer

    tab of a results workspace shows how the variables in the results workspace depend on each other.
    If the results workspace contains sample output, the Analyzer shows these calculated results.
    Analyzer dependency diagram

    The dependency diagram of the

    Analyzer

    tab of a results workspace shows which variable you are currently analyzing with the variables that it depends on and the variables that depend upon it.
    You can double-click another variable in the dependency diagram to analyze that variable.
    The dependency diagram shows the value of each variable if this is available in sample output.

    The dependency diagram is divided into three strips of variables:

    * The top strip shows variables whose value depends on the value of the current variable (its dependants).
    * The middle strip contains the variable currently being analyzed.
    * The bottom strip shows variables on which the value of the current variable depends (its precedents).

    Each variable has a box that shows:

    * An icon representing the data type of the variable
    * A name bar that shows the name of the variable
    * A value box that shows the value of the variable

    The variable boxes are linked by arrows that show the ...
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 2
  • learning_rate: 2e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.05
  • bf16: True
  • dataloader_num_workers: 2
  • remove_unused_columns: False

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 2
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: False
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0946 50 9.7648
0.1892 100 9.3037
0.2838 150 9.1803
0.3784 200 9.2374
0.4730 250 9.1815
0.5676 300 9.2019
0.6623 350 9.2085
0.7569 400 9.0603
0.8515 450 9.1276
0.9461 500 9.1794
1.0397 550 9.0348
1.1343 600 9.1246
1.2289 650 9.1251
1.3236 700 9.1681
1.4182 750 8.907
1.5128 800 9.0067
1.6074 850 9.1056
1.7020 900 9.0715
1.7966 950 8.9425
1.8912 1000 9.0148
1.9858 1050 9.0477

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 5.1.1
  • Transformers: 4.49.0
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}