limited pre-training and guard rails

by mannamalai - opened Oct 30, 2023

Oct 30, 2023

hi Abinaya and team?:
This is great effort; I think you should write a paper and post to Arxiv on this topic and significant contribution for Tamil.
Can you release the methodology for training, tokenization and encoding representations ?

However since the model seems to be having some limited self correction and guard rails, and the model has limited cleanup of personally-identifiable information it should be mentioned in the announcement and user guide. There should be more guard rails added to this model and harmful content generation should be listed.

Thank you
-Muthu Annamalai

abinayam

Owner Jan 19, 2024

Hi @mannamalai : Sure, will try to write up a paper outlining the points you mentioned. This model was built as part of a hackathon and the amount of data used to pretrain the model is super less. We have plans to improve this model further as part of the AI Tamil Nadu's initiative.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment