Wednesday, October 25, 2023

Beginner Tutorial to Fine-Tune an AI Model

 This video steps through an easy tutorial to fine-tune a model on custom dataset from scratch by using LlamaIndex and Gradient.




Dataset Used:


{"inputs": "<s>### Instruction:\nWho is Fahd Mirza?\n\n### Response:\nFahd Mirza is an AI Cloud Engineer based in Sydney Australia. He has also got a background in databases and devops plus infrastrucutre.</s>"}

{"inputs": "<s>### Instruction:\nWhat are hobbies of Fahd Mirza?\n\n### Response\nFahd Mirza loves to spend time on his youtube channel and reading about technology.</s>"}

{"inputs": "<s>### Instruction:\nWhat Fahd Mirza's favorite Color?\n\n### Response:\nFahd Mirza's favorite color varies from time to time. These days its blue.</s>"}

{"inputs": "<s>### Instruction:\nWhat does Fahd Mirza look like?\n\n### Response:\nFahd Mirza looks like a human.</s>"}


.env File:


GRADIENT_ACCESS_TOKEN='<>'

GRADIENT_WORKSPACE_ID='<>'


Commands Used:


!pip install llama-index gradientai -q

!pip install python-dotenv 


import os

from dotenv import load_dotenv, find_dotenv

_= load_dotenv(find_dotenv())


questions = [

    "Who is Fahd Mirza??",

    "What is Fahd Mirza's favorite Color?",

    "What are hobbies of Fahd Mirza?",

]


prompts = list(

    f"<s> ### Instruction:\n{q}\n\n###Response:\n" for q in questions

)


print(prompts)


import os

from llama_index.llms import GradientBaseModelLLM

from llama_index.finetuning.gradient.base import GradientFinetuneEngine


base_model_slug = "nous-hermes2"

base_model_llm = GradientBaseModelLLM(

    base_model_slug=base_model_slug, max_tokens=100

)


base_model_responses = list(base_model_llm.complete(p).text for p in prompts)


finetune_engine = GradientFinetuneEngine(

    base_model_slug=base_model_slug,

    name="my test finetune engine model adapter",

    data_path="data.jsonl",

)


epochs = 2

for i in range(epochs):

    finetune_engine.finetune()

fine_tuned_model = finetune_engine.get_finetuned_model(max_tokens=100)


fine_tuned_model_responses = list(

    fine_tuned_model.complete(p).text for p in prompts

)

fine_tuned_model._model.delete()


for i, q in enumerate(questions):

    print(f"Question: {q}")

    print(f"Base: {base_model_responses[i]}")

    print(f"Fine tuned: {fine_tuned_model_responses[i]}")

    print()


No comments: