openai gpt3 inference in inferentia2(neruon)

peanut0613 2024. 3. 28. 23:28

2024. 3. 28. 23:28

import openai

openai.api_key = 'mykey'

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Hello, I'm a language model,"}
    ]
)
print(response['choices'][0]['message']['content'])

< setting>

AMI : Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04) 20240102

인스턴스 : Inf2.xlarge

source /opt/aws_neuron_venv_pytorch/bin/activate

pip install transformers-neuronx --extra-index-url=https://pip.repos.neuron.amazonaws.com

python3 sample.py

**코드 참조 : https://platform.openai.com/docs/guides/rate-limits/error-mitigation?context=tier-free

**코드 참조 2 : 챗지피티4

** openai에서 사용가능한 모델 : https://platform.openai.com/docs/models/overview

첨에 RateLimitError 에러가 나서 limit 확인

https://platform.openai.com/docs/guides/rate-limits/usage-tiers?context=tier-free

RateLimitError 해결해볼라고 14000원 결제도 함..

'<개념> > Deep learning' 카테고리의 다른 글

LLAMA Inference (0)	2024.05.17
T5 inference in inferentia (0)	2024.04.08
GPT2 Text-generation을 AWS환경의 GPU/CPU/Inf2/Trn1에서.. (1)	2024.01.29
Inf2.xlarge에서 gpt2 text generation inference 수행 (0)	2024.01.22
[Scaler정리] Scaler 식 정리 (0)	2023.05.15

DARAM BLOG

openai gpt3 inference in inferentia2(neruon)

'<개념> > Deep learning' 카테고리의 다른 글

+ Recent posts

티스토리툴바