Member-only story
Tutorial
How to stream ChatGPT API responses?
This tutorial introduces a simple method to stream ChatGPT & GPT-4 responses. Streamed responses are served sooner to user side, which improve user experience.
Introduction
Streaming enables faster response to user query.
The tutorial builds the necessary code to stream answers with ChatGPT API responses. The entire code is available in my Github with an option to deploy the code in Streamlit.
Let’s get started!
Requirements
This approach requires only standard libraries for the OpenAI API’s own library and time, which is built into Python. So, you do not need any unfamiliar libraries:
!pip install --upgrade openai
import openai
import time
openai.api_key = os.getenv("OPENAI_API_KEY")
startime = time.time()
Now, we can start using the scripts to stream answers. I will use a code, which first takes user input, then it passes the input to the API and immediately measures events in time and feeds each event to the print, which is continuously updated.
Stream ChatGPT API
Streaming answers with ChatGPT model differs from streaming answers using the InstructGPT model. So, I built here code for each, so you can easily compare each approach.
The code is very self-explanatory, but I control input length, speed of responses. Then I add input prompt, which retrieves user input. This input is sent to the API and response is streamed back to the user. The streams are fed as events of time:
### STREAM CHATGPT API RESPONSES
delay_time = 0.01 # faster
max_response_length = 200
# ASK QUESTION
prompt = input("Ask a question: ")
start_time = time.time()
response = openai.ChatCompletion.create(
# CHATPG GPT API REQQUEST
model='gpt-3.5-turbo',
messages=[
{'role': 'user', 'content': f'{prompt}'}
],
max_tokens=max_response_length…