Member-only story

Tutorial

How to stream ChatGPT API responses?

4 min readMar 12, 2023

This tutorial introduces a simple method to stream ChatGPT & GPT-4 responses. Streamed responses are served sooner to user side, which improve user experience.

Introduction

Streaming enables faster response to user query.

The tutorial builds the necessary code to stream answers with ChatGPT API responses. The entire code is available in my Github with an option to deploy the code in Streamlit.

GitHub - tmgthb/stream_chatgpt_api: How to stream ChatGPT & Davinci model responses (gpt-3.5-turbo…

The code is to demonstrate usage of streaming with ChatGPT API model and I wrote comparable code using the current…

github.com

Let’s get started!

Requirements

This approach requires only standard libraries for the OpenAI API’s own library and time, which is built into Python. So, you do not need any unfamiliar libraries:

!pip install --upgrade openai 
import openai
import time
openai.api_key = os.getenv("OPENAI_API_KEY")
startime = time.time()

Now, we can start using the scripts to stream answers. I will use a code, which first takes user input, then it passes the input to the API and immediately measures events in time and feeds each event to the print, which is continuously updated.

Stream ChatGPT API

Streaming answers with ChatGPT model differs from streaming answers using the InstructGPT model. So, I built here code for each, so you can easily compare each approach.

The code is very self-explanatory, but I control input length, speed of responses. Then I add input prompt, which retrieves user input. This input is sent to the API and response is streamed back to the user. The streams are fed as events of time:

### STREAM CHATGPT API RESPONSES
delay_time = 0.01 #  faster
max_response_length = 200

# ASK QUESTION
prompt = input("Ask a question: ")
start_time = time.time()

response = openai.ChatCompletion.create(
    # CHATPG GPT API REQQUEST
    model='gpt-3.5-turbo',
    messages=[
        {'role': 'user', 'content': f'{prompt}'}
    ],
    max_tokens=max_response_length…