Troubleshooting OpenAI’s Text-to-Speech Implementation in Python

Posted on
import io
import os

from openai import OpenAI
from pydub import AudioSegment
from pydub.playback import play

client = OpenAI()


def stream_and_play(text):
    response = client.audio.speech.create(
        model="tts-1",
        voice="alloy",
        input=text,
    )

    # Convert the binary response content to a byte stream
    byte_stream = io.BytesIO(response.content)

    # Read the audio data from the byte stream
    audio = AudioSegment.from_file(byte_stream, format="mp3")

    # Play the audio
    play(audio)


if __name__ == "__main__":
    text = input("Enter text: ")
    stream_and_play(text)

The above example can be found in the documentation of OpenAI: https://platform.openai.com/docs/guides/text-to-speech

However it didn’t work for me for various reasons:

raise OpenAIError(openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

First I had to add the following lines to load the API Key from the .env file above the client = OpenAI() line.

from dotenv import load_dotenv
load_dotenv()

To have this work you also need a .env file in the same directory with your api key:

OPENAI_API_KEY=sk-sdfsdjflskljsljsjlljljsjldlfjjdslkfjsl

After that I encountered a new error when running the code:

info = json.loads(output) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Searching for it revealed having to reinstall various things via brew:

brew reinstall tesseract
brew uninstall librist --ignore-dependencies
brew uninstall mbedtls --ignore-dependencies
brew reinstall ffmpeg

After running the above commands it worked. If you encounter the problem I’d be interesting if only running brew reinstall ffmpeg would solve the problem.

As the audio shouldn’t have anything to do with the tesseract (OCR) package. But in the error I had it complained about the tesseract

Sources:

https://www.reddit.com/r/ffmpeg/comments/16kjvyr/workaround_for_leptonica_issue_on_mac_rename_a/

https://github.com/jiaaro/pydub/issues/450

https://stackoverflow.com/questions/77110765/error-while-run-command-ffmpeg-library-not-loaded-opt-homebrew-opt-mbedtls-l

Leave a Reply