import io
import os
from openai import OpenAI
from pydub import AudioSegment
from pydub.playback import play
client = OpenAI()
def stream_and_play(text):
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input=text,
)
# Convert the binary response content to a byte stream
byte_stream = io.BytesIO(response.content)
# Read the audio data from the byte stream
audio = AudioSegment.from_file(byte_stream, format="mp3")
# Play the audio
play(audio)
if __name__ == "__main__":
text = input("Enter text: ")
stream_and_play(text)
The above example can be found in the documentation of OpenAI: https://platform.openai.com/docs/guides/text-to-speech
However it didn’t work for me for various reasons:
raise OpenAIError(openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
First I had to add the following lines to load the API Key from the .env file above the client = OpenAI() line.
from dotenv import load_dotenv load_dotenv()
To have this work you also need a .env file in the same directory with your api key:
OPENAI_API_KEY=sk-sdfsdjflskljsljsjlljljsjldlfjjdslkfjsl
After that I encountered a new error when running the code:
info = json.loads(output) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Searching for it revealed having to reinstall various things via brew:
brew reinstall tesseract brew uninstall librist --ignore-dependencies brew uninstall mbedtls --ignore-dependencies brew reinstall ffmpeg
After running the above commands it worked. If you encounter the problem I’d be interesting if only running brew reinstall ffmpeg would solve the problem.
As the audio shouldn’t have anything to do with the tesseract (OCR) package. But in the error I had it complained about the tesseract
Sources:
https://www.reddit.com/r/ffmpeg/comments/16kjvyr/workaround_for_leptonica_issue_on_mac_rename_a/
https://github.com/jiaaro/pydub/issues/450