Sesame CSM 1B

Generate from CSM 1B (Conversational Speech Model). Code is available on GitHub: SesameAILabs/csm. Checkpoint is hosted on HuggingFace.

Try out our interactive demo sesame.com/voicedemo, this uses a fine-tuned variant of CSM.

The model has some capacity for non-English languages due to data contamination in the training data, but it is likely not to perform well.

Voices

Select a predefined speaker

Select a predefined speaker

Each line is an utterance in the conversation to generate. Speakers alternate between A and B, starting with speaker A.

conversation

GPU time limited to 3 minutes, for longer usage duplicate the space.

Synthesized audio