Today, we’re showcasing CM3leon (pronounced like “chameleon”), a single foundation model that does both text-to-image and image-to-text generation.
CM3leon is the first multimodal model trained with a recipe adapted from text-only language models, including a large-scale retrieval-augmented pre-training stage and a second multitask supervised fine-tuning (SFT) stage.
Source – Meta
Techcrunch added some further insights into this here. It’ll be interesting to see how Meta continues to compete in the AI space.
Leave a Reply