Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention...
Recipe generation is an important task in both research and real life. In this study, we explore several pretrained language models that generate recipes from a list of text-based ingredients. Our recipe-generation models use a standard self-attention mechanism in Transformer and integrate a re-attention mechanism in Vision Transformer. The models were trained using a common paradigm based on cross-entropy loss and the BRIO paradigm combining contrastive and cross-entropy losses to achieve the best performance faster and eliminate exposure bias. Specifically, we utilize a generation model to produce N recipe candidates from ingredients. These initial candidates are used to train a BRIO-based recipe-generation model to produce N new candidates, which are used for iteratively fine-tuning the model to enhance the recipe quality. We experimentally evaluated our models using the RecipeNLG and CookingVN-recipe datasets in English and Vietnamese, respectively. Our best model, which leverages BART with re-attention and is trained using BRIO, outperforms the existing models.