multimodal re-encoding