| 1 | IITP-AI-NLP-ML | MMEVMM25en-od | 2025/10/22 21:36:19 | 7459 | - | - | - | - | - | - | 63.50 | - | - | - | NMT | Yes | Used Selective Attention Architecture with IndicTrans as the base model and CLIP ViT-B/16 model to extract image features. We extract a) Full image feats, and b) cropped image feats and pick the one w |