| # |
Team |
Task |
Date/Time |
DataID |
AMFM |
Method
|
Other Resources
|
System Description |
| unuse |
unuse |
unuse |
unuse |
unuse |
unuse |
unuse |
unuse |
unuse |
unuse |
|
| 1 | SILO_NLP | MMEVMM24en-ml | 2022/07/13 23:24:57 | 6936 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | No | Object Tags (Image) + Finetune mBART |
| 2 | BITS-P | MMEVMM24en-ml | 2023/07/08 13:52:01 | 7127 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | Yes | NLLB model finetuned on captions + object tags of original & synthetic images using DETR model |
| 3 | 00-7 | MMEVMM24en-ml | 2024/08/05 15:22:16 | 7194 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | Yes | Malayalam Test |
| 4 | v036 | MMEVMM24en-ml | 2024/08/11 13:13:34 | 7323 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | No | NMT based system using both image descriptors and text description. A multistage LLM pipeline used for extracting image data descriptions and translation. Fine tuning done in few cases
Models Used:
|
| 5 | 239233 | MMEVMM24en-ml | 2024/08/13 13:03:31 | 7382 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | Yes | One-shot prompt for synthetic QA description from captions; translate QA using IndicTrans2; generate caption from QA as context |
| 6 | UNLP | MMEVMM24en-ml | 2024/08/13 17:59:08 | 7393 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | No | Using the Transformer-based Gated Fusion model to integrate both text and visual data.
|
| 7 | v036 | MMEVMM24en-ml | 2024/08/14 16:55:15 | 7396 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | No | |
| 8 | v036 | MMEVMM24en-ml | 2024/08/15 10:54:29 | 7399 | - | - | - | - | - | - | 0.000000 | - | - | - | SMT | No | |
| 9 | IITP-AI-NLP-ML | MMEVMM24en-ml | 2025/10/22 21:51:01 | 7460 | - | - | - | - | - | - | 0.000000 | - | - | - | NMT | Yes | Used Selective Attention Architecture with IndicTrans as the base model and CLIP ViT-B/16 model to extract image features. We extract a) Full image feats, and b) cropped image feats and pick the one w |