Pritam Sarkar's picture

Pritam Sarkar

pritamqu

·

https://pritamsarkar.com

AI & ML interests

multimodal learning with vision, language, and audio; generative modeling; large multimodal models (LMMs); multimodal LLMs (MLLMs); AI agents; alignments; representation learning; self-supervised and unsupervised learning; vision-language models; audio-visual models; foundation models; computer vision

Recent Activity

liked a dataset 3 days ago

WHB139426/Grounded-VideoLLM

commented on a paper 10 months ago

VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models

updated a dataset 10 months ago

pritamqu/VCRBench

View all activity

Organizations

None yet

Collections 1

Papers 3

arxiv:2505.08455

arxiv:2504.12083

arxiv:2405.18654

models 10

pritamqu/LongVU_Qwen2_7B-RRPO-16f

8B • Updated Apr 17, 2025 • 1 • 1

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-32f

8B • Updated Apr 17, 2025 • 1

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-16f

8B • Updated Apr 17, 2025 • 2

pritamqu/VideoChat2_stage3_Mistral_7B-RRPO-16f-LORA

Updated Apr 17, 2025 • 2

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-32f-LORA

Updated Apr 17, 2025 • 1

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-16f-LORA

Updated Apr 17, 2025 • 1

pritamqu/LongVU_Qwen2_7B-RRPO-16f-LORA

Updated Apr 17, 2025

pritamqu/halva13b-lora

Updated Jan 29, 2025

pritamqu/halva7b-lora

Updated Jan 29, 2025

pritamqu/halva13b384-lora

Updated Jan 29, 2025

datasets 2

pritamqu/VCRBench

Viewer • Updated May 14, 2025 • 365 • 66 • 1

pritamqu/self-alignment

Preview • Updated Apr 17, 2025 • 9 • 2