Hello!

I am a
Research Assistant
at the University of Galway, Ireland, affiliated with the Insight Research Center for Data Analytics. Under the guidance of Prof. Paul Buitelaar in the
Natural Language Processing Unit
, My research centered on enhancing Visual Storytelling by
Generating Factual Visual Story Scenes from Text-based Story Narratives
.

My
research interests
spans across Large Vision-Language Models (LVLMs), Large Language Models (LLMs), Natural Language Processing (NLP) and Reinforcement Learning (RL), with a focus on solving domain-specific challenges. I am passionate about leveraging these technologies to drive innovation in both industry and academia.

Outside of my professional pursuits, I enjoy playing
cricket
and
chess
, as well as
traveling with friends
. These activities not only provide relaxation but also inspire creativity and strategic thinking, which I bring into my research work.

To know more, refer to my resume or drop me an email!

News and Updates


Publications

FlintstonesSV++ : Improving Story Narration using Visual Scene Graph
Janak Kapuriya, Paul Buitelaar
Text2Story Workshop 2025 | European Conference on Information Retrieval
paper

Semantic Frame Aggregation-based Transformer for Live Video Comment Generation
Anam Fatima, Yi Yu, Janak Kapuriya, Julien Lalanne, Jainendra Shukla
Multimedia Transaction'25 | IEEE Transaction on Multimedia
paper

Optimizing Multimodal Large Language Models for Scientific VQA through Caption-Aware Supervised Training
Janak Kapuriya, Arnav Goel, Medha Hira, Apoorv Singh, Naman Lal, Jay Saraf, Sanjana Sanjeev, Vaibhav Nauriyal, Avinash Anand, Rajiv Ratn Shah
AAAI AI4Edu Workshop'25 | Association for the Advancement in the Artificial Intelligence (AAAI)
paper

MM-PhyQA: Multimodal Physics Question-Answering with Multi-image CoT Prompting
Avinash Anand Janak Kapuriya, Apoorv Singh, Jay Saraf, Naman Lal, Astha Verma, Rushali Gupta & Rajiv Shah
PAKDD'24 | Pacific-Asia Conference on Knowledge Discovery and Data Mining
paper

Deep Learning Based Named Entity Recognition Models for Recipes
Mansi Goel*, Ayush Agarwal*, Shubham Agrawal*, Janak Kapuriya*, Akhil Vamshi Konam*, Rishabh Gupta, Shrey Rastogi, Niharika Niharika, Ganesh Bagler | (*Equal Contribution)
LREC-COLING'24 | Joint Int. Conference on Computational Linguistics, Language Resources and Evaluation
paper


Teaching

  • Winter 2024: Teaching Assistant for CSE508: Information Retrieval (IIIT-Delhi)
  • Monsoon 2022: Teaching Assistant for CSE201: Advance Programming (IIIT-Delhi)

  Template: Ashish Sharma