Multimedia, Graphics and Robotics

The Multimedia, Graphics and Robotics research group works on the following:

  • Developing algorithms that would enable one or more robots to understand their environments and fulfill its assigned goals efficiently and effectively

  • Developing Vision Algorithms for Artificial Cognitive systems in Robotic and Augmented Reality using Deep Learning

  • Algorithms and Platform Framework for enabling smart machines which combine robotics, machine learning, Cloud and Internet of Everything

Our projects include the following:

Deep Learning for Reading Text from Images
Tasks such as container number recognition, automatic verification of serial number from AR platforms require capabilities that are beyond traditional OCR engines, to be able to recognize text at different angles, orientations and fonts. Recently spatial transformer networks have been proposed to recognize distorted images. We intend to utilize these networks to build a number identification platform and apply it to the Container Terminals Container recognition problem, the AR serial number recognition problem, the validity and currency of paper based certificates and other potential scenarios involving text in the wild.

Deep Learning for Vision using pre-trained models
Using pre-trained models for image classification can save the effort required for training a model from scratch. Also pre-trained models have been extensively validated and hence are more robust. However, how best to adapt a model to new data is still an emerging science and we intend to explore the different methods outlined in the literature to adapt pre-trained models for large scale face recognition, object recognition and classification

Deep Reinforcement Learning in Robotics
Robots supported with artificial intelligence to sense its environment, analyze the observations and then act can be used for many real-life operations. In this context, robot based object picking and placing is a key problem. The project focusses on research and development of deep learning based algorithms for vision based perception and motion planning for picking and placing in different application contexts. Our primary objective is to explore recent advancements in reinforcement learning (RL) known as Deep RL in the context of Robot based object manipulation. We work on different parts of this problem including understanding the scene, recognition of objects, identification of grasp points and optimal path planning for approaching the object. The work also includes creating simulation models for synthetic data generation for applying learning algorithms.

Cloud Robotics and Smart Machines Platform
Our Cloud Robotics Platform allows multiple robots representing Movers and Doers to undertake collaborative tasks and access to shared knowledge bases to increase effectiveness in Industrial, Transportation, Logistics and Manufacturing applications. Research areas include formation/ consensus/ efficient algorithms for knowledge sharing, elastic computing for robot operating systems. Projects in this include global path planning for fleet of autonomous vehicles, Multi-UAV formation control and consensus.  We are integrating various IOT platforms to enhance availability and processing capabilities within the Cloud Robotics Platform to access smart objects, non-robotic sensors and results of analytics pipelines from IoE environments to deliver true Smart Machines Platform.

Manipulation and Grasping
Industry 4.0 envisages robots coming out of their confinements of the assembly lines and working along with humans sharing each other’s workspace. This calls for developing algorithms that would enable the robots in better understanding its environment and interact effectively with the humans. Moreover it also has to have the capability to alter its environment by manipulating things as required. Under this project, we plan to develop several IPs in areas, such as, inverse kinematics, redundancy resolution, visual servcing and grasping. 

Multi-Drone Control and Coordination
This project would focus on developing algorithms for control and coordination of multiple drones in order to cater towards applications such as, infrastructure monitoring, crop monitoring, precision agriculture and inspection over a large area. Various multi-agent system algorithms, such as, formation control, consensus algorithms, swarm optimization etc. will be exploited to optimize the available resources so as to accomplish the above objectives. We would also develop capabilities in areas such as, vision-based navigation, obstacle avoidance, path planning, SLAM for drones.

Augmented Reality Platform for Industrial Maintenance and Inspections
The project focuses on building AR solutions for assisting the inspection and complex repair tasks in industrial assembly and maintenance operations. In a typical industrial setting, these tasks consists of lists of inspection checks and compliance tests to be completed in limited time, where the operator needs to access the corresponding guidelines, and record his observations to and from remote server as *evidences logs in the form of* text, audio, image and video which varies with the task. Further, an essential requirement for these tasks is hands-free operation as the operator may require to operate on some parts while the inspection progresses within a given time frame. We focus on different research problems associated in this context which include software architecture design, object recognition and tracking, 3D modelling and HCI. We work with different devices like Glass, Cardboard, Meta and Wearability especially targeting frugal AR solution which is easily configurable across devices and applications.

The group is headed by Dr. Gautam Shroff


Reach Us.