IIT Hyderabad Builds Dataset to Understand Online User InteractionsSci-Tech

June 18, 2018 14:08

(Image source from: The Hindu)

The Indian Institute of Technology has built Dataset for Affective States in E-Environments (DAiSEE), to understand user engagement in online interactions, with online shopping, advertising, e-learning, and health care being just a few sectors.

DAiSEE is the first multilabel, video-classification data set for recognizing tedium, uncertainty, frustration, and engagement.

The dataset comprises of 9,068 video snippets captured from 112 individuals. There are four levels of labels which are provided by observing viewer's reactions - very low, low, high, and very high.

There can be multiple labels assigned to a snippet: "For example, when understanding some complex terminology from videos, a person could display high engagement and still be confused or frustrated at the same time," explains Vineeth N. Balasubramanian of Department of Computer Science and Engineering at IIT Hyderabad who has led the research. "The combination of data and annotations related to user engagement sets the platform for DAiSEE as a specialized dataset," he adds in an email to media source. The dataset is available to the public at the website http://www.iith.ac.in

The crucial area of research is Recognizing, interpreting, processing, and simulating human affective states, or emotions, known as affective computing.

The usual emotions studied include anger, disgust, fear etc. “For a large part, researchers have focused on these basic expressions, we chose to go beyond,” says Dr Balasubramanian.

For instance, in a classroom, the student could be engaged with the lesson, or bored, frustrated or even confused. "Subsequent affective states can be viewed as a result of these four,” says Dr Balasubramanian. For instance, if a person is bored or confused, they could be distracted easily. “The affective states we have considered in DAiSEE are a bit more subtle than the six basic expressions," he adds.

People were invited to take part in an research voluntarily where they would watch definite videos and then answer to a questionnaire. One educational video and one recreational video were shown to the people, so that both relaxed and focused settings may be captured. They have extracted 27,000,000 images/video frames among 9,068 videos of 10-second length given to them.

"This is larger than most contemporary video datasets," says Dr Balasubramanian.

The crowd-voting method was used by researchers to annotate the dataset, and using a statistical aggregation method is used to pick best possible answers

By Sowmya Sangam