An Intel-developed software solution aims to apply the power of artificial intelligence to the faces and body language of digital students. According to Protocol, the solution is being distributed as part of the "Class" software product and aims to aid in teachers' education techniques by allowing them to see the AI-inferred mental states (such as boredom, distraction, or confusion) of each student. Intel aims to expand the program into broader markets eventually. However, the technology has been met with pushbacks that bring debates on AI, science, ethics and privacy to the forefront.
The AI-based feature, which was developed in partnership with Classroom Technologies, is integrated with Zoom via the former's "Class" software product. It can be used to classify students' body language and facial expressions whenever digital classes are held through the videoconferencing application. Citing teachers' own experiences following remote lessons taken during the COVID-19 pandemic, Michael Chasen, co-founder and CEO of Classroom Technologies, hopes its software gives teachers additional insights, ultimately bettering remote learning experiences.
The software makes use of students' video streams, which it feeds into the AI engine alongside contextual, real-time information that allows it to classify students' understanding of the subject matter. Sinem Aslan, a research scientist at Intel who helped develop the technology, says that the main objective is to improve one-on-one teaching sessions by allowing the teacher to react in real-time to each student's state of mind (nudging them in whatever direction is deemed necessary).
But while Intel and Classroom Technologies' aim may be well-intentioned, the basic scientific premise behind the AI solution - that body language and other external signals can be accurately used to infer a person's mental state - is far from being a closed debate.
For one, research has shown the dangers of labeling: the act of fitting information - sometimes even shoehorning it - into easy to perceive (but ultimately and frequently too simplistic) categories.
We don't yet fully understand the external dimensions through which people express their internal states. For example, the average human being expresses themselves through dozens (some say even hundreds) of micro expressions (dilating pupils, for instance), macro expressions (smiling or frowning), bodily gestures, or physiological signals (such as perspiration, increased heart rate, and so on).
It's interesting to ponder the AI technology's model - and its accuracy - when the scientific community itself hasn't been able to reach a definite conclusion on translating external action toward internal states. Building houses on quicksand rarely works out.
Another noteworthy and potential caveat for the AI engine is that expressing emotions also vary between cultures. While most cultures would equate smiling with an expression of internal happiness, Russian culture, for instance, reserves smiles for close friends and family (opens in new tab) - being overly smiley in the wrong context is construed as a lack of intelligence or honesty. Expand this towards the myriad of cultures, ethnicities, and individual variations, and you can imagine the implications of these personal and cultural "quirks" on the AI model's accuracy.
According to Nese Alyuz Civitci, a machine-learning researcher at Intel, the company's model was built with the insight and expertise of a team of psychologists, who analyzed the ground truth data captured in real-life classes using laptops with 3D cameras. The team of psychologists then proceeded to examine the videos, labeling the emotions they detected throughout the feeds. For the data to be valid and integrated into the model, at least two out of three psychologists had to agree on how to label it.
Intel's Civitci himself found it exceedingly hard to identify the subtle physical differences between possible labels. Interestingly, Aslan says Intel's emotion-analysis AI wasn't assessed on whether it accurately reflected students' actual emotions, but rather on its results being instrumental or trustable by teachers.
There are endless questions that can be posed regarding AI systems, their training data (which has severe consequences, for instance, on facial recognition tech used by law enforcement) and whether its results can be trusted. Systems such as these can either prove beneficial, leading teachers to ask the right question, at the right time, to a currently troubled student. But it can also be detrimental to student performance, well-being, and even their academic success, depending on its accuracy and how teachers use it to inform their opinions on students.
Questions surrounding long-term analysis of students' emotional states also arise - could a report from systems such as these be used by a company hiring students straight out of university, with labels such as "depressed" or "attentive" being thrown around? To what measure of this data should the affected individuals have access? And what about students' emotional privacy - their capacity to keep their emotional states internalized? Are we comfortable with our emotions being labeled and accessible to anyone - especially if there's someone in a position of power on the other side of the AI?
The line between surveillance and AI-driven, assistive technologies seems to be thinning, and the classroom is but one of the environments at stake. That brings an entirely new interpretation for wearing our hearts on our sleeves.