Dr. Sang Won Bae on Detecting Depression With Apps
Author(s): Scott Douglas Jacobsen
Publication (Outlet/Website): Oceane-Group
Publication Date (yyyy/mm/dd): 2024/11/18
Dr. Sang Won Bae is an Assistant Professor at Stevens Institute of Technology’s Department of Systems and Enterprises, Charles V. Schaefer, Jr. School of Engineering and Science. Her research focuses on human-computer interaction, mobile health systems, and machine learning, with an emphasis on personalized interventions for vulnerable populations to promote health and safety. Bae talks about AI-powered smartphone applications designed to detect depression through subtle physiological and behavioural cues inspired during the pandemic to explore non-invasive identification mental health issues, particularly PupilSense, which analyzes pupil responses, and FacePsy, which assesses facial behavior markers including facial expressions and head gestures – for detecting depression in naturalistic settings.
Scott Douglas Jacobsen: Today, we are here with Assistant Professor Sang Won Bae. I wouldn’t have imagined this kind of development, but science never ceases to surprise me. Detecting depression through the eyes – this is fascinating. Has there been any precursor to this style of research using indirect measures to detect depression?
Dr. Sang Won Bae: While recent studies explored detecting depression using mobile sensors like GPS, it was the pandemic that motivated me to start this project. During that time, many of us were struggling with feeling depressed. It was difficult to stay focused, manage work, and even keep up with studying. As a professor, I had to transition to online teaching, delivering lectures through Zoom since nobody was allowed to come to campus.
All classes were conducted on Zoom. I asked my students to turn on their cameras so I could see their reactions. This would allow me to adjust the content, shift the topic, or add more comments based on their level of engagement and how well they were understanding the material.
But in reality, very few students turned on their cameras. Almost everyone kept their cameras off, leaving me to wonder, “What’s going on? Are they even listening?” It felt isolating. I was teaching, but it felt like I was talking to no one. As a teacher, I wanted to interact with my students. Still, I felt isolated, both as an educator and as a human.
So, I started wondering, “What’s happening when the cameras are off? How do they feel about the lecture?” I wanted to understand what was going on behind the scenes, especially during the pandemic. While I wouldn’t describe my own feelings as full-blown depression, I did feel down, with an underlying sense of sadness and isolation. In early 2020 – around January or February – I contracted COVID, and that experience reinforced my belief that there was much more to explore.
People were putting on brave faces, but I wanted to know: could we find a way to help students and others who were struggling? What was really happening behind the scenes? We were no longer physically interacting, communicating only through devices—computers and smartphones—not human-to-human interaction. That’s when I felt we needed to do something about it, which became my motivation behind this project.
Jacobsen: This personal issue became a professional area of expertise for you.
Bae: Exactly, and it’s clear there are limitations of the existing systems. For example, there have been studies using the Facial Action Coding System to detect depression severity or mood disorders, but most of them were conducted in lab settings. Typically, these studies involved recording interviews with individuals experiencing mental health issues to analyze specific features, or they used actors to mimic various emotions in order to collect data. While these methods can be quite accurate, they often overlook a critical issue from the user’s perspective: the stigma associated with being monitored under the guise of advancing computer vision technology.
Jacobsen: Why did you choose the eyes as a metric or marker for detecting depression? I assume it’s part of a broader spectrum, of course.
Bae: Yes, it’s not just about the eyes alone. Other facial expressions and physiological elements, such as the pupil-to-iris ratio, play important roles as well. For example, when you’re focused, your pupils tend to constrict. But if you’re distracted or not engaged, your pupils dilate. These subtle changes in pupil size, known as pupillometry, can provide valuable insights into a person’s mood or mental state.
The eyes are a particularly interesting marker because they are part of a larger set of behavioral and physiological phenotypes that can indicate attention, distraction, or even emotional states. The eyes not only reflect someone’s affective and cognitive status, but they can also hint at broader health conditions. For example, certain changes in eye behavior have been linked to conditions like high blood pressure or neurological disorders. While it’s not the eyes themselves that show these issues directly, the patterns of eye movements and responses can be used to infer underlying health conditions through careful analysis.
Jacobsen: How does combining the analysis of the eyes with facial expressions provide a robust metric for detecting depression? And what is the margin of error?
Bae: We’ve reported an error rate of less than 5%. Our system achieved an accuracy of over 76% using PupilSense and 69% with FacePsy, using rigorous cross-validation approaches. This means that when new, unseen data from participants is introduced, the algorithm can predict whether someone is depressed with 76% accuracy using PupilSense and 69% accuracy using FacePsy.
This is quite innovative because other researchers often use different sensing technologies, like activities and GPS, which can raise privacy concerns. That’s why we try to use just the smartphone without invading privacy. The system only triggers and collects data when users use their smartphones.
If you’re asking what specific signals indicate depression, there are many. We’ve found key markers such as head gestures, eye movements, and smiling behaviour. Our mobile application includes a range of behavioural markers, including pupil-to-iris ratios.
As for accuracy, we’ve introduced two main applications and have two more in development. Recently, we published papers on understanding human emotions and mood using facial markers. The model’s performance would improve if we included additional sensors like GPS, movement tracking, or other features. However, using multiple sensors requires significant computational resources, and it could be more scalable for everyday use, as most researchers or participants would need access to large computing systems they don’t have.
That’s why our open-source affective sensing framework will be scalable—not in the distant future, but right now. We’ve already shared the framework and application data on GitHub. Many other developers and researchers can build upon this work for future studies in mental health, eye diseases, diabetes, and using facial features to understand dementia.
Many other diseases can be detected, and this will be feasible.
Jacobsen: So, why the eyes? Why facial expressions? And why mobile?
Bae: We tend to make social faces and expressions when we meet people in person. We say, “Hi, how are you?” and smile. But when someone closes the door and looks at their mobile phone, they show a different side. They might browse, and we observe this shift – the change in their facial expressions and perhaps their mood when interacting with the virtual world through apps, search engines, and social media.
One interesting finding in our studies is that depressed individuals tend to smile more compared to healthy participants. It doesn’t seem intuitive at first, but this is part of the phenomenon of masking depression. We also noticed that, which we haven’t reported in full, depressed individuals were more likely to use social media, entertainment apps, games, and YouTube. They’re searching for something to entertain themselves, looking for fun, funny videos or other content to make them feel happier.
We are preparing follow-up studies to analyze app usage and to know more context about what people do when they feel sad or happy and how their mood changes would be ideal. Excessive use of social media can contribute to feelings of sadness or depression, especially when people compare their lives to the curated, idealized versions of others’ lives. Everyone seems to be happy, travelling, and enjoying life. This constant comparison can lead to a decline in mental health.
Jacobsen: Yes, people are curating an idealized version of themselves for the world to see, and others who view this may feel worse in comparison. There’s certainly a logic to that. The major benefits of this technology are, first, it’s cost-effective. Second, it can be implemented now. Third, it has reasonable accuracy. And fourth, it can be distributed globally as an app.
So, my main question is: if you’ve combined facial expression analysis with PupilSense for early depression detection, what other easy-to-measure metrics could be integrated into the same smartphone app further to increase the accuracy and robustness of early depression detection? Are you working on such developments? I’m sure you’ve thought about those.
Bae: Yes. If you’re asking about additional features, there is more we can explore, particularly regarding application usage. You mentioned curation, which refers to what users seek and how often they visit specific applications and content.
We are currently using Android application categories, and while we can’t always see the exact name of the app unless it’s registered, we can still understand if the app is categorized as entertainment, work-related, or GPS and navigation. It’s possible to analyze the relationship between the use of these different categories – productive apps, entertainment apps, and more – with their emotional state and depression. This virtual behaviour can give us insight into their mood, which would be useful for intervening and delivering specific content that could help.
However, it’s critical to understand that we don’t need to know exactly what they are reading or viewing. That would be too invasive. For instance, if an application knew exactly what I was reading, that would raise privacy concerns. However, knowing which category an app belongs to and how frequently users engage with it provides enough insight.
Think about Netflix, for example. They might want to know what users are watching and how they feel while watching. Our application can capture various emotions and sentiments, and the time of day or duration of app usage is critical. Understanding these patterns of depression could be key in developing a more innovative and preventative approach so we can identify when someone might need help before they realize it themselves.
Jacobsen: How can these apps be improved in their next iteration?
Bae: In the next iteration, the focus will be on improving our algorithm’s accuracy to obtain generalizability. We are working on validating the model further before moving into large-scale clinical trials with depression patients. So far, we’ve made significant strides by incorporating new sampling methods and optimizing features like sampling time and battery usage to ensure the app performs well in various real-world environments.
To make the app more scalable and generalizable, we’ve been having productive discussions with institutions like Johns Hopkins and MATClinics. However, we’re eager to collaborate with more medical researchers and experts (email: sbae4@stevens.edu) who are interested in joining us in expanding the app’s potential.
Looking ahead, accessibility is another key priority. We want to make the app more user-friendly, especially for people who face barriers to healthcare, such as immigrants, low-income individuals, and others who have difficulty accessing hospitals or clinics. Our goal is to empower them to take control of their own health monitoring and management before any negative consequences arise. I firmly believe early detection saves lives when proper just-in-time interventions are delivered.
Jacobsen: What were the hurdles in the full development of this app?
Bae: One of the major hurdles was the approval process. It took almost a year to get the research started, largely due to the extra precautions and considerations around potential risks during the pandemic. Another challenge was finding participants for the modeling process. Given the pandemic, it was difficult to recruit enough people, and passive sensing research can be inconvenient for users, as they had to keep the app running for a full month without deleting it. I’m incredibly grateful to those who participated, as their commitment made a huge difference. Even though the compensation was minimal, they believed in the value of the research and stayed engaged. I’m also thankful to the volunteers who helped with pilot testing, as their support was crucial in overcoming these hurdles.
Jacobsen: Who were important collaborators?
Bae: The person who contributed the most was, without a doubt, Rahul, a PhD student in our lab, who worked tirelessly on the development. I also want to mention Priyanshu and Shahnaj, our assistant researcher and volunteer, for their help. And of course, Professor Tammy Chung from Rutgers University, who I’m currently collaborating with on an NIH project, has been incredibly supportive and believed in the potential of this research. Most importantly, this project wouldn’t have been possible without the Startup funding support from the Department of Systems and Enterprises at our university.
Jacobsen: Dr. Bae, thank you so much for your time today. I appreciate it.
Bae: Yes, thank you!
License & Copyright
In-Sight Publishing by Scott Douglas Jacobsen is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. ©Scott Douglas Jacobsen and In-Sight Publishing 2012-Present. Unauthorized use or duplication of material without express permission from Scott Douglas Jacobsen strictly prohibited, excerpts and links must use full credit to Scott Douglas Jacobsen and In-Sight Publishing with direction to the original content.
