Social agent design

#md1

This is a hypothetical social agent I designed for my psychology class, to prevent phishing. The full report is below.

Stay safe when surfing the web: Agent Neptune

Social Agent design

Katherine Rackliffe

05/04/2025

Introduction

Phishing is a common part of daily life, with many having stories of people falling for scams. Often, traditional phishing training courses will focus on technical capabilities and train people to recognize suspicious links or email addresses. However, people still fall for scams because it’s hard to keep up this constant analysis of every email coming their way.

Currently, Gmail has an automated junk mail detector that tries to filter out these phishing emails. Outlook does as well, but it doesn’t catch everything. Furthermore, false positives (where a legitimate email is flagged) means that a user needs to dig through their spam emails, which decreases trust between a human and technology. Organizations can make strict policies to encourage cybersecurity, but this is often frustrating and annoying for people to do. Think of a yearly cybersecurity training course, which can come across as patronizing and mandated by an IT department. Another example is false phishing emails that an organization sends to try and ‘catch’ people who might fall for a real phishing email. Again, this is disrupting the trust between people and technology, and subsequently the IT department. (Arce, 2003)

The major consequence of this inefficient training is simply people getting phished. But the unintended consequences include shame, frustration, and growing self-deprecation in those who are scammed. Cybersecurity experts tend towards thinking of technology rather than people. Usable security is an entire branch of security designed to answer the gap between security and people, but often research neglects emotions of users. Cybersecurity people call humans the ‘weakest link’ in security. Technology is designed from this framework to painstakingly keep humans in place.

Current research has focused on the emotion of urgency, which pushes people to answer phishing emails without noticing the red flags. Naturally, this has led to cybersecurity training trying to encourage users to slow down when they’re reading emails. But this contradicts the reality of our daily lives: we’re busy responding to hundreds of other emails.

Some research has been done on AI and social agents for the use of catching phishing messages and even social media users will lean on AI agents in some cases by talking with AI workers. (Sameen et al., 2020). On the other side of cybersecurity, generative AI is used to scam people. (Liaqat et al., 2024).

The very nature of hacking and scamming is about breaking systems for your own benefit. Cybersecurity is, and most likely always be, a constant battle between defenders and attackers. As technology improves for one side, the other will adapt and change their technology to match it. But what if there’s a tactic that has changed little over time? What would an approach for an anti-phishing tool look like, if it were based on emotions and psychology, rather than technology?

Theoretical foundations

The primary focus of this agent is to target shame around cybersecurity (Sznycer et al., 2016). Many cybersecurity trainings generate shame as an unintended consequence. This shame leads to norms around security that are counter intuitive. For example, someone might be embarrassed about clicking on a phishing link. Due to this embarrassment, they won’t report the email, and it will go unnoticed by the IT department. By the time the scam is noticed, computers are locked down with malware. Because the company needs a scapegoat for what happened, they fire the employee and pin the blame on human error.

Encouraging an emotional response goes against many ideas in security. Again, cybersecurity training tells people to think carefully about emails, but in reality, most people who decide an email is suspicious have an initial ‘gut feeling’ that helps them avoid the link. Emotions (such as fear and unease) that contribute to the instinctive feeling of wrongness should be encouraged for phishing detection. (Von Preuschen et al., 2024). Fear helps people recognize threats, and figure out what to do next. Cognitive psychology about decision-making helps reveal the importance of emotion in these rapid-fire decisions. (Campos et al., 1994)

Emotional responses about phishing can have another benefit. With a more social group, people can talk about emails they receive with others and collectively figure out the true intent of the email. (Cheong et al., 2023) Additionally, with a less antagonistic relationship to the IT department, people will be willing to report and communicate about emails they get. (Wood et al., 2021)

One possible benefit of using emotional awareness when designed cybersecurity systems could potentially support older adults. Psychology research has found that older adults have less introspection when it comes to cues (Khalsa et al., 2009), and many scammers target older adults. It’s thought that older adults are more commonly the target of scams because they struggle with technology, but considering how decisions are made while using technology, this may not be the case. It’s likely that scammers target older adults as they have more money rather than actual technical ability, and possible that lower introspection means that older adults don’t notice the fear/unease response that can help detect a suspicious email.

Phishing detection can also utilize emotions research. In many scam emails, they attempt to trigger stress responses by instigating a fear response. Think of a phishing email that says your package has been lost, or that your computer has been hacked. Some types of phishing emails focus more on trying to trigger an empathetic response, such as asking for help or mimicking a legitimate fundraiser. If a social agent can detect what emotions are emerging in emails, it could flag these emotions and figure out where to proceed. This is what the social agent described in this paper aims to do as well, with noticing specific keywords that are related to emotions. (e.g, urgent, help, gift, danger)

Finally, the design of the agent is built around design principles for emotion. Colors are linked to feelings and can help aid a user to gain a certain feeling during the process of checking emails. For example, red is often used to indicate danger and is bright enough to catch a user’s attention. (Schloss et al., 2020). The shape language of the agent, as described and shown later in this paper, is round and soft shapes, which is a principle currently used in empathetic robot design. (Melzer et al., 2019) (Main et al., 2017)

Emotional Perception System

This social agent (Neptune) avoids using an LLM and instead uses a branching series of choices, much like a flowchart. This ensures the social agent will consistently deliver the same response every time.

Neptune will start when an email is flagged as suspicious. The main goal are as follows:

1. Encourage emotional introspection to detect the intent of the email

2. Support the social network and direct the user to report to the IT department

3. Avoid shaming or blaming the user

There are two ways Neptune will perceive human emotion. One is within the scam email itself. Many of these emails use charged language to try and make people panic. Secondly, Neptune will use the flowchart to encourage secure choices and allow users to choose between options during the process.

Much of phishing training is helpful only when someone sees an email as suspicious. In many cases, people never get to this point. Later, they can see the email was wrong. They know that the email is misspelled, but they never got to the point where they could recognize this since it was just part of their everyday tasks.

Neptune is designed with calming colors and a humanized persona.

Fig 1. Conceptual design for Neptune. Neptune is designed as a genderless humanoid figure with teal and blue colors. Neptune will have various colored borders indicating Neptune’s emotions.

Neptune will have three different states of emotion. One is when a scam email is first detected. Neptune will have an alarmed expression (Adolphs, 2002) and the background will be red. When the user is going through the options, Neptune will have a yellow background and neutral, inquisitive expressions. Finally, when the user settles on a decision, Neptune will have a blue background and a happy expression, (Kalegina et al., 2018) indicating that the process is complete.

Colors are linked to emotional states, as described in the theory section of this paper. These colors work as a shorthand as well, so people don’t need to rely on the text that Neptune is generating.

The program for Neptune is very simple and is based on earlier designs of chatbots such as ELIZA, (Weizenbaum & View Profile, 1966) where it relies on a simple decision tree to make decisions. This means that Neptune will respond consistently and the effectiveness of the program will be easier to measure and calibrate accordingly.

ELIZA, even as a simple chatbot, was shown to spark an emotional reaction from humans. While an argument can be made to make a phishing AI agent using an LLM, and many junk filters use other pattern recognition algorithms to filter out junk filters, this creates inconsistency in the responses provided by the social agent. As such, this will rely on the consistent decision tree.

The dialogue tree is as follows, with brackets indicating when Neptune will pull directly from the email:

Hmm, this email looks a little odd to me. I can see that it says [contents from email], and [other red flags for a phishing email]. Can we check and make sure that it isn’t a scam?”

o Option one: Forward this email to IT to check

o Option two: Investigate this email with Neptune

o Option three: Ignore, I know this sender: [Email sender] and the contents look normal to me.

This email presents Neptune as calm, humble, and offering agency to the user. The first step of preventing a phishing scam, by noticing the email looks odd, has begun. As the dialogue tree continues, it will continue to rely on empathetic design principles and encourage cybersecure behavior.

Interaction Design Scenario

Context: A tired secretary, Donna, is checking her emails first thing in the morning. She hasn’t had her coffee yet, and her kid was up with the flu. She has around fifty unread emails, and she really wants to get through enough of them to feel in control of her day. One of the emails is a phishing email.

Turn 1

User Action: Donna clicks on the phishing email. It says something about her car insurance and has a link that looks close enough to Donna’s actual car insurance, so she moves to click the link. Neptune, a pop-up, appears on her screen, with a little frowning mascot.

Agent perception:

- Key times based on timestamps: early in the morning, lots of stress and other emails all indicating stress on the user.

- Email: using common scam emotions, such as URGENT in the title

Agent response: Neptune turns to the RED state and appears.

- “Hmm, this email looks a little odd to me. I can see that it says URGENT, and a couple of the links are misspelled. Can we check and make sure that it isn’t a scam?”

o Option one: Forward this email to IT to check

o Option two: Investigate this email with Neptune

o Option three: Ignore, I know this sender: carinsuarse@gmail.com

Rationale: Starting the process, providing a reason why it flagged the email.

Turn 2

User Action: Donna sighs. “It’s the fish person again. I hate technology.” She looks at the urgent. “Yeah, car insurance is urgent. And the fish person has been wrong before.” Donna is now suspicious of the email but isn’t sure it’s a scam. She hovers over the Ignore option.

Agent perception

- Hovering over the link

- Clicked on the link

Agent response: Neptune turns YELLOW and begins the investigation process.

- “Ok, do you know carinsuarse@gmail.com? It looks like this email is asking for your personal info and sending you to a link outside of our organization. Confirm?”

o Confirm, this email is fine

o Wait a moment first

Turn 3

User action: Donna looks at the email address again. She sees the misspelled word. “Oh! It is a scam. Relieved, she clicks on the report option. This sends the email to the IT department, who quickly handles it.

Agent perception: Request to send to the IT department.

Agent response: Neptune turns BLUE and gives a quick goodbye message.

- “Thank you for checking!”

- Email sent to IT department and handled properly

Ethical and Societal Consideration

One of the main concerns for using AI agents or otherwise for cybersecurity is how easily these LLMS can be manipulated. Scams are already working around spam filters that use LLM programs. Neptune avoids these concerns while also sacrificing adaptability. In cases where Neptune doesn’t automatically pick up on a suspicious email, or where an organization doesn’t make Neptune strict enough, it will fail.

This paper focuses mostly on a US example within a company with a dedicated IT department. A different decision tree would need to be designed if someone were using a personal device, or in different areas of the world. Alternative language translations would need to be made and be accessible to all.

Furthermore, Neptune is just one more piece of technology that scammers would learn to evade. Users might get tired if there’s too many false positives and ignore the intention to slow down and examine the email. This would again result in the problem where users distrust the IT department and build resentment towards security measures.

Solutions for cybersecurity are all small drops in a vast ocean, that is the internet. Neptune alone will not solve everything, but it is a step towards more empathetic design for cybersecurity.

Conclusion

This paper described the theory, design, and example of a social agent designed to prevent phishing scams from taking place. Neptune is designed to encourage reporting emails and create a calming experience that supports people to reason through a suspicious email.

Building this social agent will be relatively easy. These simple branching pathway decision programs have existed for decades. Outlook and Gmail already have built in reporting methods for the clients, and incorporate different types of plugins for organizations easily.

There’s an argument that most cybersecurity behaviors are easy to follow. After all, how easy is it to avoid clicking on a link? The real challenge with cybersecurity is the norms of how we think of and interact with technology. Some of the norms include shaming people for falling for scams, and some include firing cybersecurity workers when a company is hacked. These norms are enforced with how technology and cybersecurity trainings are designed.

Technology shouldn’t be frustrating to use. People have emotions, and for too long cybersecurity has considered emotions superfluous or even a weakness in human behavior. But by tying psychological concepts of emotion into technology, we can encourage reports of phishing scams and help people navigate the ocean of the internet without getting phished.

References

Adolphs, R. (2002). Recognizing Emotion from Facial Expressions: Psychological and Neurological Mechanisms. Behavioral and Cognitive Neuroscience Reviews, 1(1), 21–62. https://doi.org/10.1177/1534582302001001003

Arce, I. (2003). The weakest link revisited [information security]. IEEE Security & Privacy, 1(2), 72–76. https://doi.org/10.1109/MSECP.2003.1193216

Campos, J. J., Mumme, D. L., Kermoian, R., & Campos, R. G. (1994). A functionalist perspective on the nature of emotion. Monographs of the Society for Research in Child Development, 59(2–3), 284–303. https://doi.org/10.2307/1166150

Cheong, J. H., Molani, Z., Sadhukha, S., & Chang, L. J. (2023). Synchronized affect in shared experiences strengthens social connection. Communications Biology, 6(1), 1099. https://doi.org/10.1038/s42003-023-05461-2

Kalegina, A., View Profile, Schroeder, G., View Profile, Allchin, A., View Profile, Berlin, K., View Profile, Cakmak, M., & View Profile. (2018). Characterizing the Design Space of Rendered Robot Faces. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (pp. 96–104). https://doi.org/10.1145/3171221.3171286

Khalsa, S., Rudrauf, D., & Tranel, D. (2009). Interoceptive awareness declines with age. Psychophysiology, 46(6), 1130–1136. https://doi.org/10.1111/j.1469-8986.2009.00859.x

Liaqat, M. S., Mumtaz, G., Rasheed, N., & Mubeen, Z. (2024). Exploring Phishing Attacks in the AI Age: A Comprehensive Literature Review. Journal of Computing & Biomedical Informatics, 7(02). https://www.jcbi.org/index.php/Main/article/view/567

Main, A., Walle, E. A., Kho, C., & Halpern, J. (2017). The Interpersonal Functions of Empathy: A Relational Perspective. Emotion Review, 9(4), 358–366. https://doi.org/10.1177/1754073916669440

Melzer, A., Shafir, T., & Tsachor, R. P. (2019). How do we recognize emotion from movement? Specific motor components contribute to the recognition of each emotion. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.01389

Sameen, M., Han, K., & Hwang, S. O. (2020). PhishHaven—An Efficient Real-Time AI Phishing URLs Detection System. IEEE Access, 8, 83425–83443. https://doi.org/10.1109/ACCESS.2020.2991403

Schloss, K. B., Witzel, C., & Lai, L. Y. (2020). Blue hues don’t bring the blues: Questioning conventional notions of color-emotion associations. Journal of the Optical Society of America A, 37, 813. https://doi.org/10.1364/JOSAA.383588

Sznycer, D., Tooby, J., Cosmides, L., Porat, R., Shalvi, S., & Halperin, E. (2016). Shame closely tracks the threat of devaluation by others, even across cultures. PNAS Proceedings of the National Academy of Sciences of the United States of America, 113(10), 2625–2630. https://doi.org/10.1073/pnas.1514699113

Von Preuschen, A., View Profile, Schuhmacher, M. C., View Profile, Zimmermann, V., & View Profile. (2024). Beyond fear and frustration—Towards a holistic understanding of emotions in cybersecurity. In Proceedings of the Twentieth USENIX Conference on Usable Privacy and Security (pp. 623–642). https://doi.org/10.5555/3696899.3696932

Weizenbaum, J. & View Profile. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. https://doi.org/10.1145/365153.365168

Wood, A., Lipson, J., Zhao, O., & Niedenthal, P. (2021). Forms and Functions of Affective Synchrony. In M. D. Robinson & L. E. Thomas (Eds.), Handbook of Embodied Psychology: Thinking, Feeling, and Acting (pp. 381–402). Springer International Publishing. https://doi.org/10.1007/978-3-030-78471-3_17

This report was human-generated. No text was generated or edited by an LLM.

Title was suggested by Marian Isbell