Phishing is an online identity theft technique where attackers steal users personal information, leading to financial losses for individuals and organizations. With the increasing adoption of smartphones, which provide functionalities similar to desktop computers, attackers are targeting mobile users. Smishing, a phishing attack carried out through Short Messaging Service (SMS), has become prevalent due to the widespread use of SMS-based services. It involves deceptive messages designed to extract sensitive information. Despite the growing number of smishing attacks, limited research focuses on detecting these threats. This work presents a smishing detection model using a content-based analysis approach. To address the challenge posed by slang, abbreviations, and short forms in text communication, the model normalizes these into standard forms. A machine learning classifier is employed to classify messages as smishing or ham. Experimental results demonstrate the model effectiveness, achieving classification accuracies of 97.14% for smishing and 96.12% for ham messages, with an overall accuracy of 96.20%.
As smartphone capabilities increasingly approach desktop computers, attackers have shifted their focus to mobile device users. Smishing (SMS phishing attacks) represents phishing attacks conducted through SMS services, aimed at stealing sensitive user information. Despite the exponential growth in smishing attacks, detection research targeting these threats remains relatively limited. This study proposes a content analysis-based smishing detection model that normalizes text to handle slang, abbreviations, and shorthand forms, using machine learning classifiers to distinguish between smishing and legitimate SMS messages. Experimental results demonstrate that the model achieves 97.14% classification accuracy for smishing messages, 96.12% for legitimate messages, with an overall accuracy of 96.20%.
Primary Problem: With the surge in smartphone users (projected to reach 2.87 billion by 2020), SMS has become a primary channel for attackers to conduct phishing attacks. Smishing attacks exploit users' high trust in SMS (35% of users consider SMS the most trustworthy messaging platform) for fraud.
Problem Significance:
33% of mobile users have received smishing messages
42% of mobile users click on malicious links
Smartphone users face 3 times higher risk of phishing attacks compared to desktop users
45% of users received smishing messages in 2017, representing a 2% increase from 2016
Limitations of Existing Methods:
Abundant spam SMS detection techniques exist, but research specifically targeting smishing is limited
Slang, abbreviations, and shorthand forms in text reduce classifier efficiency
Lack of effective text normalization mechanisms
Research Motivation:
Mobile device hardware limitations (small screens, lack of security indicators) increase attack success rates
Need to effectively detect smishing attacks while protecting user privacy
Proposed a comprehensive smishing security model: A two-stage detection framework based on content analysis
Innovative text normalization method: Using the NoSlang dictionary to handle slang, abbreviations, and shorthand, significantly improving classification accuracy
Comprehensive mobile phishing attack taxonomy: Systematically organized 7 major categories of mobile phishing attack methods
Excellent detection performance: Achieving 96.20% overall accuracy on public datasets
In-depth literature review: Providing comprehensive analysis of mobile phishing attacks and defense mechanisms
Input: SMS text messages
Output: Binary classification result (smishing message or ham message)
Constraints: Protect user privacy, real-time detection, high accuracy
Machine learning applications in text classification
SMS spam filtering techniques
Mobile malware detection methods
Primary references include APWG phishing reports, IEEE and ACM conference papers, and important journal articles in related fields, with authoritative and comprehensive citation coverage.
Overall Assessment: This is a practical research addressing an important security problem with certain methodological innovations and satisfactory experimental results. While technical depth is limited, it provides an effective baseline method for smishing detection with good academic and practical value.