Detecting Spam Bots on Social Network

Page created by Frederick Patterson

Arts & Entertainment

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Detecting Spam Bots on Social Network

           A. Gnanasekar1; T. Thangam2; S. Afraah Mariam3; K. Deepika4; S. Dhivya Shree5
       1
         Associate Professor, Department of CSE, R.M.D College of Engineering, Tamil Nadu, India.
                                               1
                                                ags.cse@rmd.ac.in
2
  Associate Professor, Department of ECE, PSNA College of Engineering and Technology, Tamil Nadu, India.
                                        2
                                          thangam7280@psnacet.edu.in
               3
                 Student, Department of CSE, R.M.D College of Engineering, Tamil Nadu, India.
                                             3
                                               ucs17103@rmd.ac.in
               4
                 Student, Department of CSE, R.M.D College of Engineering, Tamil Nadu, India.
                                             4
                                               ucs17125@rmd.ac.in
               5
                 Student, Department of CSE, R.M.D College of Engineering, Tamil Nadu, India.
                                             5
                                               ucs17131@rmd.ac.in

Abstract
Bots have made an appearance on social media in a variety of ways. Twitter, for instance, has been
particularly hard hit, with bots accounting for a shockingly large number of its users. These bots are
used for nefarious purposes such as disseminating false information about politicians and inflating
celebrity expectations. Furthermore, these bots have the potential to skew the results of
conventional social media research. With the multiple increases in the size, speed, and style of user
knowledge in online social networks, new methods of grouping and evaluating such massive
knowledge are being explored. Getting rid of malicious social bots from a social media site is
crucial. The most widely used methods for identifying fraudulent social bots focus on the
quantitative measures of their actions. Social bots simply mimic these choices, leading to a low level
of study accuracy. Transformation clickstream sequences and semi-supervised clustering were used
to develop a new technique for detecting malicious social bots. This method considers not only the
probability of user activity clickstreams being moved, but also the behavior's time characteristic.
The detection accuracy for various kinds of malware social bots by the detection technique assisted
transfer probability of user activity clickstreams will increase by a mean of 12.8 percent, as per
results from our research on real online social network sites, compared to the detection method
funded estimate of user behaviour.

 Key-words: Semi-Supervised Clustering, Social Media.

ISSN: 2237-0722                                                                                      850
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

1. Introduction

        Computerised systems can manage social accounts and execute corresponding procedures in
the online social network based on a set of procedures. The amount and type of user engagement on
social media platforms has increased as a result of the increased use of mobile devices. The vast
amount, pace, and variety of data provided by the massive online social network user base
demonstrates this. These social bots are used to increase the accuracy and efficiency of social media
data collection and analysis. Using the social bot SF QuakeBot in San Francisco, a real-time analysis
of earthquake-related information on social media sites is collected and a report is generated. General
public opinion regarding social networks, as well as large amounts of user data, can be mined and
shared for malicious purposes. Since computerised social bots cannot accurately reflect the true
wishes and intentions of ordinary humans, they are often regarded as malicious by the internet
community. These types of illegal deeds have the ability to jeopardise social media platforms'
security and protection. Various approaches have been used in the past to secure the security of online
social networks. Since different users have different behaviours, tastes, and online actions, such as the
way they click or type, and the speed at which they type, user conduct is the most clear indicator of
user target. In other words, we might be able to profile and classify various users by mining and
analysing information concealed in their online activity. Contextual factors, on the other hand, may
influence how a consumer behaves online. To put it another way, user behaviour is complicated, and
the environment in which it happens is always evolving, including both the externally observable
application background environment and the secret environment of user knowledge.

2. Objective

        We must obtain and examine economic situations of user behaviour and contrast to recognize
the distinctions between vicious social bots and normal users in complicated forms in order to
accurately differentiate social bots from normal users, detect vicious social bots, and minimise the
harm caused by mischievous social bots. The aim of this paper is to detect malicious social bots on
social media sites in real time.

      1. To propose transfer likelihood features between user clickstreams.
      2. To plot spatial temporal features for mischievous social bots.

ISSN: 2237-0722                                                                                      851
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

3. Literature Review

        According to recent figures, more than half of all Twitter accounts are not run by humans.
Administrators of social networking sites are well aware of these harmful practises and use their
suspension/removal mechanisms to remove these users. According to one poll, Twitter has terminated
28% of 2008 accounts and half of 2014 accounts. The role of bots in facilitating these illegal
activities is not well established. 165,000 accounts went largely unnoticed for days in one experiment.
Bots currently account for 16% of all spammers on Twitter. Bot detection in social networks has
gotten a lot of attention, despite the fact that approaches to identifying spam in social networks are
still ineffective. Bots in a botnet, for example, are able to collaborate in order to accomplish a shared
malicious purpose. Social bots, which can imitate human behaviour, have gained a lot of popularity in
social networks in recent years. They're also programmed to work together to finish the assignments.
Some users, as well as social bots, use a number of methods for malicious or sinister purposes. For
instance, social bots will 'crawl' online social networks for words and pictures that exhibit both
human and social bot features. The operating period of social bots is even more uncertain as a result
of this method. Social bots are usually smarter and more capable of imitating human behaviour, and
they are difficult to detect. Social bots are typically identified using machine learning-based methods,
such as Twitter's Bot Or Not, which was published in 2014. The random forest model is used in Bot
Or Not for both training and research, using context social data from daily users and social bot
accounts. Centered on the interaction theme and some flow behaviour, a supervised machine learning
approach for detecting social bots was suggested. The Act-M model is a time act model that focuses
on the timing of user behaviour events and can be used to measure accurately. They've been
concentrating their efforts on detecting even semi-social bots. Data tagging takes time and effort, and
it's usually not appropriate for the large data social networking world. Social bots are usually smarter
and more capable of imitating human behaviour, and they are difficult to detect. It's impossible to tell
which clusters are healthy and which are unhealthy. Approaches to detecting spam on social media
networks are also inadequate.

4. Proposed System

        In this paper, we recommend transfer likelihood functionality between user review platforms,
which are backed by social situation analytics, and we develop a spatiotemporal features-based
approach for detecting malicious social bots in real-time. We evaluate user activity attributes and

ISSN: 2237-0722                                                                                      852
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

classify transfer probability features between user clickstreams to detect malicious social bots in
online social networks. It has developed a cross social bot detection framework with space-time
features that promotes transfer likelihood features and frequency features. User behaviour features are
evaluated and select the transfer likelihood of user behaviour. Using a semi-supervised clustering
process, we can now analyse and identify situation aware user activities through social networks.
With a small number of tagged applications, we can reliably detect malicious social bots.

   1. Data Collection

        On the cyVOD network, the website platform, as well as Android and iOS apps, are all
combined. On CyVOD, a data burying point is used to obtain user clickstream actions, and
server-side information is recorded. In a real-world scenario, one must work with the web site or call
the related API to get the data (if provided).

   2. Data Cleaning

        To delete incorrect data, produce reliable transition probabilities between clickstreams, and
avoid the transition probability error caused by fewer clicks, data with fewer clicks must be cleaned.

   3. Data Processing

        Social bots are used to label data that is randomly selected from the normal user array. The
bot account is given the number -1, while the normal user account is given the number 1. Seed users
are classified according to which groups they belong to.

ISSN: 2237-0722                                                                                     853
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

4. Feature Selection

        P(play,play),   P(play,like),   P(play,feedback),   P(play,comment),     P(play,share),   and
P(play,more) are the transfer probability features according to the main function of the CyVOD
platform. Since building all user action transfer probability matrices would result in big information
sizes and sparse matrices, data detection would be more difficult.

   5. Semi Supervised Clustering

        Labeled seed users decide the initial centres of the two clusters. Unlabeled data is then
iterated and optimised for clustering performance on a continuous basis.

ISSN: 2237-0722                                                                                   854
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

6. Acquisition of Set

        By detecting, the usual user set and the social bots set can be obtained.

   7. Result Evaluation

        Precision, Recall, and F1 Score are the three measures used to test performance. Accuracy
metric used for comparison with SVM algorithm.

ISSN: 2237-0722                                                                              855
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

Prediction Recall

        The total number of documents returned as a search result divided by the total number of
current related documents is known as recall.

   Prediction Precision

        Precision can be described as the accuracy with which measurements are compared to one
another.

ISSN: 2237-0722                                                                             856
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

Prediction Accuracy

        Accuracy defines correct predictions for the test data and calculated by dividing valid
predictions total number by the valid predictions number.

5. Data Flow Diagram

        The DFD, also known as a bubble map, is a simple graphical formalism for describing a
system in terms of input data, knowledge base on that data, and output data obtained from the system.

ISSN: 2237-0722                                                                                   857
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

6. Use Case Diagram

        The product of a use-case analysis is a use-case diagram in the Unified Modeling Language
(UML). Its main goal is to show a graphical representation of a system's functionality in terms of
actors, preferences, and any dependencies between use cases. The main goal of a use case diagram is
to show the actor's capabilities.

ISSN: 2237-0722                                                                                858
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

7. Conclusion

        A new method for accurately detecting malicious social bots is presented. Results show that
the social situation analytics, which are commonly used to accurately detect malicious social bots in
online social media, is more likely to shift between user clickstreams. Researchers intend to develop
an application that can be spread through many social networking sites in the future, as well as
include more malicious social bot activities.

References

Morstatter, F., Wu, L., Nazer, T.H., Carley, K.M., & Liu, H. (2016). A new approach to bot
detection: striking the balance between precision and recall. In IEEE/ACM International Conference

ISSN: 2237-0722                                                                                  859
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

on Advances in Social Networks Analysis and Mining (ASONAM), 533-540.
   https://doi.org/10.1109/ASONAM.2016.7752287
De Lima Salge, C.A., & Berente, N. (2017). Is that social bot behaving unethically?.
Communications of the ACM, 60(9), 29-31. https://doi.org/10.1145/3126492.
Sahlabadi, M., Muniyandi, R.C., & Shukur, Z. (2014). Detecting abnormal behavior in social network
websites by using a process mining technique. Journal of Computer Science, 10(3), 393-402.
   https://doi.org/10.3844/jcssp.2014.393.402.
Brito, F., Petiz, I., Salvador, P., Nogueira, A., & Rocha, E. (2013). Detecting social-network bots
based on multiscale behavioral analysis. In Proceedings of 7th International Conference on Emerg.
Secur. Inf., Syst. Technol.(SECURWARE), 81-85.
Huang, T.K., Rahman, M.S., Madhyastha, H.V., Faloutsos, M., & Ribeiro, B. (2013). An analysis of
socware cascades in online social networks. In Proceedings of the 22nd international conference on
World Wide Web, 619-630. https://doi.org/10.1145/2488388.2488443.
Gao, H., Yang, Y., Bu, K., Chen, Y., Downey, D., Lee, K., & Choudhary, A. (2014). Spam ain't as
diverse as it seems: throttling OSN spam with templates underneath. In Proceedings of the 30th
Annual Computer Security Applications Conference, 76-85.
Ferrara, E., & Varol, O., & Davis, Clayton & Menczer, Filippo & Flammini, Alessandro. (2014). The
Rise of Social Bots. Communications of the ACM, 59(7). https://doi.org/10.1145/2818717.
Hwang, T., Pearce, I., & Nanis, M. (2012). Socialbots: Voices from the fronts. interactions, 19(2),
38-45. https://doi.org/10.1145/2090150.2090161.
Zhou, Y., Kim, D. W., Zhang, J., Liu, L., Jin, H., Jin, H., & Liu, T. (2017). Proguard: Detecting
malicious accounts in social-network-based online promotions. IEEE Access, 5, 1990-1999.
   https://doi.org/10.1109/ACCESS.2017.2654272
Zhang, Z., Li, C., Gupta, B.B., & Niu, D. (2018). Efficient compressed ciphertext length scheme
using multi-authority CP-ABE for hierarchical attributes. IEEE Access, 6, 38273-38284.
https://doi.org/10.1109/ACCESS.2018.2854600.
Cai, C., Li, L., & Zengi, D. (2017). Behavior enhanced deep bot detection in social media. In IEEE
International Conference on Intelligence and Security Informatics (ISI), 128-130.
   https://doi.org/10.1109/ISI.2017.8004887.
Chang, C.K. (2016). Situation analytics: a foundation for a new software engineering
paradigm. Computer, 49(1), 24-33. https://doi.org/10.1109/MC.2016.21.
Zhang, Z., Sun, R., Wang, X., & Zhao, C. (2017). A situational analytic method for user behavior
pattern in multimedia social networks. IEEE Transactions on Big Data, 5(4), 520-528.
   https://doi.org/10.1109/TBDATA.2017.2657623.
Jr, S.B., Campos, G.F., Tavares, G.M., Igawa, R.A., Jr, M.L.P., & Guido, R.C. (2018). Detection of
human, legitimate bot, and malicious bot in online social networks based on wavelets. ACM
Transactions on Multimedia Computing, Communications, and Applications (TOMM), 14(1s), 1-17.
https://doi.org/10.1145/3183506.
Park, J.Y., O'Hare, N., Schifanella, R., Jaimes, A., & Chung, C.W. (2015). A large-scale study of
user image search behavior on the web. In Proceedings of the 33rd Annual ACM Conference on
Human Factors in Computing Systems, 985-994. https://doi.org/10.1145/2702123.2702527.

ISSN: 2237-0722                                                                                860
Vol. 11 No. 2 (2021)
Received: 22.03.2021 – Accepted: 25.04.2021

You can also read