Legitimate or not : learning the status of online pharmacies
- Author
- Muthupandi, Sowmyasri
- Published
- [University Park, Pennsylvania] : Pennsylvania State University, 2016.
- Physical Description
- 1 electronic document
- Additional Creators
- Kumara, Soundar T., 1952-
Access Online
- etda.libraries.psu.edu , Connect to this object online.
- Graduate Program
- Restrictions on Access
- Open Access.
- Summary
- Previous research has indicated the presence of illegitimate online pharmacies on the World Wide Web, which may be involved in sales of counterfeit or substandard products which have the potential for drug abuse. This thesis focuses on studying the relative usage of legitimate and illegitimate pharmacies and developing as automated classification system by mining web data. The list of safe and rogue pharmacies are identified from National Association of Board of Pharmacies (NABP) and web data is obtained using Similarweb and SEMrush. The referral data is used to develop the classification model. Along with an intuitive algorithm, Rating Method (RM), K-Nearest Neighbor (KNN), Linear Discriminant Analysis (LDA), Logistic Regression (LR) and Support Vector Machines (SVM) models are developed and validates using leave-one-out cross validation on a dataset with 157 samples 30 legitimate and 157 illegitimate pharmacies. RM had better accuracy (95.42%), kappa (0.8635), and specificity (96.67%) compared to other models. KNN had the highest sensitivity (99.2%) and second highest accuracy (92.2%) and kappa (0.715). KNN and RM were implemented on a larger dataset with 50 legitimate and 1136 illegitimate pharmacies. It was observed that KNN performed better than RM on the larger dataset with accuracy, kappa, sensitivity and specificity values 98.73%, 0.8085, 100%, 68.75% respectively. This thesis takes into account the static data, a snapshot of the World Wide Web to classify online pharmacies. Future research can consider dynamic data, changes in referring websites of the online pharmacies to improving the classification. Also, products sold in the pharmacies and the anomalies can be used to increase the granularity of the classification model.
- Other Subject(s)
- Genre(s)
- Dissertation Note
- M.S. Pennsylvania State University, 2016.
- Technical Details
- The full text of the dissertation is available as an Adobe Acrobat .pdf file ; Adobe Acrobat Reader required to view the file.
View MARC record | catkey: 18372746