Mohammad Mozammal Huq, Statistics Department, Jahangirnagar University, Bangladesh
The usefulness of several machine learning models and strategies for sentiment analysis is examined in this research study. The gathered data and analysis offer insightful knowledge into the subject of sentiment analysis and its application to a significant number of unlabeled customer reviews and comments on Amazon products. To categorize the sentiment of the reviews, the paper suggests a supervised research model that includes two different feature extractors. Along with a thorough overview of pertinent literature on sentiment analysis utilizing text-based datasets, the core theory of the model, analysis techniques, and performance standards are all the experiments conducted on a small dataset yielded promising results, with an accuracy of over 82 percent achieved by the random forest model. The comparison of different data quantities using cross-validation, varied training-testing ratios, and various feature extraction methods contributed to the robustness of the findings.
Sentiment Analysis, Machine Learning, Text Classification, NLP.
Yashodeep Basnet. University of Northampton , UON, Kathmandu, Nepal
Recommender systems are pivotal in enhancing user experiences across digital platforms by providing personalized content. However, the "cold-start" problem, where systems struggle to generate meaningful recommendations for new users or items due to limited historical data, remains a significant challenge. This paper presents a novel hybrid recommender system that integrates collaborative filtering, content-based filtering, and deep learning techniques to address cold-start scenarios effectively. Utilizing the MovieLens 100K dataset, the proposed system leverages the strengths of each methodology to improve recommendation accuracy and robustness. Experimental results demonstrate that the hybrid model outperforms traditional methods, achieving lower error rates and higher precision, recall, and F1-scores, thereby validating its efficacy in handling data sparsity and enhancing user satisfaction.
Recommender systems, collaborative filtering, content-based filtering, deep learning, cold-start problem
Haseebullah Jumakhan and Lana Weiss, Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, United Arab Emirates
Abstract. Lung cancer remains a leading cause of cancer-related deaths worldwide. This study employs data mining techniques to analyze and predict lung cancer risk based on various patient attributes. Using a dataset of 1000 patients, we explore the relationships between factors such as air pollution exposure, smoking habits, and genetic risk with the likelihood of developing lung cancer. We implement and compare four machine learning models: Multinomial Logistic Regression, Random Forest, Naive Bayes, and a Neural Network. Our findings demonstrate the potential of these models in predicting lung cancer risk, with the Random Forest and Multinomial Logistic Regression models showing particularly high accuracy. This research contributes to the growing body of work on early lung cancer detection and risk assessment, potentially aiding in more timely and effective interventions.
Lung Cancer Risk Prediction, Machine Learning, Data Mining, Model Comparison, SHAP Analysis.
Hifsa Malik and Ruba Qasim, Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman P.O. Box 346, United Arab Emirates
This report represents our project of using machine learning models to forecast the performance of students. The dataset used comprises a wide range of academic, social, and demographic factors of secondary school students in Portugal, largely with regard to their Mathematics performance. Fortunately, we have data models that can be used to analyse and predict student performance. Firstly, the dataset is pre-processed through various data cleaning techniques followed by regression-based models among which are Linear Regression, Random Forest, Gradient Boosting, Support Vector Regressor, and XGBoost for grade prediction. This model will allow schools to have a predictive view of students’ performance at the beginning every year, to set an improvement plan for students.
Linear regression, random forest, feature selection, preprocessing, visualization, modelling, cross validation, outliers, standardization, SMOTE, feature importance, gradient boosting, SVR, XGboost, MAE, MSE,R2.
Michael Abobor and Darsana P. Josyula, Department of Computer Science, Bowie State University, Bowie, USA
Imbalanced datasets present significant challenges in machine learning. The disproportionate distribution of labels in imbalanced multi-label datasets is a result of the low datapoints of the minority class. This leads to biases in model predictions as algorithms tend to favor the majority class, resulting in poor generalization for the minority class. Any effort to balance the inequality within each class can inadvertently create issues across the other classes. This paper introduces the multi-view learning approach that combines pre- trained large language models and embeddings augmented with techniques such as SMOTE, MLeNN, MLSMOTE, MLSOL, and MLTL. This helps address the issue of imbalanced multi-label datasets in classification. This dual input model combines the original tokenized text, and the augmented embeddings extracted from the penultimate layer of the transformer, giving the model the ability to learn from both sources of information. This approach conserves the contextual significance of the input text and makes it possible for training transformers with the augmented embeddings thereby tackling the issue of imbalance multi-class datasets.
Imbalanced datasets, Multi-label, Transformer, Augmented Embeddings, Machine Learning.
Rasim Alguliyev and Ramiz Shikhaliyev, Ministry of Science and Education Republic of Azerbaijan, Institute of Information Technology, Baku, Azerbaijan
Cybersecurity monitoring is essential for safeguarding computer networks. However, the increasing scale, complexity, and data volume of modern networks present significant challenges for traditional monitoring methods. To address these challenges, we propose a deep learning-based method for network security monitoring. Our method integrates convolutional neural networks (CNNs) with long short-term memory (LSTM) models. Trained on the CICIDS2017 dataset, the proposed model achieved a classification accuracy of 96.76% and an error rate of 9.34%, showcasing its effectiveness in managing complex and voluminous network data.
Computer Networks, Computer Network Cybersecurity Monitoring, Deep Learning Model, CNN-LSTM Model, Network Traffic Classification
Lin Zhang, Qing Li, Yang Wang, and Jingmei Zhao, Southwestern University of Finance and Economics Chengdu, Sichuan , China
Large Language Models (LLMs) have significant reasoning capabilities and can act as agents interacting with the real world. However, they are often segmented and, unlike humans, lack integrated systems for validating their thoughts and actions. This limitation often leads LLMs to encounter “local optima” in task performance. To mitigate this problem, we propose a replanning mechanism for LLM-based agents that dynamically incorporates feedback from actions and exploits implicit information not initially available in the reasoning framework. This approach effectively bridges the gap between the cognitive and action phases of LLMs. Experimental results on real world ticket booking platforms such as Ctrip.com and Booking.com show that our method exhibits greater robustness in following clear instructions, successfully completing more steps, and achieving a higher success rate in practical applications, especially in complex tasks requiring interactive reasoning and action.
LLM, Agents, Replanning, Reaction, Logical path reasoning.
Anwur Alenezi1, Abouelyazed Kuliab2 and Yousef Alabaiadly3, 1Water Research Centre, Kuwait Institute for Scientific Research, Safat 13109, Kuwait, 2New & Renewable Energy Authority, Ministry of Electricity & Renewable Energy, Egypt, 3Studies and Research Department, Ministry of Electricity and Water, Ministries Zone 12010, Kuwait
This paper investigates the feasibility of recovering energy from exhaust air of air conditioning central plants experimentally. A mini vertical axis small wind turbine is connected directly with the exhaust air of condenser fan. The exhaust air energy recovery unit includes an air rotor-blades with generator. The electricity produced from the recovery unit is based on the value of fan speed and its air flow rate, CFM. The exhaust air from different central types of A/C systems is measured experimentally. As example, the speed of air exhausted in a central package unit with capacity of 5 ton refrigeration (TR) is reached to 8-15 m/s. The wind speed levels play significant effect on the performance of wind turbine. a small vertical wind turbine was installed on the exhaust of air flowing from the condenser fan of central package 10-ton refrigeration (TR). The proposed exhaust air energy recovery unit is produced to 35-40% of the total energy consumption in a building having an A/C plant.
Energy recovery, wind turbine, exhaust air, HVAC systems.
Merita Bakiji, Faculty of Contemporary Sciences and Technologies , South East European University , Tetovo, North Macedonia
As a result of the great boom experienced by global business, rapid technological developments, IT Outsourcing came as a result of organizations attempts to reduce operational costs and increase efficiency through external expertise.Through this study, it is intended to explore the current models of IT Outsourcing, detailing their sustainability and suitability in different market environments.This goal is attempted to be achieved by relying on a comprehensive summary of existing literature, articles and existing studies on IT Outsourcing, industry reports, consultancy reports, technological trends and their impacts on the market.The study also analyzes the IT Outsourcing industry map in the Republic of North Macedonia revealing the IT Outsourcing market and trends.By synthesizing existing research and data, this paper presents a valuable resource for decision makers in IT outsourcing, by providing practical recommendations that can serve organizations that are constantly trying to adapt to with rapidly changing market conditions.
IT Outsourcing, Artificial Intelligence, Market Trends, North Macedonia.
Sikha S. Bagui1, Dustin Mink2, Subhash C. Bagui3, Jadarius Hill1, Farooq Mahmud1 and Michael Plain13, 1Department of Computer Science, University of West Florida, Pensacola, Florida, USA, 2Department of Cybersecurity, University of West Florida, Pensacola, Florida, USA, 3Department of Mathematics and Statistics, University of West Florida, Pensacola, Florida, USA
This study uses motif-based graph databases to classify tactics in the MITRE ATT&CK framework. Machine Learning classification models capable of detecting Reconnaissance network attack tactics, labelled as per the MITRE ATT&CK framework, are created for the newly created UWF-ZeekData22 dataset. The work analyzes Zeek Connection logs. Feature selection is performed using graph motifs. Results show that model performance can be increased using various network graph motifs. Upon completion of this work, it was concluded that the most important feature for predicting Reconnaissance network attacks within the Zeek Connection Logs dataset was the “From” feature, which represents the network address from where the connection is originating. It was also determined that, irrespective of which motif was used to train the model, the Decision Tree algorithm performed best.
Graph Databases, Motifs, Reconnaissance, Machine Learning, Cybersecurity.
N Chitha, O R Mnyaka; J T Thabethe, N Ntsele, S Nomatshila; W Chitha, NV Khosa, and R Tshabalala, Department of Public Health, Faculty of Medicine and Health Science, University of Walter Sisulu, Mthatha, South Africa
Information is crucial tool for nurses, and how they acquire and use it is key to their performance. Professional nurses need information that is accessible, good quality, up-to- date, manageable, and relevant, as well as information services that assist in finding that information. This study assessed the most reliable information sources nurses use to make clinical decisions at the point of care. A quantitative cross-sectional survey was conducted in four referral hospitals across four South African provinces, Mpumalanga, Limpopo, Eastern, and Northern Cape, between May and July 2022. The hospitals were identified using simple random sampling. Stratified random sampling was utilised to select nurses within the hospitals. Data were entered into Microsoft Excel and analysed using STATA version 17 and SPSS version 26. Nurses mostly relied on nursing colleagues (86.8%, 362/417) or doctors (78.4%, 309/394) for information whilst they sometimes consulted protocols or guidelines (63.5%, 247/389).
Reliable Information, Information Sources, Nurses Information, Health Information.
Copyright © CORAJ 2024