2nd International Conference on Operations Research and Applications (CORAJ 2024)

December 28 ~ 29, 2024, Dubai, UAE

Accepted Papers


Sentiment Analysis Using Various Machine Learning Models and Techniques

Mohammad Mozammal Huq, Statistics Department, Jahangirnagar University, Bangladesh

ABSTRACT

The usefulness of several machine learning models and strategies for sentiment analysis is examined in this research study. The gathered data and analysis offer insightful knowledge into the subject of sentiment analysis and its application to a significant number of unlabeled customer reviews and comments on Amazon products. To categorize the sentiment of the reviews, the paper suggests a supervised research model that includes two different feature extractors. Along with a thorough overview of pertinent literature on sentiment analysis utilizing text-based datasets, the core theory of the model, analysis techniques, and performance standards are all the experiments conducted on a small dataset yielded promising results, with an accuracy of over 82 percent achieved by the random forest model. The comparison of different data quantities using cross-validation, varied training-testing ratios, and various feature extraction methods contributed to the robustness of the findings.

Keywords

Sentiment Analysis, Machine Learning, Text Classification, NLP.


A Hybrid Recommender System Integrating Collaborative Filtering, Content-based Filtering, and Deep Learning Techniques for Cold Start Scenarios

Yashodeep Basnet. University of Northampton , UON, Kathmandu, Nepal

ABSTRACT

Recommender systems are pivotal in enhancing user experiences across digital platforms by providing personalized content. However, the "cold-start" problem, where systems struggle to generate meaningful recommendations for new users or items due to limited historical data, remains a significant challenge. This paper presents a novel hybrid recommender system that integrates collaborative filtering, content-based filtering, and deep learning techniques to address cold-start scenarios effectively. Utilizing the MovieLens 100K dataset, the proposed system leverages the strengths of each methodology to improve recommendation accuracy and robustness. Experimental results demonstrate that the hybrid model outperforms traditional methods, achieving lower error rates and higher precision, recall, and F1-scores, thereby validating its efficacy in handling data sparsity and enhancing user satisfaction.

Keywords

Recommender systems, collaborative filtering, content-based filtering, deep learning, cold-start problem


Exploring High-accuracy Lung Cancer Risk Prediction: a Multi-model Approach with Shap Interpretation

Haseebullah Jumakhan and Lana Weiss, Artificial Intelligence Research Center (AIRC), Ajman University, Ajman, United Arab Emirates

ABSTRACT

Abstract. Lung cancer remains a leading cause of cancer-related deaths worldwide. This study employs data mining techniques to analyze and predict lung cancer risk based on various patient attributes. Using a dataset of 1000 patients, we explore the relationships between factors such as air pollution exposure, smoking habits, and genetic risk with the likelihood of developing lung cancer. We implement and compare four machine learning models: Multinomial Logistic Regression, Random Forest, Naive Bayes, and a Neural Network. Our findings demonstrate the potential of these models in predicting lung cancer risk, with the Random Forest and Multinomial Logistic Regression models showing particularly high accuracy. This research contributes to the growing body of work on early lung cancer detection and risk assessment, potentially aiding in more timely and effective interventions.

Keywords

Lung Cancer Risk Prediction, Machine Learning, Data Mining, Model Comparison, SHAP Analysis.


Predicting Student Grades

Hifsa Malik and Ruba Qasim, Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman P.O. Box 346, United Arab Emirates

ABSTRACT

This report represents our project of using machine learning models to forecast the performance of students. The dataset used comprises a wide range of academic, social, and demographic factors of secondary school students in Portugal, largely with regard to their Mathematics performance. Fortunately, we have data models that can be used to analyse and predict student performance. Firstly, the dataset is pre-processed through various data cleaning techniques followed by regression-based models among which are Linear Regression, Random Forest, Gradient Boosting, Support Vector Regressor, and XGBoost for grade prediction. This model will allow schools to have a predictive view of students’ performance at the beginning every year, to set an improvement plan for students.

Keywords

Linear regression, random forest, feature selection, preprocessing, visualization, modelling, cross validation, outliers, standardization, SMOTE, feature importance, gradient boosting, SVR, XGboost, MAE, MSE,R2.


Multi-view Approach with Transformer Models and Augmented Embeddings for Tackling Imbalanced Multi-label Datasets.

Michael Abobor and Darsana P. Josyula, Department of Computer Science, Bowie State University, Bowie, USA

ABSTRACT

Imbalanced datasets present significant challenges in machine learning. The disproportionate distribution of labels in imbalanced multi-label datasets is a result of the low datapoints of the minority class. This leads to biases in model predictions as algorithms tend to favor the majority class, resulting in poor generalization for the minority class. Any effort to balance the inequality within each class can inadvertently create issues across the other classes. This paper introduces the multi-view learning approach that combines pre- trained large language models and embeddings augmented with techniques such as SMOTE, MLeNN, MLSMOTE, MLSOL, and MLTL. This helps address the issue of imbalanced multi-label datasets in classification. This dual input model combines the original tokenized text, and the augmented embeddings extracted from the penultimate layer of the transformer, giving the model the ability to learn from both sources of information. This approach conserves the contextual significance of the input text and makes it possible for training transformers with the augmented embeddings thereby tackling the issue of imbalance multi-class datasets.

Keywords

Imbalanced datasets, Multi-label, Transformer, Augmented Embeddings, Machine Learning.


Computer Networks Cybersecurity Monitoring based on CNN-LSTM Model

Rasim Alguliyev and Ramiz Shikhaliyev, Ministry of Science and Education Republic of Azerbaijan, Institute of Information Technology, Baku, Azerbaijan

ABSTRACT

Cybersecurity monitoring is essential for safeguarding computer networks. However, the increasing scale, complexity, and data volume of modern networks present significant challenges for traditional monitoring methods. To address these challenges, we propose a deep learning-based method for network security monitoring. Our method integrates convolutional neural networks (CNNs) with long short-term memory (LSTM) models. Trained on the CICIDS2017 dataset, the proposed model achieved a classification accuracy of 96.76% and an error rate of 9.34%, showcasing its effectiveness in managing complex and voluminous network data.

Keywords

Computer Networks, Computer Network Cybersecurity Monitoring, Deep Learning Model, CNN-LSTM Model, Network Traffic Classification


Getting LLM to Think and Act Like a Human Being: Logical Path Reasoning and Replanning

Lin Zhang, Qing Li, Yang Wang, and Jingmei Zhao, Southwestern University of Finance and Economics Chengdu, Sichuan , China

ABSTRACT

Large Language Models (LLMs) have significant reasoning capabilities and can act as agents interacting with the real world. However, they are often segmented and, unlike humans, lack integrated systems for validating their thoughts and actions. This limitation often leads LLMs to encounter “local optima” in task performance. To mitigate this problem, we propose a replanning mechanism for LLM-based agents that dynamically incorporates feedback from actions and exploits implicit information not initially available in the reasoning framework. This approach effectively bridges the gap between the cognitive and action phases of LLMs. Experimental results on real world ticket booking platforms such as Ctrip.com and Booking.com show that our method exhibits greater robustness in following clear instructions, successfully completing more steps, and achieving a higher success rate in practical applications, especially in complex tasks requiring interactive reasoning and action.

Keywords

LLM, Agents, Replanning, Reaction, Logical path reasoning.


Feasibility of Energy Recovery From Exhausted Air of Hvac Systems

Anwur Alenezi1, Abouelyazed Kuliab2 and Yousef Alabaiadly3, 1Water Research Centre, Kuwait Institute for Scientific Research, Safat 13109, Kuwait, 2New & Renewable Energy Authority, Ministry of Electricity & Renewable Energy, Egypt, 3Studies and Research Department, Ministry of Electricity and Water, Ministries Zone 12010, Kuwait

ABSTRACT

This paper investigates the feasibility of recovering energy from exhaust air of air conditioning central plants experimentally. A mini vertical axis small wind turbine is connected directly with the exhaust air of condenser fan. The exhaust air energy recovery unit includes an air rotor-blades with generator. The electricity produced from the recovery unit is based on the value of fan speed and its air flow rate, CFM. The exhaust air from different central types of A/C systems is measured experimentally. As example, the speed of air exhausted in a central package unit with capacity of 5 ton refrigeration (TR) is reached to 8-15 m/s. The wind speed levels play significant effect on the performance of wind turbine. a small vertical wind turbine was installed on the exhaust of air flowing from the condenser fan of central package 10-ton refrigeration (TR). The proposed exhaust air energy recovery unit is produced to 35-40% of the total energy consumption in a building having an A/C plant.

Keywords

Energy recovery, wind turbine, exhaust air, HVAC systems.


A Survey Paper Exploring It Outsourcing Models and Market Trends

Merita Bakiji, Faculty of Contemporary Sciences and Technologies , South East European University , Tetovo, North Macedonia

ABSTRACT

As a result of the great boom experienced by global business, rapid technological developments, IT Outsourcing came as a result of organizations attempts to reduce operational costs and increase efficiency through external expertise.Through this study, it is intended to explore the current models of IT Outsourcing, detailing their sustainability and suitability in different market environments.This goal is attempted to be achieved by relying on a comprehensive summary of existing literature, articles and existing studies on IT Outsourcing, industry reports, consultancy reports, technological trends and their impacts on the market.The study also analyzes the IT Outsourcing industry map in the Republic of North Macedonia revealing the IT Outsourcing market and trends.By synthesizing existing research and data, this paper presents a valuable resource for decision makers in IT outsourcing, by providing practical recommendations that can serve organizations that are constantly trying to adapt to with rapidly changing market conditions.

Keywords

IT Outsourcing, Artificial Intelligence, Market Trends, North Macedonia.


Machine Learning Classification Using Motif Based Graph Databased Created From Uwf-zeekdata22

Sikha S. Bagui1, Dustin Mink2, Subhash C. Bagui3, Jadarius Hill1, Farooq Mahmud1 and Michael Plain13, 1Department of Computer Science, University of West Florida, Pensacola, Florida, USA, 2Department of Cybersecurity, University of West Florida, Pensacola, Florida, USA, 3Department of Mathematics and Statistics, University of West Florida, Pensacola, Florida, USA

ABSTRACT

This study uses motif-based graph databases to classify tactics in the MITRE ATT&CK framework. Machine Learning classification models capable of detecting Reconnaissance network attack tactics, labelled as per the MITRE ATT&CK framework, are created for the newly created UWF-ZeekData22 dataset. The work analyzes Zeek Connection logs. Feature selection is performed using graph motifs. Results show that model performance can be increased using various network graph motifs. Upon completion of this work, it was concluded that the most important feature for predicting Reconnaissance network attacks within the Zeek Connection Logs dataset was the “From” feature, which represents the network address from where the connection is originating. It was also determined that, irrespective of which motif was used to train the model, the Decision Tree algorithm performed best.

Keywords

Graph Databases, Motifs, Reconnaissance, Machine Learning, Cybersecurity.


Reliable Information Sources Consulted by Nurses at the Point of Care in Four Selected South African Referral Hospitals

N Chitha, O R Mnyaka; J T Thabethe, N Ntsele, S Nomatshila; W Chitha, NV Khosa, and R Tshabalala, Department of Public Health, Faculty of Medicine and Health Science, University of Walter Sisulu, Mthatha, South Africa

ABSTRACT

Information is crucial tool for nurses, and how they acquire and use it is key to their performance. Professional nurses need information that is accessible, good quality, up-to- date, manageable, and relevant, as well as information services that assist in finding that information. This study assessed the most reliable information sources nurses use to make clinical decisions at the point of care. A quantitative cross-sectional survey was conducted in four referral hospitals across four South African provinces, Mpumalanga, Limpopo, Eastern, and Northern Cape, between May and July 2022. The hospitals were identified using simple random sampling. Stratified random sampling was utilised to select nurses within the hospitals. Data were entered into Microsoft Excel and analysed using STATA version 17 and SPSS version 26. Nurses mostly relied on nursing colleagues (86.8%, 362/417) or doctors (78.4%, 309/394) for information whilst they sometimes consulted protocols or guidelines (63.5%, 247/389).

Keywords

Reliable Information, Information Sources, Nurses Information, Health Information.