Utilizing machine learning for the assessment of mosquito repellent effectiveness and decision support in product selection

Dharmendra Kumar; Ankit Verma; Mukesh Kumar; Vipin Maurya; Ashutosh Mishra

doi:10.22712/susb.20230040

Preview

General Article

International Journal of Sustainable Building Technology and Urban Development. 30 December 2023. 519-533
https://doi.org/10.22712/susb.20230040

Utilizing machine learning for the assessment of mosquito repellent effectiveness and decision support in product selection

Dharmendra Kumar¹

Ankit Verma²

Mukesh Kumar³

Vipin Maurya⁴

Ashutosh Mishra⁵⁶^*

¹Assistant Professor, Department of Electrical and Instrumentation Engineering, Thapar Institute of Engineering and Technology (TIET), Patiala, Punjab, India

²Researcher, Department of Electronics Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India

³Assistant Professor, School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT) Deemed to be University, Bhubaneswar, Odisha, India

⁴Researcher, Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India

⁵Research Professor, School of Integrated Technology, Yonsei University, South Korea

⁶Adjunct Professor, Department of Electronics & Communication Engineering, Graphic Era Deemed to be University, Dehradun, 248002, Uttarakhand, India

^{*Corresponding Author}

ABSTRACT

The significance of effective insect repellents is underscored by the substantial threat posed by mosquito-borne diseases to global health. Machine learning (ML) algorithms have garnered attention as a potential approach for classifying insect repellents and evaluating their effectiveness. In this study, we explore the application of decision tree (DT), random forest (RF), logistic regression (LR), K-Nearest Neighbor (KNN), Naive Bayes (NB), AdaBoost (AB), Multilayer Perceptron (MLP), and Support Vector Machine (SVM) algorithms for categorizing insect repellents. ML techniques are employed to classify mosquito repellents by training various ML classification models using a labeled dataset in which each repellent is assigned a specific effectiveness level. These algorithms can identify patterns and determine the efficacy of a particular repellent by analyzing a range of features such as chemical composition, concentration, duration of protection, and environmental conditions. E-Nose technology offers several advantages, including high sensitivity, selectivity, rapid analysis, non-invasiveness, portability, extensive data analysis capabilities, cost-effectiveness, and versatility. In this research, we propose the use of a non-selective gas sensor array comprising eight MQ series elements, combined with ML analytics, to distinguish the type, concentration, and application of four different types of mosquito repellents commonly used in real-world scenarios. The sensor node prototype was exposed to each of the potential mosquito repellents for 15 to 20 minutes individually to create the experimental dataset. Subsequently, various ML models were trained using this dataset to accurately classify unknown data samples. SVM and MLP models achieved model accuracy scores of 98.01% and 98.10%, respectively, while DT, RF, and KNN models achieved training accuracies of 98.99%, 98.98%, and 98.61%, respectively. Classification performance error was also assessed using Mean Absolute Error (MAE), R2 Score, and Mean Squared Error (MSE).

Keywords

artificial intelligence

decision support systems

E-Nose

intelligent systems

sensors

sustainable engineering

MAIN

Introduction
Experimental Setup for Data Collection
Gas Sensor
Data Collection
Results and Discussion
Confusion Matrices (CM)
Logistic Regression (LR) Classifier
Decision Tree (DT) Classifier
Random Forest (RF) Classifier
K-Nearest Neighbour (KNN) Classifier
Naïve Bayes (NB) Classifier
Ada Boost (Adaptive Boosting) Classifier
Multilayer Perceptron (MLP) Classifier
Support Vector Machine (SVM) Classifier
Conclusion

Introduction

Mosquitoes are notorious for transmitting various diseases such as the Zika virus, dengue fever, and malaria. Therefore, it is crucial to develop and effectively utilize insect repellents to protect people from mosquito-borne illnesses. An increasing number of individuals are exploring the application of learning algorithms to categorize and evaluate the efficiency of different insect repellents, as discussed in references [1, 2]. These algorithms employ labeled datasets to classify mosquito repellents based on their effectiveness, discerning patterns and predicting the efficacy of each repellent by analyzing features like chemical composition, concentration, duration of protection, and environmental conditions [3]. Many researchers use these parameters to classify and quantify repellents, as summarized in references [4, 5, 6].

Burning indoor mosquito coils is a common practice in numerous households across Asia, Africa, and South America for repelling mosquitoes effectively. However, the smoke produced by these coils may contain harmful pollutants [7]. An experiment was conducted to analyze emissions from two popular Malaysian brands and four Chinese brands of mosquito coils. The researchers used mass balance equations to calculate emission rates of fine particles (PM2.5), polycyclic aromatic hydrocarbons, aldehydes, and ketones. They applied these emission rates to predict indoor pollutant concentrations under realistic room conditions and found that burning mosquito coils could lead to pollutant concentrations exceeding air quality limits. Notably, Malaysian mosquito coils produced more pollutants than their Chinese counterparts under the same combustion conditions. To repel mosquitoes and purify the air, one researcher proposes an innovative method in which a sensor detects mosquito presence and adjusts the release of herbal fumes accordingly. This approach outperformed other commercial mosquito repellents in terms of effectiveness [8]. In reference [9], authors developed a system that uses an Arduino UNO powered by a rechargeable 12V battery and a solar panel to control a mosquito-repelling device in a 125 square meter area. It is essential to consider the safety of chemical disinfectants, as they can pose risks to humans and the environment when misused [10, 11].

Climate change has had a significant impact on the environment, affecting flora and fauna, as discussed in reference [12]. Successful gas and odor classification and quantification methods have been deployed, as detailed in the reference [13]. Additionally, a real-time monitoring system is needed to ensure the quality and presence of disinfection agents in public sterilization chambers [14]. Analog outputs obtained from these chambers are recorded in an Excel sheet and analyzed using various machine learning (ML) techniques. Support Vector Machine (SVM) algorithms have shown the highest accuracy, sensitivity, and specificity, with values of 91.5%, 91.08%, and 90.90%, respectively [15]. To address fast detection, reference [16] suggests an attention-based gated recurrent unit (AGRU) and thoroughly explains the mechanism. The study evaluates the model’s performance using metrics such as accuracy, root mean square error (RMSE), model parameters, and floating-point operations (FLOPs). This approach is also applied to compare the performance of mosquito repellents [17]. For monitoring hazardous gases, insight monitoring is discussed in references [18, 19]. Greenhouse gas emissions from penetration resist agents are assessed, with carbon dioxide emissions found to be the most significant contributor [20]. In the context of bridge cycle maintenance, environmental impact prediction employs greenhouse gas emission analysis and a database, as noted in references [21, 22].

In classification tasks for real-time systems with limited hardware resources, researchers often prioritize classifiers that are easy to implement and quick to execute, as explained in references [4, 23]. A key focus of their work is the comparison of two straightforward and fast classifiers, namely the Naive Bayes classifier and the Cogent confabulation-based classifier [3]. The ultimate goal of ML and pattern recognition (PR) is to create classification terms that are easily comprehensible to humans. Reference [24] proposes the use of the Microwave and K* Algorithm to estimate the alcohol content of liquids. Here is a summary of the contributions made by this article:

1.We designed a gas sensor-array system using an E-Nose for data collection purposes.

2.We conducted comparisons and testing of different ML algorithms to assess their performance accuracy.

3.We also evaluated the performance of error metrics and examined confusion matrices for comparison.

The remainder of this paper is organized as follows: Section 2 provides details about the experimental setup, data collection procedures, the design of the E-Nose system, and the subsequent data analysis. Section 3 presents the analysis results of different ML algorithms, including accuracy, confusion matrix, and other model performance parameters. It also discusses data categorization using various ML classification algorithms and the data acquisition process using a sensor module. Finally, Section 4 summarizes the conclusions. This study focuses on the utilization of an E-Nose-based Sensor system and the development of multiple machine-learning algorithms to collect data from eight MQ sensors and one Digital Temperature Humidity sensor (DHT22). Subsequently, the study involves the categorization of gases/odors.

Experimental Setup for Data Collection

The E-Nose system incorporates a gas sensor array containing anywhere from four to sixteen sensors. As detailed in references [13, 25], essential components of every E-Nose system include data pre-processing, classification, and pattern recognition (PR). In a comparative analysis with a single electronic nose, single electronic tongue, and combined electronic nose and tongue systems, the synaesthesia model yielded the highest accuracy, Kappa coefficients, and F1 scores for identifying beer and apple samples [26, 27].

The E-Nose system used for data collection consists of an ESP8266 module, eight metal-oxide gas-detecting sensors, and a temperature and humidity sensor. The ESP8266 is favored for its cost-effectiveness and Wi-Fi connectivity capabilities, making it a popular choice for Internet of Things (IoT) projects. It is a highly integrated system-on-a-chip (SoC) that combines a microcontroller unit (MCU) with Wi-Fi functionality, enabling seamless device-to-internet connections and facilitating Wi-Fi-based project development. The dataset is gathered using eight metal-oxide gas-detecting sensors along with temperature and humidity measurements, as illustrated in Figure 1. The data collection setup includes a DHT22 sensor, as well as the sensors MQ2, MQ3, MQ4, MQ5, MQ6, MQ8, MQ9, and MQ135. We designed and constructed the experimental setup ourselves, as depicted in Figure 1.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F1.jpg

Figure 1.

Configuration for gathering experimental data.

We employed an eight-channel analog multiplexer to collect input from all eight gas sensors and connected it to the ESP8266 module. Additionally, we directly interfaced the temperature and humidity (DHT22) sensor with the ESP module, as illustrated in Figure 2. In our research, we simultaneously gathered a dataset using both Temperature and Humidity sensors (DHT-22) and eight MQ gas sensors. Figure 2 provides visual representations of the DHT-22 and the other sensors, aiding the reader in better understanding the sensors utilized in this study. The utilization of an 8-channel multiplexer in our system offers increased input flexibility, reduced hardware complexity, cost-efficiency, enhanced signal integrity, scalability, efficient data acquisition, compatibility, and energy efficiency, making it a favourable component in systems requiring multiple input selections and getting desired data.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F2.jpg

Figure 2.

Connections at the block level for gathering data.

Gas Sensor

To generate electrical signals, gas sensors differentiate between the presence of various gases. When it comes to gas sensors, those utilizing Metal Oxide Semiconductor (MQ) technology stand out as the optimal choice due to their compact size, cost-effectiveness, rapid responsiveness, and extended lifespan. The selectivity of the sensor is provided in Table 1. We used only the MQ Sensor of different versions like MQ2, MQ3, MQ5, etc. Numerous experiments have been conducted using these sensors, exploring their sensitivity as detailed in references [28, 29].

Table 1.

Sensor and specific gas

Gas Sensor	Target Gases
MQ2	Methane / Butane
MQ3	Alcohol, Ethanol, and Smoke
MQ4	Methane / CNG gas
MQ5	Carbon Monoxide (CO) / Natural Gas
MQ6	LPG / Butane
MQ8	Hydrogen Gas (H₂)
MQ9	LPG / Butane
MQ135	Air Quality (CO, Benzene)
DHT22	Temperature and Humidity

In the realm of gas analysis, a recent development is the Multimodal Gas Data, which consists of concurrent data samples acquired using a thermal imaging camera in conjunction with seven distinct MQ gas-detecting sensors, as elaborated in [30]. The sensor array facilitates real-time monitoring and analysis of dynamic systems or environments. It provides continuous data streams that can be analysed instantly, enabling prompt decision-making. Different sensors in an array capture various parameters that lead to better gas/ odor classification. This diversity in data collection enables comprehensive monitoring, capturing multiple facets of the monitored environment or system simultaneously. The responses of these gas sensors have been systematically characterized by several researchers, with select findings presented in references [31, 32, 33, 35].

Data Collection

In this study, we utilized a variety of commercially available insect repellents, such as coils, sprays, incense sticks, and liquids, to assemble our dataset. During the experiment, multiple gas sensors were positioned 1 mm apart from each other and exposed to various compounds. The system was operated for 25 minutes, allowing it to be exposed to these substances, resulting in the collection of data for all the different classes. In this experiment, we used four mosquito repellents to create four classes and one class is for air. Good Knight fast card, Hit mosquito repellent, incense sticks, and mosquito coil.

To begin, we obtained baseline data by running the machine in a fresh-air environment. Subsequently, we collected data for Good Knight fast card, Hit mosquito repellent, incense sticks, and mosquito coil in a manner similar to that used for Fast Card. For instance, data for Good Knight Fast Card was acquired by burning it in close proximity to the sensors, while data for Hit Mosquito Repellent was obtained by spraying it at regular intervals toward the sensor array. We partitioned the gas dataset into two sets, with 70% of the data reserved for model training and the remaining 30% for model testing. Figure 3 provides a comprehensive depiction of all the data collection procedures and methods.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F3.jpg

Figure 3.

Collecting data on (a) rapid card, (b) coil, (c) fragrant incense sticks, and (d) liquidator.

A. Data Standardization

Due to variations in sensitivity among the sensors, they produced outputs within different ranges, despite all having an output range of 0 to 1023. To address this variability, we standardized the features by transforming the mean to zero and the variance to one. The calculation of the standard score is outlined in Equation 1.

(1)

z = \frac{x - μ}{σ}

B. Building Model for Classification

To construct the model, we first partitioned our dataset into training and testing subsets, opting for a 70:30 ratio. Prior to this division, we ensured the dataset was thoroughly shuffled, guaranteeing that both the training and testing sets would contain a well-balanced representation of all data types. For training purposes, we partitioned the dataset into two sets, with 70% of the data reserved for model training and the remaining 30% for model testing. Only 70% of the data were used for training purposes and rest for the testing.

Results and Discussion

We evaluated the performance of eight different ML algorithms on our training set by comparing their Mean Absolute Error (MAE), Mean Squared Error (MSE), and R2 Score. Among these eight algorithms, the MLP and SVM algorithms demonstrated the lowest error, making them the most suitable choices for our classification problem. Additionally, both the MLP and SVM achieved high accuracy scores of 98.10 and 98.01, respectively. We also examined the Confusion Matrix for each classifier.

Our dataset consists of responses from eight MQ sensors in an E-Nose, recording the reactions of these sensors. Figure 4 illustrates the sample responses obtained from the considered sensors (using histogram plot). This dataset includes 3700 examples of averaged sensor responses, collected from various tin oxide gas sensors placed in a room. The target values range from 0 to 4, corresponding to different sensor outputs. The dataset comprises 3700 entries, numbered from 0 to 3700, and has 10 data columns.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F4.jpg

Figure 4.

Data from all sensors are visualized through histograms.

The E-Nose used in this study incorporates multiple non-selective MQ sensors connected in series. The data was collected using our custom-developed DMS (Data Measurement System) designed exclusively for this research. The entire programming and model-building process was carried out on an ML platform, where we imported the sci-kit-learn (sklearn) module for linear regression (LR), decision tree (DT), random forest (RF), k-nearest neighbors (KNN), Naive Bayes (NB), AdaBoost (AB), MLP, and SVM. We calculated various performance metrics to assess the classifiers’ performance.

Confusion Matrices (CM)

A Confusion Matrix (CM) is a square matrix that compiles the model’s anticipated and real classifications when applied to a test dataset. This matrix offers valuable insights into the model’s performance by breaking down the various types of correct and incorrect predictions it makes. In the context of a specific study on predicting energy efficiency for CO₂ performance in a building [34], the CM serves as a straightforward tool to evaluate the classification algorithm’s effectiveness. Additionally, it plays a pivotal role in determining essential parameters like precision score, recall score, and F1 score, as depicted in equations 2 to 4. Numerous researchers have leveraged smart sensors and machine-learning models for informed decision-making [35, 36, 37, 38, 39]. In [35], the author demonstrated the CM’s utility in appraising the model’s performance.

In classification tasks, the primary function of a CM is to furnish a comprehensive and lucid summary of a classification model’s performance. It assumes the form of a table, aiding in the assessment of a classifier’s accuracy and efficacy by revealing how closely the model’s predictions align with the actual class labels in the dataset. The CM proves particularly beneficial for discerning the types of errors made by a classifier and for gauging its overall performance. Typically, a CM for a binary classification task comprises four primary components, as outlined below:

True Positives (TP):These represent cases where the model made accurate predictions for the positive class. Put simply, the real class was positive, and the model correctly identified it as such.

True Negatives (TN): In these instances, the model accurately predicted the negative class. The actual class was indeed negative, and the model correctly classified it as such.

False Positives (FP):Also known as Type I errors or “false alarms,” these are situations where the model incorrectly classified cases as the positive class when they were actually part of the negative class.

False Negatives (FN):Termed as Type II errors, these are scenarios where the model inaccurately predicted the negative class when, in reality, the cases belonged to the positive class.

The accuracy of the model, encompassing both true positives (TP) and true negatives (TN), can be described as the proportion of accurately predicted instances among all instances. Precision, denoted as TP / (TP + FP), gauges the model’s ability to avoid false positives, offering insight into its accuracy. Recall, also known as sensitivity or true positive rate, is the proportion of true positives to all actual positives (TP / (TP + FN)), indicating the model’s ability to detect all relevant events. The F1-score, a harmonic mean of precision and recall, strikes a balance between these two metrics, particularly valuable when dealing with imbalanced class distributions in the dataset.

Specificity, expressed as the genuine negative rate (TN / (TN + FP)), quantifies the model’s capacity to identify all relevant non-positive instances. On the other hand, the false positive rate (FPR), calculated as FP / (TN + FP), measures the model’s propensity to generate false alarms, specifically in relation to actual negatives.

(2)

Precision Score = \frac{TP}{TP + FP}

(3)

Recall Score = \frac{TP}{TP + FN}

(4)

F 1 Score = 2 * \frac{pre cision * recall}{pre cision + recall}

Observations from Figures 5(a), (b), (c), (d), and Figures 6(e), (f), (g), (h) reveal the performance of the classifier employed in this study in terms of its classification accuracy.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F5.jpg

Figure 5.

Confusion Matrix (a) Logistic Regression, (b) Decision Trees, (c) KNN, and (d) Random Forests.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F6.jpg

Figure 6.

Confusion Matrix for (a) Naive Bayes, (b) AdaBoost, (c) Multi-Layer Perceptron, and (d) Support Vector Machine.

Logistic Regression (LR) Classifier

The LR classifier is a widely used statistical model that addresses both classification and regression problems, making it a popular choice for categorization tasks. With the LR classifier, observations can be grouped into two or more distinct classes, resulting in a discrete target variable. These classes can take different forms, including binomial and multinomial.

Referring to Table 2, the LR classifier achieved an impressive accuracy of 97% for mosquito-repellent classifications. Additionally, it obtained a Training Accuracy Score (TAS) of 97.60% and a Testing Accuracy of 97.02%. The Testing Accuracy Score is also commonly referred to as the Model Accuracy Score (MAS).

Table 2.

LR Classifier performance

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	0.97	0.99	0.98	208
Fast Card	0.95	0.98	0.97	246
Hit	0.97	0.99	0.98	226
Incense Sticks	0.99	0.95	0.97	219
Coil	0.98	0.94	0.96	211
Accuracy	0.97			1110
Macro Avg.	0.97	0.97	0.97	1110
Weighted avg.	0.97	0.97	0.97	1110
Training Accuracy Score (TAS)	97.60
Model Accuracy Score (MAS)	97.02

Decision Tree (DT) Classifier

One of the widely used methods in ML for tackling classification and regression tasks is the decision tree (DT) algorithm. This supervised learning approach involves constructing a tree-shaped model, where each internal node corresponds to a feature or attribute, each branch represents a decision rule, and each leaf node signifies the predicted outcome or value.

In Table 3, the assessment of mosquito repellent classifications resulted in a 97% accuracy for the DT classifier, a 98.99% TAS, and a testing/MAS of 97.11%. The proposed technique is expected to offer a passenger mobility assessment system based on intelligent amenities like smart construction and safety [21].

Table 3.

DT Classifier performance

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	1.00	1.00	1.00	205
Fast Card	0.94	0.95	0.95	246
Hit	0.98	1.00	0.99	226
Incense Sticks	1.00	0.98	0.99	219
Coil	0.94	0.94	0.94	211
Accuracy	0.97			1110
Macro Avg.	0.97	0.97	0.97	1110
Weighted avg.	0.97	0.97	0.97	1110
TAS	98.99
MAS	97.11

Random Forest (RF) Classifier

Random Forest (RF), an ensemble learning method, leverages multiple decision trees to generate predictions. It enjoys widespread popularity in ML for both classification and regression tasks.

In Table 4, the assessed mosquito repellents achieved impressive results with an RF classifier accuracy of 98%, a TAS of 98.99%, and a testing/MAS of 97.11%. Additionally, a time series dataset was examined to forecast sensor data values in references [40, 41].

Table 4.

RF classifier performance

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	1.00	1.00	1.00	208
Fast Card	0.95	0.95	0.95	233
Hit	1.00	1.00	1.00	225
Incense Sticks	1.00	1.00	1.00	218
Coil	0.94	0.94	0.94	199
Accuracy	0.98			1110
Macro Avg.	0.98	0.98	0.98	1110
Weighted avg.	0.98	0.98	0.98	1110
TAS	98.99
MAS	97.11

K-Nearest Neighbour (KNN) Classifier

K-Nearest Neighbours (KNN) stands out as a popular and uncomplicated method in the realm of ML, commonly employed for both classification and regression tasks. This instance-based, non-parametric learning approach makes predictions by gauging the similarity between new data points and labeled examples from the training dataset.

In the context of mosquito repellent classifications, Table 5 reveals that the KNN classifier exhibited an impressive accuracy of 98%, with a TAS of 98.61% and a testing/MAS of 97.74%.

Table 5.

KNN classifier performance

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	1.00	1.00	1.00	208
Fast Card	0.95	0.96	0.95	246
Hit	1.00	1.00	1.00	226
Incense Sticks	1.00	0.99	0.99	219
Coil	0.96	0.94	0.95	211
Accuracy	0.98			1110
Macro Avg.	0.98	0.98	0.98	1110
Weighted avg.	0.98	0.98	0.98	1110
TAS	98.61
MAS	97.74

Naïve Bayes (NB) Classifier

A widely employed classification technique, Naive Bayes, leverages the Bayes theorem and assumes feature independence. This uncomplicated yet potent approach is commonly used in tasks such as text categorization, spam filtering, sentiment analysis, and other domains where the input data consists of discrete or categorical components.

In the context of evaluating mosquito repellent classifications, we observed an 83% accuracy for the KNN classifier, a TAS of 80.96%, and a Testing/MAS of 82.88%, all of which are documented in Table 6. Furthermore, the testing/model accuracy score reached 97.74%, the classifier accuracy was 98%, and the TAS achieved 98.61%.

Table 6.

Performance of Naïve Bayes classifier

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	0.71	0.91	0.80	208
Fast Card	0.73	0.98	0.83	246
Hit	0.94	0.96	0.95	226
Incense Sticks	0.96	0.44	0.61	219
Coil	0.97	0.84	0.90	211
Accuracy	0.83			1110
Macro Avg.	0.86	0.83	0.82	1110
Weighted avg.	0.86	0.83	0.82	1110
TAS	80.96
MAS	82.88

Ada Boost (Adaptive Boosting) Classifier

AdaBoost, also known as Adaptive Boosting, is a popular ensemble learning method commonly used in classification tasks. It effectively combines multiple weak classifiers into a robust and accurate strong classifier through a boosting process. AdaBoost assigns weights to each data point in the training set, and it iteratively adjusts these weights to emphasize the challenging instances in the classification.

In the context of mosquito-repellent classifications, the AdaBoost classifier achieved an accuracy of 75%. Additionally, it attained a TAS of 80.96% and a testing/MAS of 74.68% (as depicted in Table 7).

Table 7.

Performance of AdaBoost classifier

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	0.50	0.91	0.65	208
Fast Card	0.96	0.90	0.93	246
Hit	0.94	0.84	0.89	226
Incense Sticks	0.41	0.15	0.22	219
Coil	0.89	0.94	0.91	211
Accuracy	0.75			1110
Macro Avg.	0.74	0.75	0.72	1110
Weighted avg.	0.75	0.75	0.72	1110
TAS	80.96
MAS	74.68

Multilayer Perceptron (MLP) Classifier

A multilayer perceptron (MLP), a specific type of artificial neural network (ANN), consists of multiple layers of interconnected nodes, also referred to as perceptrons. In a feedforward neural network, information flows from the input layer, passes through one or more hidden layers, and reaches the output layer. MLPs are commonly used in tasks involving classification and regression.

In Table 8, the study reported an MLP classifier’s accuracy of 98%, a TAS of 98.41%, and a testing/ MAS of 98.10% when evaluating various mosquito repellents.

Table 8.

Performance of MLP classifier

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	1.00	1.00	1.00	208
Fast Card	0.95	0.97	0.96	246
Hit	1.00	1.00	1.00	226
Incense Sticks	1.00	1.00	1.00	219
Coil	0.96	0.94	0.95	211
Accuracy	0.98			1110
Macro Avg.	0.98	0..98	0.98	1110
Weighted avg.	0.98	0.98	0.98	1110
TAS	98.41
MAS	98.10

Support Vector Machine (SVM) Classifier

Support Vector Machine (SVM), a powerful supervised ML method, finds application in both classification and regression tasks. It demonstrates exceptional performance in managing high-dimensional data and creating a clear separation between different classes.

The results in Table 9 show that the SVM classifier achieved an accuracy of 98%, with a TAS of 97.99% and a Testing/Model Accuracy Score of 98.01% when evaluating various mosquito repellents.

Table 9.

Performance of SVM classifier

Mosquito Repellents	Precision	Recall	F1-Score	Support
Fresh Air	1.00	1.00	1.00	208
Fast Card	0.95	0.98	0.97	246
Hit	0.98	1.00	0.99	226
Incense Sticks	1.00	0.98	0.99	219
Coil	0.99	0.94	0.96	211
Accuracy	0.98			1110
Macro Avg.	0.98	0.98	0.98	1110
Weighted Avg.	0.98	0.98	0.98	1110
TAS	97.99
MAS	98.01

Figure 7 and Table 10 present a comparative analysis of eight classifiers that were assessed.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F7.jpg

Figure 7.

Analyzing MAS and TAS Across Various Classifiers.

Table 10.

Comparison of MAS and TAS of different classifiers

Classifier	MAS	TAS
Decision Tree	97.11	98.99
Random Forest	97.56	98.88
Logistic Regression	97.02	97.61
K-Nearest Neighbour	97.74	98.61
Naïve Bayes	82.88	80.96
AdaBoost	74.68	80.96
Multilayer Perceptron	98.11	98.41
Support Vector Machine	98.01	98.99

The findings indicate that the classifiers MLP and SVM exhibited promise in terms of the MAS, while RF and DT produced accurate classification results that showed potential for differentiating among the four types of mosquito repellents and the fresh air response. Mean Absolute Error (MAE), Mean Squared Error (MSE), and R2 Score along with the confusion matrix are used for the performance evaluation metrics. The performance of these classifiers is evaluated using various metrics presented in Figure 8 and Table 11. To assess the classifier’s performance, we employed three distinct metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), and R2 Score.

https://cdn.apub.kr/journalsite/sites/durabi/2023-014-04/N0300140407/images/Figure_susb_14_04_07_F8.jpg

Figure 8.

Assessing the performance of various classification methods.

Table 11.

Comparison of performance evaluation metrics of considered classifiers

Classifier	MAE	MSE	R2 Score
Decision Tree	0.072	0.208	0.892
Random Forest	0.071	0.211	0.892
Logistic Regression	0.075	0.209	0.891
K-Nearest Neighbour	0.064	0.186	0.903
Naïve Bayes	0.412	1.102	0.421
AdaBoost	0.684	1.961	0.021
Multilayer Perceptron	0.056	0.171	0.911
Support vector Machine (SVM)	0.073	0.325	0.893

Conclusion

Consequently, we assessed various machine-learning techniques, achieving remarkable accuracy in the process. When applied to the task of classifying insect repellents, ML algorithms have shown promising outcomes. It is viable to gauge the effectiveness of mosquito repellents accurately by leveraging multiple features and training the model on labeled data. It is essential to underscore that the quality and representativeness of the training data play a pivotal role in determining the effectiveness of the classification model. To yield precise classification results, it is imperative to possess a diverse and comprehensive dataset that encompasses various insect repellents, environmental factors, and efficacy measurements. Furthermore, the latest research and advancements in ML methodologies offer opportunities to enhance the precision and efficacy of classification models for mosquito-repellent products. By combining state-of-the-art algorithms, feature selection techniques, and ensemble methods, these models can be enhanced to produce even more accurate predictions. As a result, our proposed approach exemplifies an advanced and cost-effective system for monitoring mosquito repellents in both residential and workplace settings. Through the utilization of our suggested E-Nose, we have demonstrated the attainability of accurate mosquito repellent classification. This system is not only affordable and straightforward but also IoT-capable. Consequently, it can be integrated into pandemic prevention initiatives and smart city scenarios. The establishment of an IoT network for simultaneous monitoring over a broader area is also within the realm of possibility. In summary, the utilization of ML algorithms for the categorization of insect repellents proves to be a valuable approach, aiding individuals in making informed decisions about practical mosquito protection. By harnessing the potential of data and sophisticated algorithms, we can enhance our understanding of mosquito repellent effectiveness, ultimately contributing to the development of more efficient and reliable solutions in the future.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgements

The authors express their gratitude to the administrative teams at Thapar Institute of Engineering and Technology (TIET) in Patiala, Punjab, the Department of Electronics Engineering & Computer Science Engineering at Indian Institute of Technology (BHU) in Varanasi, India, and the School of Computer Science and Engineering at Kalinga Institute of Industrial Technology (KIIT) Deemed to be University in Bhubaneswar, Odisha, India for generously offering their research facilities.

References

S.B. Ogoma, S.J. Moore, and M.F. Maia, A systematic review of mosquito coils and passive emanators: Defining recommendations for spatial repellency testing methodologies. Parasites and Vectors. 5(1) (2012), pp. 1-10. DOI: 10.1186/ 1756-3305-5-287. 10.1186/1756-3305-5-28723216844PMC3549831

A. Joshi and C. Miller, Review of machine learning techniques for mosquito control in urban environments. Ecol. Inform. 61 (2021). DOI: 10.1016/j.ecoinf.2021.101241. 10.1016/j.ecoinf.2021.101241

I. Choi and H. Kim, Reducing Energy Consumption and Health Hazards of Electric Liquid Mosquito Repellents through TinyML. Sensors. 22(17) (2022). DOI: 10.3390/s22176421. 10.3390/s2217642136080880PMC9460490

A. Mishra, S. Kim, and N.S. Rajput, An Efficient Sensory System for Intelligent Gas Monitoring Accurate classification and precise quantification of gases/ odors. Proc. - Int. SoC Des. Conf. ISOCC 2020, pp. 338-339, Oct. 2020, DOI: 10.1109/ ISOCC50952.2020.9332957. 10.1109/ISOCC50952.2020.9332957

S.N. Chaudhri, N. Singh Rajput, and A. Mishra, A novel principal component-based virtual sensor approach for efficient classification of gases/ odors. J. Electr. Eng. 73 (2022), pp. 108-115. DOI: 10.2478/jee-2022-0014. 10.2478/jee-2022-0014

N.S. Rajput, R.R. Das, V.N. Mishra, K.P. Singh, and R. Dwivedi, A fully neural implementation of unitary response model for classification of gases/odors using the responses of thick film gas sensor array. Sensors Actuators, B Chem. 155(2) (2011), pp. 759-767. DOI: 10.1016/j.snb.2011.01.043. 10.1016/j.snb.2011.01.043

W. Liu, J. Zhang, J.H. Hashim, J. Jalaludin, Z. Hashim, and B.D. Goldstein, Mosquito coil emissions and health implications. Environ. Health Perspect. 111(12) (2003), pp. 1454-1460. DOI: 10.1289/EHP.6286. 10.1289/ehp.628612948883PMC1241646

P.S. Sodhi, N. Rawat, and U. Saxena, Low Cost Herbal Mosquito Repellent Using Arm Based Device. 2018 7th Int. Conf. Reliab. Infocom Technol. Optim. Trends Futur. Dir. ICRITO 2018, (2018), pp. 545-550. DOI: 10.1109/ICRITO.2018. 8748840. 10.1109/ICRITO.2018.8748840

S.S. Saini, D. Bansal, G.S. Brar, and E. Sidhu, Solar energy driven Arduino based smart mosquito repeller system. Proc. 2016 IEEE Int. Conf. Wirel. Commun. Signal Process. Networking, WiSPNET 2016. (2016), pp. 1239-1243. DOI: 10.1109/WiSPNET.2016.7566334. 10.1109/WiSPNET.2016.7566334

M. Exner, S. Bhattacharya, J. Gebel, P. Goroncy-Bermes, P. Hartemann, P. Heeg, C. Ilschner, A. Kramer, M.L. Ling, W. Merkens, and P. Oltmanns, Chemical disinfection in healthcare settings: critical aspects for the development of global strategies. GMS Hygiene and Infection Control. 15 (2020). DOI: 10.3205/dgkh000371. Availableat: https://www. ncbi.nlm.nih.gov/pmc/articles/PMC7818848/pdf/HIC-15-36.pdf).

S.A. Sattar, O. Adegbunrin, and J. Ramirez, Combined application of simulated reuse and quantitative carrier tests to assess high-level disinfection: Experiments with an accelerated hydrogen peroxide-based formulation. Am. J. Infect. Control. 30(8) (2002), pp. 449-457. DOI: 10.1067/MIC. 2002.126428. 10.1067/mic.2002.12642812461509

D. Kumar and N.S. Rajput, Air Pollution in mining Industries has very adverse effects on Human Health, Flora, and Fauna, and proper assessment is needed around the mining areas. Int. J. Eng. Technol. Manag. Sci. 6(5) (2022), pp. 734-741. DOI: 10.46647/ijetms.2022.v06i05.114. 10.46647/ijetms.2022.v06i05.114

A. Mishra and N.S. Rajput, A novel modular ANN architecture for efficient monitoring of gases/ odours in real-time. Materials Research Express. 5(4) (2018). DOI: 10.1088/2053-1591/aabe09. 10.1088/2053-1591/aabe09

R.A. Rosenthal, S.V.W. Sutton, and B.A. Schlech, Review of Standard for Evaluating the Effectiveness of Contact Lens Disinfectants. PDA J Pharm Sci Technol. 56(1) (2002). pp. 37-50.

V.A. Binson, M.M. George, M.A. Sibichan, M. Raj, and K. Prasad, Freshness Evaluation of Beef using MOS Based E-Nose. IDCIoT 2023 - Int. Conf. Intell. Data Commun. Technol. Internet Things, Proc., no. IDCIoT 2023. (2023), pp. 792-797. DOI: 10.1109/IDCIoT56793.2023.10053399. 10.1109/IDCIoT56793.2023.10053399

Y. Jia, S. Fan, Z. Li, and K. Xia, A Fast Detection Method of Turbulent Gases Based on Gated Recurrent Unit and Attention Mechanism. IEEE Sens. J. 23(6) (2023), pp. 5974-5987. DOI: 10.1109/JSEN.2023.3239753. 10.1109/JSEN.2023.3239753

H. Onen, M.M. Luzala, S. Kigozi, R.M. Sikumbili, C.J.K. Muanga, E.N. Zola, S.N. Wendji, A.B. Buya, A. Balciunaitiene, J. Viškelis, M.A. Kaddumukasa, and P.B. Memvanga, Mosquito-Borne Diseases and Their Control Strategies: An Overview Focused on Green Synthesized Plant-Based Metallic Nanoparticles. Insects. 14(3) (2023). DOI: 10.3390/ insects14030221. 10.3390/insects1403022136975906PMC10059804

A. Mishra, N.S. Rajput, and D. Singh, Performance evaluation of normalized difference based classifier for efficient discrimination of volatile organic compounds. Mater. Res. Express. 5(9) (2018). DOI: 10.1088/2053-1591/aad3dd. 10.1088/2053-1591/aad3dd

N. Musee, L. Lorenzen, and C. Aldrich, New methodology for hazardous waste classification using fuzzy set theory: Part I. Knowledge acquisition. J. Hazard. Mater. 154(1-3) (2008), pp. 1040-1051. DOI: 10.1016/J.JHAZMAT.2007.11.011. 10.1016/j.jhazmat.2007.11.01118082951

R. Kim, S. Tae, and H. Lim, Greenhouse gas emission analysis for each material composition of a cement reactive siloxane polymer-based penetration resist agent. Int. J. Sustain. Build. Technol. Urban Dev. 10(3) (2019), pp. 167-175. DOI: 10.22712/susb.20190017.

Y. Jo, E. Jeong, S. Lee, and C. Oh, A novel methodology to monitor passenger mobility performance in urban subway stations. Int. J. Sustain. Build. Technol. Urban Dev. 12(2) (2021), pp. 186-203. DOI: 10.22712/susb.20210015.

H. Kim, S. Tae, and Y. Ahn, A study on the environmental impact prediction method of bridge life cycle maintenance using bridge maintenance database. Int. J. Sustain. Build. Technol. Urban Dev. 10(4) (2019), pp. 194-204. DOI: 10.22712/ susb.20190021.

S. Kumar, Sahil, and S.K. Sood, Internet of Things Smart vehicular traffic management: An edge cloud centric IoT based framework. Internet of Things. 14 (2021), 100140. DOI: 10.1016/j.iot. 2019.100140. 10.1016/j.iot.2019.100140

E. Efeoğlu and G. Tuna, The Use of Microwave and K* Algorithm in Determination of Alcohol Concentration in Liquids. Russ. J. Nondestruct. Test. 56(8) (2020), pp. 689-697. DOI: 10.1134/ S1061830920080033. 10.1134/S1061830920080033PMC7590582

A. Atasoy, U. Ozsandikcioglu, and S. Guney, Fish freshness testing with Artificial Neural Networks; Fish freshness testing with Artificial Neural Networks. International Conference on Electrical and Electronics Engineering, ELECO. (2015). DOI: 10.1109/ELECO.2015.7394629. 10.1109/ELECO.2015.7394629

W. Zheng, H. Men, Y. Shi, Y. Ying, J. Liu, and Q. Liu, An Olfactory-Taste Synesthesia Model Combined With Electronic Nose and Electronic Tongue to Identify Flavor Substances. IEEE Sens. J. 22(15) (2022), pp. 15199-15210. DOI: 10.1109/ JSEN.2022.3185452. 10.1109/JSEN.2022.3185452

D. Kumar, S.N. Chaudhri, and N.S. Rajput, A Machine Learning-Based Disinfectant Type, Concentration, and Usage Monitoring System for Real-World Scenarios. 2023 Int. Conf. IoT, Commun. Autom. Technol. (2023), pp. 1-6. DOI: 10.1109/ICICAT57735.2023.10263659. 10.1109/ICICAT57735.2023.10263659

S. Sadeghifard, M. Anjomshoa, and E. Esfandiari, A new embedded E-nose system in smoke detection. 2011 1st Int. eConference Comput. Knowl. Eng. ICCKE 2011. (2011). pp. 18-21. DOI: 10. 1109/ICCKE.2011.6413317. 10.1109/ICCKE.2011.6413317

L.V. Shum, P. Rajalakshmi, A. Afonja, G. McPhillips, R. Binions, L. Cheng, S. Hailes, On the development of a sensor module for real-time pollution monitoring. 2011 Int. Conf. Inf. Sci. Appl. ICISA 2011. (2011). DOI: 10.1109/ICISA.2011.5772355. 10.1109/ICISA.2011.5772355

P. Narkhede, R. Walambe, P. Chandel, S. Mandaokar, and K. Kotecha, MultimodalGasData: Multimodal Dataset for Gas Detection and Classification. Data. 7(8) (2022). pp. 1-8. DOI: 10.3390/data7080112. 10.3390/data7080112

A. Verma, D. Kumar, V.N. Mishra, and R. Prakash, A Self-Assembled Polymer Nanocomposite-Based Low-Voltage White Light Phototransistor With UV-Cured Synthesized LaZrOx Dielectric. IEEE Trans. Electron Devices. (2023), pp. 1-7. DOI: 10.1109/TED.2023.3274500. 10.1109/TED.2023.3274500

A. Mishra, N.S. Rajput, and G. Han, NDSRT: An Efficient Virtual Multi-Sensor Response Transformation for Classification of Gases/Odors. IEEE Sens. J. 17(11) (2017), pp. 3416-3421. DOI: 10. 1109/JSEN.2017.2690536. 10.1109/JSEN.2017.2690536

A. Mishra and S.K. Gupta, Intelligent Classification of Coal Seams Using Spontaneous Combustion Susceptibility in IoT Paradigm. International Journal of Coal Preparation and Utilization. (2023), pp.1-23. DOI: https://doi.org/10.1080/19392699.2023.2217747.10.1080/19392699.2023.2217747

R. Kim, S. Roh, and H. Kim, Investigation of relationship between carbon emission and building energy performance index for carbon neutrality. Int. J. Sustain. Build. Technol. Urban Dev. 14(3) (2023), pp. 426-433.

D. Kumar, A. Mishra, S.N. Chaudhri, and N.S. Rajput, Intelligent Monitoring of Disinfectants. In: IoT, Big Data and AI for Improving Quality of Everyday Life: Present and Future Challenges: IOT, Data Science and Artificial Intelligence Technologies, pp. 379-391. Cham: Springer International Publishing, 2023. 10.1007/978-3-031-35783-1_22

A. Mishra, J. Cha, and S. Kim, Single Neuron for Solving XOR like Nonlinear Problems. Computational Intelligence and Neuroscience. (2022). DOI: https://doi.org/10.1155/2022/9097868. 10.1155/2022/909786835652062PMC9148856

A. Mishra, J. Cha, and S. Kim, Privacy-Preserved In-Cabin Monitoring System for Autonomous Vehicles. Computational Intelligence and Neuroscience. (2022). DOI: https://doi.org/10.1155/2022/5389359.10.1155/2022/538935935498178PMC9054414

A. Mishra, J. Kim, J. Cha, D. Kim, and S. Kim, Authorized traffic controller hand gesture recognition for situation-aware autonomous driving. Sensors. 21(23) (2021), 7914. DOI: https://doi.org/10.3390/s21237914.10.3390/s2123791434883917PMC8659850

V. Butram, A. Mishra, and A. Naugarhiya, A lead-free spiral bimorph piezoelectric mems energy harvester for enhanced power density. IETE Technical Review. 38(5) (2020), pp.537-546. DOI: https://doi.org/10.1080/02564602.2020.1799876. 10.1080/02564602.2020.1799876

Y.S. Chang, H.T. Chiao, S. Abimannan, Y.P. Huang, Y.T. Tsai, and K.M. Lin, An LSTM-based aggregated model for air pollution forecasting. Atmos. Pollut. Res. 11(8) (2020), pp. 1451-1463. DOI: 10.1016/J.APR.2020.05.015. 10.1016/j.apr.2020.05.015

D. Kumar, S.N. Chaudhri, and N.S. Rajput, Air Quality Prediction and Monitoring Using Machine Learning-Based Forecasting Approach. 2023 Int. Conf. IoT, Commun. Autom. Technol. (2023), pp. 1-6. DOI: 10.1109/ICICAT57735.2023.10263594. 10.1109/ICICAT57735.2023.10263594

International Journal of Sustainable Building Technology and Urban Development ISSN:2093-761X(Print) 2093-7628(Online)

Preview

Utilizing machine learning for the assessment of mosquito repellent effectiveness and decision support in product selection

ABSTRACT

MAIN

Figure 1.

Configuration for gathering experimental data.

Figure 2.

Connections at the block level for gathering data.

Table 1.

Sensor and specific gas

Figure 3.

Collecting data on (a) rapid card, (b) coil, (c) fragrant incense sticks, and (d) liquidator.

(1)

Figure 4.

Data from all sensors are visualized through histograms.

(2)

(3)

(4)

Figure 5.

Confusion Matrix (a) Logistic Regression, (b) Decision Trees, (c) KNN, and (d) Random Forests.

Figure 6.

Confusion Matrix for (a) Naive Bayes, (b) AdaBoost, (c) Multi-Layer Perceptron, and (d) Support Vector Machine.

Table 2.

LR Classifier performance

Table 3.

DT Classifier performance

Table 4.

RF classifier performance

Table 5.

KNN classifier performance

Table 6.

Performance of Naïve Bayes classifier

Table 7.

Performance of AdaBoost classifier

Table 8.

Performance of MLP classifier

Table 9.

Performance of SVM classifier

Figure 7.

Analyzing MAS and TAS Across Various Classifiers.

Table 10.

Comparison of MAS and TAS of different classifiers

Figure 8.

Assessing the performance of various classification methods.

Table 11.

Comparison of performance evaluation metrics of considered classifiers

Conflicts of Interest

Acknowledgements

References