Introduction
Literature Review
Background
Autoencoder
XGBoost
Methodology
Data Generation
Data Pre-processing
Feature Extraction
Train-Test Split
Model Prediction using XGBoost
Model Evaluation
Tools and Software
Results and Discussion
Comparitive study with Exisiting models
Discussion
Conclusion
Introduction
In recent decades, wireless communication technologies have advanced rapidly, driven by the increasing demand for high data rates, enhanced spectral efficiency, and reliable connectivity across wide frequency ranges. While this “explosion of data traffic” is fueled by billions of connected devices worldwide, its implications extend beyond consumer communication into the built environment [1]. The proliferation of wireless-enabled devices is expected to exceed 18 billion by 2025, creating unprecedented opportunities for smart and sustainable building systems. Future networks such as 5G and 6G, with projected peak data rates of 20 gigabits per second (Gbps) and even 1 terabit per second (Tbps), respectively, will enable seamless integration of structural health monitoring, energy management, environmental sensing, and occupant well-being solutions within buildings and urban infrastructure [2, 3]. Consequently, communication systems must adopt advanced antenna and sensing technologies that not only meet data demands but also contribute to the durability, sustainability, and resilience of the built environment [4].
With its bandwidth being below 10 GHz, millimeter wave (mm-wave) communication (30-300 GHz) has proven to be a workable solution in transmitting high data in 5G. For mm-wave technology, achieving Tbps data rates may be a challenge. One future solution for the forthcoming 6G wireless system is the Terahertz (THz) band (0.1-10 THz). [5, 6], which is under research as a potential candidate for supporting the mm-wave spectrum [4]. The THz spectrum falls below the infrared region but above microwave frequencies. The capacity to handle 100 Gbps of data transmission, which is required for future communication systems, is one of the benefits of this frequency range [7].
Antennas are very important for the transmission and reception of THz waves. THz antennas are valued for their compact size, wide frequency coverage, and ability to support high data rates [8]. Among these, the terahertz microstrip patch antenna is commonly used due to its simple fabrication, high gain, and radiation efficiency [9, 10]. Performance depends on factors such as patch size, feeding mechanism, substrate height, and dielectric constant [11].
Despite these advantages, THz antennas face significant challenges. High propagation losses caused by atmospheric absorption, primarily from water vapor, necessitate high-gain and directional designs. Transmission windows at 0.12-0.55 THz, 0.562-0.75 THz, 0.992-1.09 THz, and 1.212-1.41 THz can be exploited for THz communication [12], but beamforming, MIMO arrays, and narrow-beam phased arrays are often required to overcome free-space path loss [13, 14]. Material constraints further complicate design: conventional copper suffers reduced conductivity at THz frequencies, while advanced materials such as graphene and CNTs offer high carrier mobility, conductivity, and flexibility [15, 16]. Graphene, in particular, can generate plasmon-polariton waves at THz frequencies, enabling efficient signal propagation while supporting compact and flexible antenna designs. Low-loss dielectrics like polyethylene naphthalate (PEN), liquid crystalline polymer (LCP), and quartz are also essential [17, 18, 19].
To address these design requirements, engineers traditionally rely on empirical methods and computational electromagnetic (CEM) simulations, which, though accurate, are often time-consuming, computationally intensive and require iterative trial-and-error, especially for complex THz antennas [20, 21]. As antenna systems become more complex-such as in large-scale antenna configurations used in 5G networks [22], where hundreds of antennas may be deployed- traditional methods struggle to scale efficiently.
Machine learning (ML) has come forward as a viable substitute in recent times. Through learning from simulation-generating data, ML can effectively predict antenna parameters, including return loss, bandwidth, gain, and radiation efficiency, minimizing dependency on comprehensive simulations and speeding up the design process [23, 24]. Furthermore, ML can automatically correlate input design parameters with performance results to achieve quick optimization for various geometries and materials. Integrating simulation-based data generation with ML in THz antenna design therefore presents a viable method towards realizing high-gain and high-efficiency antennas with minimal design complexity and computational expense [25, 26].
This work offers a hybrid approach that leverages MATLAB-based simulations and machine learning for frequency-tunable THz antenna design. As opposed to traditional CEM approaches or previous ML models restricted to single-parameter prediction, the new strategy unites data-driven simulation with a two-stage ML pipeline-Autoencoder-based feature extraction followed by XGBoost for predictive modeling. This facilitates simultaneous estimation of important antenna performance parameters-return loss, bandwidth, gain, and efficiency-while improving computational efficiency and speeding design cycles. This work particularly addresses the prediction and optimization of frequency-tunable triple-band hexagonal graphene THz antennas, showcasing the novelty and feasibility of the proposed approach.
The contributions and novelty of this work can be outlined as follows:
ㆍTo design and simulate frequency-tunable, triple- band hexagonal graphene THz antennas using MATLAB and generate a comprehensive dataset of performance parameters.
ㆍTo preprocess the dataset and apply an Autoencoder for efficient feature extraction, reducing dimensionality and enhancing learning.
ㆍTo use XGBoost for predicting antenna characteristics and optimizing performance outcomes.
ㆍTo evaluate the hybrid Autoencoder-XGBoost model in terms of prediction accuracy and its effectiveness in reducing design complexity.
ㆍTo demonstrate the feasibility and novelty of integrating MATLAB-based data generation with advanced ML models for faster, scalable, and practically applicable THz antenna design.
The following will be the format of the remaining sections: The associated work is examined in Section 2, which entails examining the literature for our recommended strategy and determining any research gaps. Background information on the proposed models is given in Section 3. The study’s methodology is covered in Section 4. The experiment is explained, the findings are examined, and the results are contrasted in Section 5. The study’s conclusions and future prospects are presented in Section 6.
Literature Review
The increasing use of ML in the design of THz antennas for 6G and IoT has been underscored in recent studies. Because ML algorithms can effectively explore high-dimensional parameter spaces and optimize several physical and electrical qualities at once, they are well suited for antenna design. In contrast to traditional techniques, ML systems manage non-linear interactions between performance measurements (like gain and radiation efficiency) and input parameters (like antenna size, shape, and material), making it possible to identify ideal configurations more quickly [27]. It is possible to rapidly and accurately predict the properties of several antenna modules using machine learning methods. In order to find previously unknown mathematical correlations in the data, machine learning is being utilized to link input and output behaviors [28, 29, 30].
Nirob et al. [31] developed a grid-shaped MIMO antenna that attained peak gains over 11 dB and isolation over 31 dB, and antenna gain was predicted at a rate higher than 96% by XGB regression. Also, [32] presented a new-shape THz MIMO antenna with 2.5 THz bandwidth, 14.59 dB gain, and 96% efficiency, in which Extra Tree Regression performed better than other models in predicting gain.
Graphene-based structures have also evolved. Rai et al. [33] proposed a frequency-tunable hexagonal graphene antenna optimized by ML algorithms, in which Random Forest achieved over 99% accuracy in S11 prediction. Kumar and Sadhu [34] utilized ANN and RF for predicting the performance of a graphene- dielectric resonator antenna, while saving simulation time with correlation to measured data. Kushwah et al. [35] also showed a silicon- graphene-based MIMO array, where RF and XGBoost effectively predicted reflection coefficients for frequency-agile and polarization-reconfigurable performance.
A comprehensive review by Khan et al. [30] also reiterated that ML and deep learning boost antenna design, decrease computation cost, and enhance optimization in millimeter-wave, satellite, UAV, and THz applications. Overall, the studies solidify ML as a revolutionary means for compact, efficient, and high-performance THz antenna design, less dependent on computationally intensive simulations while maintaining accuracy of the design.
Several recent studies have added weight to the already convincing case that ML-driven and data- centric approaches are the ones to be finally ruling in the THz antenna design field with the great potential integration of 6G and IoT applications. Koziel’s [36] showed that through the use of surrogate-assisted reduced-dimensional optimization, one can effectively achieve a drastic decrease in the simulation time for full-wave electromagnetic while at the same time maintaining the accuracy of the antenna resulting in the practicality of the ML-enabled optimization pipeline for THz and high-frequency structures. Moreover, a review conducted by Pandey [37] recognized the swift and wide use of ML techniques in the process of tuning the antenna geometry e.g., Gaussian processes, neural networks, and hybrid evolutionary learners and pointed out their great advantage in being able to deal with not merely linear but also nonlinear feature modeling spread over vast parameter spaces. Providing further support, PIER published in 2024 [38] the adoption of ML-based models for the purpose of achieving a 90% or more time reduction in the process of complex multi- resonant layout optimization, thus proving the efficiency advantage that had been earlier observed in the designs based on THz MIMO and graphene.
However, over and above the optimization, THz communication research has underlined the role of intelligent antenna and propagation modeling approaches in the coming academic era. One of the most important aspects that Ahmed’s survey on reconfigurable intelligent surfaces (RIS) for THz links [39] has revealed is the fact that there is a need for reliability in ML-assisted beamforming and surface configuration which can mitigate the problem of high molecular absorption losses an essential requirement for reliable THz IoT networks. Besides that, some emerging IoT-oriented studies [40] indicate that the combination of RIS, compact high-gain antenna arrays, and predictive channel modeling in ML-enhanced THz architectures will lead to more stable connectivity in the case of short-range dense IoT deployments within the 6G environment.
All previous research on THz antenna design either relies on computationally costly electromagnetic simulations or uses traditional ML tools like Random Forest, SVR, or isolated neural networks, which fail to capture intricate nonlinear parameter-performance correlations and do not generalize well at far-from- average data points. Additionally, less focus has been on combining feature extraction with predictive modeling, leading to redundancy in input parameters and inefficiency. In order to address such limitations, this research proposes a novel Auto encoder-XGBoost framework integrating dimensionality reduction and high-precision prediction, thereby solving scalability, decreasing simulation dependency, and achieving better accuracy for THz antenna optimization. Further, Table 1 summarizes the existing study in the domain in terms of major findings, algorithms used, improvements achieved etc.
Table 1.
Summary of the existing studies
| Study | Antenna Type / Focus | ML Algorithms Used | Major Findings | Performance Improvements |
| Goudos et al. (2022) [28] | General antenna ML optimization review | GA, ANN, SVM, Bayesian, GP | ML reduces design complexity and supports multi-objective optimization | Significant reduction in tuning iterations |
| El Misilmani et al. (2020) [29] | ML for antenna design & optimization | ANN, SVM, Regression models | ML models efficiently handle nonlinear antenna characteristics | Simplified modeling and faster parametric evaluation |
| Khan et al. (2022) [30] | Review on ML/DL for antenna design | CNN, ANN, RF, GP | ML enhances accuracy for THz, MIMO, and mmWave systems | Reduces computation time by 40–80% |
| Nirob et al. (2025) [31] | Grid-structured THz MIMO antenna | XGBoost Regression | Achieved >11 dB gain, >31 dB isolation | >96% predictive accuracy for gain |
| Haque et al. (2024) [32] | Slotted-ground THz MIMO antenna | Extra Trees Regression | 2.5 THz bandwidth, 96% efficiency, 14.59 dB gain | Highest gain-prediction accuracy among tested models |
|
Rai et al. (2025) [33] | Graphene hexagonal frequency-tunable antenna | Random Forest | Achieved 99% accuracy for S11 prediction | High tunability and ML-enhanced parameter control |
| Koziel et al. (2024) [36] | ML surrogate-assisted antenna optimization | Surrogate ML models (dimensionality reduction) | Achieved drastic reduction in simulation time | Reduced computation by up to 90% |
Background
Optimizing complex THz antenna structures, whose performance metrics such as return loss, bandwidth, gain, and efficiency are influenced by high-dimensional variables, necessitates efficient feature extraction and predictive modeling. Unsupervised representation learning and advanced regression models are employed in hybrid methods to enhance prediction accuracy at reduced processing expense. To enable faster and efficient THz antenna design, the Autoencoder- XGBoost model is employed within this research to identify meaningful latent features from data generated through simulation and provide sound predictions of antenna performance.
Autoencoder
An Autoencoder (AE) is an unsupervised neural network used for dimensionality reduction, feature learning, and representation learning. It contains two main components.
ㆍEncoder - compresses the input x∈Rn into a lower- dimensional latent space h∈Rm(where m<n).
Here, W and b are trainable weights and biases, and σ(⋅) is a nonlinear activation function (e.g., ReLU, sigmoid, tanh).
ㆍDecoder - reconstructs the input from the latent representation.
where is the reconstructed approximation of the original input .
The training goal of an Autoencoder is to reduce the reconstruction error between the original input and its reconstruction, typically quantified by Mean Squared Error (MSE):
Therefore, the Autoencoder learns a compressed latent representation that captures important structures in the data while removing redundancy and noise. In hybrid systems, like in THz antenna optimization, the learned latent features act as effective inputs for forecasting models (e.g., XGBoost), which enhances accuracy and computational savings.
XGBoost
XGBoost has been shown to have excellent prediction performance for classification and regression problems [41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59]. The XGBoost technique is based on the Gradient Boosting Decision Tree (GBDT) and allows for simultaneous computing. The second-order Taylor expansion loss function improves computational accuracy, while the regularization term simplifies and speeds up the model. The Blocks storage structure allows for parallel processing [42, 43, 44]. For a total of k trees, the model prediction for round can be expressed as.
where is the number of iterations, is the tree function at round , and is the prediction from the previous round. The goal function and regularization term are defined as.
where , is the loss function, 𝛾 and 𝜆 are regularization parameters to avoid overfitting, represents the number of leaf nodes, and denotes leaf weights. This formulation ensures both accuracy and generalization, which is why XGBoost was adopted in the proposed Autoencoder-XGBoost hybrid model. The overall architecture of XGBoost is illustrated in Figure 1.
Methodology
This paper presents a two-stage hybrid machine learning framework for predicting and optimizing a small, frequency-tunable, triple-band hexagonal graphene antenna for terahertz (THz) communication. The technique combines simulation-based data production in MATLAB with a sequential ML pipeline that includes feature extraction and performance prediction. Figure 2 shows the flowchart of the proposed work.
Data Generation
In the first stage of this study, a structured dataset is created in MATLAB by modelling a frequencytunable, triple-band hexagonal graphene THz antenna. The antenna design is parametrically flexible, allowing for systematic variation of key input parameters such as feed width (Fw), patch radius (R), conducting layer thickness (T), slit width (W1), substrate permittivity (εr), graphene chemical potential (μc), applied bias voltage (Vb), and material or substrate type. The related antenna performance metrics are calculated for each design using MATLAB electromagnetic simulations. This research considers the following outputs: return loss (S11 in dB), resonant frequency (THz), impedance bandwidth (GHz), and peak gain (dBi). Although other characteristics like as VSWR and group delay may be retrieved, they were excluded to keep the dataset focused on the most important metrics for machine learning prediction and optimisation. The resultant dataset is organised in a tabular format, with each row representing a different antenna design and its performance results, as shown in Table 2.
Table 2.
Sample Dataset generated using MATLAB
| Parameter | Value |
| Fw (µm) | 4 |
| R (µm) | 6 |
| T (µm) | 2 |
| W₁ (µm) | 0.3 |
| εr | 2.2 |
| μc (eV) | 0.5 |
| Vb (V) | 0.3 |
| Material | Graphene |
| S11 (dB) | –22.3 |
| Resonant Frequency (THz) | 2.8 |
| Bandwidth (GHz) | 1.8 |
| Gain (dBi) | 5.6 |
Data Pre-processing
The dataset is exported in a structured format (CSV/Excel) after being created in MATLAB and will do preprocessing. In order to guarantee consistency across all input-output variables and to prepare the data for machine learning analysis, this step is crucial.
The preprocessing procedure consists of the following steps:
ㆍData Cleaning: Checking for missing values and incorrect entries, and updating or eliminating them if needed.
ㆍNormalization/Scaling: To facilitate model convertgence, continuous input parameters such as feed width, patch radius, chemical potential, and bias voltage are normalized to a common scale (for example, Min-Max scaling [0,1]).
ㆍEncode Categorical variables/parameters: Nonnumeric entries such as Material (e.g., graphene, copper, gold) and Substrate type (e.g., polyamide, quartz) are transformed to numerical form using one-hot encoding, yielding binary feature columns.
Feature Extraction
To improve learning efficiency, the hybrid framework’s initial step employs a feature extraction approach. An Artificial Neural Network (ANN) Autoencoder is used to compress a collection of latent characteristics from the raw input parameters. The autoencoder captures nonlinear connections among design parameters and removes duplication, resulting in enhanced features that more accurately depict the underlying design-performance correlations.
The encoder mapping is the formal method for feature extraction.
Where,
ㆍ is the input vector that contains the antenna design parameters (Fw, R, T, W1, εr, μc, Vb, Material).
ㆍ and weights and biases of encoders,
ㆍ is a nonlinear activation function (e.g., ReLU, tanh),
ㆍ is the latent feature vector (<)
the latent characteristic Vector h is the compressed representation of the original input and functions as the extracted feature set. In the input space, these features reduce redundancy and preserve the most significant design-performance relationships.
The decoder is employed during the training process to reconstruct the original input:
with the loss of reconstruction:
This loss is minimised to guarantee that the latent features retain the critical information from . The encoder is the sole component that is retained following the training process, while the extracted features h is forwarded to the XGBoost regressor for performance prediction after the data is splitted.
Train-Test Split
After feature extraction, the dataset is separated into training and testing subsets to assess the model’s generalisability. An 80:20 split is utilised, with 80% of the data used to train the Autoencoder-XGBoost pipeline and 20% set aside for independent testing.
Model Prediction using XGBoost
In the second step of the hybrid framework, the collected features from the Autoencoder are fed into an Extreme Gradient Boosting (XGBoost) regressor. XGBoost was chosen because of its capacity to effectively handle structured datasets, capture nonlinear interactions, and achieve high predicted accuracy via optimised training.
Formally, the prediction is based on a set of regression trees. The model output for sample i is as follows:
Where,
ㆍ is the expected value of the antenna performance parameter (S11, resonant frequency, bandwidth, or gain).
ㆍ Represents the k-th regression tree ().
ㆍ is how many trees there are overall, and.
ㆍ is the area where every potential regression tree may be found.
The objective function minimized by XGBoost is.
Where,
ㆍ is the loss function (like Mean Squared Error),
ㆍ is the phrase for regulari- sation,
ㆍ is how many leaves the tree has,
ㆍ symbolises the leaf weights,
ㆍ𝛾 and 𝜆 are parameters for regularisation that regulate the complexity of the model.
Model Evaluation
The suggested Autoencoder-XGBoost framework’s performance is assessed via the use of validation techniques and statistical error measures. The assessment guarantees that the model exhibits good generalisation to unknown antenna designs in addition to achieving high accuracy.
1) Mean Square Error (MSE).
ㆍCalculates the average squared difference between the numbers that were expected and those that were found.
2) Mean Absolute Error (MAE).
ㆍCalculates the mean absolute deviation of the forecasts.
3) R2
ㆍshows the extent to which the variation in the actual data can be explained by the anticipated values.
4) Root Mean Square Error (RMSE)
ㆍCalculates the square root of the mean squared deviation between expected and actual values.
Tools and Software
The research used MATLAB for antenna simulation and Python for machine learning implementation. Key libraries used include NumPy, Pandas, scikit-learn, TensorFlow/Keras, XGBoost, and Matplotlib.
Results and Discussion
This section covers the study’s experimental results. The findings demonstrate the performance of the suggested methodology in comparison to current methodologies, as supported by numerical evaluation metrics and graphical assessments. Table 3 shows the comparison of the proposed model with the existing model across various metrics.
Table 3.
Comparison proposed model with the existing model across various metrics
| Models | MSE | RMSE | MAE | R2 |
| Proposed (Autoencoder + XGBoost) | 0.001 | 0.03 | 0.02 | 0.99 |
| Random Forest | 0.11767 | 0.27582 | 0.23571 | 0.84805 |
| SVR | 0.1105 | 0.26882 | 0.23212 | 0.85198 |
| MLP | 0.13455 | 0.30583 | 0.25765 | 0.77001 |
The comparison performance table demonstrates that the suggested Autoencoder + XGBoost hybrid model outperforms standard machine learning algorithms. The suggested model has the lowest error values (MSE = 0.001, RMSE = 0.03, MAE = 0.02) and an extraordinarily high R2 = 0.99, suggesting almost flawless prediction accuracy. The baseline models - Random Forest (R2 = 0.848, RMSE = 0.275), SVR (R2 = 0.852, RMSE = 0.269), and MLP (R2 = 0.770, RMSE = 0.306) had much larger errors and lower correlation with real data. This illustrates that the proposed hybrid framework not only reduces prediction errors but also delivers more dependable and generalisable performance across antenna design parameters than previous techniques. Figure 3 shows the Actual vs Predicted plots of the Proposed and baseline models (RF, SVR, MLP).
The suggested Autoencoder + XGBoost hybrid model works better than the baseline techniques, as shown by the comparison plots, which show near- perfect alignment between the actual and projected values and a great match with the ideal prediction line (R2 = 0.9900, MSE = 0.0100). On the other hand, namely at the dataset’s extremes, the baseline models Random Forest (R2 = 0.9051, MSE = 0.2496), SVR (R2 = 0.9021, MSE = 0.2627), and MLP (R2 = 0.9158, MSE = 0.2721) exhibit much larger prediction errors and more dispersed points departing from the ideal line. This demonstrates how the suggested hybrid framework outperforms current machine learning models in terms of generalisation and prediction accuracy.
Algorithm 1.
Hybrid Autoencoder–XGBoost Framework for THz Antenna Prediction
The performance of the baseline and suggested models in capturing antenna characteristics across the 1-10 THz range is contrasted in the Return Loss (S11) vs. Frequency graphic as shown in Figure 4. Three separate resonant dips can be seen in the real curve (black) below the -10 dB cutoff, indicating effective radiation at those frequencies. The suggested Autoencoder+XGBoost model (red dashed) tracks the depth and location of the resonant notches with the highest accuracy, matching the real response. The Random Forest, SVR, and MLP baseline models, on the other hand, exhibit more deviations, especially in the acute resonant areas where variations are more noticeable. The suggested model’s greater ability to forecast frequency-dependent antenna performance is shown by its consistency in matching the actual return loss profile.
The comparative plot of Gain vs. Frequency illustrates (Figure 5) how different models approximate the actual antenna gain response in the 1-10 THz range. The proposed Autoencoder+XGBoost model (red dashed line) closely follows the three distinct resonant gain peaks observed in the actual curve (black), which suggests that it can preserve peak amplitudes as well as bandwidth patterns. The Random Forest, SVR, and MLP baseline models, however, show noticeable differences, such as differences at the resonant peaks and less accurate tracking of the gain profile. This indicates that the architecture, which ensures a closer approximation to the actual antenna performance.
The error distribution map shows (as illustrated in Figure 6) the comparison of prediction errors for all models, with each model’s ability to approximate true antenna performance. The proposed Autoencoder+ XGBoost model (red) shows a clear, narrow distribution with zero as the centre, indicating low error and minimal deviation. In comparison, the baseline models Random Forest (blue), SVR (green), and MLP (orange) had broader spreads and longer tails, indicating more variability and bigger mistakes. This investigation demonstrates that the proposed hybrid framework produces the most stable and accurate predictions, with much lower error than traditional machine learning models.
Figure 7 shows the VSWR Vs Frequency plot shows that the proposed triple-band hexagonal graphene THz antenna matches impedance efficiently across the operational range, dropping below 2 at 3.5, 6.2, and 9.8 THz. The antenna’s triple-band capability and little reflection loss at the effective operating bands are confirmed by these resonance dips. However, frequencies with VSWR values over 2 suggest mismatched areas with more reflection and decreased efficiency. The figure shows the antenna’s consistent multi-band performance and THz communication compatibility. Table 4 depicts the comparison of K- fold validation for proposed model with the existing.
Table 4.
comparison of K-fold validation for proposed model with existing models
The k-fold cross-validation results give further information on the resilience of each model. The proposed Autoencoder+XGBoost exhibits very competitive performance with MSE = 0.0389 ± 0.0070, MAE = 0.1595 ± 0.0088, and R2 = 0.9824 ± 0.0037, demonstrating both accuracy and stability across folds. Random Forest has a little lower mean error (MSE = 0.0328 ± 0.0032, MAE = 0.1533 ± 0.0091) and slightly higher R2 = 0.9852 ± 0.0021, but its predictions are less reliable at extreme points than the suggested hybrid model. SVR performs similarly (MSE = 0.0362 ± 0.0035, MAE = 0.1594 ± 0.0069, R2 = 0.9836 ± 0.0026) but has more variability, whereas MLP has much larger mistakes (MSE = 0.0749 ± 0.0081, MAE = 0.2212 ± 0.0123) and lower R2 = 0.9660 ± 0.0052. Overall, the findings show that, although Random Forest performs well, the suggested Autoencoder+XGBoost offers a better-balanced trade- off between accuracy and generalisation, making it the most dependable option for predicting antenna performance.
Comparitive study with Exisiting models
In order to assess the proposed Autoencoder- XGBoost framework’s effectiveness, the comparison of its performance is made with various established antenna optimization procedures that are mentioned in recent research. The comparison is on prediction accuracy, error metrics, and computational efficiency and emphasizes how well each technique conveys the vital antenna parameters like S11, resonant frequency, bandwidth, and gain. The results presented in Table X provide a summary of the models’ relative strengths and unambiguously identify the suggested method as the best one for THz antenna optimization. Moreover, Table 5 illustrates the comparison of the proposed model with the existing model.
Table 5.
Comparison of the proposed model with the existing model
The performance evaluation, as shown in Table 5, compares the proposed Autoencoder-XGBoost antenna optimization model with three existing methods that are largely accepted, these are the CNN-based predictors, GA-ML hybrid optimizers, and surrogate- assisted EM models. The proposed method not only reaches the maximum prediction accuracy (R2 = 0.982) but also produces the lowest error values among all antenna metrics, which are S11, resonant frequency, bandwidth, and gain. Moreover, it exhibits significant training time reduction when compared to evolutionary and surrogate-based techniques, which are regarded as computationally intensive. All in all, the findings underscore the proposed model as the best choice for THz antenna performance optimization in terms of efficiency, robustness, and predictive reliability.
Discussion
The proposed Autoencoder + XGBoost hybrid framework outperforms current machine learning methods in terms of THz antenna design and optimisation. The proposed approach had the lowest prediction errors (MSE = 0.001, RMSE = 0.03, MAE = 0.02) and the highest accuracy (R2 = 0.99), outperforming traditional models like Random Forest, SVR, and MLP, which had lower correlation with simulation data and higher variability in predictions. The actual vs. projected graphs supported these findings, demonstrating that the suggested model nearly precisely fitted with the ideal regression line, while baseline models generated scattered deviations, especially at the dataset’s extreme values. Frequency- domain investigations revealed further insights, with the suggested framework precisely replicating the return loss dips and gain peaks of the real antenna response over the 1-10 THz range, proving its capacity to capture fine-tuned resonant behaviours as well as general performance trends. In contrast, baseline models failed to reliably monitor these important features, demonstrating their limits in effectively simulating complex, nonlinear antenna behaviours.
Error distribution plots showed the excellence of the proposed strategy. The Autoencoder + XGBoost model possesses a dense, symmetric distribution around zero, showing its robustness and stability in minimizing large prediction errors. Baseline models possess wider spreads and longer tails, meaning that they make less reliable and more volatile predictions. The VSWR research confirmed the physical feasibility of the antenna, with the recommended model nicely demonstrating outstanding impedance matching at 3.5, 6.2, and 9.8 THz, which is perfect for triple-band THz communication. In addition, the k-fold cross-validation results indicated that the proposed model worked well in all folds with little variance in error measures, demonstrating good generalisability. Random Forest and SVR produced slightly similar mean values, but were not as stable and scalable as the hybrid system. Together, these figures demonstrate that the new Autoencoder + XGBoost pipeline is a well-balanced and effective method that incorporates deep learning’s feature extraction ability with gradient boosting’s predictive ability. Through this synergy, the model is able to outperform conventional methodologies in accuracy, reliability, and generalisability. These characteristics make it particularly suitable for reducing THz antenna design cycles, minimizing time dependence on labor-intensive simulations, and ensuring accurate predictions for real-world applications in 6G communication systems and beyond.
Conclusion
This study proposes a two-stage hybrid machine learning framework that integrates MATLAB-based electromagnetic simulations with an Autoencoder- XGBoost pipeline to predict and optimize frequency- tunable, triple-band hexagonal graphene THz antennas for building-related sensing and communication applications. The proposed model outperforms standard machine learning methods such as Random Forest, SVR, and MLP, achieving very low error metrics (MSE = 0.001, RMSE = 0.03, MAE = 0.02) and a near-perfect R2 of 0.99, thus delivering highly accurate and reliable prediction outcomes. Comparative assessments of return loss, gain, and VSWR confirm that the framework effectively captures resonant characteristics and performance trends, while error distribution and k-fold validation further demonstrate robustness and generalizability. The novelty of this research lies in its combination of deep feature extraction via Autoencoders with high-performance gradient boosting regression, enabling accurate modeling of complex nonlinear interactions among antenna design parameters and performance metrics. Unlike conventional simulation-driven methods, which are computationally intensive and time-consuming, the proposed hybrid approach ensures computational efficiency and predictive reliability. By significantly accelerating antenna design cycles and reducing reliance on resource-heavy simulations, this work supports sustainable research and development practices. Importantly, the demonstrated framework highlights strong potential for smart and sustainable building environments, where THz antennas can be applied to structural health monitoring, indoor environmental sensing, occupant well-being, and energy- efficient communication. With its ability to balance accuracy, efficiency, and scalability, the Autoencoder- XGBoost framework represents a promising tool for integrating advanced wireless technologies into sustainable building systems and urban development strategies. The future advancements in antenna technology involve multi-objective optimization to forecast numerous parameters and performance factors simultaneously. Integrating physics-informed and reinforcement learning models can improve adaptability and consistency in real-time settings. Additionally, validating the model with experimental data and applying it in digital twin platforms for smart buildings will enhance practical use and facilitate technology scalability.









