Introduction
Related Work
Problem Statement and Proposed Work
Problem Statement
Proposed Architecture
Proposed Mathematical Model
Experimental Results
Dataset
Performance metrics
CTR Estimators
Comparison of Bidding Strategies
Different Algorithms: Budget Allocation
Conclusion
Introduction
In recent years, online advertising has seen extensive adoption, establishing itself as the leading sector within the advertising industry. In contrast to traditional media such as television, radio, newspapers, magazines, and billboards, online advertising offers advertisers an alternative means to diversify their strategies for reaching a broader audience through the Internet. Additionally, it enables the personalisation of advertisements for viewers in a real-time and cost-effective manner [1]. The auction process for each impression in RTB typically occurs in under 100 milliseconds prior to ad placement. The automation of the auctioning process, encompassing a substantial volume of inventories from ad publishers, has markedly altered the online advertising marketplace through Real-Time Bidding (RTB).
The transition from traditional advertising methods to programmatic advertising significantly influences budget allocation in online advertising. RTB emerged to transform the bidding strategy for impressions and the allocation of budget from conventional methods. The first methods depended on static models that assumed market conditions remained constant and expenditure rates stayed fixed [2]. Although the models maintained a set structure they could not effectively adjust to market changes. The integration of machine learning and predictive analytics into budget distribution strategies allowed for the creation of dynamic pacing models over time. Balancing the spending rate with market demand continues to pose significant challenges. Business budget planning becomes complex due to user engagement dynamics together with market factors and competitive intensity which require businesses to adopt strategic and adaptive budget allocation methods to match prevailing market conditions [3].
The steps involved in Real-Time Bidding (RTB) for internet advertising are depicted in Figure 1 below. First, a bid request is created on the publisher’s website and sent to the Supply Side Platform (SSP). Demand Side Platforms (DSPs) get the request from ADX, which also includes user data from a Data Management Platform (DMP). The Generalised Second Price (GSP) system is used to determine the winning offer in internal auctions. The user receives the winning ad following the RTB procedure [4].
Based on this, once we determine the winning ad this would be forwarded to the SSP can be viewed on the publisher’s page [5]. The process takes 100 milliseconds. Its profits are used to monitor the interactions and improve its bidding policy based on it. This RTB platform executes this request-bid feedback process one or billions of times a day [6].
In the space between these elements, intermediaries such as real-time bidding (RTB) and demand-side platforms (DSP) play a crucial role, as advertisers rely on them to effectively distribute budgets across their advertising campaigns. This enables advertisers to formulate the optimal bidding strategy for each ad impression, aligned with the campaign objective for urban development. A DSP is crafted to assist advertisers in comprehending strategies to enhance ad clicks or reduce cost per click (CPC) in alignment with a budget [7]. Throughout the ad delivery time frame, competition and significant shifts in the marketplace can lead to a dynamic and fluid environment for the RTB marketplace. A bidding strategy grounded in historical data may not consistently ensure optimal budget utilisation within the remaining delivery time intervals. An important issue with bidding strategies is the inefficiency of financing within the budget and the variability of spending rates [8]. When an advertiser excessively spends their budget in a short timeframe, it indicates a lack of capacity to secure high-quality impressions in the future, as the budget will be depleted. Some individuals are overly cautious, failing to fully utilise the ad delivery budget before the designated time period concludes, resulting in missed click opportunities. For RTB advertisers, it is most effective to allocate their budgets throughout the entire ad delivery time frame, ensuring that spending is distributed evenly across the whole duration [9].
A variety of budget allocation algorithms have been proposed to address this issue. An advertisement delivery period is typically segmented into time slots, with each slot allocated a designated budget. During this period, expenditures in any slot must not exceed the assigned amount [10]. Budget allocations based on time and traffic exhibit specific constraints. The use of reinforcement learning (RL) frameworks to maximise bidding strategies while maintaining budgetary constraints has been the subject of recent study [11]. However, in contrast to static optimisation algorithms, reinforcement learning-based systems consider the future impact of bids on campaign profit and impression value. As with the delivery term, it permits variable budget distribution over all impressions. Regretfully, RL approaches do not address all aspect of bid dynamics [12]. By carefully modifying the spending rate on the budget allotted for each time slot and employing static optimisation within the parameters of the budget allotted for each time slot, optimal bidding functions for each time slot may be achieved.
Unfortunately, if the price allocation is not sufficiently flexible, the real time bidding (RTB) problem will still remain unresolved. Early fund depletion or misallocated expenditures that leave resources unspent are the two major outcomes of this situation. Advertisers are not receiving an equitable return on their investment due to poor ad delivery. Rigid and fixed bidding methods make it unable to alter bid prices even when costs vary, which makes the issue worse. Long-term, the optimisation of advertising campaigns is impacted by the higher click- through prices and lower conversion rates brought on by the lack of data on consumers’ real-time engagement habits.
One of the most difficult problems we face while analysing data generated by RTB auctions is data management in urban areas. The volume of samples generated by the enormous number of bid requests makes computationally costly real-time decision making. The models now in use are inadequate for managing these datasets. Suboptimal use of the funds is caused by bidding procedure obstacles. This demonstrates the limitations of prediction models, given that the predicted CTR models are ineffective and unable to infer optimal bidding. Low engagement and money misallocation result from the present methods’ inability to accurately recognise bid trends and user behaviour patterns. Advertisers may find it to be a cost-effective tactic if click-through rates are inaccurately assessed, but since impressions are costly, they will likely also lose out on crucial engagements.
An economical budget allocation method for real- time bidding (RTB) online ad auctions is presented in this article. By using advanced reinforcement learning algorithms to dynamically update the bidding strategy, the method maximises the use of the money. The initial two elements of the proposed methodology include the budget allocation model and an adaptive bidding mechanism. The budget distribution model allocates the available budget across various time slots, reflecting the dynamic spending patterns throughout the entire ad delivery period. A hybrid learning model for an adaptive bidding mechanism is developed, determining optimal bid prices for each impression based on real-time market conditions. No other research has been identified that connects budget allocation to user lifetime value while incorporating hybrid reinforcement learning into the framework. This study presents three primary contributions:
ㆍThis study presents the control problem within the RTB context as a dynamic optimisation challenge and introduces an intelligent algorithm for budget allocation across time slots that approach optimality.
ㆍTo improve cost efficiency, we propose an adaptive bidding mechanism utilising a hybrid Transformer- FM model for click-through rate prediction. The model enhances bid price predictions in sequential bidding by incorporating feature interactions and sequential data processing.
ㆍSubsequently, we develop a real-time optimisation model that adjusts bid prices based on market competition and user engagement patterns. A reinforcement learning method is employed to equilibrate immediate gains with long-term budget efficiency through an appropriate strategy.
This paper is structured as follows. Section II presents an overview of previous research concerning budget allocation and real-time bidding strategies. Section III delineates the problem formulation and constraints associated with budget optimisation. Additionally, it outlines the specifics of the proposed framework’s architecture and its algorithmic implementation. Section IV presents a comparison with existing budget management techniques and the results obtained from the experimental evaluation. In conclusion, we summarise our findings and outline potential directions for future research in Section V.
Related Work
This section presents a review of studies from 2021 to 2025, focussing on budget allocation, bid optimisation, and the application of artificial intelligence in real-time bidding (RTB). Prior research endeavours focus on the application of reinforcement learning, deep learning-based bid forecasting, and hybrid optimisation strategies. The proposed framework offers new insights by enhancing cost efficiency and advertising performance for users via adaptive learning techniques.
Tang and Yu [13] addressed the trustworthy AIRTB literature gap. The report examines AIRTB stakeholders’ core concerns, citing five essential trust attributes: robustness, explainability, fairness, auditability, accountability, and environmental well-being. This is merely a study, but they clearly identify present gaps in development on several dimensions, the most likely trust failure reasons, and the value of each characteristic. Existing approaches that fulfil each trust dimension are thoroughly analysed. Gupta et al [14] To guarantee that participating agencies achieve a high Return on Investment, assess the problems of RTB’s ever-changing data characteristics and the need to calculate on the growing data amount. In RTB, the publisher (website owner) gives the advertising impressions on its website. Opening the webpage generates room for advertisements. When an advertiser cannot be awarded a customised ad space, the publisher’s ad server auctions the space in the open market. The publisher’s ad server connects to the Server-Side Platform (SSP) to deliver the ad request with all auction information (web cookies, unique visitors, page views per visit, visiting sessions, etc.) to the AdExchange (AdX), which sends it to the Demand Side Platforms. AdX receives demand-side platform bids for advertisers’ advertising space. The highest bidder in an auction wins an advertising spot on the publisher’s website. Tunuguntla et al. [15] created a learning algorithm that converges to the best strategy without regret while maintaining competitive equilibrium. For the online knapsack problem, this is the first algorithm with theoretical guarantees without item prices and values. While keeping within budget, it meets promotional goals by changing assessment standards. Analyse 100 simulated and 10 real-world campaigns to show that with complete foresight, the system achieves 98% of the best value, outperforming the best alternative by 11%. Finally, it displays the algorithm’s improvement in impression value estimate and bidding rule learning. Kao and Chueh [16] suggested unifying the supplier and demand sides using an SSP and DSP. The recommended solution incorporates gamified RTB and mobile location-based analysis. Gamification increases customer engagement by encouraging active participation in promotional content. Coupons increased restaurant coupon usage 16.7% and apparel coupon usage 11.6% using gamified RTB App. This is a great marketing service for digital retail transition. Christopher et al. [17] present a technique for marketers to evaluate their advertising without using their DSP’s performance optimisers. This illustrates that external frequency limitations, which limit ad impressions beyond bidding algorithms, can be useful in some cases. By eliminating performance optimisers, the advocate gains independence from DSP support services, expands client reach, and reduces costs.
Nayyar et al. [18] suggest improving click through rate (CTR), conversion rate, and return on ad expenditure. These models and algorithms show how machine learning can complete dynamic pricing, segment audiences, and reduce advertising waste. The study examines the ethics of machine learning in advertising, including data privacy and algorithm biases. Ren et al. [19] provide a Deep Landscape Forecasting (DLF) model for censorship management using deep learning for probability distribution and survival analysis. The recurrent neural network is used to calculate the conditional probability of success for each bid price as an initial strategy. Deduce probability chain rule and use math to anticipate bidding environment. Finally, eliminate both negative likelihood losses in the model with comprehensive incentive. Unlike past attempts to align the bid landscape across complicated distributions, our approach performs well without making any assumptions about the distribution structure. Vora et al. [20] provide a two-step planning technique for this difficulty. Start by grouping multi- component MDP parts by capacity constraint amount. Segmentation for the Linear Sum Assignment Problem. Then the groups are allocated and the funding is proportionate to their numbers. This partitioning breaks down the massive multi-component MDP into subsystems that can be computationally handled with simple capacity constraints and are budget-friendly with existing solutions. Meta-trained PPO agents are used to produce near-optimal policies for each group. Wu et al. [21] offer budget-constrained Markov Decision Processes under model-free reinforcement learning to solve the optimisation problem. Resource constraints may provide vaccination environments unanticipated immediate benefits, according to one study. Thus, constructed reward functions for reinforcement learning situations with restrictions. Use the new reward framework to create a deep neural network that can properly define tasks to find the best policy. This framework is scalable for industrial application, like prior model-based techniques. Arbitrary Distribution Modelling (ADM) by Li et al. [22] uses a novel loss function, Neighbourhood Likelihood Loss (NLL), to forecast the winning price censoring distribution without pre-assumptions. Both actual systems trials were substantial, and one huge dataset from a non-simulated production context was utilised to evaluate the system. In studies, the ADM beat baseline models on algorithmic and business criteria. Documenting this process for a year increased system yield by tens of percent above tolerable limits. Third work [23] refined RTB by adding feedback loops to stabilise spending at the advertising aim. In volatile market situations, it yields trustworthy results. Chapelle [24] also solved the delayed conversion input problem, a crucial challenge given online budget adjustments. However, the revised model accounts for response timing, improving budget allocation accuracy.
Despite significant advancements in RTB and budget allocation, several issues remain inadequately addressed. A significant concern is the absence of trust and transparency in AI-based bidding systems regarding the decision-making process. This raises concerns regarding the justice, accountability, and explainability of the matter. Moreover, current models struggle to manage the complexities of large-scale and dynamic data environments, optimise budget utilisation, and achieve return on investment. Traditional bidding strategies are ineffective because they cannot adapt to real-time market fluctuations. Another limitation is the static, rigid, and unchangeable nature of budget allocation methods, which fail to adapt to user engagement and market demand. Models often exhibit overspending, depleting budgets prematurely or inefficiently, or underspending, resulting in a failure to capture high-value impressions. Furthermore, while optimisation techniques exist to improve bidding efficiency, many of these methods often exhibit limitations in scalability and computational efficiency when implemented in real-world RTB platforms. Improved models for predicting conversion rates and click-through rates are necessary to achieve more accurate forecasts, minimising the influence of misleading historical data patterns.
Problem Statement and Proposed Work
Problem Statement
The primary challenge of most budget allocation frameworks, as previously discussed, is the absence of real-time feedback mechanisms. Failure to implement ongoing adjustments in response to campaign performance may lead to budget inefficiencies and missed opportunities for optimal ad placement. The dynamic reallocation of expenditures during campaign development is hindered by the absence of performance-based adjustments to campaign expenditures.
Scheduling an advertising budget within RTB environments presents significant challenges. The core challenge involves maintaining the budget’s pacing throughout the campaign to enhance effectiveness. Controlling a budget is about decreasing the spending rate so that at the end of the budget we have spent all funds allocated for that period. For optimal situations, there are strategies that need to be applied to overcome the ad impressions and broaden our campaign reach. There are fixed pacing models commonly used to allocate budget in the traditional approach. However, these models are not suitable to address the fast- changing nature of real-time bidding. Two specific inefficiencies may be one, the lack of reallocating real-time budget to real-time allocation may result in big inefficiencies (wasting potential advertising opportunities or prematurely trimming resources before its maximum achievement in performance).
However, although RTB and budget allocation have made great progress, there are still many things that need to be improved. As AI-based bidding systems depend solely on opaque decision-making processes, the glaring absence of trust and transparency in them is a cause of concern in the sense that there are related issues of fairness, accountability as well as explainability. Currently, the models do not provide adequate means of scrutiny and management of the dynamic and scale large data environments, causing inefficiencies in the budget spent as well as the ROI generated. Traditional bidding strategies are not effective since algorithms typically don’t adjust to the real-time fluctuations in the market. Rigid and static budget allocation for campaigns cannot dynamically adapt to user engagement, or the need for a campaign to evolve budget-wise, as a result, is a secondary consideration. In a traditional way, there are models that either deal with budget allocation quite very extra or do not allocate enough budget for high-value impressions. Nevertheless, optimization techniques bring enhanced efficiency to bidding that still can suffer from scalability and computational inefficiency that render them inapplicable for real-time bidding. In addition, most of the models for predicting conversion and click rates are heavily dependent on historical data and may lead to bias and inaccuracy which cannot effectively allocate budgets.
The obstacles demonstrate that there is a need for a dynamic budget pacing solution that can be adapted to market trends and performance metrics in an automatic way. In order to achieve expenditure efficiency and competitiveness, a framework of reinforcement learning- based adaptive pacing in conjunction with predictive analytics should be established. The ability to dynamically control the spending rates based on the market demand and engagement patterns allows the advertisers to improve cost efficiency, and increase the impression rates as well as the ROI in the RTB campaigns. In this study, we develop an intelligent budget allocation framework that is used in long-term campaigns to allocate expenditure in such a way that overall expenditure distribution is balanced during the lifetime of the campaign while allowing adaptive decision-making within the constraints of a sustainable and strategic approach to expenditure
Proposed Architecture
To tackle this, the proposed framework is later using adaptive learning to allocate budgets dynamically across different time slots. Pacing strategies based on reinforcement learning provably minimize or prevent early budget depletion, and maximize high value impressions with high expectation of conveyance throughout the campaign. In contrast to traditional fixed or rule based models, the proposed approach formulates a hybrid reinforcement learning mechanism that can be used to optimize the bids prices on an on-line real-time basis. It gives the chance for fewer expenses, lower total costs per click, and more click through rates and conversions. There’s a real need, which is actually allows distributed AI models to process large scale RTB data efficiently in real time, in the framework. It supports rapid decision making and high system throughput to service the market quickly. Sequence bidding pattern and user engagement behaviour are analyzed using a hybrid Transformer- FM model for the purpose of CTR prediction. This approach helps advertisers to allocate their budget sensibly allocating their funds to the most valued ad impression and also makes it easier for bid categories to this approach, thereby increasing engagement rate. The proposed framework includes a real time mechanism, to monitor the performance of the budget to get immediate changes in the allocation strategies. Adaptive feedback loop can incorporate to prevent spending towards low performing impressions and hence optimize the advertising budgets against the performance. This capability enables advertisers to optimise their budgets in real time on behalf of bidding efficiency and save cost.
This Figure 2 presents the architecture of the proposed hybrid model for budget allocation in RTB. It integrates a Transformer model with Factorization Machines (Transformer-FM) for click-through rate (CTR) prediction. The model processes real-time user behavior and bidding data to dynamically adjust bidding strategies through a reinforcement learning (RL) feedback loop. The framework supports adaptive budget pacing, optimized bid pricing, and improved campaign performance. The calculated dynamics of bid prices are ultimately employed to forecast CTR values. The bidding strategy must persist in adjusting the bid values over time based on real-time auction outcomes within an adaptive reinforcement learning framework. Based on past bidding results and market swings, the model suggested the best bidding strategies to boost participation while cutting expenses. Implementing reinforcement learning ensures dynamic bid modifications, including real-time adjustments performed in accordance with the competitive environment and advertiser objectives. Allocating the whole amount of money available for the designated time slots and ensuring continued involvement throughout the campaign duration are made possible by a budget allocation process. Based on performers’ immediate feedback, resources are dynamically allocated to time windows with the highest potential for engagement. In a real-time feedback loop, we also rely on additional data like budget usage, cost per click (CPC), and click-through rate (CTR). In instances where certain bidding strategies fall short of expectations, the system will adjust budget allocation and bidding limits to optimise both time and financial resources. This method reduces the likelihood of ineffective spending by employing feedback-driven optimisation, ensuring that budget utilisation is aligned with market conditions. The proposed framework is mentioned in step by step manner in Algorithm 1.
Algorithm 1.
Optimized Ad Bidding Strategy and Budget Management
Proposed Mathematical Model
The RTB budget allocation framework is to optimize spending across multiple time slots while ensuring cost-effectiveness and high engagement. This verification model mathematically justifies the approach.
Define the total advertising budget as , divided across time slots .
The objective is to maximize click-through rate (CTR) while minimizing cost per click (CPC):
Where:
- : Expected click-through rate at time slot t.
- : Cost per click at time slot t.
- 𝜆: Trade-off parameter for cost efficiency.
Subject to budget constraints:
Where:
- : Budget allocated to time slot t.
- : Total available budget.
a) Budget Allocation as an Optimization Problem
The proposed framework distributes the available budget dynamically while considering CTR and CPC variations over time. To formulate this as a convex optimization problem:
Where:
- : Represents the Lagrangian function for budget optimization.
To ensure optimality, so use Lagrange multipliers to incorporate budget constraints:
Where:
- 𝜇: Represents the Lagrange multiplier enforcing budget constraints.
b) Competitive Bidding Validation using Game Theory
Since multiple advertisers compete in RTB auctions, to validate the bidding approach using Nash equilibrium principles. Each advertiser selects an optimal bid such that:
Where:
- : Bidding strategy of advertiser i.
- : Expected profit for advertiser i.
The equilibrium condition ensures no bidder can increase their profit by deviating from :
Where:
- : Represents the equilibrium bid price.
c) Probabilistic Verification of CTR Prediction
The proposed work predicts CTR dynamically based on past engagement patterns. The model CTR as a conditional probability distribution:
Where:
- 𝜇: Mean of the normal distribution.
- : Variance of the normal distribution.
Using Bayesian inference, the expected CTR can be estimated as:
Where:
- : Predicted CTR based on Bayesian probability.
d) Real-Time Budget Pacing Validation using Optimization Techniques
To ensure smooth spending throughout the campaign period, to apply optimization techniques. The drift-plus-penalty function governs the budget pacing process:
Where:
- : Lyapunov drift function ensuring stable budget pacing.
The optimal budget allocation at any time slot is determined dynamically by minimizing:
Where:
- : Dynamically allocated budget for time slot t.
The above mathematical formulations validate the proposed RTB budget allocation framework by ensuring theoretical optimality under real-time constraints, efficient competition in bidding, and adaptive spending behaviour.
Experimental Results
To obtain experimental evidence for the efficacy of our proposed budget allocation framework for real- time bidding (RTB), we conduct comprehensive experiments using the publicly available iPinYou dataset. The primary objectives of this evaluation are to determine the impact of our approach on cost efficiency, click-through rate (CTR), and budget utilisation. We initially evaluate our method against several baseline strategies that differ in their adaptivity levels, including fixed bidding, linear bidding, non- linear bidding, and various reinforcement learning- based bidding strategies. Secondly, we examine the performance of different budget allocation algorithms and their adaptation to market conditions while modifying ad expenditure in the Bootstrap campaign. The evaluation metrics include CTR, CPC, CPM, total clicks, and budget smoothness, aimed at incorporating both bidding efficiency and financial sustainability. The specifics of our experimental setup, dataset description, model evaluation, and comparative analysis, along with the subsequent subsections, are provided.
Dataset
A dataset has been released by iPinYou, a leading entity in the online advertisement industry and a specialist in demand side platform solutions, which includes impressions, bids, clicks, and final convertsions. User-related information is included in each bid record for the impression. The initial approach can be utilised to develop a more accurate estimator for click-through rates (CTR). The data on advertiser impressions and clicks offers valuable insights into auction outcomes, market pricing, user interactions (such as clicks), and bid amounts. The data utilised in this study was gathered over a seven-day period from June 6, 2013, to June 12, 2013. The initial six days of data constitute the training set, while the data from the final day serves as the test set. During the test phase, only winning impressions are evaluated; losing impressions are excluded from market pricing and user feedback considerations. The recorded impressions in these experiments differ significantly from those in the iPinYou dataset. This analysis focusses on the statistical and experimental results of a single campaign (Advertiser ID: 1460), considering the spatial constraints, as budget management is organised by campaigns within the current framework. This approach aims to achieve consistent results across various campaigns. Table 1 presents a comprehensive overview of the campaign data, detailing its contents.
Table 1.
The Statistic of the Campaign (Advertiser Id = 1460)
Performance metrics
i) Area Under the Curve (AUC): AUC (Area Under the Receiver Operating Characteristic Curve) quantifies a model’s ability to distinguish between positive and negative classes. It determines which predicted probabilities yield superior ranking quality by identifying the model that produces better outcomes for clicked and non-clicked ads.
ii) Click-Through Rate (CTR): The click-through rate (CTR) of an advertisement is quantified as the proportion of users who clicked on the ad following its exposure. Ad engagement and effectiveness are assessed as significant metrics.
iii) Cost-per Click:CPC (Cost per Click) refers to the average expenditure incurred by an advertiser for each
iv) click on their advertisement. This also aids in determining the efficiency rates of advertising expenditures.
v) Cost-Per-Mile: The cost of ad impressions is denoted by CPM. This metric is frequently utilised in display advertising to assess the expense associated with reaching a broad audience.
vi) Total-Clicks: The total clicks refer to the aggregate number of clicks made by users on the advertisement. This serves as an indicator of user interest and interaction with the advertisement.
CTR Estimators
The experiments conducted and their corresponding results are presented in Table 2. Among five models, the FFM model demonstrates the highest accuracy in predicting CTR, as indicated by the AUC metric. Another reason is that the prediction for click-through rate (CTR) is determined based on each impression using the FFM model.
Table 2.
Performance Comparison of CTR Estimators
| Model | LR | FM | FFM | FNN | PNN | Proposed |
| AUC | 0.8102 | 0.8116 | 0.8732 | 0.8127 | 0.8169 | 0.8596 |
| LogLoss (10-3) | 4.4368 | 4.4152 | 5.4521 | 4.5036 | 4.5684 | 4.9423 |
While the FFM model achieves a higher AUC (0.8732), it was observed to exhibit higher variance and overfitting on unseen test data, particularly in dynamic bidding environments. The proposed hybrid Transformer-FM model (AUC 0.8596) provides a better balance between generalization and computational efficiency during real-time bidding cycles. Additionally, the Transformer component captures sequential user behavior, which is not modeled in FFM, making the hybrid approach more suitable for adaptive budget pacing in dynamic RTB scenarios.
Comparison of Bidding Strategies
The analysis of the bidding strategy encompasses four distinct methods: (1) Fixed bidding, (2) Linear bidding, (3) Non-linear bidding, and (4) RL-based bidding. The analysis encompasses various performance metrics, including the number of clicks, click-through rate (CTR), and cost per click (CPC). Price variations range from 0 to 300. Consequently, a selection of fixed bid prices was made within the range of 5 to 300.
The fixed bid experiments suggest that 75 represents the optimal price point initially, as indicated by JsonSerializer. The findings from the fixed bid 75 experiment demonstrate a significant correlation between bid amounts and levels of participant engagement. Data analysis indicates that elevated fixed bids correlate with enhanced participation rates, implying a direct relationship between bid value and user engagement. The experiment revealed variations in outcomes influenced by demographic factors, necessitating further exploration of the underlying causes of these discrepancies. Table 3 displays the findings, and Figure 3 depicts the distribution of key metrics over different time intervals. The results demonstrate that bidding strategies employing reinforcement learning achieve the greatest number of clicks while reducing cost per click. The selection of bids is treated as a dynamic and interactive process, utilising reinforcement learning methodologies. The bid price for an impression is contingent upon both immediate and future rewards. This illustrates the use of reinforcement learning to improve optimisation in dynamic environments. In the absence of reinforcement learning-based bidding, it exceeds the proposed bidding method. Linear and nonlinear bids approximate 75, while fixed bids of 75 are the least favourable. On the testing day, merely 16.5% of the total expenditure for impressions was utilised as per the budget, highlighting a substantial deviation between the projected bidding plan (153) and the actual result (356). Furthermore, all techniques, with the exception of the RL-based bidding approach, experienced premature budget exhaustion due to insufficient budget management. Reinforcement learning- based bidding, which entails submitting bids for each impression during the delivery process, produces optimal results. Table 4 outlines the factors associated with click losses related to four bidding strategies, excluding reinforcement learning. Data suggests that early budget depletion is responsible for the majority of lost clicks. After time slot 6, the non-linear bidding approach was terminated, while the linear, fixed (75), and proposed methods continued until time slot 7. As a result, all impressions that qualified for clicks in later time slots were lost.
Table 3.
Various Budgeting Strategies Comparison
Table 4.
Different Reason for the Missing Clicks (R1: Budget get used completely; R2: Price of Bid is less than Market Price)
| Reason | Fixed (75) | Linear | Non-Linear | Proposed |
| R1 | 237 | 268 | 248 | 240 |
| R2 | 43 | 3 | 13 | 11 |
| Total | 280 | 271 | 261 | 251 |
Different Algorithms: Budget Allocation
i) Performance Comparison
Table 5 presents the effectiveness of six algorithms in identifying the highest value solution under a budget constraint of 5,000,000, specifically evaluating the daily budget limitation. The results indicate that the proposed method optimised the number of clicks (269) with a balanced CPC, demonstrating greater efficiency compared to other methods. In comparison to the NBA strategy, it exhibits a notably high CPC of 49,504, resulting in only 101 clicks, which highlights inefficiencies in budget allocation. The effectiveness of NBA and RLB is contingent upon the capture and utilisation of clicks by the budget. The budget- managed schemes (UA, UAA, TAA, OAA) optimise budget usage, resulting in increased clicks at improved CPC rates. Unlike the NBA and RLB, which do not specify the pCTR constraint and cost distribution, all other methods represent spending schemes. The implementation of a piecewise bidding strategy enhances bidding outcomes by targeting unlikely impressions, which may yield clicks, albeit with a lower probability. While managed budget approaches do not guarantee superior click performance, they offer the advantage of consistent spending patterns, thereby enhancing analysis through targeted interventions. In UA and UAA, current spending patterns result in nearly depleted budgets following minor budget adjustments. In contrast, the budget allocation methods (TAA and OAA) offer superior cost distribution relative to CAA and facilitate future advantageous bidding.
Table 5.
Comparison of BA Algorithms under a 5,000,000 Budget
Table 5 and Figure 4 illustrate that the proposed method demonstrates an improvement compared to RLB, TAA, and NBA; however, the performance of TAA and NBA is relatively comparable to that of the UA and UAA methods.
Table 6 presents the number of missed clicks classified by various reasons to analyse the effectiveness of bidding strategies. The NBA strategy experienced a budget depletion, resulting in 242 missed clicks and 240 clicks at bid prices that were below the market rate of the proposed method. One example includes budget-managed schemes such as UA, UAA, TAA, and OAA, where missed clicks may occur due to bid prices exceeding the pCTR threshold. A key finding indicates that pCTR thresholds generally influence budget management strategies, resulting in portions of the budget remaining unused even when the total available budget constitutes only 16.5 percent of the total cost of all impressions.
Table 6.
Number of Missed Clicks for Different Reasons (Budget = 5,000,000; R1: Budget Depleted, R2: Bid Below Market Price, R3: Bid Below Threshold)
| Reasons | NBA | UA | UAA | TAA | OAA | Proposed |
| R1 | 240 | 0 | 0 | 0 | 0 | 0 |
| R2 | 16 | 20 | 20 | 18 | 17 | 240 |
| R3 | 0 | 110 | 95 | 92 | 105 | 11 |
| Total | 256 | 130 | 115 | 110 | 122 | 251 |
The alternative calculation method for the 7th day threshold involves averaging the market price of high-quality impressions from the 6th day, resulting in an elevated figure. The pCTR thresholds established through the implemented bidding strategies are excessively high, which may result in the potential loss of additional clicks. This indicates the necessity of integrating performance metrics into budget allocation models that enhance cost efficiency, click rates, and bidding effectiveness in response to changes.
ii) Performance Under Different Budgets
This experiment assesses the impact of budget management strategy parameters on budget efficiency. The test set represents 33% of the total impression expenditure within the daily budget of 10,000,000. Results are displayed in Table 7, while Figure 5 illustrates daily fluctuations of the key metrics. Analytical findings indicate a significant increase in the number of clicks across all strategies with an increase in budget. The NBA has recorded a 56.74% increase in clicks. Among all advertisements, OAA receives the highest click-through rate; however, its cost per click (CPC) and overall expenditure significantly exceed those of UAA and TAA. The elevated budget allocated to OAA during various time slots, in comparison to the other two, results in lower pCTR thresholds. This suggests that OAA will undertake a broader acquisition of impressions with lower cash value compared to UAA and TAA. The OAA’s CPC exhibits a 65.24% increase relative to the budget of 5,000,000. The pCTR for each time slot decreases with an increase in budget; thus, it is necessary to allocate additional impressions at lower pCTRs for each time slot within the piecewise strategy. In conclusion, detailed budget allocation and piecewise bidding remain effective strategies, irrespective of fluctuations in daily budgets.
Table 7.
Comparison of BAAlgorithms under a 10,000,000 Budget
Conclusion
This study’s primary contribution is the development of a cost-effective and sustainable budget allocation framework for Real Time bidding in online advertising. This framework integrates reinforcement learning with a hybrid Transformer FM model to establish bid prices and pacing budgets. The proposed framework can dynamically adjust spending in an efficient and responsible manner, thereby reducing advertising waste, based on real-time user engagement and market conditions for urban development. The experimental results utilising the iPinYou dataset demonstrate that our approach enhances CTR by 11.2%, decreases CPA by 24.3%, and boosts total clicks by 17.5% compared to existing methods, balancing economic and social sustainability considerations.
Further efforts are required to enhance the scalability of Finding Bidders, enabling its application in large advertising campaigns across extensive RTB platforms. A significant challenge faced by AI-driven bidding systems is the necessity of fairness and transparency, which are crucial for preventing market bias and ensuring an equitable distribution of advertisements. Future investigations could focus on elucidating AI models to clarify the mechanisms through which advertisers receive budget approvals and the functioning of automated decision-making processes. This study centres on creating socially responsible, ethical, and effective digital marketing strategies to adapt to the evolving economic landscape and promote sustainable advertising practices. The use of user sentiment analysis to optimise bidding tactics based on real-time customer sentiment and feedback may be the subject of future research. Furthermore, the implementation of Explainable AI (XAI) also enables the user to know model’s judgement.







