I.e., if an expensive task arrives at an idle RateLimiter, it will be granted immediately, but it is the next request that will experience extra throttling, thus paying for the cost of the expensive task.'
but it affects the throttling of the next request. 'It is important to note that the number of permits requested never affects the throttling of the request itself. Maybe it does not work as expected, depending on the number of requests your application does over time. I would propose to take a look in the RateLimiter direction.