The Problem: When Middleware Finishes But Response is Still Streaming

The Rack specification defines a response as a triplet [status, headers, body], but in the real world, significant time can pass between returning this triplet and the actual delivery of the last byte to the client. This is particularly critical when streaming large files or Server-Sent Events.

The Real Problem

Middleware can free resources (close DB connections, clear caches) long before the client receives the complete response. This leads to inaccurate metrics, premature log cleanup, and potential race conditions.

Historically, this problem was addressed using Rack::BodyProxy — a wrapper around the body object that allowed middleware to register callbacks for response closure. However, this solution introduced new problems.

Architectural Flaws of Rack::BodyProxy

Proxy Object Chains

Each middleware that needed a completion callback would create its ownBodyProxy. In a typical Rails application, this could result in chains of 5-10 nested proxies:

1# Typical chain in Rails applications
2original_body = ["Hello World"]
3body = Rack::BodyProxy.new(original_body) { logger.info "Request finished" }
4body = Rack::BodyProxy.new(body) { metrics.record_latency }
5body = Rack::BodyProxy.new(body) { cleanup_thread_locals }
6body = Rack::BodyProxy.new(body) { close_db_connections }
7# ... and so on

Timing Uncertainty

The #close method was called by the server, but the specification didn't guarantee this happened exactly after all data was sent to the client. Depending on the server and buffering settings, callbacks could fire:

  • Before data transmission begins to the client
  • During transmission (in parallel)
  • After transmission but before acknowledgment
  • After complete connection termination

Exception Handling Issues

If an exception occurred during body iteration, callbacks inBodyProxy might not execute at all, leading to resource leaks:

1class ProblematicBody
2  def each
3    yield "Part 1"
4    raise "Something went wrong"  # BodyProxy#close might not be called
5    yield "Part 2"
6  end
7end

Memory and Performance Impact Analysis

Additional Allocations

Each BodyProxy is an additional object in memory. On high-traffic sites with tens of thousands of requests per second, even small additional allocations create GC pressure.

More critically, these objects can live longer than usual due to closures in callbacks, complicating generational GC work and potentially promoting objects to older generations.

Method Dispatch Overhead

Every body object method (especially #each) had to pass through the proxy chain. For large streams, this added measurable overhead:

1# Each call goes through the entire chain
2def each(&block)
3  @body.each(&block)  # Delegate to next in chain
4ensure
5  @callback.call      # Then execute callback
6end

Measurable Effect

According to Rack issue tracker data, some applications saw reduced object counts per request and improved tail latency when switching toresponse_finished due to more predictable GC behavior.

rack.response_finished: Design of the New Solution

Design Philosophy

The new mechanism follows "convention over configuration" principles. Instead of multiple proxy objects, it uses one standard key in the envhash — "rack.response_finished" — containing an array of callbacks.

1# Middleware registers callback
2def call(env)
3  callbacks = env["rack.response_finished"] ||= []
4  callbacks << lambda do |env, status, headers, error|
5    # Code executes AFTER complete response delivery
6    cleanup_resources
7    log_metrics(status, headers)
8  end
9  
10  @app.call(env)
11end

Execution Guarantees

Unlike BodyProxy#close, new callbacks are guaranteed to be called by the server in three cases:

  1. Successful completion: after all data is sent to client
  2. Application exception: even if body never started iterating
  3. Server exception: during network or I/O problems

Key Improvement

Callbacks execute even during exceptions, solving the resource leak problem characteristic of BodyProxy.

Implementation Details and API Contract

Callback Signature

Each callback receives four parameters: (env, status, headers, error). The logic for populating them depends on the completion scenario:

1# Successful completion
2callback.call(env, 200, {"content-type" => "text/html"}, nil)
3
4# Application exception  
5callback.call(env, nil, nil, StandardError.new("App error"))
6
7# Server exception (e.g., connection reset)
8callback.call(env, 500, {}, IOError.new("Connection reset"))

Error Handling in Callbacks

Exceptions in one callback shouldn't affect execution of others. Servers typically log such errors but don't interrupt processing of remaining callbacks:

1# Example of safe callback execution in server
2callbacks.each do |callback|
3  begin
4    callback.call(env, status, headers, error)
5  rescue => callback_error
6    logger.error "Response finished callback failed: #{callback_error}"
7  end
8end

Thread Safety

Callbacks execute in the same thread as the main request. This simplifies work with thread-local variables but requires caution with blocking operations:

1# Good pattern: fast cleanup
2callbacks << lambda do |env, status, headers, error|
3  Thread.current[:request_id] = nil
4  ActiveRecord::Base.clear_active_connections!
5  Rails.cache.clear if Rails.env.test?
6end
7
8# Bad pattern: slow I/O operations
9callbacks << lambda do |env, status, headers, error|
10  # Don't do this in callbacks!
11  send_email_notification(status)  # Can take seconds
12  upload_logs_to_s3(env[:logs])    # Blocking network operation
13end

Production Migration Strategies

Phased Approach

Safe migration requires supporting both mechanisms during the transition period. Here's a proven pattern for production-ready middleware:

1class SafeMigrationMiddleware
2  def initialize(app)
3    @app = app
4  end
5
6  def call(env)
7    status, headers, body = @app.call(env)
8    
9    # Try to use new API
10    if env["rack.response_finished"]
11      register_new_callback(env)
12    else
13      # Fallback to old mechanism
14      body = wrap_with_proxy(body)
15    end
16    
17    [status, headers, body]
18  end
19
20  private
21
22  def register_new_callback(env)
23    callbacks = env["rack.response_finished"] ||= []
24    callbacks << method(:cleanup_resources)
25  end
26
27  def wrap_with_proxy(body)
28    Rack::BodyProxy.new(body) { cleanup_resources }
29  end
30
31  def cleanup_resources(*)  # Accepts any number of arguments
32    # Your cleanup logic
33    logger.info "Request completed"
34  end
35end

Migration Monitoring

To track migration progress, add metrics showing what percentage of requests use the new API:

1def call(env)
2  status, headers, body = @app.call(env)
3  
4  if env["rack.response_finished"]
5    StatsD.increment('middleware.response_finished.new_api')
6    register_new_callback(env)
7  else
8    StatsD.increment('middleware.response_finished.fallback')
9    body = wrap_with_proxy(body)
10  end
11  
12  [status, headers, body]
13end

Rails and Ecosystem Integration

ActionDispatch::Executor

One key improvement in Rails is more reliable operation ofActionDispatch::Executor. Now it can reliably clear thread-local variables exactly after response completion.

Popular Gems

Many popular solutions have already added support for the new API or plan to do so:

  • rack-timeout: more accurate execution time measurement
  • newrelic_rpm: improved performance metrics
  • skylight: more precise request tracing
  • sentry-ruby: correlating errors with completed requests

Monitoring and Problem Diagnostics

Key Metrics

When migrating to the new API, monitor these metrics:

  • Memory: reduction in objects per request
  • GC metrics: major GC frequency, pause times
  • Latency distribution: especially tail latency (p95, p99)
  • Callback errors: exceptions during cleanup

Common Problem

If callbacks take too long to execute, this can block worker processes. Move heavy operations to background jobs.

Useful Resources and Further Reading

Official Documentation

In-Depth Analysis

Implementation Examples

Migration Tools

Learning Recommendation

Start by reading the original Rails at Scale article and studying the Rack pull request. This will give you better understanding of the technical reasons and trade-offs in developing the new API.

Practical Steps

For starting migration, I recommend this action plan:

  1. Audit existing middleware using BodyProxy
  2. Update Rack to version 3.x in development environment
  3. Implement support for both APIs in critical middleware
  4. Add monitoring to track new API usage
  5. Test in staging with various completion scenarios
  6. Gradually deploy to production with metrics monitoring
  7. Remove BodyProxy fallback after stabilization

Conclusion: Path to More Reliable Rack Applications

The transition from Rack::BodyProxy to env["rack.response_finished"] is not just a technical improvement, but a fundamental step toward more predictable and efficient request completion handling in the Rack ecosystem.

The new API solves real production application problems: reduces GC pressure, eliminates race conditions during resource cleanup, and provides execution guarantees even during exceptions. This is especially important for high-throughput applications where every extra allocation and millisecond of delay matters.

The key lesson from this evolution is the importance of proper abstractions in foundational libraries. BodyProxy seemed like an elegant solution, but in practice created more problems than it solved. The new approach is conceptually simpler yet more reliable and efficient.

Key Takeaway

Start migration now with support for both APIs. This will give you experience with the new mechanism without production risk, and you'll be ready for full transition when your stack is updated.

Planning a Rack 3 Migration?
Need help with BodyProxy migration, Rack middleware performance optimization, or memory pattern analysis in Ruby applications? I'll help design a safe migration strategy and set up monitoring for key metrics.