Context Propagation in Spring WebFlux MCP Server
search cancel

Context Propagation in Spring WebFlux MCP Server

book

Article ID: 440674

calendar_today

Updated On:

Products

VMware Tanzu Platform Spring

Issue/Introduction

In Spring WebFlux, HTTP request context (Auth tokens, Tenant IDs) is lost when an LLM triggers an asynchronous @McpTool callback.

  • Standard ThreadLocal storage breaks because reactive execution constantly hops across different threads.

  • MCP tool executions run on decoupled scheduling contexts, completely separating from the original HTTP thread.

Resolution

Use Reactor Context (Mono.contextWrite and Mono.deferContextual) to pass data along the reactive subscription chain instead of the thread execution stack.

To make data visible to the MCP tool handler, the contextWrite operator must wrap the downstream execution chain inside a WebFilter.

Step 1: Write Context via WebFilter

Intercept the inbound HTTP request, extract the required attributes, and write them to the context after triggering the next link in the filter chain

@Bean
List<McpStatelessServerFeatures.AsyncToolSpecification> tool() {
    return List.of(McpStatelessServerFeatures.AsyncToolSpecification.builder()
        .tool(McpSchema.Tool.builder().name("greet").inputSchema(EMPTY_JSON_SECHMA).build())
        .callHandler((context, request) -> {
            return Mono.deferContextual(ctx -> {
                        CustomAuthentication authn = ctx.get(CustomAuthentication.CONTEXT_KEY);
                        return Mono.just(McpSchema.CallToolResult.builder()
                                .addTextContent("Hello! You logged in at [%s], with user-agent [%s]".formatted(authn.loginTime, authn.userAgent))
                                .build());
                    }
            );
        })
        .build());
}

 

Step 2: Read Context in the Async Tool Specification

Register your tool using AsyncToolSpecification and wrap the payload extraction inside Mono.deferContextual to lazily resolve the context at subscription time.

@Override
public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
    return Mono.fromSupplier(() -> {
                // Here we wrap this in a Mono just to simulate more complex operations
                // that you would want to be non-blocking, e.g. reading from a DB
                //
                // If you have synchronous operation,s you may skip the Mono -> flatMap
                var userAgent = exchange.getRequest().getHeaders().getFirst("user-agent");
                return new CustomAuthentication(userAgent);
            })
            .flatMap(authentication ->
                    chain.filter(exchange).contextWrite(ctx -> ctx.put(CustomAuthentication.CONTEXT_KEY, authentication))
            );
}