Skip to content

feat: background workers = non-HTTP workers with shared state#2287

Open
nicolas-grekas wants to merge 1 commit intophp:mainfrom
nicolas-grekas:sidekicks
Open

feat: background workers = non-HTTP workers with shared state#2287
nicolas-grekas wants to merge 1 commit intophp:mainfrom
nicolas-grekas:sidekicks

Conversation

@nicolas-grekas
Copy link
Copy Markdown
Contributor

@nicolas-grekas nicolas-grekas commented Mar 16, 2026

Summary

Background workers are long-running PHP workers that run outside the HTTP cycle. They observe their environment (Redis, DB, filesystem, etc.) and publish variables that HTTP workers read per-request - enabling real-time reconfiguration without restarts or polling.

PHP API

  • frankenphp_set_worker_vars(array $vars): void - publishes vars from a background worker (persistent memory, cross-thread). Skips all work when data is unchanged (=== check).
  • frankenphp_get_worker_vars(string|array $name, float $timeout = 30.0): array - reads vars from any worker context (blocks until first publish, generational cache)
  • frankenphp_get_worker_handle(): resource - returns a pipe-based stream for stream_select() integration. Closed on shutdown (EOF = stop).

Caddyfile configuration

php_server {
    # HTTP worker (unchanged)
    worker public/index.php { num 4 }

    # Named background worker (auto-started if num >= 1)
    worker bin/worker.php {
        background
        name config-watcher
        num 1
    }

    # Catch-all for lazy-started names
    worker bin/worker.php {
        background
    }
}
  • background marks a worker as non-HTTP
  • name specifies an exact worker name; workers without name are catch-all for lazy-started names
  • Not declaring a catch-all forbids lazy-started ones
  • max_threads on catch-all sets a safety cap for lazy-started instances (defaults to 16)
  • max_consecutive_failures defaults to 6 (same as HTTP workers)
  • max_execution_time automatically disabled for background workers
  • Each php_server block has its own isolated scope (managed by NextBackgroundWorkerScope())

Shutdown

On restart/shutdown, the signaling stream is closed. Workers detect this via fgets() returning false (EOF). Workers have a 5-second grace period.

After the grace period, a best-effort force-kill is attempted:

  • Linux ZTS: arms PHP's own max_execution_time timer cross-thread via timer_settime(EG(max_execution_timer_timer))
  • Windows: CancelSynchronousIo + QueueUserAPC interrupts blocking I/O and alertable waits
  • macOS: no per-thread mechanism available; stuck threads are abandoned

During the restart window, get_worker_vars returns the last published data (stale but available). A warning is logged on crash.

Forward compatibility

The signaling stream is forward-compatible with the PHP 8.6 poll API RFC. Poll::addReadable accepts stream resources directly - code written today with stream_select will work on 8.6 with Poll, no API change needed.

Architecture

  • Per-php_server scope isolation with internal registry (unexported types, minimal public API via NextBackgroundWorkerScope())
  • Dedicated backgroundWorkerThread handler implementing threadHandler interface - decoupled from HTTP worker code paths
  • drain() closes the signaling stream (EOF) for clean shutdown signaling
  • Persistent memory (pemalloc) with RWMutex for safe cross-thread sharing
  • set_worker_vars skip: uses PHP's === (zend_is_identical) to detect unchanged data - skips validation, persistent copy, write lock, and version bump
  • Generational cache: per-thread version check skips lock + copy when data hasn't changed; repeated get_worker_vars calls return the same array instance (=== is O(1))
  • Opcache immutable array zero-copy fast path (IS_ARRAY_IMMUTABLE)
  • Interned string optimizations (ZSTR_IS_INTERNED) - skip copy/free for shared memory strings
  • Rich type support: null, scalars, arrays (nested), enums
  • Crash recovery with exponential backoff and automatic restart
  • Background workers integrate with existing worker infrastructure (scaling, thread management)
  • $_SERVER['FRANKENPHP_WORKER_NAME'] set for background workers
  • $_SERVER['FRANKENPHP_WORKER_BACKGROUND'] set for all workers (true/false)

Example

// Background worker: polls Redis every 5s
$stream = frankenphp_get_worker_handle();
$redis = new Redis();
$redis->connect('127.0.0.1');

frankenphp_set_worker_vars([
    'maintenance' => (bool) $redis->get('maintenance_mode'),
    'feature_flags' => json_decode($redis->get('features'), true),
]);

while (true) {
    $r = [$stream]; $w = $e = [];
    if (false === @stream_select($r, $w, $e, 5)) { break; }
    if ($r && false === fgets($stream)) { break; } // EOF = stop

    frankenphp_set_worker_vars([
        'maintenance' => (bool) $redis->get('maintenance_mode'),
        'feature_flags' => json_decode($redis->get('features'), true),
    ]);
}
// HTTP worker
$config = frankenphp_get_worker_vars('config-watcher');
if ($config['maintenance']) {
    return new Response('Down for maintenance', 503);
}

Test coverage

17 unit tests + 1 Caddy integration test covering: basic vars, at-most-once start, validation, type support (enums, binary-safe strings), multiple background workers, multiple entrypoints, crash restart, signaling stream, worker restart lifecycle, non-background-worker error handling, identity detection, generational cache, named auto-start with m# prefix.

All tests pass on PHP 8.2, 8.3, 8.4, and 8.5 with -race. Zero memory leaks on PHP debug builds.

Documentation

Full docs at docs/background-workers.md.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 4 times, most recently from e1655ab to 867e9b3 Compare March 16, 2026 20:26
@AlliBalliBaba
Copy link
Copy Markdown
Contributor

AlliBalliBaba commented Mar 16, 2026

Interesting approach to parallelism, what would be a concrete use case for only letting information flow one way from the sidekick to the http workers?

Usually the flow would be inverted, where a http worker offloads work to a pool of 'sidekick' workers and can optionally wait for a task to complete.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 2 times, most recently from da54ab8 to a06ba36 Compare March 16, 2026 21:45
@henderkes
Copy link
Copy Markdown
Contributor

Thank you for the contribution. Interesting idea, but I'm thinking we should merge the approach with #1883. The kind of worker is the same, how they are started is but a detail.

@nicolas-grekas the Caddyfile setting should likely be per php_server, not a global setting.

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 7 times, most recently from ad71bfe to 05e9702 Compare March 17, 2026 08:03
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 17, 2026

@AlliBalliBaba The use case isn't task offloading (HTTP->worker), but out-of-band reconfigurability (environment->worker->HTTP). Sidekicks observe external systems (Redis Sentinel failover, secret rotation, feature flag changes, etc.) and publish updated configuration that HTTP workers pick up on their next request; with per-request consistency guaranteed via $_SERVER injection. No polling, no TTLs, no redeployment.

Task offloading (what you describe) is a valid and complementary pattern, but it solves a different problem. The non-HTTP worker foundation here could support both.

@henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:

  • Minimal FrankenPHP config: a single sidekick_entrypoint in php_server(thanks for the idea). No need to declare individual workers in the Caddyfile. The PHP app controls which sidekicks to start via frankenphp_sidekick_start(), keeping the infrastructure config simple.

  • Graceful degradability: apps should work correctly with or without FrankenPHP. The same codebase should work on FrankenPHP (with real-time reconfiguration) and on traditional setups (with static or always refreshed config).

  • Nice framework integration: the sidekick_entrypoint pointing to e.g. bin/console means sidekicks are regular framework commands, making them easy to develop.

Happy to follow up with your proposals now that this is hopefully clarified.
I'm going to continue on my own a bit also :)

@dunglas
Copy link
Copy Markdown
Member

dunglas commented Mar 17, 2026

Great PR!

Couldn't we create a single API that covers both use case?

We try to keep the number of public symbols and config option as small as possible!

@henderkes
Copy link
Copy Markdown
Contributor

@henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:

Yes, that's why I'd like to unify the two API's and background implementations into one. Unfortunately the first task worker attempt didn't make it into main, but perhaps @AlliBalliBaba can use his experience with the previous PR to influence this one. I'd be more in favour of a general API, than a specific sidecar one.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

The PHP-side API has been significantly reworked since the initial iteration: I replaced $_SERVER injection with explicit get_vars/set_vars protocol.

The old design used frankenphp_set_server_var() to inject values into $_SERVER implicitly. The new design uses an explicit request/response model:

  • frankenphp_sidekick_set_vars(array $vars): called from the sidekick to publish a complete snapshot atomically
  • frankenphp_sidekick_get_vars(string|array $name, float $timeout = 30.0): array: called from HTTP workers to read the latest vars

Key improvements:

  • No race condition on startup: get_vars blocks until the sidekick has called set_vars. The old design had a race where HTTP requests could arrive before the sidekick had published its values.
  • Strict context enforcement: set_vars and should_stop throw RuntimeException if called from a non-sidekick context.
  • Atomic snapshots: set_vars replaces all vars at once. No partial state possible
  • Parallel start: get_vars(['redis-watcher', 'feature-flags']) starts all sidekicks concurrently, waits for all, returns vars keyed by name.
  • Works in both worker and non-worker mode: get_vars works from any PHP script served by php_server, not just from frankenphp_handle_request() workers.

Other changes:

  • sidekick_entrypoint moved from global frankenphp block to per-php_server (as @henderkes suggested)
  • Removed the $argv parameter: the sidekick name is the command, passed as $_SERVER['argv'][1]
  • set_vars is restricted to sidekick context only (throws if called from HTTP workers)
  • get_vars accepts string|array: when given an array, all sidekicks start in parallel
  • Atomic snapshots: set_vars replaces all vars at once, no partial state
  • Binary-safe values (null bytes, UTF-8)

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 3 times, most recently from cb65f46 to 4dda455 Compare March 17, 2026 10:46
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Thanks @dunglas and @henderkes for the feedback. I share the goal of keeping the API surface minimal.

Thinking about it more, the current API is actually quite small and already general:

  • 1 Caddyfile setting: sidekick_entrypoint (per php_server)
  • 3 PHP functions: get_vars, set_vars, should_stop

The name "sidekick" works as a generic concept: a helper running alongside. The current set_vars/get_vars protocol covers the config-publishing use case. For task offloading (HTTP->worker) later, the same sidekick infrastructure could support:

  • frankenphp_sidekick_send_task(string $name, mixed $payload): mixed
  • frankenphp_sidekick_receive_task(): mixed

Same worker type, same sidekick_entrypoint, same should_stop(). Just a different communication pattern added on top. No new config, no new worker type.

So the path would be:

  1. This PR: sidekicks with set_vars/get_vars (config publishing)
  2. Future PR: add send_task/receive_task (task offloading), reusing the same non-HTTP worker foundation

The foundation (non-HTTP threads, cooperative shutdown, crash recovery, per-php_server scoping) is shared. Only the communication primitives differ.

WDYT?

@nicolas-grekas nicolas-grekas force-pushed the sidekicks branch 4 times, most recently from b3734f5 to ed79f46 Compare March 17, 2026 11:48
@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 17, 2026

I think the failures are unrelated - a cache reset would be needed. Any help on this topic?

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Mar 17, 2026

Hmm, it seems they are on some versions, for example here: https://github.com/php/frankenphp/actions/runs/23192689128/job/67392820942?pr=2287#step:10:3614

For the cache, I'm not aware of a Github feature that allow to clear everything unfortunately 🙁

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

"vars"-plural refers to the argument being a collections of value wrapped in an array. We're trying to have the minimum API surface, so not sure I'd add one more function.

@henderkes
Copy link
Copy Markdown
Contributor

set_vars sets a single variable

It should be setting a number of variables (php array) for a scope.

Though it should arguably be

frankenphp_set_vars(?string $name, array $vars): void;

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Though it should arguably be frankenphp_set_vars(?string $name, array $vars): void;

Not sure about this: one scope one name, part of the context. Adding a name would require the scope knowing its name, and would allow a scope to alter another scope. Not good either, this makes things way less bounded - aka more fragile to build on IMHO.

@henderkes
Copy link
Copy Markdown
Contributor

Though it should arguably be frankenphp_set_vars(?string $name, array $vars): void;

Not sure about this: one scope one name, part of the context. Adding a name would require the scope knowing its name, and would allow a scope to alter another scope. Not good either, this makes things way less bounded - aka more fragile to build on IMHO.

Shouldn't a background worker always knows it (application side) name, since it's given by other application code? If frankenphp_set_vars can't declare a scope things will become very messy if we try to expand it to http workers, queue workers or regular threads. Even though it also introduces a difficulty in coordinating that a background worker doesn't overwrite anothers variables. I'll need to sleep about it.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

By trying to make the new API do everything, we might just fail everything...
The current feature scope is really well bounded and solid.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Mar 30, 2026

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars(). This is a known feature-scope. The other ambitions are not clearly-enough undefined and not to be addressed in this PR IMHO.

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

Same $stream, no new FrankenPHP API needed. And if the poll RFC lands with a SignalHandle type, we could later add a higher-level alternative that hides the string parsing entirely (to be confirmed it's worth it in the future):

Yeah I'm talking about abstracting this via handles in the future. But I guess you're right that it would be 8.6+ only.

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars()

I actually kind of like that this is a key-value store. If the endgoal is to allow libraries to start their own background processes, wouldn't it be better to do something like this?

// In library
$redisEndpoints = frankenphp_get_vars('redis:redis-host');

if(!$redisEndpoints){
    frankenphp_start_background_worker('/path/to/background-worker.php', [
        'host' => 'redis-host'
    ]);
}

$redisEndpoints = frankenphp_get_vars('redis');

Starting the background worker simply blocks until it has reached 'ready' by calling set_vars. Would make this more generic, WDYT @nicolas-grekas @henderkes

@johanjanssens
Copy link
Copy Markdown

@AlliBalliBaba pointed me to this PR: #2306 (comment) Reading the discussion it seems there is agreement that the shared state layer is essentially a process-wide key-value store. Took the time over the weekend to build one as a proof of concept, which I think offers some interesting benefits over the approach proposed in this PR.

You can find the code here: https://github.com/johanjanssens/frankenstate. It implements a cross-thread shared state as a standalone FrankenPHP extension, a SharedArray backed by Go (sync.RWMutex + map[string]any), exposed to PHP via ArrayAccess, and accessible over Redis wire protocol (RESP) on port 6380.

$state = new FrankenPHP\SharedArray();
$state['feature_flags'] = ['new_ui' => true];
$flags = $state['feature_flags'];

Key differences from the set_vars/get_vars approach:

  • Go-native: Go extensions write directly to the store (state.Set(k, v)), no CGO crossing on the write side. PHP reads via cached snapshots with version-gated refresh.
  • No new thread types: Any thread (Go or PHP) can read or write at any time. No dedicated background workers, no lifecycle management, no shutdown signaling.
  • No Caddyfile config: It's a PHP extension, not an infrastructure concern.
  • Redis protocol: The store speaks RESP, any process that talks Redis can push data in. redis-cli -p 6380 SET feature_flags '{"new_ui":true}' just works.

With this approach the "config watcher" flow changes to:

FrankenState: External system → any process (cron, script, CI/CD) → RESP → SharedArray → all PHP threads

The store is a KV with a standard protocol. The sync logic lives wherever it makes sense. A cron job, a deploy script, a Redis subscriber, whatever pushes data in over the Redis protocol, or whatever Go code uses the API. FrankenPHP doesn't need to know or care where the data comes from.

Might be worth considering as an alternative approach to the shared state part of this PR. With a standalone shared state store, the background worker and signaling infrastructure in this PR may not be needed, or could be developed as a more generic solution decoupled from state management. Happy to develop it further if there is interest.

Do not want to hijack this PR. Created a discussion thread here: #2324.

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

To truly benefit from ZTS, you'll have to do what's done in this PR and leave the copy mechanism on the C side. Couple that with potential 0-copy if the value hasn't changed and performance is potentially orders of magnitude better.

I still think we need a fast built-in cross-thread copy mechanism like was done in this PR, just think that the API could be more generic.

@henderkes
Copy link
Copy Markdown
Contributor

I still think we need a fast built-in cross-thread copy mechanism like was done in this PR, just think that the API could be more generic.

Essentially my only remaining issue here.

Maybe we tried to hard to turn this into a global KV store and should get back to frankenphp_set/get_worker_vars(). This is a known feature-scope. The other ambitions are not clearly-enough undefined and not to be addressed in this PR IMHO.

I feel like it would be great if background workers could use this more general KV store as a backend to push/pull values back to http threads that started the background workers. I'm really sorry for being so nitpicky here because it's a great addition, I'm just focussed on making sure the decisions made here won't bite us later.

I'll give the code a proper review in a few days again.

@dbu
Copy link
Copy Markdown
Contributor

dbu commented Apr 1, 2026

"vars"-plural refers to the argument being a collections of value wrapped in an array. We're trying to have the minimum API surface, so not sure I'd add one more function.

@nicolas-grekas i thought you widened the argument to also accept single string or other scalar value. or have you reverted that? if its reverted, i retire my suggestion :-)

regarding the general KV store vs workers: from my understanding, the intention from nikolas was that workers could be automatically launched on demand when a namespace is requested. maybe an approach could be:

  • generic set KV store
  • generic get KV store
  • specific get worker values (which is what nicolas proposed, but after making sure the worker is up accesses the underlying generic KV store)
  • specific worker signal stream

setting values in the worker could use the generic set KV store method. except if the worker get method needs to be aware when the value is set, maybe we'd need a specific worker set value method to unlock the blocking call.

but this would separate the handling of on-demand workers from the underlying KV mechanism. alternatively, we could have only the worker part and later refactor to a generic KV mechanism...

@henderkes
Copy link
Copy Markdown
Contributor

henderkes commented Apr 1, 2026

  • specific get worker values (which is what nicolas proposed, but after making sure the worker is up accesses the underlying generic KV store)

Then we have an extra function for essentially the same thing, which is what I want to avoid.

I'm closely aligned with @AlliBalliBaba's vision here. Similar to what he proposed:

// In library
$redisEndpoints = frankenphp_get_vars('redis:redis-host');

if(!$redisEndpoints){
    frankenphp_start_background_worker('/path/to/background-worker.php', [
        'host' => 'redis-host'
    ]);
}

$redisEndpoints = frankenphp_get_vars('redis');

Except that I like @nicolas-grekas concept of at-most-one worker, which would simplify this to:

# in library
frankenphp_start_background_worker("projectDir/path/to/redis-updater.php", scopes: ['redis-host'], args: []);
$redisEndpoints = frankenphp_get_vars('redis-host');

# in redis-updater
frankenphp_set_vars($scopes, ['host' => 'new-redis-host']);

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

@dbu Scalar widening was reverted; see #2287 (comment)

About the explicit start_background_worker() + generic KV store approach, I slept on the idea and I'd like to push back.
I think this opens more problems than it solves:

Entrypoint from PHP = security surface. The Caddyfile is the trust boundary today. The sysadmin declares which scripts can run as background workers, and PHP code references them by name. If PHP can pass arbitrary script paths to start_background_worker(), any code (including dependencies) can spawn long-running processes. That's a fundamentally different security model. The name-based approach keeps the Caddyfile as the single source of truth. The current model also makes sure no competing same-name-but-different-entrypoint issues can exist. That's an important feature.

Generic set_vars = many writers, many readers. The current model is deliberately one-writer-many-readers: a single background worker owns its vars, HTTP workers read them. This is trivially safe — RWMutex, no conflicts, no ordering surprises. The moment you decouple set_vars from the writer's identity, you get many-writers-many-readers concurrency. Last-write-wins looks simple until two writers race and users get non-deterministic results. People build CRDTs for exactly this reason. Keeping one writer per named scope avoids this entire class of problems by design. See APCu, which already has apcu_cas, apcu_add, etc. It might be worth adding something similar to FrankenPHP, but that's not the frontier I want to tackle with the feature I'm proposing here.

Lazy-start is the feature, not a side effect. The whole point of get_worker_vars('redis-watcher') implicitly starting the worker is that library code can declare a dependency on a background worker without orchestration logic. The sketch with explicit start_background_worker() + guard if (!$redisEndpoints) is exactly the orchestration boilerplate that lazy-start eliminates.

I therefor reverted back to _worker_ in the function names (frankenphp_set_worker_vars / frankenphp_get_worker_vars / frankenphp_get_worker_handle) to make the scope explicit. The API is tied to workers intentionally. The $name parameter identifies a worker, which enables lazy-start. If we later want a generic KV store decoupled from workers, it would have different semantics (no lazy-start, many writers) and deserves its own API rather than overloading this one.

As a reminder, my goal here is to push the boundaries of what can be done in a PHP app. A generic KV store is not such a boundary. APCu, Redis, etc all exist to tackle this. FrankenPHP can certainly have its say on the matter, but that's a different problem area than this PR.

Background workers give PHP something it never had: long-running processes that share state with HTTP workers in the same process, with a powerful and simple design. That's the feature this PR delivers. Let's land it :)

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

Entrypoint from PHP = security surface. The Caddyfile is the trust boundary today.

It's the same security-wise as a library being able to call exec. I'm pretty sure you'd end up having the same with the 'fallback workers'. In order to run a custom script, the fallback worker would end up needing to accept a custom script name to include.

Having this as an explicit function just clarifies what actually happens. I still think it would be a lot simpler to not allow lazy starts and just have people configure workers directly.

If we do allow starting background workers at runtime, it should be very explicit and not be hidden behind a getter function.

@AlliBalliBaba
Copy link
Copy Markdown
Contributor

With the current implementation, it just becomes very hard to track down potential issues. An example:

I see that get_vars returns something incorrect or times out. First, I'd have to know that it has a side effect of starting the background worker with the passed name. With the name I'd then have to go to the Caddyfile and look for the worker. Not finding the worker, I'd have to know that a worker without name is a catch-all worker and that an instance of the script is executed and passed the name. Then I'd have to go to that script and see what it does with the name. It probably will do something like start the Symfony container and determine what to do based on bootstrapped services. Then I'd have to look through all services finding the logic that is actually run.

With start_background_worker() it's pretty simple: "If the value is empty, run this script to set the value".

Without lazy start it's even simpler, the worker needs to call set_vars or we won't even reach ready state.

@henderkes
Copy link
Copy Markdown
Contributor

@dbu Scalar widening was reverted; see #2287 (comment)

About the explicit start_background_worker() + generic KV store approach, I slept on the idea and I'd like to push back. I think this opens more problems than it solves:

Entrypoint from PHP = security surface. The Caddyfile is the trust boundary today. The sysadmin declares which scripts can run as background workers, and PHP code references them by name. If PHP can pass arbitrary script paths to start_background_worker(), any code (including dependencies) can spawn long-running processes. That's a fundamentally different security model. The name-based approach keeps the Caddyfile as the single source of truth. The current model also makes sure no competing same-name-but-different-entrypoint issues can exist. That's an important feature.

I disagree here. I never advocated for arbitrary paths, only files within the current root. It's the same attack vector as someone on the internet visiting that file.

Generic set_vars = many writers, many readers. The current model is deliberately one-writer-many-readers: a single background worker owns its vars, HTTP workers read them. This is trivially safe — RWMutex, no conflicts, no ordering surprises. The moment you decouple set_vars from the writer's identity, you get many-writers-many-readers concurrency.

There's one-writer, many-readers concurrency either way. If you don't want conflicts, don't use the same scope (worker) names. The API I proposed already passed the scopes a background worker has access to in the explicit frankenphp_start_worker call.

Lazy-start is the feature, not a side effect. The whole point of get_worker_vars('redis-watcher') implicitly starting the worker is that library code can declare a dependency on a background worker without orchestration logic. The sketch with explicit start_background_worker() + guard if (!$redisEndpoints) is exactly the orchestration boilerplate that lazy-start eliminates.

No need to check if (!$redisEndpoints). The boilerplate here is one call to frankenphp_start_worker before you wish to read, because you proposed at-most-one for background workers of a name.

As a reminder, my goal here is to push the boundaries of what can be done in a PHP app. A generic KV store is not such a boundary. APCu, Redis, etc all exist to tackle this. FrankenPHP can certainly have its say on the matter, but that's a different problem area than this PR.

Background workers give PHP something it never had: long-running processes that share state with HTTP workers in the same process, with a powerful and simple design. That's the feature this PR delivers. Let's land it :)

We absolutely and whole-heartedly agree! We only disagree on the magic, lazy starting, specific API that would be better solved by integrating with a more general KV store backend. If we merge this with your proposed API in a year we will have:

  • frankenphp_set_worker_vars
  • frankenphp_get_worker_vars
  • frankenphp_get_worker_handle # only accessible from a worker itself?
  • frankenphp_set_vars
  • frankenphp_get_vars
  • frankenphp_start_worker # gives a handle too...
  • frankenphp_handle_task

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

I've been thinking about this while working on other things, https://github.com/symfony/php-ext-deepclone to name one of them which is of direct interest to this PR.

TL;DR: I'd like to keep the current API shape (set_worker_vars / get_worker_vars / get_worker_handle, with lazy-start and catch-all). The concrete improvement propose: make get_worker_vars failures more informative by including the worker name, the resolved script path, and the worker's stderr in the exception message. That directly addresses the "hard to debug" concern without changing the API.

Here is why, listing your arguments:

"get_vars might be hard to debug": the realistic failure modes are: (1) timeout waiting for first set_worker_vars, or (2) the worker throws before calling set_worker_vars at all. Both already surface as RuntimeException from get_worker_vars. We can make those error messages very explicit: include the worker name, the resolved script path, and, why not, the worker's stderr if we captured it. That directly addresses the "trace from failure to root cause" concern, without changing the API shape.

"start_worker() is more secure because paths are restricted to the project root": I'm not sure why files in the project root would be safer to execute as long-lived workers than any other file (eg files under vendor/). If you meant the public-web root, I wouldn't put worker entrypoints there (like bin/console isn't under public/). With the Caddyfile approach, there's zero attack surface around worker startup from PHP: the infra declares the trusted entrypoints once.

"It's no worse than exec": that comparison doesn't fully holds. exec() is universally recognized as dangerous: linters, SAST tools, code reviewers, and devs all flag it and know to sanitize input. A new start_worker($path) wouldn't benefit from any of that for years; every composer package using it becomes a potential path-traversal vector that users would need to audit. This is a real ecosystem cost, and keeping input-sensitive functions to a minimum is a principle worth preserving.

"start_worker() is more explicit and therefore better": it moves the explicitness from the config file to PHP code, while adding API surface. It also introduces its own traceability hazards: what happens if code path A calls start_worker('foo', 'a.php') and path B calls start_worker('foo', 'b.php'), or both happen concurrently on different threads? Either we error out (bad UX for libraries that don't coordinate), or we silently keep the first registration (a debugging trap and a race condition that don't exist today). With entrypoints-from-Caddyfile, there's exactly one place where a name maps to a script.

There's also a design-level cost: taking a path at the API surface leaks the implementation. Right now, a name is an opaque handle for "some process that publishes this data". The mechanism is up to the infra running the app: on FrankenPHP it starts a local PHP script, on a polyfill we could dispatch to an external process. The moment $path is part of the API, the contract becomes "PHP starts this specific local script", and that abstraction is locked in.

"one-writer-many-readers concurrency either way": this is a deep design choice, not a cosmetic one. The current model enforces a single writer per scope at the type level: only the background worker holding $name can write to it. RWMutex, no CAS needed, no ordering surprises, no divergent writes to reconcile. The moment set_vars takes a scope argument from any caller, you're in many-writers-many-readers territory. That's a very different problem: you need CAS, merge semantics, version vectors. CRDTs exist precisely because shared-memory concurrency without enforced single-writer semantics is hard. Go has channels alongside shared memory for the same reason.

"FrankenPHP should have a generic KV store": I'd argue that's outside FrankenPHP's scope. APCu, Redis, Memcached, and plenty of extensions already do this well. FrankenPHP's unique value is thread and worker management: that's what no PHP extension can replace, because it requires ownership of the process model. The API surface we should expose is the one that couldn't be done without FrankenPHP. That's how we keep it minimal.

"start_worker is just more explicit about what the get function does": this is probably the root of the disagreement. start_worker() assumes PHP is the process that bootstraps workers. But FrankenPHP isn't like the parallel extension: PHP isn't the master process here, FrankenPHP itself is. Workers survive HTTP requests. The Go side owns their lifecycle. Asking PHP to "start a worker" inverts that ownership model.

Lazy-start from get_worker_vars fits that model naturally: it's not PHP asking Go to create a worker, it's PHP saying "I need this data, please ensure its producer is running." Go decides how and when. PHP doesn't need to care.

On API bloat: if we don't add start_worker(), set_vars(), get_vars(), the count stays at three: set_worker_vars, get_worker_vars, get_worker_handle. Those three cover the entire use case: publish, consume, cooperate. If APCu-like CAS primitives make sense later, they belong in a separate concern, possibly a separate PR, possibly a PHP extension, but either way not coupled to this feature.

Catch-all: It's what allows a library to ship a background worker as a self-contained package without requiring the user to know its internal worker names. That's a real library-ergonomics win. The failure mode (timeout with unclear cause) is solvable with the better error messages mentioned above, not by removing the feature.


To recap what this PR delivers:

  • Background workers run outside the HTTP cycle, sharing state with HTTP workers in the same process
  • Zero serialization overhead, immutable-array zero-copy, interned string sharing
  • One-writer-many-readers by construction, no CRDT/CAS needed
  • Cooperative shutdown via signaling stream, forward-compatible with PHP 8.6 poll API
  • Three functions, one Caddyfile directive

That's the feature. I'd love to land it ASAP :)

@henderkes
Copy link
Copy Markdown
Contributor

That's the feature. I'd love to land it ASAP :)

I think we all would, really, but we definitely need to agree on the API for it. I think we need some outside opinions from @dunglas @alexandre-daubois and @withinboredom at this point to see where we're most likely to reach sufficient consensus.

I will try to address your points later when I've got time, I unfortunately still don't agree on all of them.

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Apr 20, 2026

Thanks for the ping, it got out of my mind. I'll catch up with the conversation this week!

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

nicolas-grekas commented Apr 20, 2026

I'm in Berlin until Friday BTW! Let's meet at SymfonyLive or around? @AlliBalliBaba also? Please join me on https://symfony.com/slack

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Apr 21, 2026

After re-reading the thread, here's my position.

The target use case feels right, and I don't have a concrete need today that would justify use cases beyond the background-to-HTTP flow, like HTTP workers publishing data between themselves. I also have no objection in principle to lazy-starting from PHP: the DX it provides is valuable, and starting a worker can remain conditional on an app-level trigger. Feels a bit like goroutines, I like it.

That said, two points tip me toward an API that cleanly separates the lifecycle from the data store:

  • Once frankenphp_get_worker_vars($name) ships, the semantic "$name maps to a declared worker" is frozen. I don't see how we extend that function later without either introducing a parallel function or breaking backward compatibility. Decoupling the store from the lifecycle now keeps the door open.
  • The concern about an opaque diagnostic chain: an unexplained timeout on get_worker_vars, traced back to a non-obvious catch-all in the Caddyfile, then back to the calling script. An explicit start_background_worker() makes the trace linear and readable, even if we keep lazy-start via catch-all as a DX convenience.

Concretely, I'd support an API where the store is exposed in a generic form, and where starting a background worker is a named, explicit operation. Nothing prevents lazy-start via Caddyfile catch-all from still kicking in on a get_vars() call with a known name: that preserves the ergonomic side on the app without freezing the public API's semantics.

On the other points, I'm aligned with what already seems to be converging in the thread.

Sorry if I forgot something, there are quite a few comments 🙂

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Thanks for taking the time @alexandre-daubois. Two things I'd like to push back on, plus one structural argument that I think hasn't been engaged with yet.

On "freezes the $name maps to declared worker semantics": I don't see what extension this actually blocks. Names are the most abstract contract shape possible: DNS hostnames, Redis keys, Symfony service IDs, HTTP endpoints all work this way. The implementation behind a name is free to change without touching the API. A path-based API (start_worker($path)) would freeze far more: the contract becomes "PHP starts this specific local script," locking the implementation. Could you share a concrete example of what extension get_worker_vars($name) would block in the future?

On the diagnostic chain concern: I offered some improvements in my previous reply, I think we can push this further until errors from workers are surfaced better to callers, all compatible with the API I'm proposing. That's the one aspect I think we can improve further.

On the explicit start_background_worker(): I'd like to understand what blocking semantics you have in mind, because the function only makes sense under one of two shapes, and both have issues:

  • If it blocks until ready: it's functionally equivalent to get_worker_vars() minus the returned data. Same lazy-start, same wait, same ready signal. Adding a second function for that case is pure redundancy.
  • If it returns immediately after spawning: the caller has no way to tell if the worker actually booted. It could fail 1ms later and the caller would never know. You'd then need a separate wait_ready() primitive to get feedback, more API surface.

The current design sidesteps both: one frankenphp_set_worker_vars([]) call at boot (even for side-effect-only workers) gives callers a real ready-state signal via get_worker_vars, and the failure mode is uniform (rich timeout error with boot-failure details). The "workaround" is actually a feature: it makes the ready contract explicit and symmetric.

The structural argument I'd like your read on: the current model is one-writer-many-readers by construction. Only the background worker owning $name can write to it, enforced at the type level, no CAS needed, no ordering surprises, no divergent writes to reconcile. A generic store where any caller can set_vars($name, $data) puts us in many-writers-many-readers territory: you need CAS (apcu_cas), merge semantics, version vectors. CRDTs exist precisely because shared-memory concurrency without enforced single-writer semantics is hard. This is why I push back on the "generic KV store" framing: it's not a cosmetic API choice, it's a fundamental concurrency model shift. Did you land on "fine, accept the tradeoff" or did that not factor into your position yet?

@henderkes
Copy link
Copy Markdown
Contributor

"get_vars might be hard to debug": the realistic failure modes are: (1) timeout waiting for first set_worker_vars, or (2) the worker throws before calling set_worker_vars at all. Both already surface as RuntimeException from get_worker_vars. We can make those error messages very explicit: include the worker name, the resolved script path, and, why not, the worker's stderr if we captured it. That directly addresses the "trace from failure to root cause" concern, without changing the API shape.

get_vars gets hard to debug when it sometimes starts a worker and sometimes it doesn't. It's not an issue if we were to stick with your worker_get_vars implementation, but with the more general get_vars api we all want.

"start_worker() is more secure because paths are restricted to the project root": I'm not sure why files in the project root would be safer to execute as long-lived workers than any other file (eg files under vendor/). If you meant the public-web root, I wouldn't put worker entrypoints there (like bin/console isn't under public/). With the Caddyfile approach, there's zero attack surface around worker startup from PHP: the infra declares the trusted entrypoints once.

Not necessarily the public webroot, but a root defined by the Caddyfile for sure. The problem with your approach is that it limits to a single background worker script, which is likely in a framework like Symfony with a central kernel and container, but otherwise not.

"It's no worse than exec": that comparison doesn't fully holds. exec() is universally recognized as dangerous: linters, SAST tools, code reviewers, and devs all flag it and know to sanitize input. A new start_worker($path) wouldn't benefit from any of that for years; every composer package using it becomes a potential path-traversal vector that users would need to audit. This is a real ecosystem cost, and keeping input-sensitive functions to a minimum is a principle worth preserving.

It's a fair point, but again, I'm not concerned with security when it's explicitly configurable through some background-worker-directory. We're not giving anyone a gun here, they're stealing it, pointing it at their foot and fire repeatedly when someone actually manages to run into a security issue with it. And they could do the same with a single entrypoint, too.

"start_worker() is more explicit and therefore better": it moves the explicitness from the config file to PHP code, while adding API surface. It also introduces its own traceability hazards: what happens if code path A calls start_worker('foo', 'a.php') and path B calls start_worker('foo', 'b.php'), or both happen concurrently on different threads? Either we error out (bad UX for libraries that don't coordinate), or we silently keep the first registration (a debugging trap and a race condition that don't exist today). With entrypoints-from-Caddyfile, there's exactly one place where a name maps to a script.

While adding API surface is true, but it's unified api surface that we'd most likely add at some point anyways. Then it's better to have an explicit API than magic behaviour on one, but not the other.

There's also a design-level cost: taking a path at the API surface leaks the implementation. Right now, a name is an opaque handle for "some process that publishes this data". The mechanism is up to the infra running the app: on FrankenPHP it starts a local PHP script, on a polyfill we could dispatch to an external process. The moment $path is part of the API, the contract becomes "PHP starts this specific local script", and that abstraction is locked in.

That's a very fair point that I don't have a perfect solution to. It's actually one where I'm going back and fourth between even using names (and just using anonymous lists) in another project I'm working on.

"start_worker is just more explicit about what the get function does": this is probably the root of the disagreement. start_worker() assumes PHP is the process that bootstraps workers. But FrankenPHP isn't like the parallel extension: PHP isn't the master process here, FrankenPHP itself is. Workers survive HTTP requests. The Go side owns their lifecycle. Asking PHP to "start a worker" inverts that ownership model.

Lazy-start from get_worker_vars fits that model naturally: it's not PHP asking Go to create a worker, it's PHP saying "I need this data, please ensure its producer is running." Go decides how and when. PHP doesn't need to care.

See point 1, because at that point it would just be confusing about what issues a lazy start and what doesn't. (And I'm honestly not even sure how useful a lazy start really is, what problem does that solve? The library will have a dependency on the Caddyfile configuration at that point and the worker existing if it's hit once, and if it is, it would never shut down again)

"FrankenPHP should have a generic KV store": I'd argue that's outside FrankenPHP's scope. APCu, Redis, Memcached, and plenty of extensions already do this well. FrankenPHP's unique value is thread and worker management: that's what no PHP extension can replace, because it requires ownership of the process model. The API surface we should expose is the one that couldn't be done without FrankenPHP. That's how we keep it minimal.

On API bloat: if we don't add start_worker(), set_vars(), get_vars(), the count stays at three: set_worker_vars, get_worker_vars, get_worker_handle. Those three cover the entire use case: publish, consume, cooperate. If APCu-like CAS primitives make sense later, they belong in a separate concern, possibly a separate PR, possibly a PHP extension, but either way not coupled to this feature.

These are all more or less the same point of disagreement which is: this PR is locking that decision in "forever". No generic KV store, no matter if it would ever make sense (I'd argue it would, how else would you share vars within the same application on different threads, but guard it from being accessed by other, unrelated applications? Using apcu for this is very dirty and will suffer from heavy fragmentation for a runtime concern.

I think @alexandre-daubois essentially has the same considerations that Alex and I do too.

If it blocks until ready: it's functionally equivalent to get_worker_vars() minus the returned data. Same lazy-start, same wait, same ready signal. Adding a second function for that case is pure redundancy.

It would obviously be blocking until started, but it would a generic API surface that could be reused for task workers, that we're still intending to add. And it would be explicit. And it would solve the inability to reason about what a unified frankenphp_get_vars would do (i.e. never lazy start a worker).

The structural argument I'd like your read on

I just think the actual issue with it is the same as before: worker string names lead to poor reasoning. If library A uses 'redis' and library B uses the same, but both expect different worker scripts, we have the exact same issue that the many-writers, many-readers has. If we don't have conflicting worker names, there's no issue with many-writers-many-readers either.

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Apr 21, 2026

Marc has already articulated most of where I land, so I'll stay short and add a few angles I don't think have come up yet.

On the DNS / Redis / Symfony service name analogy, I think the comparison doesn't hold. Those APIs deliberately separate concerns: DNS has getaddrinfo, not "getaddrinfo that spawns a server if the name doesn't resolve." Redis has GET, BLPOP, SUBSCRIBE as distinct primitives. None of them bundle "read + block + lazy-spawn + timeout" into one call. What's under discussion isn't the abstractness of names, it's the composition of side effects on a single primitive, which is where the real lock-in sits.

About testability: a global function whose call can spawn a worker process is hostile to unit tests. Libraries adopting this will either need to wrap it in their own abstraction or give up on isolation in tests. A store-shaped API is materially more mockable.

Also, about the principle of least surprise: get_worker_vars('foo') silently starting a process is a footgun that better error messages don't remove. The surprise isn't "the error was confusing," it's "a read-looking call had a lifecycle-altering side effect."

Finally, genuine question about the API: is it possible to unset a key?

@dunglas
Copy link
Copy Markdown
Member

dunglas commented Apr 21, 2026

Edited: I missed last response by @alexandre-daubois and I agree with him. API updated.

Thanks, everyone, for the depth of this one! @nicolas-grekas for the huge amount of work, and @henderkes, @AlliBalliBaba, @alexandre-daubois, @dbu for the careful pushback. I've read through the whole thread, and I think we're close to merging it. Most of the back-and-forth is really
three questions tangled together: how workers start, generic vs. worker-scoped names, and single-writer vs. many-writers. Once they're separated, each side is clearly right on some of them.

Here's my opinion on this: Caddyfile and the whole Go/C runtime stay as Nicolas designed them, but we make small changes to the PHP API:

  • frankenphp_start_background_worker(string $name, float $timeout = 30.0): void
  • frankenphp_set_vars(array $vars): void
  • frankenphp_get_vars(string|array $name, float $timeout = 30.0): array
  • frankenphp_get_worker_handle(): resource

set_vars() takes no $name. The caller writes to its own scope: a background worker writes to its declared name, and that's it. We keep Nicolas' single-writer-per-scope guarantee structurally (no CRDTs, no CAS, same safety), and we drop _worker_ from the data functions so we don't freeze semantics we don't need to freeze as suggested by Alexandre (it's also my main concern).

start_background_worker() blocks until the worker has called set_vars once (clean ready signal, same at-most-once semantics, name-only so Caddyfile stays the trust boundary). get_vars() is pure read, it can still block waiting for data if callers want it to, but no lifecycle side effect. One
extra line at bootstrap (start then get) in exchange for a clean trace, mockable code, and a read that behaves like a read. Good trade.

On unsetting a key: with snapshot semantics it's just set_vars a new array without the key — no dedicated primitive needed. If we ever add per-key writes we'd add a matching unset.

We can apply the same logic for #2319, drop the _worker_ part:

  • frankenphp_task_send(string $name, array $payload, float $timeout = 30.0): resource
  • frankenphp_task_read(resource $stream): ?array
  • frankenphp_task_receive(): ?array
  • frankenphp_task_update(resource $stream, array $data): void

The fact that a worker picks up the task is an implementation detail.

WDYT?

@alexandre-daubois
Copy link
Copy Markdown
Member

alexandre-daubois commented Apr 21, 2026

Looks like the best of both worlds @dunglas. Dropping _worker_ from the data functions resolves the forward-compat concern I was most worried about.

Sorry if this was answered somewhere in the comments: what's the defined behavior when the caller has no worker scope, e.g. called from an HTTP request context, a CLI script, or any non-worker code path? Should it be no-op or throw a RuntimeException? I'd be in favor of the latter.

@dunglas
Copy link
Copy Markdown
Member

dunglas commented Apr 21, 2026

I would throw too

@henderkes
Copy link
Copy Markdown
Contributor

henderkes commented Apr 21, 2026

frankenphp_start_background_worker(string $name, float $timeout = 30.0): void
frankenphp_get_vars(string|array $name, float $timeout = 30.0): array

Why do we need a timeout for the get_vars? Shouldn't that just return immediately since the prior start_background_worker call is already blocking and guarantees it to be ready?

frankenphp_get_worker_handle(): resource

Perhaps this should return an object on which php can call get_stream()? Or are we certain that a resource will fulfil our future requirements for what that handle has to do?

Sorry if this was answered somewhere in the comments: what's the defined behavior when the caller has no worker scope, e.g. called from an HTTP request context, a CLI script, or any non-worker code path? Should it be no-op or throw a RuntimeException? I'd be in favor of the latter.

You're talking about frankenphp_set_vars? My immediate thought is to throw, but it's hard to say. What if we wanted to update a php_server-wide variable from a http thread in the future? Once we guarantee throwing, we cannot change it later anymore without potentially breaking code that expected a throw.

I'm generally happy with that direction, but I'd still want to argue the case for being able to define multiple background worker scripts. We went out of our way to support non-framework code all the way up until this point, for the gain I see (for a single script would already mostly disappear with an explicit start_background_worker call). I'm just not yet convinced that lazy-starting workers is really worth it from a single script. Or, going back a step, if we weren't better off defining background workers explicitly in the caddyfile.

@nicolas-grekas
Copy link
Copy Markdown
Contributor Author

Thanks @dunglas for the proposal, I think we're very close. Let me suggest a small refinement that I think fully addresses the debuggability objection without giving up anything structural.

Proposal (noted about #2319 also)

frankenphp_require_background_worker(string $name, float $timeout = 30.0): void
frankenphp_set_vars(array $vars): void
frankenphp_get_vars(string|array $name): array
frankenphp_get_worker_handle(): resource

Four functions, same count as your proposal. Two differences:

require instead of start

The function is a dependency declaration, not a command. "Start" is slightly misleading because Go owns the worker lifecycle, PHP doesn't actually start anything. "Require" better matches the intent: "I declare that this worker must be running."

Mode-dependent semantics for require_background_worker

In a worker script, BEFORE frankenphp_handle_request: starts the worker if needed, blocks until first set_vars call or timeout, throws immediately on boot failure (no exponential backoff). This is the "declare my dependencies up front, fail fast if broken" pattern.

In a worker script, INSIDE the request loop: must refer to a worker that's already running (either declared with num 1 in Caddyfile, or previously required during bootstrap). Throws if the name isn't known. Blocks with timeout if the worker is currently in crash-restart. Never lazy-starts a new worker. This makes runtime calls a clean assertion: "this dependency must be available now."

In NON-worker mode: starts the worker if needed, blocks with timeout, tolerates transient boot failures via exponential backoff. Same as current lazy-start behavior, just explicit.

get_vars becomes pure-read everywhere

No lifecycle side effects, no timeout argument needed. Throws if the name isn't currently running. Consistent semantics across worker and non-worker modes.

Usage

// Worker mode
frankenphp_require_background_worker('config-watcher'); // bootstrap, fail-fast
while (frankenphp_handle_request(function () {
    $cfg = frankenphp_get_vars('config-watcher'); // pure read
})) { gc_collect_cycles(); }

// Non-worker mode (every request)
frankenphp_require_background_worker('config-watcher'); // tolerant
$cfg = frankenphp_get_vars('config-watcher');

Why this works

Addresses the "sometimes starts, sometimes doesn't" debuggability concern: get_vars never has lifecycle side effects. The require call is where lifecycle lives, and its name says so.

Preserves the mode asymmetry in the right place: the only mode-dependent behavior is in the lifecycle function (where mode genuinely matters: bootstrap vs. request), not in reads. The asymmetry is visible from the function name.

Keeps set_vars scope-less: the caller still writes to its own scope. Single-writer-per-scope, no CAS, no CRDTs.

Caddyfile remains the trust boundary: require takes a name, not a path. No new input-sensitive API.

Runtime discipline in worker mode: assertions instead of lazy-starts. Library code can declare "my dependency must be running" at runtime without the side-effect surprise.

Non-worker mode ergonomics: accepts that non-worker mode re-initializes everything per request. The require + get pattern is consistent with the rest of per-request setup in that mode.

Answering specific points from the thread

"Non-worker mode should throw": to be clear, background workers already work in non-worker mode today, and I want to keep it that way. Non-worker scripts can require and get_vars normally, that's one of the core use cases (classic request mode reading live config from a bg worker). The only thing that should throw in non-worker mode is set_vars, because the caller has no bg worker scope to write to. That's already the behavior, no change needed.

CLI: rather than throwing, we can simply not expose the functions in CLI mode. CLI is a standalone PHP execution with no worker pool, the functions would be meaningless. Not exposing them is cleaner than throwing at runtime.

get_worker_handle returning an object with get_stream() for future-proofing: I'd push back. PHP streams are the universal primitive for async I/O in PHP. They're not going anywhere, and the upcoming PHP 8.6 poll API RFC is built on top of them (Poll::addReadable accepts stream resources directly). Wrapping them in an object to "future-proof" adds complexity today for a future requirement that doesn't exist and likely won't. If one day we need something a stream can't express, we can add a new function then, and the old one still works.

"Allow defining multiple background worker scripts": this is already supported. You can declare as many worker { background; name X } blocks as you want in the Caddyfile, each with a different script. The catch-all (worker { background } without a name) is an additional mechanism, not the only one.

"catch-all assumes a framework with a central kernel": the catch-all is completely framework-agnostic. The dispatch is a single $_SERVER['FRANKENPHP_WORKER_NAME'] lookup, that's it. A match statement, a class_exists check, a require of a file named after the worker, whatever you prefer. The idea of having a single entrypoint that dispatches to multiple workers doesn't require a kernel or container, it's just switch ($_SERVER['FRANKENPHP_WORKER_NAME']) { ... }. Many PHP libraries already ship a bin/ script that dispatches to different subcommands based on $argv[1], this is the same pattern.

The DX win of the catch-all is that library authors can ship a worker entrypoint that handles multiple named workers without requiring their users to declare each one in the Caddyfile. Remove this and libraries have to document "add these N worker blocks to your Caddyfile" instead of "add one worker { background } block". That's a real usability regression for a debugging concern that the explicit require already addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants