Skip to content

Concurrent Curl requests

Greg Bowler edited this page Mar 28, 2023 · 2 revisions

So far we've only executed individual HTTP requests. We create a Curl object, execute it, then read the response.

If we want to execute multiple HTTP requests as part of our script, it can be inefficient to run multiple requests one after another. For example, if we have 10 HTTP requests to make that all take 1 second to execute, running them one after another will take 10 seconds to complete. However, if we ran all 10 requests concurrently, our script would only take 1 second to complete, because all HTTP requests can be executed in parallel.

The way to do this is to build each individual Curl object as usual, but instead of calling exec() on them directly, we add them to a new CurlMulti object, and call exec on that instead.

The CurlMulti::exec() function returns the number of requests still active, so while the returned number is greater than 0, there is still work to do - we should wait a while, then call exec again.

Here's an example that makes three individual requests, executes them concurrently within a CurlMulti object, waits for them all to complete, then outputs all the responses:

use Gt\Curl\Curl;
use Gt\Curl\CurlMulti;

$curlCat = new Curl("https://catfact.ninja/fact");
$curlIp = new Curl("https://api.ipify.org/?format=json");
$curlTimeout = new Curl("https://this-domain-name-does-not-exist.example.com/nothing.json");

$multi = new CurlMulti();
$multi->add($curlCat);
$multi->add($curlIp);
$multi->add($curlTimeout);

$stillRunning = 0;
do {
	$multi->exec($stillRunning);
	usleep(100_000);
	echo ".";
}
while($stillRunning > 0);

echo PHP_EOL;
echo "Cat API response: " . $multi->getContent($curlCat) . PHP_EOL;
echo "IP API response: " . $multi->getContent($curlIp) . PHP_EOL;
echo "Timeout API response: " . $multi->getContent($curlTimeout) . PHP_EOL;

Streaming responses

In the above example, we are running the three requests concurrently and echoing the responses after all the responses have completed. In some situations, it may be beneficial to receive the response as soon as any content is received, rather than having to wait for the entire response to complete its download.

For example, when large responses are expected, we can stream the incoming data somewhere to process the information in a memory efficient manner, rather than loading the entire response data into memory.

To do this, we can set a header/write function on the individual Curl objects. This function will be called whenever any bytes are received from the server.

To process each HTTP header as it's received, set the CURLOPT_HEADERFUNCTION option. To process the incoming response, whenever any bytes are received, set the CURLOPT_WRITEFUNCTION option.

The value of this option is a function or other callable that takes the following signature:

function(CurlHandle $ch, string $buffer):int;

The first parameter is the native PHP CurlHandle. The second parameter is the most important - the incoming data.

The header callback is be called once for each header and only complete header lines are passed on to the callback, ending with a newline character.

The write callback is called as soon as there is data received that needs to be saved. For most transfers, this callback gets called many times and each invoke delivers another chunk of data.

Here's an example that uses the same three APIs, but this time echoes the response headers and body as soon as they are received. The echoes will likely execute out of order, so if this were being used for a real project, the CurlHandle would need to be kept track of to save the responses to the correct locations.

use Gt\Curl\Curl;
use Gt\Curl\CurlMulti;

$urlArray = [
	"https://catfact.ninja/fact",
	"https://api.ipify.org/?format=json",
	"https://this-domain-name-does-not-exist.example.com/nothing.json",
];

$multi = new CurlMulti();

foreach($urlArray as $url) {
	$curl = new Curl($url);
	$curl->setOpt(CURLOPT_HEADERFUNCTION, function ($ch, string $rawHeader):int {
		echo "HEADER: $rawHeader";
		return strlen($rawHeader);
	});
	$curl->setOpt(CURLOPT_WRITEFUNCTION, function ($ch, string $rawBody):int {
		echo "BODY: $rawBody\n";
		return strlen($rawBody);
	});
	$curl->setOpt(CURLOPT_TIMEOUT, 10);
	$multi->add($curl);
}

$stillRunning = 0;
do {
	$multi->exec($stillRunning);
	usleep(10_000);
	echo ".";
}
while($stillRunning > 0);

echo PHP_EOL;

The Fetch API

This library was created to provide a solid foundation for PHP.Gt/Fetch, which is a more featured library that is implemented to be compatible with web standards. Behind the scenes, Fetch uses this Curl library to manage the HTTP transport, but has a Promise-based interface that is familiar with web developers outside of the PHP community.

For more information, see https://www.php.gt/fetch.