Skip to content

Conversation

@Mdsujansarkar
Copy link
Contributor

Fix finfo_file() for remote resources (GH-20679)

Summary

Fixes finfo_file() to work correctly with remote URLs (HTTP/HTTPS) by ensuring that magic_stream() is called even when php_stream_stat() fails on remote resources.

Problem

Previously, finfo_file() would return false for remote URLs because:

  1. php_stream_stat() fails on remote streams (stat is not available for HTTP/HTTPS resources)
  2. The code only called magic_stream() inside the if (php_stream_stat() == SUCCESS) block
  3. When stat failed, ret_val remained NULL, causing the function to return false

Example of the bug:

$f = finfo_open();
var_dump(finfo_file($f, "https://example.com"));
// Expected: string(53) "HTML document, ASCII text, with very long lines (512)"
// Actual:   bool(false)

Solution

Restructured the logic in php_fileinfo_from_path() to:

  1. Use php_stream_stat() only to detect directories (when stat succeeds)
  2. Always attempt magic_stream() when the resource is not a directory, regardless of whether stat succeeded or failed

This ensures that:

  • Local directories are still correctly identified as "directory"
  • Remote URLs are processed by magic_stream() even when stat fails
  • Local files continue to work as before

Changes

Code Changes

File: ext/fileinfo/fileinfo.c

Before:

const char *ret_val = NULL;
if (php_stream_stat(stream, &ssb) == SUCCESS) {
    if (ssb.sb.st_mode & S_IFDIR) {
        ret_val = "directory";
    } else {
        ret_val = magic_stream(magic, stream);
        if (UNEXPECTED(ret_val == NULL)) {
            php_error_docref(NULL, E_WARNING, "Failed identify data %d:%s", magic_errno(magic), magic_error(magic));
        }
    }
}

After:

const char *ret_val = NULL;
if (php_stream_stat(stream, &ssb) == SUCCESS) {
    if (ssb.sb.st_mode & S_IFDIR) {
        ret_val = "directory";
    }
}
if (!ret_val) {
    ret_val = magic_stream(magic, stream);
    if (UNEXPECTED(ret_val == NULL)) {
        php_error_docref(NULL, E_WARNING, "Failed identify data %d:%s", magic_errno(magic), magic_error(magic));
    }
}

Test Case

File: ext/fileinfo/tests/finfo_file_remote_url.phpt

Added a new test case that verifies finfo_file() works with remote HTTP resources using a local test server.

--TEST--
GH-20679 (finfo_file() doesn't work on remote resources)
--EXTENSIONS--
fileinfo
--INI--
allow_url_fopen=1
--SKIPIF--
<?php
if (@!include "./ext/standard/tests/http/server.inc") die('skip server.inc not available');
http_server_skipif();
?>
--FILE--
<?php
require "./ext/standard/tests/http/server.inc";

['pid' => $pid, 'uri' => $uri] = http_server([
    "data://text/plain,HTTP/1.0 200 Ok\r\n\r\n<html>foo",
], $output);

$f = finfo_open();
var_dump(finfo_file($f, $uri));

http_server_kill($pid);
?>
--EXPECT--
string(51) "HTML document, ASCII text, with no line terminators"

Testing

Manual Testing

Tested with various scenarios:

  1. Remote HTTP URL:

    $f = finfo_open();
    var_dump(finfo_file($f, "http://example.com"));
    // Result: string(53) "HTML document, ASCII text, with very long lines (512)"
  2. Remote HTTPS URL (requires OpenSSL):

    $f = finfo_open();
    var_dump(finfo_file($f, "https://example.com"));
    // Result: string(53) "HTML document, ASCII text, with very long lines (512)"
  3. Local files: Still work correctly

  4. Directories: Still correctly identified as "directory"

Automated Testing

Run the test suite:

./sapi/cli/php run-tests.php ext/fileinfo/tests/finfo_file_remote_url.phpt

Result: ✅ PASS

Requirements

For Testing

  • pcntl extension: Required to run the PHPT test (uses http_server() which requires pcntl_fork())

    • Build PHP with: ./configure --enable-pcntl
  • allow_url_fopen: Must be enabled (set in test's --INI-- section)

For HTTPS Support (Optional)

  • OpenSSL extension: Required for HTTPS URLs to work
    • Build PHP with: ./configure --with-openssl
    • Note: HTTP URLs work without OpenSSL

Backward Compatibility

Fully backward compatible:

  • Local files continue to work exactly as before
  • Directories are still correctly identified
  • No changes to function signatures or behavior
  • Only fixes the bug where remote URLs returned false

Related Issues

Additional Notes

  • The fix differentiates between statable (local files) and non-statable (remote URLs) streams
  • For statable streams, we still use stat to detect directories efficiently
  • For non-statable streams, we fall back to magic_stream() which works for all stream types
  • This approach is more robust and handles edge cases better

@ndossche
Copy link
Member

ndossche commented Dec 22, 2025

This reads like an LLM. Also, did you take my suggestion patch that I posted in the issue as your own ... ?

@iluuu1994
Copy link
Member

There's already an open PR for this (#20700), so let's close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

finfo_file() doesn't work on remote resources

3 participants