一、图片src提取的基本原理

二、使用正则表达式提取图片src

<?php
$html = <<<HTML
<html>
<head><title>Test Page</title></head>
<body>
<img src="https://example.com/image1.jpg" alt="Image 1">
<img src="image2.jpg" alt="Image 2">
</body>
</html>
HTML;

$pattern = '/<img\s+[^>]*src="([^"]*)"[^>]*>/i';
preg_match_all($pattern, $html, $matches);

foreach ($matches[1] as $src) {
    echo "Found image src: $src\n";
}
?>

三、使用DOM解析提取图片src

<?php
$html = <<<HTML
<html>
<head><title>Test Page</title></head>
<body>
<img src="https://example.com/image1.jpg" alt="Image 1">
<img src="image2.jpg" alt="Image 2">
</body>
</html>
HTML;

$dom = new DOMDocument();
@$dom->loadHTML($html);
$images = $dom->getElementsByTagName('img');

foreach ($images as $image) {
    $src = $image->getAttribute('src');
    echo "Found image src: $src\n";
}
?>

这段代码首先加载HTML字符串到一个DOMDocument对象中。然后使用getElementsByTagName方法获取所有的<img>标签,并遍历它们以提取src属性值。

四、实战案例:自动下载网页图片

<?php
$html = file_get_contents('https://example.com');
$pattern = '/<img\s+[^>]*src="([^"]*)"[^>]*>/i';
preg_match_all($pattern, $html, $matches);

foreach ($matches[1] as $src) {
    $image_name = basename($src);
    file_put_contents("uploads/$image_name", file_get_contents($src));
    echo "Downloaded $image_name\n";
}
?>