我用以下代码抓了HTML页面
#!/usr/bin/perl
use HTTP::Client;
my $client = HTTP::Client->new();
my $site = $client->get("http://www.csc.liv.ac.uk/");
my @headers = $client->response_headers;
my $agent = $client->agent;
print $site;
Grap images:
#!/usr/bin/perl
# cnhacktnt {a t} perlchina.org
# http://perlchina.org or http://wanghui.org
use LWP::Simple;
$url='http://www.csc.liv.ac.uk/';
$content=get $url;
if ($content) {
while ($content=~ m/src="(.+?)"/gi) {
$imgurl=$1;
if ($imgurl=~ m/^(?:http|HTTP).*\/(.*)$/) {
$filename=$1;
$imgs{$filename}=$imgurl;
}else{
$imgurl=~ m/"(.+?)"$/;
$filename=$1;
$imgs{$filename}=$url.$imgurl;
}
}
for (keys %imgs) {
print "Getting $imgs{$_},save as $_\n";
getstore $imgs{$_},$_;
}
}
Another approach for grabing images.
我在CPAN 上也找到了另外一种方法.
我也贴上来把.
#!/usr/bin/perl
use Image::Grab;
$pic = new Image::Grab;
$pic->url('http://album.9you.com/pic/comicphoto/98/uqh1116383139.jpg');
$pic->grab;
open(IMAGE, ">image.jpg") || die"image.jpg: $!";
binmode IMAGE; # for MSDOS derivations.
print IMAGE $pic->image;
close IMAGE;
Sunday, May 07, 2006
Subscribe to:
Posts (Atom)
