Computer Science Canada :: View topic - [REBOL] Script to pull images out of web pages

Computer Science Canada

[REBOL] Script to pull images out of web pages

Author:

btiffin [ Wed May 07, 2008 12:52 am ]

Post subject:

[REBOL] Script to pull images out of web pages

Hello,

This script will pull images out of a web page. I wrote it for darkangel, thought I might as well post it.

code:

REBOL [
Title: "snag images"
]

;; ** change the site **
site: http://www.rebol.com

tags: copy []
page: read site

; pull out all the html tags
parse page [
some [to "<" copy tag thru ">" (append tags tag)]
to end
]

; look for tags with "img", pull out the filename,
; maybe append site to get full url, load image
foreach tag tags [
if find tag "img" [
attempt [
start: find/tail tag {src="}
end: find start {"}
file: copy/part start end
unless find file "http" [
unless equal? first file #"/" [insert head file "/"]
]
url: either find file "http" [file] [join site file]
url: to url! url
img: load url
print [url "is" img/size]
]
]
]

If anyone wants it explained, or info on what can be done other than just printing the url and size (width by height) just ask. This is a quick write, a lot of the lines could be removed and expressions condensed. It could be made a function where you pass the site instead of hardcoding etc, etc. By the way, REBOL's load function is doubleplus good. Note; this won't handle all cases, by any means; ECMAScript calculated names, CGI hidden, etc etc.

Cheers

:

You can syndicate this boards posts using the file backend.php or view the topic map using sitemap.php.

Terms of Use | Privacy Policy