[REBOL] Script to pull images out of web pages 
	 
	
		| Author | 
		Message | 
	 
		 
		btiffin
 
  
 
    
		 | 
		
		
			
				  Posted: Wed May 07, 2008 12:52 am    Post subject: [REBOL] Script to pull images out of web pages  | 
	
				
				 | 
			 
			 
				
  | 
			 
			
				Hello,
 
 
   This script will pull images out of a web page.  I wrote it for darkangel, thought I might as well post it.	  | code: | 	 		  
 
REBOL [
 
    Title: "snag images"
 
]
 
 
;; ** change the site **
 
site: http://www.rebol.com
 
 
tags: copy []
 
page: read site
 
 
; pull out all the html tags
 
parse page [
 
    some [to "<" copy tag thru ">" (append tags tag)]
 
    to end
 
]
 
 
; look for tags with "img", pull out the filename,
 
;   maybe append site to get full url, load image
 
foreach tag tags [
 
    if find tag "img" [
 
        attempt [
 
            start: find/tail tag {src="}
 
            end: find start {"}
 
            file: copy/part start end
 
            unless find file "http" [
 
                unless equal? first file #"/" [insert head file "/"]
 
            ]
 
            url: either find file "http" [file] [join site file]
 
            url: to url! url       
 
            img: load url
 
            print [url "is" img/size]
 
        ]
 
    ]
 
]  | 	  
 
If anyone wants it explained, or info on what can be done other than just printing the  url and size (width by height) just ask.  This is a quick write, a lot of the lines could be removed and expressions condensed.  It could be made a function where you pass the site instead of hardcoding etc, etc.   By the way, REBOL's load function is doubleplus good.  Note; this won't handle all cases, by any means; ECMAScript calculated names, CGI hidden, etc etc.
 
 
Cheers | 
			 
			
				 | 
			 
		  | 
	 
	 
		 | 
		
		 | 
	 
	  
		  | 
	 
		 
		Sponsor Sponsor 
		 
  
		 | 
		
 | 
	 
	 
		  | 
	 
				 
		 | 
	 
 
	
	
	 
	
	 |