html - Check duplicate content without doing a GET -

- January 15, 2010

one of main purposes of url normalization avoid get requests on distinct urls produce exact same result.

now, know can check canonical tag , compare 2 url's html see if they're same, have download exact same resource twice in order this, beating point stated before.

is there way check duplicated content doing head request? if not, there way download <head> section of web page without downloading entire document?

i can think of solutions last one, wan't know if there's direct one.

according msdn documentation solution question following

dim myhttpwebrequest httpwebrequest = ctype(webrequest.create(url), httpwebrequest) dim myhttpwebresponse httpwebresponse = ctype(myhttpwebrequest.getresponse(), httpwebresponse) console.writeline(controlchars.lf + controlchars.cr + "the following headers received in response") dim integer while < myhttpwebresponse.headers.count     console.writeline(controlchars.cr + "header name:{0}, value :{1}", myhttpwebresponse.headers.keys(i), myhttpwebresponse.headers(i))     = + 1 end while myhttpwebresponse.close()

let me explain code first line creates httpwebrequest specified url , second line , third line displays headers present in response received uri , fourth eighth line - headers property webheadercollection. use it's properties traverse collection , display each header , tenth close request , if want head portion of web page php class freely available @ http://www.phpclasses.org/package/4033-php-extract-html-contained-in-tags-from-a-web-page.html

Search This Blog

Expalin

html - Check duplicate content without doing a GET -

Comments

Post a Comment

Popular posts from this blog

c# - how to write client side events functions for the combobox items -

c# - Regex to match full lines of text excluding crlf -

exception - Python, pyPdf OCR error: pyPdf.utils.PdfReadError: EOF marker not found -