문제

I'm writing some link scraping code where I was hoping to grab only the <head> section of a given web page. Apparently I've been confused about what a HEAD request is, as I thought it was supposed to do exactly that. Instead, it just returns HTTP headers.

Is there a way to fetch just the <head> section of a given page, without getting the whole doc?

도움이 되었습니까?

해결책

No, there is no provision for that in the HTTP protocol (which doesn't know about HTML at all). You'll need to do a proper GET or POST, the use an HTML parser to extract the data you need.

The only thing you could do to limit what you get back is use the Range header, but that would just be guess-work on your part as to how much data you request.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top