문제

I am trying to scrape a website for article titles, however this page only loads the five first titles and loads more when the user scrolls down the page (JSON calls more articles and injects into the page).

The web scraper that I built works perfectly, but only finds the first 5 default articles, and what I am trying to achieve is to load more than 5. Is there any way of achieving that using PHP and if you can explain me why/how it works I would really appreciate because I love to learn these things.

도움이 되었습니까?

해결책

you can use chrome's network monitor to log the source of the ajax requests and then request those from your webscraper, but this really is a "make shift api" , and will brake if the site changes it's json format, you can use the php function json_decode to decode the json.

in order to first retrieve the data, you will have to use file_get_contents

but this will only allow GET If you want more "advanced" options ( like POST ) you will have to look into cURL

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top