Question

The following code taken from: Java code for using google custom search API. It works correctly to extract the first 10 results of the first page in google results pages.

public static void main(String[] args) throws Exception {      
String key="YOUR KEY";     
String qry="Android";     
URL url = new URL("https://www.googleapis.com/customsearch/v1?
key="+key+ "&cx=013036536707430787589:_pqjad5hr1a&q="+ qry + "&alt=json");     

HttpURLConnection conn = (HttpURLConnection) url.openConnection();    
conn.setRequestMethod("GET");     
conn.setRequestProperty("Accept", "application/json");     
BufferedReader br = 
new BufferedReader(new InputStreamReader( (conn.getInputStream())));      
String output;     
System.out.println("Output from Server .... \n");     
while ((output = br.readLine()) != null) 
 {          
    if(output.contains("\"link\": \""))
    {                             
     String link=output.substring(output.indexOf("\"link\": \"")+
     ("\"link\": \"").length(), output.indexOf("\","));             
     System.out.println(link);       //Will print the google search links         
    }          
 }     
conn.disconnect();                               
}

I'm trying to figure out how can I traverse all results pages? By searching in https://developers.google.com/custom-search/v1/using_rest I found that the start parameter in the query referes to the index, and it is obvious that by changing this value in a loop this will do the purpose, but will cost me a query for each page (which should not be the case, as it is not a new query, it is the same query but just new page). Also, I found that google have mentioned that if the query succeeds, the response data contains value totalResults for total results, but they mentioned that it is estimate number. So, how can one get benifit of this service and get the actual number of results or number of pages in order to traverse them all ?? It does not make any sense that I issue new query for every page.

Was it helpful?

Solution

  1. You should use a JSON parser to extract data from the results, rather than parsing the result yourself.

  2. Google won't return all the results at once for a single query. If you search for Java, there are approximately 214,000,000 results? Returning them all would last days, and you couldn't do anything meaningful with them anyway. So if there are several pages, you must do a new query for each page, as you do when doing a Google search with your browser. Mostof the time, the interesting results are in the first or second page. Returning more than that would waste resources.

  3. Google doesn't know the exact number of results. It returns an estimate. Counting the exact number of results would be too hard. Knowing that there are 214,000,001 results and not 214,000,002 doesn't ad any value, and the exact number would be immediately obsolete anyway.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top