You can find the composition of the CAC40 at Wikipedia, and download and process with package XML
.
The function readHTMLTable()
is particularly useful, since it will find and parse all tables on the page. In this case the relevant table is the second, hence the index [[2]]
in the code. Try:
library(XML)
url <- "http://en.wikipedia.org/wiki/CAC_40"
dat <- readHTMLTable(url)[[2]]
head(dat[, 1:3])
Company ICB Sector Ticker symbol
1 Accor hotels AC
2 Air Liquide commodity chemicals AI
3 Alstom industrial machinery ALO
4 ArcelorMittal steel MT
5 AXA full line insurance CS
6 BNP Paribas banks BNP
The same code also works for the FTSE 100:
url <- "http://en.wikipedia.org/wiki/FTSE_100_Index"
dat <- readHTMLTable(url)[[2]]
head(dat[, 1:3])
Company Sector Market cap (£bn)
1 Royal Dutch Shell Oil and gas 135
2 HSBC Banking 129
3 BP Oil and gas 85
4 Vodafone Group Telecomms 83
5 GlaxoSmithKline Pharmaceuticals 73
6 British American Tobacco Tobacco 69