Auto Search list and Scrape table
I presume you have got the names from excel sheet so I used a name list
and using python request
module and get the page text and then use beautiful soup
to get table content and Then I have use pandas
to get the info in dataframe
.
Code:
import requestsimport pandas as pdfrom bs4 import BeautifulSoupplayernames=['Dominique Jones', 'Joe Young', 'Darius Adams', 'Lester Hudson', 'Marcus Denmon', 'Courtney Fortson']for name in playernames: fname=name.split(" ")[0] lname=name.split(" ")[1] url="https://basketball.realgm.com/search?q={}+{}".format(fname,lname) print(url) r=requests.get(url) soup=BeautifulSoup(r.text,'html.parser') table=soup.select_one(".tablesaw ") dfs=pd.read_html(str(table)) for df in dfs: print(df)
Output:
https://basketball.realgm.com/search?q=Dominique+Jones Player Pos HT ... Draft Year College NBA0 Dominique Jones G 6-4 ... 2010 South Florida Dallas Mavericks1 Dominique Jones G 6-2 ... 2009 Liberty -2 Dominique Jones PG 5-9 ... 2011 Fort Hays State -[3 rows x 8 columns]https://basketball.realgm.com/search?q=Joe+Young Player Pos HT ... Draft Year College NBA0 Joe Young F 6-6 ... 2007 Holy Cross -1 Joe Young G 6-0 ... 2009 Canisius -2 Joe Young G 6-2 ... 2015 Oregon Indiana Pacers3 Joe Young G 6-2 ... 2009 Central Missouri -[4 rows x 8 columns]https://basketball.realgm.com/search?q=Darius+Adams Player Pos HT ... Draft Year College NBA0 Darius Adams PG 6-1 ... 2011 Indianapolis -1 Darius Adams G 6-0 ... 2018 Coast Guard Academy -[2 rows x 8 columns]https://basketball.realgm.com/search?q=Lester+Hudson Season Team GP GS MIN ... STL BLK PF TOV PTS0 2009-10 * All Teams 25 0 5.3 ... 0.32 0.12 0.48 0.56 2.321 2009-10 * BOS 16 0 4.4 ... 0.19 0.12 0.44 0.56 1.382 2009-10 * MEM 9 0 6.8 ... 0.56 0.11 0.56 0.56 4.003 2010-11 WAS 11 0 6.7 ... 0.36 0.09 0.91 0.64 1.644 2011-12 * All Teams 16 0 20.9 ... 0.88 0.19 1.62 2.00 10.885 2011-12 * CLE 13 0 24.2 ... 1.08 0.23 2.00 2.31 12.696 2011-12 * MEM 3 0 6.5 ... 0.00 0.00 0.00 0.67 3.007 2014-15 LAC 5 0 11.1 ... 1.20 0.20 0.80 0.60 3.608 CAREER NaN 57 0 10.4 ... 0.56 0.14 0.91 0.98 4.70[9 rows x 23 columns]https://basketball.realgm.com/search?q=Marcus+Denmon Season Team Location GP GS ... STL BLK PF TOV PTS0 2012-13 SAN Las Vegas 5 0 ... 0.4 0.0 1.60 0.20 5.401 2013-14 SAN Las Vegas 5 1 ... 0.8 0.0 2.20 1.20 10.802 2014-15 SAN Las Vegas 6 2 ... 0.5 0.0 1.50 0.17 5.003 2015-16 SAN Salt Lake City 2 0 ... 0.0 0.0 0.00 0.00 0.004 CAREER NaN NaN 18 3 ... 0.5 0.0 1.56 0.44 6.17[5 rows x 24 columns]https://basketball.realgm.com/search?q=Courtney+Fortson Season Team GP GS MIN FGM ... AST STL BLK PF TOV PTS0 2011-12 * All Teams 10 0 9.5 1.10 ... 1.00 0.3 0.0 0.50 1.00 3.501 2011-12 * HOU 6 0 8.2 1.00 ... 0.83 0.5 0.0 0.33 0.83 3.002 2011-12 * LAC 4 0 11.5 1.25 ... 1.25 0.0 0.0 0.75 1.25 4.253 CAREER NaN 10 0 9.5 1.10 ... 1.00 0.3 0.0 0.50 1.00 3.50[4 rows x 23 columns]
You have to have url list with players and scrape the pages using beautiful soup.
import urllib2from bs4 import BeautifulSoupsoup = BeautifulSoup(urllib2.urlopen('http://example.com').read())