I wanted to get the trade history of a user (whom I suspected was a bot), but I didn’t want to copy and paste all those pages.
I had previously watched episode 191 of RailsCasts where Ryan Bates uses Mechanize to login to a site and scrape the desired data. The original instructions in the video were not 100% applicable, however the Mechanize documentation filled in the gaps.
I put a script together using the two sources, and… it worked!
Go to the website and login.
agent = Mechanize.new page = agent.get("http://www.investopedia.com/accounts/login.aspx?returnurl=http://www.investopedia.com/simulator/") login_form = page.forms #grab the second form login_form.email = "email" login_form.password = "password" login_form.submit
Get desired data. This one varies depending on what you want to do once logged in.
# **Loop through pages** for i in 1..54 partial_url = "http://www.investopedia.com/simulator/trade/tradeoverview.aspx?UserID=4187215&GameID=211140&Currency=USD&page=" url = partial_url + i.to_s #add page number to end of url page = agent.get( url ) tradeHistory = page.search("#gvTradeHistory td") tradeHistory.each do |x| puts x.text end end
Here’s the full code →
Prettify the extracted data.
Note: The data I scraped was from a simulator where the trades are with imaginary money on a public no stakes game. I would never intentionally post someone’s sensitive information.