Musings of a Fondue

Mechanize X Investopedia

I wanted to get the trade history of a user (whom I suspected was a bot), but I didn’t want to copy and paste all those pages.

I had previously watched episode 191 of RailsCasts where Ryan Bates uses Mechanize to login to a site and scrape the desired data. The original instructions in the video were not 100% applicable, however the Mechanize documentation filled in the gaps.

I put a script together using the two sources, and… it worked!

First bit

Go to the website and login.


agent = Mechanize.new

page = agent.get("http://www.investopedia.com/accounts/login.aspx?returnurl=http://www.investopedia.com/simulator/")

login_form = page.forms[1]    #grab the second form
login_form.email = "email"
login_form.password = "password"
login_form.submit

Second bit

Get desired data. This one varies depending on what you want to do once logged in.


# **Loop through pages**
for i in 1..54

    partial_url = "http://www.investopedia.com/simulator/trade/tradeoverview.aspx?UserID=4187215&GameID=211140&Currency=USD&page="
    url = partial_url + i.to_s  #add page number to end of url
    page = agent.get( url )  

    tradeHistory = page.search("#gvTradeHistory td")

    tradeHistory.each do |x|
        puts x.text
    end
end

Here’s the full code

Make sure to checkout the RailsCasts video. Ryan Bates does an amazing job at explaining. I also recommend watching the previous episode.

Third bit

Prettify the extracted data.

I was initially going to use Excel to analyze the data. But that would be going backwards, so I did a quick search for JavaScript tables. I came across the Table Sorter plugin by Christian Bach. And bam!

Checkout the prettified data

The good thing with this approach (keeping the data as an array versus porting to Excel), is that you have more control and options with what you can do with the data. Whereas in Excel you are limited to the functionality built in to it, with JavaScript how you visualize the data is limited only by your imagination.

Note: The data I scraped was from a simulator where the trades are with imaginary money on a public no stakes game. I would never intentionally post someone’s sensitive information.

Comments