Hello again,
I've been experimenting with php and using it to create screen scraper but I have encountered a problem being a noob I am, when I came upon a dynamic page that sends out XMLHttpRequest to server to obtain new results.
The website that is called realtor.com and when I search for real estate in say chicago, I am using url http://www.realtor.com/realestateandhomes-search/60601 to get results
However the page displays only first 10 results and if I choose to display 50 results, it sends out XMLHttpRequest to http://www.realtor.com/search/resources.aspx(I found it using FireBug)
What I couldn't figure out since I don't know much about xmlhttprequests, is how it forms request to post in order to get the necessary data. And how to extract that data that it gets.
I've searched the web to see answer to my question but couldn't find something that would answer it.
Maybe someone has an answer for me here.
P.s I know it's prob against realtor's terms but I am using this site as an example to get a hold of concept.
Here are the request and response headers
Host: www.realtor.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
Accept: text/javascript, application/javascript, */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Referer: http://www.realtor.com/realestateandhomes-search/60601
Content-Length: 2134
Cookie: Move_UUID=ea981d5b5f024fe9bafc0aff8a3648bf; HSID=527c5d4b88_R_dc:10.160.4.250:355483606868:R; rsi_segs=C05504_10005|D08734_70056|D08734_70079|D08734_70102|C05504_10039|C05504_10040|C05504_10041; ASP.NET_SessionId=su1mauilqtmk1pvkagsgye55; MetaKey=_server_error%7Csrp; ParamCookie=[]; s_cc=true; s_sq=%5B%5BB%5D%5D; widgetClicked=oldSRP; views=srp=list; previousState=MD; SRP_ShownWinks=1; listingdetailmpr=http%3A%2F%2Fwww.realtor.com%2Frealestateandhomes-search%2F60601%23%2Fpagesize-50%2Fpg-1; rowselected=3; currentRowIndex=3; agentId=30248; sid=745ad950dac666b18395744db424829febf4a966; recAlertSearch=recAlertShown=false&sameSrch=false&saveLstCnt=0&sid=; RecentSearch=loc%3d46842%26typ%3d3%26mnp%3d%26mxp%3d%26bd%3d0%26bth%3d0%26status%3d1|loc%3d23641%26typ%3d3%26mnp%3d%26mxp%3d%26bd%3d0%26bth%3d0%26status%3d1|loc%3dSPRAGUE%2cNE%2c68438%26typ%3d3%26mnp%3d%26mxp%3d%26bd%3d0%26bth%3d0%26status%3d1|loc%3dMIDLAND%2cMI%2c48667%26typ%3d3%26mnp%3d%26mxp%3d%26bd%3d0%26bth%3d0%26status%3d1|loc%3dChicago%2cIL%2c60601%26typ%3d3%26mnp%3d%26mxp%3d%26bd%3d0%26bth%3d0%26status%3d1; criteria=fhcnt=3&loc=Chicago%2cIL%2c60601&usrloc=Chicago%2cIL%2c60601&typ=3&status=1
Pragma: no-cache
Cache-Control: no-cache
Date: Thu, 11 Nov 2010 14:40:27 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Set-Cookie: SAVEDITEMS=; domain=realtor.com; expires=Wed, 10-Nov-2010 14:40:27 GMT; path=/
ParamCookie=[]; path=/
criteria=pg=1&fhcnt=3&loc=Chicago%2cIL%2c60601&usrloc=Chicago%2cIL%2c60601&typ=3&status=1; domain=realtor.com; path=/
Cache-Control: no-cache
Pragma: no-cache
Expires: -1
Content-Type: text/javascript; charset=utf-8
ntCoent-Length: 246355
Content-Encoding: gzip
Transfer-Encoding: chunked