Tuesday, May 15, 2012

Linear Regression on new data

Previously we did a linear regression with two variables on data from SNAP, and found the problems due to considering only papers with primary tag in HepTh. So, we crawled arxiv and obtained new data and performed linear regression on the new data. The following are the results :

Using Gross Pay : (limiting years since PhD to <= 40)

R^2 : 0.34126 Estimate p-value
Constant 84184.6 2.56162*10^-8
Years Since PhD 2094.61 0.000113447
Citation Count 7.47883 0.0138856


R^2 : 0.323274 Estimate p-value
Constant 89390.7 3.70218*10^-9
Years Since PhD 1955.36 0.000356706
Page Rank 6.57888*10^6 0.0284647
  
Using Base Pay : (limiting years since PhD to <= 40)
R^2 :0.605081 Estimate p-value
Constant 57158.9 1.61337*10^-9
Years Since PhD 2427.95 2.17864*10^-10
Citation Count 4.74429 0.0103403
 
R^2 :0.598601 Estimate p-value
Constant 60317.7 1.94649*10^-10
Years Since PhD 2334.74 9.65709*10^-10
Page Rank 4.39694*10^6 0.0158609

No comments:

Post a Comment