Python Pandas outlines for data analysis. This page outlines Pandas methods to create graphs. The data is [here][Pandas analysis]. It is based on crimes per 100,000 people:
Plotting with Pandas |
Outline
For correlation
Source code
The following outlines the Python code used:
import numpy as np import pandas as pd import sys import matplotlib.pyplot as plt xval = 'Violent Crime'; yval = 'Murder'; file='1111' ver=pd.read_csv("city.csv") plt.xlabel(xval) plt.ylabel(yval) plt.scatter(ver[xval],ver[yval]) plt.show() f2= file+".svg" plt.savefig(f2,format='SVG') f2= file+".png" plt.savefig(f2,format='PNG')
Data
The data used is [here]
State,City,Population,Violent Crime,Murder,Rape,Robbery,Aggravated Assault,Property Crime,Burglary,Larceny-Theft,Motor Vehicle Theft,Arson New Mexico,Albuquerque,558874,883,5.4,71.9,247.1,558.4,5446.1,1095.6,3713.9,636.6,15.4 California,Anaheim,346956,317,4,22.8,120.5,170.1,2362.3,375.0,1619.8,367.5,6.6 Alaska,Anchorage,301306,865,4,130.1,164.6,565.9,3827.0,456.3,3059.0,311.6,26.9 Texas,Arlington,382976,484,3.4,53.8,128.7,298.2,3515.1,644.9,2633.6,236.6,6.8 Georgia,Atlanta,454363,1227,20.5,33.2,512.6,661.1,5747.4,1203.9,3631.0,912.5,16.5 Colorado,Aurora,350948,413,3.1,78.1,118.8,212.6,2838.6,517.5,2018.0,303.2,21.7 Texas,Austin,903924,396,3.5,63.2,96.6,232.9,4142.4,634.2,3255.0,253.1,10.5 California,Bakersfield,367406,457,4.6,5.7,179.6,266.7,3972.4,1106.4,2244.7,621.4,101.3 Maryland,Baltimore,623513,1339,33.8,39.3,589.7,675.7,4718.4,1110.8,2888.2,719.5,34.2 Massachusetts,Boston,654413,726,8.1,42.8,256.7,418.1,2638.9,409.5,1998.3,231.0,30.0 New York,Buffalo,258419,1228,23.2,67.3,494.2,643.5,4817.4,1207.0,3235.8,374.6,30.0 Arizona,Chandler,252369,185,0.4,23.8,44,116.5,2236.0,378.0,1767.6,90.3,38.4 North Carolina,Charlotte-Mecklenburg,856916,590,5.5,24.5,185.1,374.7,3566.9,703.8,2663.9,199.2,19.3 California,Chula Vista,259894,236,2.7,15.4,82.3,135.1,1740.7,235.9,1189.3,315.5,13.5 Illinois,Chicago,2724121,884,15.1,49.3,359.9,460.0,3126.2,533.6,2224.6,367.9,16.9 Ohio,Cincinnati,297671,905,20.2,76.6,455.5,353.1,5562.2,1619.2,3574.1,368.9,118.6 Ohio,Cleveland,388655,1334,16.2,124,769.3,424.8,5434.4,1787.7,2659.2,987.5,78.2 Colorado,Colorado Springs,444949,458,4.5,92.6,90.8,270.4,3667.6,620.1,2677.6,369.9,21.1 Ohio,Columbus,830811,549,10,88.8,252.5,197.9,4253.0,1091.2,2807.9,353.9,30.0 Texas,Corpus Christi,319211,656,8.5,87.7,118.1,441.7,4420.3,725.5,3519.0,175.7,15.7 Texas,Dallas,1272396,665,9.1,61.4,303.1,291.1,3589.2,918.3,2117.2,553.7,30.3 Colorado,Denver,665353,599,4.7,67.3,164,362.7,3359.4,684.6,2158.7,516.1,20.7 Michigan,Detroit,684694,1989,43.5,81.4,521.4,1342.4,4817.2,1340.3,2004.3,1472.6,71.6 Texas,El Paso,680273,393,3.1,49.2,61,279.3,2141.8,232.1,1789.0,120.7,7.1 Indiana,Fort Wayne,257172,317,4.7,40.4,134.9,137.3,3247.6,686.3,2419.8,141.5,17.5 Texas,Fort Worth*,789035,560,6.1,66.3,159.2,328.6,4343.5,1053.9,2985.5,304.0,19.6 California,Fresno,513187,464,9.2,10.3,152.2,292.5,4111.8,919.9,2587.2,604.7,50.5 North Carolina,Greensboro,282203,477,8.2,18.4,172.6,277.8,3600.2,888.7,2522.7,188.9,40.4 Nevada,Henderson,274121,165,1.1,32.5,60.2,71.1,1978.3,509.6,1288.1,180.6,10.6 Texas,Houston,2219933,991,10.9,36.6,458.8,485.1,4693.7,974.3,3068.8,650.6,32.3 Indiana,Indianapolis,858238,1255,15.8,66.8,443.7,728.4,4823.1,1412.8,2806.9,603.4,31.6 Florida,Jacksonville,856021,684,11.2,56,165.8,450.8,3940.6,795.1,2914.1,231.4,10.7 New Jersey,Jersey City,260005,531,9.2,13.5,238.5,270.0,1630.7,341.1,1078.4,211.1,14.2 Missouri,Kansas City,468417,1251,16.7,83.3,346.9,804.6,4835.0,1208.1,2783.2,843.7,46.3 Texas,Laredo,250994,389,5.6,39.4,78.1,265.7,3859.5,504.8,3244.7,110.0,29.9 Nevada,Las Vegas,1530899,841,8,51,319.1,463.1,2923.4,924.3,1530.6,468.5,9.2 Kentucky,Lexington,311848,334,6.4,43,177.7,106.8,3891.0,767.4,2827.0,296.6,13.8 Nebraska,Lincoln,271208,339,2.6,56,75.6,204.3,3348.7,481.2,2747.7,119.8,24.0 California,Long Beach,471123,489,4.9,23.3,188.7,272.1,2640.1,739.1,1459.3,441.7,16.6 California,Los Angeles,3906772,491,6.7,28.8,203.5,251.8,2128.1,385.7,1389.4,352.9,29.1 Kentucky,Louisville Metro,677710,591,8.3,27.2,225.9,329.6,4185.3,946.9,2902.3,336.1,30.0 Tennessee,Memphis,654922,1741,21.4,76.5,501.6,1141.1,5988.0,1748.5,3785.3,454.3,53.4 Arizona,Mesa,462092,459,2.8,54.5,101.5,299.7,2800.3,510.3,2108.7,181.3,18.4 Florida,Miami,421996,1060,19.2,26.1,424.2,590.5,4832.7,867.1,3439.4,526.3,15.6 Wisconsin,Milwaukee,600374,1476,15,65.8,586.3,809.3,4580.3,987.6,2484.6,1108.1,46.0 Minnesota,Minneapolis,404461,1012,7.7,96.2,462.6,445.5,4728.0,1016.7,3332.8,378.5,28.9 Alabama,Mobile,250655,594,12.4,54.3,170.8,356.3,4629.1,1135.8,3281.4,211.8,30.0 Tennessee,Nashville,647689,1123,6.3,75.2,235.1,805.8,3630.6,734.3,2720.8,175.5,15.0 Louisiana,New Orleans,387113,974,38.7,63,379.7,492.4,4231.8,893.3,2663.0,675.5,30.0 New York,New York,8473938,597,3.9,25.8,195.7,371.3,1601.9,187.8,1323.0,91.2,30.0 New Jersey,Newark,279110,1078,33.3,17.6,688.6,338.2,2851.2,622.0,1365.1,864.2,14.0 California,Oakland,409994,1685,19.5,51,849,765.9,5943.3,977.1,3376.1,1590.0,42.9 Oklahoma,Oklahoma City,617975,774,7.3,70.2,182.2,514.1,4410.9,1074.2,2768.9,567.8,15.5 Nebraska,Omaha,438465,561,7.3,41.1,164.9,347.3,4345.4,683.5,2944.4,717.5,8.4 Florida,Orlando,259675,901,5.8,64.3,238.8,592.3,6359.9,1287.0,4691.2,381.6,21.2 Pennsylvania,Philadelphia,1559062,1021,15.9,77.4,447.1,481.1,3387.7,621.8,2398.5,367.4,25.6 Arizona,Phoenix,1529852,572,7.5,65.8,193,305.7,3724.3,935.8,2317.9,470.6,20.5 Pennsylvania,Pittsburgh,307613,798,22.4,29.6,320.5,425.5,3212.8,692.1,2326.6,194.1,62.4 Texas,Plano,277822,165,1.4,28.8,56.2,78.8,1974.6,296.6,1572.6,105.5,8.3 Oregon,Portland,615672,473,4.2,42.6,137.6,288.5,5234.8,673.4,4013.0,548.3,27.0 North Carolina,Raleigh,428993,392,2.8,18.4,141,230.1,3063.0,735.9,2162.7,164.3,12.6 California,Riverside,319453,433,3.8,44.8,138,246.7,3087.8,479.3,2126.8,481.8,27.5 California,Sacramento,482767,615,5.8,16.2,207.1,385.7,3123.2,670.7,1956.0,496.5,35.0 Texas,San Antonio,1428465,539,7.2,75.4,124.4,332.3,5417.8,864.1,4053.9,499.8,20.9 California,San Diego,1368690,381,2.3,27.1,96.3,255.2,1959.0,373.7,1219.5,365.8,15.3 California,San Francisco,850294,795,5.3,41.8,379.2,368.9,5303.2,615.9,3966.9,720.5,28.3 California,San Jose,1009679,321,3.2,30.3,106.2,181.4,2434.1,511.7,1173.6,748.8,12.2 California,Santa Ana,336462,375,5.3,34.2,134.3,200.6,1719.1,235.7,1095.2,388.2,6.2 Washington,Seattle,663410,603,3.9,23.2,236.2,339.8,6127.3,1070.1,4226.0,831.2,11.9 Missouri,St. Louis,318574,1679,49.9,87.6,490.3,1050.9,6252.6,1321.2,3912.1,1019.2,59.6 Minnesota,St. Paul,297984,663,3.7,60.7,219.5,378.5,3484.4,781.9,2028.6,673.9,39.9 Florida,St. Petersburg,250772,865,7.6,61.4,246.8,549.1,5642.6,1041.6,4043.1,557.9,17.5 California,Stockton,299519,1332,16.4,44.7,366.6,903.8,4389.7,1043.0,2698.3,648.4,25.4 Florida,Tampa,357124,582,7.8,17.4,143.9,413.3,2427.7,509.6,1779.5,138.6,12.0 Ohio,Toledo,281150,1091,8.5,82.2,335.8,664.8,2427.7,1668.5,1779.5,337.5,30.0 Arizona,Tucson,525486,641,8.9,41.1,190.7,400.2,6581.9,943.3,5221.8,416.8,28.7 Oklahoma,Tulsa,399556,805,11.5,78.3,230.3,485.0,5081.6,1376.5,3127.0,578.1,34.5 Virginia,Virginia Beach,451102,146,3.8,23.7,55.9,63.0,2174.9,232.8,1861.2,80.9,13.3 District of Columbia,Washington,658893,1185,15.9,71.3,490.4,607.7,5012.5,525.6,3928.0,559.0,13.3 Kansas,Wichita,386486,793,3.9,63.1,121.1,604.9,5382.3,1017.6,3851.4,513.3,29.2
Outline
The following is an outline of the code:
import numpy as np import pandas as pd import sys import matplotlib.pyplot as plt import statsmodels.api as sm xval = '5 GCEs or more'; yval = 'Leave'; file='1111' ver=pd.read_csv("eu.csv") plt.title(yval+' v ' + xval) plt.xlabel(xval) plt.ylabel(yval) plt.scatter(ver[xval],ver[yval]) axes = plt.gca() m, b = np.polyfit(ver[xval], ver[yval], 1) X_plot = np.linspace(axes.get_xlim()[0],axes.get_xlim()[1],100) plt.plot(X_plot, m*X_plot + b, '-') if (b>0): print yval,'=',round(m,2),' x ',xval,'+',round(b,2) else: print yval,'=',round(m,2),' x ',xval,round(b,2) print sm.OLS(ver[xval], ver[yval]).fit().summary() plt.show()