기초, Series, DataFrame 사용법, Sublime Text 사용법
Pandas, 판다스[참조] http://pinkwink.kr/734, http://pinkwink.kr/735
1) pandas 설치
cd Python35
cd script
pip3 install pandas
2) 에디터로 sublime text 를 사용해본다.
- 우측하단에 Plain Text 를 Python 으로 변경
- https://packagecontrol.io/installation 에 접속해서 import urlib....어쩌고하는 문구를 복사한뒤,
sublime 에서 Ctrl + ` 을 눌러서 붙여넣는다.
- sublime text 를 재시작한다.
- Preferences > Package Control 혹은 SHIFT + CTRL + P
Package Conotrol: Add Repository 검색후, 하단 URL에 https://github.com/wuub/SublimeREPL 을
붙여넣는다.
[ Series ]
3)
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
obj = Series([4,7,-5,3])
print(obj.index)
print(obj.values)
print(obj)
obj.index = ['Bob','steve','Jeff','Ryan']
print(obj)
-------------------
RangeIndex(start=0, stop=4, step=1)
[ 4 7 -5 3]
0 4
1 7
2 -5
3 3
dtype: int64
Bob 4
steve 7
Jeff -5
Ryan 3
dtype: int64
4)
obj2 = Series([4,7,-5,3], index=['b','b','a','c'])
print(obj2)
print(obj2['a'])
print(obj2['b'])
print(np.exp(obj2))
print(obj2 * 2)
----------------------------
b 4
b 7
a -5
c 3
dtype: int64
-5
b 4
b 7
dtype: int64
b 54.598150
b 1096.633158
a 0.006738
c 20.085537
dtype: float64
b 8
b 14
a -10
c 6
dtype: int64
5)
sdata = {'Ohio':35000,'Texas':71000,'Oregon':16000,'Utah':5000}
obj3 = Series(sdata)
print(obj3)
states=['California', 'Ohio', 'Oregon', 'Texas']
obj4 = Series(sdata, index=states)
print(obj4)
pd.isnull(obj4)
------------------------------
Ohio 35000
Texas 71000
Oregon 16000
Utah 5000
dtype: int64
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
dtype: float64
California True
Ohio False
Oregon False
Texas False
dtype: bool
6)
pd.notnull(obj4)
California True
Ohio False
Oregon False
Texas False
dtype: bool
7)
obj4.isnull()
California True
Ohio False
Oregon False
Texas False
dtype: bool
8)
print(obj3 + obj4)
California NaN
Ohio 70000.0
Oregon 32000.0
Texas 142000.0
Utah NaN
dtype: float64
9)
obj4.name = "population"
obj4.index.name="state"
print(obj4)
state
California NaN
Ohio 35000.0
Oregon 16000.0
Texas 71000.0
Name: population, dtype: float64
[ DataFrame ]
1)
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
data = {'state':['Ohio', 'Ohio', 'Ohio', 'Nevada', 'Nevada'],
'year':[2000,2001,2002,2001,2002],
'pop':[1.5,1.7,3.6,2.4,2.9]}
frame = DataFrame(data)
print(frame)
state year pop
0 Ohio 2000 1.5
1 Ohio 2001 1.7
2 Ohio 2002 3.6
3 Nevada 2001 2.4
4 Nevada 2002 2.9
'Pandas, 판다스' 카테고리의 다른 글
Group By + Min (0) | 2020.05.04 |
---|