2020. 9. 1. 19:42ใLearning archive/Data Science
๐ Gapminder World Map
๐ How to use Matplotlib
plot function tells python what to show, how to show!
๐Scatter plot : doesn't connect the dots, more honest way
๐Practice : ๋์ ์ฒซ ๊ทธ๋ํ
Histogram
e.g. population pyramid :
Histogram_practice
Build a histogram(3) : compare
customizaiton : how to customize plots
different plot types * (colors, shapes, .. etc. ) customizations * Data * story
+ add axis labels
Labels
New python type : Dictionaries (so useful!)
๊ธฐ์กด์ ๋ฐฉ์ : 2๊ฐ์ ๋ฆฌ์คํธ๋ฅผ ๋ง๋ ๋ค. ์ธ๋ฑ์ค๋ก ๋ ๋ฆฌ์คํธ๋ฅผ ์ฐ๊ฒฐํ๋ค. not convenient, not intuitive
Dictionary์ ๋ฐฉ์
pop = [30, 2, 39]
countries = ["afg", "alba", "algeria"]
...
world = {"afg":30, "alba":2, "algeria":39}
world[alba]
// intuitive, more efficient (more speed)
Dictionary Manipulation(1) add keys
๐ Pandas, Part1
Tabular data set examples
row = observations ๊ด์ธก๊ฐ์ฒด
col = variable = ๋ณ์
๐ pandas = high level data manipulation tool (built on Numpy)
you can build it manually #1 : from dict using pd.DataFrame(dictname)
#2 Dataframe from CSV file
- CSVํ์ผ์ comma-separated values์ ์ฝ์๋ก, ๋ฐ์ดํฐ๋ฅผ ์ ์ฅํ๊ณ ๊ณต์ ํ๋ ๋งค์ฐ ๊ฐํธํ ํฌ๋งท์ด๋ค,
- CSVํ์ผ์ ์ซ์๋ ๋ฌธ์์ด๋ก ๊ตฌ์ฑ๋ ํ๋ฅผ ์ผ๋ฐ ํ ์คํธ(plain text)๋ก ์ ์ฅํ๋ฏ๋ก, ์ด๋ฅผ ์ ์ฅ, ์ ์ก, ์ฒ๋ฆฌ ํ ์ ์๋ ํ๋ก๊ทธ๋จ์ด ๋ค์ํ๋ค.
- ๋ฐ์ดํฐ ์ ์ฅ์(csv file)์ ๋ฐ์ดํฐ ์ฒ๋ฆฌ(python ์คํฌ๋ฆฝํธ๊ฐ ๋ถ๋ฆฌ๋๋ฏ๋ก, ๋ค๋ฅธ ๋ฐ์ดํฐ์ ์ ๋๊ฐ์ ์ฒ๋ฆฌ ๊ณผ์ ์ ๋ณด๋ค ์ฝ๊ฒ ์ ์ฉํ ์ ์๋ค.
index_col=0
practice
dictionary๋ก dataframe ์คํ์ํค๊ณ , ํ์ label๋ถ์ด๊ธฐ
CSV to Dataframe(1)
pandas ํธ์ถ, csv ๋ถ๋ฌ์ค๊ณ , index_col=0
Pandas part2
๐ Index and selet data! : first, with []
In the video, you saw that you can index and select Pandas DataFrames in many different ways. The simplest, but not the most powerful way, is to use square brackets.
2 Column ๋ ok []
Row Access []
but [] : limited functionality
Pandas loc์ iloc์ ์ฌ์ฉํด๋ณด์
loc (label-based)
difference here = you can extend ur sellection with your commas, +
loc(label-based)
Row Acces iloc : can use index
loc vs iloc
Practice!
์ถ์ฒ
datacamp.com