๐Ÿ“Š R ๊ธฐ์ดˆ | ๋ฒกํ„ฐ ๋งŒ๋“ค๊ธฐ, ๋„ค์ด๋ฐ, ๋ฒกํ„ฐ ์—ฐ์‚ฐ, ํŠน์ • ์›์†Œ ์„ ํƒํ•˜๊ธฐ

2020. 9. 12. 11:13ใ†Learning archive/Data Science

๐Ÿ“ŒR์˜ ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ 

 

Image credit @Gaurav Tiwari (Medium) 

 

R์—์„œ ์ œ๊ณตํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๋Š” ๋ฒกํ„ฐ, ๋งคํŠธ๋ฆญ์Šค, ๋ฐฐ์—ด, ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„, ๋ฆฌ์ŠคํŠธ๊ฐ€ ์žˆ๋‹ค.  

 

 

๐Ÿ“Œ ๋ฒกํ„ฐ ๋งŒ๋“ค๊ธฐ

#๋ฒกํ„ฐ๋ž€, ํ•˜๋‚˜ ํ˜น์€ ํ•˜๋‚˜์ด์ƒ์˜ ์›์†Œ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ ํ˜•ํƒœ์ด๋‹ค. 
#ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ๋Š” ๋™์ผํ•œ ์ž๋ฃŒํƒ€์ž…์„ ๊ฐ€์ ธ์•ผ ํ•œ๋‹ค. 

vector <- "yeah!" 


#c(combine function)์„ ํ†ตํ•ด ์—ฌ๋Ÿฌ ์›์†Œ๋ฅผ ๊ฐ€์ง„ ๋ฒกํ„ฐ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. 
numeric_vector <- c(1,50,60) 
character_vector <- c("s", "a","n")
boolean_vector <-c(TRUE,FALSE,TRUE)

 

 

๐Ÿ“Œ ๋ฒกํ„ฐ ์ด๋ฆ„ ์ง€์ •ํ•˜๊ธฐ names() 

# Poker winnings from Monday to Friday
poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday
roulette_vector <- c(-24, -50, 100, -350, 10)

# Assign days as names of poker_vector
names(poker_vector) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

# Assign days as names of roulette_vector
names(roulette_vector) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

 

๐Ÿ“Œ ๋ฒกํ„ฐ ์ด๋ฆ„์„ ๋ณ€์ˆ˜๋กœ ์ €์žฅํ•œ ๋’ค ์ง€์ •ํ•˜๊ธฐ 

# Poker winnings from Monday to Friday
poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday
roulette_vector <- c(-24, -50, 100, -350, 10)

# The variable days_vector #๋ณ€์ˆ˜์— ์ด๋ฆ„์„ ์ง€์ •ํ•œ๋‹ค
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday") 
 
# Assign the names of the day to roulette_vector and poker_vector
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

 

๐Ÿ“Œ ๋ฒกํ„ฐ ๊ฐ„ ์—ฐ์‚ฐ 

A_vector <- c(1, 2, 3)
B_vector <- c(4, 5, 6)

#์œ„ ๋‘ ๋ฒกํ„ฐ๋ฅผ ๋”ํ•˜์ž 
total_vector <- A_vector + B_vector
  
# ๋”ํ•œ ๊ฐ’์„ ์ถœ๋ ฅํ•˜์ž 
total_vector

# console 
[1] 5 7 9 
# ํฌ์ปค์™€ ๋ฃฐ๋ › ์ˆ˜์ต (์›”-๊ธˆ) 
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Assign to total_daily how much you won/lost on each day
total_daily <- roulette_vector + poker_vector
total_daily


#console 
> total_daily
   Monday   Tuesday Wednesday  Thursday    Friday 
      116      -100       120      -470       250

 

๐Ÿ“Œ ๋‘ ๋ฒกํ„ฐ์˜ ์›์†Œ์˜ ํ•ฉ์„ ๋น„๊ตํ•ด๋ณด์ž sum() 

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Calculate total gains for poker and roulette
# sum()์œผ๋กœ ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ ๋‚ด ์›์†Œ๋ฅผ ๋ชจ๋‘ ๋”ํ•œ๋‹ค
total_poker <-sum(poker_vector) 
total_roulette <-sum(roulette_vector)

# ์–ด๋–ค ๋ฒกํ„ฐ๊ฐ€ ํฐ์ง€ ํ™•์ธํ•ด๋ณด์ž! 
total_poker > total_roulette

 

๐Ÿ“Œ๋ฒกํ„ฐ ํŠน์ • ์›์†Œ ์„ ํƒํ•˜๊ธฐ vector_name[n] | vector_name[n1:n2]

# Poker and roulette winnings from Monday to Friday:
poker_vector <- c(140, -50, 20, -120, 240)
roulette_vector <- c(-24, -50, 100, -350, 10)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
names(roulette_vector) <- days_vector

# Define a new variable based on a selection
poker_wednesday <-poker_vector[3]

 

#๋ฒกํ„ฐ ๋‚ด ์›์†Œ ์—ฌ๋Ÿฌ๊ฐœ ์„ ํƒํ•˜๊ธฐ

# ๋ฒกํ„ฐ ๋‚ด ์›์†Œ ์—ฌ๋Ÿฌ๊ฐœ ์„ ํƒํ•˜๊ธฐ 
# Define a new variable based on a selection
poker_midweek <- poker_vector[c(2,3,4)]
poker_midweek

#console 
#  Tuesday Wednesday  Thursday 
#      -50        20      -120

 

#[c(2,3,4)]๋Š” ๋„ˆ๋ฌด ๊ธธ๋‹ค! [2:4]๋กœ ์„ ํƒ๋„ ๊ฐ€๋Šฅํ•˜๋‹ค (2,3,4๋ฒˆ์งธ ์š”์†Œ๋ฅผ ์„ ํƒํ•œ๋‹ค) 

# Define a new variable based on a selection
roulette_selection_vector <- roulette_vector[2:4]
roulette_selection_vector

 

 

#๋ฒกํ„ฐ์— ์ง€์ •ํ•œ ์ด๋ฆ„์œผ๋กœ value๋ฅผ ์„ ํƒํ•  ์ˆ˜๋„ ์žˆ๋‹ค. 

poker_start <- poker_vector[c("Monday","Tuesday","Wednesday")]
poker_start   

#console
   Monday   Tuesday Wednesday 
      140       -50        20

 

๐Ÿ“Œ๋ฒกํ„ฐ์˜ ํ‰๊ท ๊ฐ’ ๊ตฌํ•˜๊ธฐ mean(vector_name) 

mean(poker_start)

#console
[1] 36.66667

 

 

๐Ÿ“Œ  selection by comparison | Advanced selection 

# Which days did you make money on poker?
selection_vector <- poker_vector > 0 
  
# Print out selection_vector
selection_vector 

# console
   Monday   Tuesday Wednesday  Thursday    Friday 
     TRUE     FALSE      TRUE     FALSE      TRUE
  
  
 
 # Select from poker_vector these days
poker_winning_days <- poker_vector[selection_vector]
poker_winning_days

# console 
   Monday Wednesday    Friday 
      140        20       240
      

#R knows what to do when you pass a logical vector in square brackets: 
#it will only select the elements that correspond to TRUE in selection_vector.

 

Image credit datacamp.com