이제는 응용! 배운 것을 다 활용해서 숙제를 해볼까요?
1. flights 데이터셋을 활용해서 그래프를 그려봅시다
import seaborn as sns
import matplotlib.pyplot as plt
# Seaborn의 내장 데이터셋 'flights' 불러오기
flights_data = sns.load_dataset('flights')
더보기
![](https://blog.kakaocdn.net/dn/bQUG7M/btsLDtVNXeS/8q1FURacZFrlOYIYRZozRk/img.png)
![](https://blog.kakaocdn.net/dn/XuMF1/btsLCJLGf4J/Q0PXkWztMkfcAECQHlXuR0/img.png)
![](https://blog.kakaocdn.net/dn/bJA4dt/btsLDG1Gh1p/p3o4xEz4JRKpF5x6YMqMWK/img.png)
![](https://blog.kakaocdn.net/dn/bN9Dei/btsLBtXwC4H/BQO2Me3YfAvZPOtVYg5ra1/img.png)
![](https://blog.kakaocdn.net/dn/cy2SED/btsLC2j64pd/ShbHXP8QfnESOzHThOtntK/img.png)
![](https://blog.kakaocdn.net/dn/RVtm0/btsLBVThE5M/MJVoLwRha05vDq3qflRhkk/img.png)
1) 연도별 총 승객 수
yearly_passengers = flights_data[['year','passengers']].groupby('year').sum().reset_index()
plt.bar(yearly_passengers['year'], yearly_passengers['passengers'])
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Total Passengers')
plt.show()
#
![](https://blog.kakaocdn.net/dn/bQUG7M/btsLDtVNXeS/8q1FURacZFrlOYIYRZozRk/img.png)
2) 연도별 평균 승객 수
yearly_passengers = flights_data[['year','passengers']].groupby('year').mean()
plt.plot(yearly_passengers.index, yearly_passengers.values, color='gray', marker = 'o', linestyle = '--')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.title('Average Passengers')
plt.show()
#
![](https://blog.kakaocdn.net/dn/XuMF1/btsLCJLGf4J/Q0PXkWztMkfcAECQHlXuR0/img.png)
3) 승객 수 분포
plt.hist(flights_data['passengers'], bins=30, color = "orange", edgecolor = 'white')
plt.xlabel('Passengers')
plt.ylabel('Frequency')
plt.title('Passengers Distribution')
plt.show()
#
![](https://blog.kakaocdn.net/dn/bJA4dt/btsLDG1Gh1p/p3o4xEz4JRKpF5x6YMqMWK/img.png)
4) 연도별 승객 수와 월간 승객 수 ---- 이해 못함 나중에 다시 풀어보기
for month in flights_data['month'].unique():
plt.scatter(flights_data[flights_data['month'] == month]['year'],
flights_data[flights_data['month'] == month]['passengers'],
label=month)
plt.title('Passengers per Year by Month')
plt.xlabel('Year')
plt.ylabel('Passengers')
plt.legend()
plt.show()
#
![](https://blog.kakaocdn.net/dn/bN9Dei/btsLBtXwC4H/BQO2Me3YfAvZPOtVYg5ra1/img.png)
5) 월별 승객 수 분포
monthly_passengers = flights_data[['month','passengers']].groupby('month').sum().reset_index()
plt.bar(monthly_passengers['month'], monthly_passengers['passengers'], color = 'skyblue')
plt.xlabel('Month')
plt.ylabel('Passengers')
plt.title('Monthly Passengers')
plt.show()
#
![](https://blog.kakaocdn.net/dn/cy2SED/btsLC2j64pd/ShbHXP8QfnESOzHThOtntK/img.png)
5번 정답 ---- boxplot 아직 잘 모르겠음
plt.figure(figsize=(8, 6))
plt.boxplot([flights_data[flights_data['month'] == month]['passengers'] for month in flights_data['month'].unique()],
labels=flights_data['month'].unique())
plt.title('Passengers Distribution by Month')
plt.xlabel('Month')
plt.ylabel('Passengers')
plt.show()
#
![](https://blog.kakaocdn.net/dn/RVtm0/btsLBVThE5M/MJVoLwRha05vDq3qflRhkk/img.png)
2. tips 데이터셋을 활용해서 그래프를 그려봅시다
import seaborn as sns
import matplotlib.pyplot as plt
# Seaborn의 내장 데이터셋 'tips' 불러오기
tips_data = sns.load_dataset('tips')
더보기
![](https://blog.kakaocdn.net/dn/zlhTn/btsLBKdGPNL/21TsvUktz71i3AJx0Gnkx1/img.png)
![](https://blog.kakaocdn.net/dn/bo8viX/btsLDeYXmJA/zPtVOHMjHY8jGF0xVrr2E1/img.png)
![](https://blog.kakaocdn.net/dn/c5XrgO/btsLCJSp0qc/hIFPM1axahaTrxMpPA3mtk/img.png)
![](https://blog.kakaocdn.net/dn/WUhDI/btsLBxkRH4W/zquqGRFQzzyFoon1LUNRf0/img.png)
![](https://blog.kakaocdn.net/dn/zRKdW/btsLBKxZgLu/QcJHqjsM1phwjIrzNDfuC1/img.png)
1) 요일별 팁 금액의 평균
제목 : Average Tips by Day
X축 : Day of the Week
Y축 : Average Tip Amount
avg_tip = tips_data[['day','tip']].groupby('day').mean()
plt.plot(avg_tip.index, avg_tip.values, marker = 'o', color = 'tomato', linestyle = '--')
plt.xlabel('Day of the Week')
plt.ylabel('Average Tip Amount')
plt.title('Average Tips by Day')
plt.show()
#
![](https://blog.kakaocdn.net/dn/zlhTn/btsLBKdGPNL/21TsvUktz71i3AJx0Gnkx1/img.png)
2) 요일별 총 팁 금액
제목 : Total Tips by Day
X축 : Day of the Week
Y축 : Total Tip Amount
avg_tip = tips_data[['day','tip']].groupby('day').sum().reset_index()
plt.bar(avg_tip['day'], avg_tip['tip'], color = 'limegreen')
plt.xlabel('Day of the Week')
plt.ylabel('Total Tip Amount')
plt.title('Total Tips by Day')
plt.show()
#
![](https://blog.kakaocdn.net/dn/bo8viX/btsLDeYXmJA/zPtVOHMjHY8jGF0xVrr2E1/img.png)
3) 식사 금액 분포
제목 : Distribution of Total Bill
X축 : Total Bill Amount
Y축 : Frequency
plt.hist(tips_data['total_bill'], bins = 50, color = 'darkcyan', edgecolor = 'thistle')
plt.xlabel('Total Bill Amount')
plt.ylabel('Frequency')
plt.title('Distribution of Total Bill')
plt.show()
#
![](https://blog.kakaocdn.net/dn/c5XrgO/btsLCJSp0qc/hIFPM1axahaTrxMpPA3mtk/img.png)
4) 식사 금액과 팁 금액의 관계
제목 : Tip Amount vs Total Bill
X축 : Total Bill Amount
Y축 : Tip Amount
plt.scatter(tips_data['total_bill'], tips_data['tip'], color = 'gold', edgecolors= 'black')
plt.xlabel('Total Bill Amount')
plt.ylabel('Tip Amount')
plt.title('Tip Amount vs Total Bill')
plt.show()
#
![](https://blog.kakaocdn.net/dn/WUhDI/btsLBxkRH4W/zquqGRFQzzyFoon1LUNRf0/img.png)
5) 요일별 식사 금액 분포
제목 : Total Bill Distribution by Day
X축 : Day of the Week
Y축 : Total Bill Amount
plt.boxplot([tips_data[tips_data['day'] == day]['total_bill']\
for day in tips_data['day'].unique()]\
, labels = tips_data['day'].unique())
plt.xlabel('Day of the Week')
plt.ylabel('Total Bill Amount')
plt.title('Total Bill Distribution by Day')
plt.show()
#
![](https://blog.kakaocdn.net/dn/zRKdW/btsLBKxZgLu/QcJHqjsM1phwjIrzNDfuC1/img.png)
'내일배움캠프 강의 숙제' 카테고리의 다른 글
[2024/12/23]엑셀보다 쉽고 빠른 SQL 5주차 숙제 (0) | 2024.12.23 |
---|---|
[2024/12/20]엑셀보다 쉽고 빠른 SQL 4주차 숙제 (0) | 2024.12.20 |
[2024/12/20]엑셀보다 쉽고 빠른 SQL 3주차 숙제 (0) | 2024.12.20 |
[2024/12/19]엑셀보다 쉽고 빠른 SQL 2주차 숙제 (0) | 2024.12.19 |
[2024/12/19]데이터 분석 종합반 5주차 숙제 (0) | 2024.12.19 |