Semana 11, pandas#16
Open
joyceslima wants to merge 1 commit into
Open
Conversation
|
|
||
| # 2 - Indentifique as colunas que contêm números, como 'Spotify Streams', 'YouTube Views', etc., e converta essas colunas para o tipo numérico se estiverem em outro formato. (Use replace() e astype()) | ||
|
|
||
| colunas = ['Track', 'Album Name', 'Artist', 'Release Date', 'ISRC','All Time Rank', 'Track Score', 'Spotify Streams','Spotify Playlist Count', 'Spotify Playlist Reach','Spotify Popularity', 'YouTube Views', 'YouTube Likes', 'TikTok Posts','TikTok Likes', 'TikTok Views', 'YouTube Playlist Reach','Apple Music Playlist Count', 'AirPlay Spins', 'SiriusXM Spins','Deezer Playlist Count', 'Deezer Playlist Reach','Amazon Playlist Count', 'Pandora Streams', 'Pandora Track Stations','Soundcloud Streams', 'Shazam Counts', 'TIDAL Popularity','Explicit Track'] |
Collaborator
There was a problem hiding this comment.
Se essas são todas as colunas , por que não utilizar o df.columns()?
Comment on lines
+17
to
+19
| for col in colunas: | ||
| if df_musicas[col].dtypes == 'object': | ||
| df_musicas[col] = df_musicas[col].str.replace(',' , '').astype(float, errors='ignore') |
Collaborator
There was a problem hiding this comment.
cuidado ao utilizar você precisou utilizar o erros=ignore porque esta tentando converter colunas do tipo object que de fato são object
|
|
||
| # 4 - Crie uma nova coluna chamada 'Streaming Popularity', que seja a média da popularidade nas plataformas 'Spotify Popularity', 'YouTube Views', 'TikTok Likes', e 'Shazam Counts'. (lembrem-se que só é possível calcular médias e fazer operações matemáticas com tipos númericos) | ||
|
|
||
| df_musicas ['Streaming Popularity'] = df_musicas[['Spotify Popularity', 'YouTube Views', 'TikTok Likes', 'Shazam Counts']].median(axis=1) |
Collaborator
There was a problem hiding this comment.
Suggested change
| df_musicas ['Streaming Popularity'] = df_musicas[['Spotify Popularity', 'YouTube Views', 'TikTok Likes', 'Shazam Counts']].median(axis=1) | |
| df_musicas ['Streaming Popularity'] = df_musicas[['Spotify Popularity', 'YouTube Views', 'TikTok Likes', 'Shazam Counts']].mean(axis=1) |
median() mede a mediana, no caso o que você precisa é a média 😄
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Exercício de ETL com pandas, utilizando uma lista dos artistas mais ouvidos do SPOTIFY!