What on earth are we talking about?


Motivation and goal

Music is an important part of the everyday life for most people. Whether it be music for a party, for the bus or the inescapable noice you hear in a telephone queue, it affects us in some way or another. But which words are we actually hearing when we put on our headphones? Are we exposing ourselves to positive or negative messages? Are we subconsciously choosing to listen to negative songs or postitive? And lastly how does the artists connect to one another, and what information can we derive from the different groupings?

These are the questions we seek to answer in this study and to do just that we need some data. But to get these data, we first need to limit our search. There are quite frankly too many artists across the world for us to take them all into account with the limited computer power that we possess, so in order to make it manageble we chose to focus on only one particular genre of music.

Hip Hop

With it's rich history, interesting vocabulary and amazing artists we believe that we can uncover quite a lot of hidden secrets within this branch of music, as well as proving/disproving wether hip hop music is as violent/negative as people believe it to be.

Dataset

As mentioned, we are using the wikipedia list of important hip hop artists to limit the scope of artists in the hip hop genre. After that we are using the genius API to extract the necesary data and link the artists based on the collaborations they've had on songs. We're using the amount of collaboration between two artists as the weight of the link, and we are adding five attributes to each node: Popularity, followers, twitter-handle, subgenre and instagram-name. Each artist also have as attributes the songs they've made or collaborated on, and these songs have the attributes: hotness, pyongs, release date, page views, media platforms and title.
Read more about the dataset and the lyrics and attribute cleaning in the explainer notebook (link above)

Network


Using this information we created a graph consisting of 1062 nodes and 12226 edges as seen below. Feel free to click around to see and read about other features of the graph.

Big Daddy Kane

Our network
Here we see the network of artists and the songs that they have collaborated on and that therefore connect them. At the edges of the graph, small nodes can be seen without any connections, this can occur if an artist hasn't collaborated with anyone, perhaps because they are unpopular, do not use the Genius platform or simply likes doing things alone.

Communities

Communities (SCROLL DOWN for pie charts)
Communities in dense networks can often be hard to see with the naked eye. The concept is based on the fact that within any network there will be some groups of nodes that are more densely connected than others, this can in this case be caused by the fact that some artists are better friends with one another than others, or maybe because some artists are connected through their chosen genre.
In the graph above 110 different communities were found, this number is of course also influenced by the amount of people who has no actual connection to the graph.
We have decided to check if the artists are in fact connected through their subgenres. Below is twelwe pie charts that show the overall distribution of genres over the members of the six biggest communities. The first piechart includes the rap category, but since it takes up the majority of the chart we decided to omit it in the second version.

This means that for each community there are two pie charts, and the further down you look the smaller the communities.

1NR
1NR
1NR
1NR
1NR
1NR
1NR
1NR
1NR
1NR
1NR
1NR



As it can be seen on the pie charts a lot of the artists have a majority of songs in the rap genre, This means that there is very little seperating the different communities in terms of the genres when rap is a part of it. You may be able to argue that the fourth biggest community is rather based on pop.
You may also be able to argue that the second largest community is quite based on rap, but you would need to apply statistic methods to see if it a statistically significant change based on the other pie charts.
The other set of pie charts, where rap is omitted only seem to strengthen the assumption that the artists don't collaborate with one another based on genre.
A funny thing to consider is that it is a very low percentage of artists whose song has the hip hop genre.

Hotness

HOT or NOT
On genius there is an attribute for each song that denotes the song hot or not. In this graph we see all the artists who does NOT have any hot songs colored orange, and the HOT artists are colored in red.
As seen on the graph genius only thinks that four artists got what it takes namely: Eminem, Travis Scott, XXXTENTACION and Mariah Carey (which could have something to do with the fact that we pulled this information close to christmas).
This however only seem to prove that maybe the genius attribute isn't that much to go on, when only 4 out of 110632 songs were shown as being hot.

Big Daddy Kane

Followers
Since the hotness attribute did not appear to be particularly usefull as a measure for popularity we decided to look at how many followers each artist had. On the graph above it is possible to see the same communities as in the communities tab, but in this graph the nodes are scaled by how many followers each artist has.
In this plot we can see that we have a pink group that all have a lot of followers, considering their large node size. It is also possible to see that the rest of the communities doesn't share any common traits based on the followers count.
The two people with the largest follower count is Kendrick Lamar with 22676 followers and Eminem with 19572 followers.

Big Daddy Kane

Genre
The genius api offers a way to see which sub genres a song belongs to. For all the songs of each artist we took this information and for each node we saw which genre the artist had the most of and colored them acordingly. It is obvious that one color holds the majority, mainly rap, the dark green dots and pop, which is shown as pink.

Degree distribution linear axis

Degree distribution shown with linear scale

Degree distribution log log scale

Degree distribution shown with log log axis

Above is two plots of the degree distribution. The degree distribution in this case describes the amount of unique people an artist have collaborated with. As it can be seen on the graph the degree distribution follows a power law distribution. This means that if an artist in the network is popular, then he or she is more likely to make new collaborations that others. This can be easily translated to the graph as more artist would have a desire to collaborate with more well known artists either to boost their own success through their followers, or simply because they would be more likely to have heard of a popular artist that matched a certain music style. This results in a distribution where a lot of the artist only have one or two connections, whilst popular artists can have up to 250 collaborations with.

Strength distribution

Strength distribution shown with linear scale

Strength distribution log log

Strength distribution shown with log log scale

Above is two plots of the strength distribution. The difference between the degree distribution and the strength distribution, is that where the degree distribution focuses on how many artists an artist have collaborated with. The strength focuses on how many songs an artist has collaborated on.
As seen above the strength distribution also follows a power law distribution.

Cliques

Even if the degree distribution follows a power law distribution, meaning that a lot of nodes only have very few links, the graph is still full of connections. There are a lot of ways to group artists together such as the community method as seen above. Another way includes a specific centrality measure, mainly cliques. A clique is a group of artists where all the artists have connections to all other artists in the clique. If you were to move the slider beneath you would notice that both the value and the graph is changing. The value shows how big cliques we are looking for, if we have a value of three, and an artist is visible, then it means that the artist is a part of one or multiple three-artist-cliques where everyone is connected to oneanother. As the value reach fourteen you will see the names of the people in the largest clique.


Value:

Degree distribution shown with linear scale


Number of nodes in -cliques: / 1062

Based on these graphs we can conclude that even though a lot of nodes do not have a high degree, then they are still quite connected to the rest of the graph considering that 838/1062 of the nodes is in a 3-clique, and 646/1062 is in a 4-clique. We can also conclude that the most interconnected artist, aka the ones in the 14-clique are quite famous artists.

Basic Stats


Now that you have seen the network and you have read about the communities and cliques of the artists, you might be interested in knowing a little bit more. In this section we present a few simple facts about our graph.

Did You Know?

That 964 of 1062 hip hop artist have most songs in the rap genre

Bruno Mars is the third most positive hip hop artists based on the sentiment analysis below.

Kendrick Lamaris the hip hop artist with the most followers in our graph, he has 22676 followers on Genius.

Top Collaborating Artists

This table shows the number of songs that artists have collaborated on, meaning that it both include the songs where they are the main artist and if they are only featured.

  • No.
    Artist
    Collaborations
  • 1
    Lil Wayne
    856
  • 2
    Gucci Mane
    697
  • 3
    Rick Ross
    644
  • 4
    Snoop Dogg
    586
  • 5
    The Game
    502

.

Top Primary Artists

This table shows the number of songs that artists have collaborated on where they acted as the primary artists.

  • No.
    Artist
    Collaborations
  • 1
    Gucci Mane
    484
  • 2
    DJ Khaled
    399
  • 3
    The Game
    376
  • 4
    Lil Wayne
    370
  • 5
    Tech N9ne
    329

.

Top Featured Artists

This table shows the number of songs that artists have collaborated on where they acted as a featured artists.

  • No.
    Artist
    Collaborations
  • 1
    Lil Wayne
    486
  • 2
    Rick Ross
    390
  • 3
    Snoop Dogg
    310
  • 4
    Young Thug
    241
  • 5
    2 Chainz
    233

.

Top Total Page Views

This is the top artists that have the highest total page views based on the page views of all their songs.

  • No.
    Artist
    Page Views
  • 1
    Eminem
    13378000
  • 2
    Big Shaq
    8153000
  • 3
    The Weeknd
    7722000
  • 4
    A$AP Rocky
    7208000
  • 5
    Logic
    6375000

.

Top Mean of Song Page Views

This is the artists that have the most mean pageviews based on the mean of page views of all their songs.

  • No.
    Artist
    Page Views
  • 1
    Big Shaq
    1027444.44
  • 2
    The Weeknd
    414177.08
  • 3
    Bobby Shmurda
    404000
  • 4
    Drake
    380429.59
  • 5
    Post Malone
    334468.35

Frequent terms


Based on the lyrics from 110632 songs we found the most frequently used words amongst each artist. Here's the top most used words for the top ten hip hop artists. as well as a little insight into Snoop Doggs music career based on the words he has used over the years.

A few explanations

You may be wondering about a few of the words in these images. Fear not, here is the explanation to some of them.

Big Daddy Kane

Heidi: Big Daddy Kane has made a song called "let yourself go", where he repeats the phrase "heidi heidi heidi ho" many times in the chorus.
Veteranz: Veteranz Day is the latest studio album by emcee Big Daddy Kane made in 1998.
Demonstration: Big Daddy Kane has made a song called "give a demonstration" where the word demonstration pops up quite often.

The Notorious B.I.G

Combs: Sean John Combs is the real name of rap artist and producer P. Diddy. He was the founder of Bad Boy Entertainment record label that included The Notorious B.I.G. Apart from that, the two of them were very close friends and P. Diddy even made a heartbreaking song honoring The Notorious B.I.G in the aftermath of his death.
Michael Patterson: He is a producer that has worked with artist on his album: Life After Death.
Wallace: Wallace is the last name of the artist.

Eminem

ASCAP: American Society of Composers Authors and Publishers. Eminem has won the award "ASCAP Award for Most Performed Song from a Motion Picture" for his song Lose yourself. Apart from that he has presented producer Dr. Dre with an award at ASCAP.
Haile: Haile is Eminem's daughter and she is the topic of several Eminem songs. E.g. the song Mockingbird.
Effigy: Effigy is a recording studio in Detroit that Eminem bought from rapper Big Sean. It is an old manufactury, but on the inside it looks very nice.

Ice Cube

Chickity: The famous song "check yo self" by Ice Cube has the famous line "Chickity-check yo self before you wreck yo self" in the chorus.
Bomaye: The term means "Kill him". The word became famous for the audience screaming it during the "Rumble in the junge" fight between Muhammad Ali and George Foreman. Ice cube says bomaye in the song "pros vs joes".

Jay-Z

Jigga: A Jay-Z alias.
Pharrell: "Frontin" is the debut solo single by American singer Pharrell Williams. Several songs on the album features Jay-Z. So the two definitely have some sort of relationship.

KRS-One

Blastmaster: An alias of KRS-One.
Edutainment: An album made by KRS-One. The word comes from edu"cation and enter"tainment". So KRS-One is referring to his songs on the album as being educational and entertaining at the same time.

Melle Mel

Diggedy: In the song "white lines dont do it", Melle Mel says: "Rang dang diggedy dang di-dang", which is a rather special phrase.
Jesse: Quote for Wikipedia: "Jesse" was a highly political song which urged people to vote for then presidential candidate Jesse Jackson.
Vote: In extension of the above, the word "vote" is special to Mellem Mel. This also because of the above song.

Nas

Braveheart: Bravehearts is the name of an East-Coast Hip-Hop group having Nas' cousin as a member. The Hip-Hop group's albums include guest appearances from Nas.
Nasir: Nasir is real name of artist Nas and apart from that, Nasir is also the name of a Nas album.

Rakim

Internationally: Rakim's song "When I b on the mic" included "Im internationally known" in the chorus.
Moolah: Meaning money. There are many synonyms for money, and this one seems to be one of Rakim's preferences since he uses it in several songs.
Levantaré: The word means to raise in Spanish. Rakim uses Levantaré in the chorus of the song "Uplift".

Tupac Shakur

Motherfuckers: Tupac, like Ice Cube, is known for using the word motherfucker very often. He uses the word in serveral of his songs e.g. "Hit Em Up" a very controversial diss song on The Notorious B.I.G.
Makaveli: An Alias of Tupac.
Thug: Tupac is known for saying the phrase "Thug life" a lot. Partly because he was in a rap group called Thug Life. Thug life is more or less a synonym for gangster life, but for Tupac it had a second meaning: An acronym standing for 'The Hate U Give Little Infants Fucks Everyone'. A message for all parents that raising kids in a negative environment is bad for everyone.

A closer look at Snoop Dogg

Snoop Dogg is by far one of the most developing artists we have today. A simple look at how his name has changed during his career is enough to confirm this. We wanted to investigate if his language showed any sign of these transformations.
On the figure below we see a colomn of words and a list of years. The orange dots indicate wether or not Snoop Dogg has said any of these things in his lyrics during those years.

Degree distribution shown with linear scale


As it can be seen on the figure above, the word "snoop" is said every single year since his career started, which really is not a surprise . However the word lion almost also appear each year, showing that he might have considered the idea of changing his name to Snoop Lion before he actually did it in 2012.
The words "gangsta" is said in all years but one, and bitch is said in every year. This supports the stereotype that hip hop artists doesn't favor the nicest words. Even during 2012 when he declared that he was Bob Marley reincarnated, and wanted to turn his back on guns and violence, did he not stop using these words in his lyrics. However the plot doesn't mention how the words were mentioned.
The last word that is linked to Snoop Dogg as a person is, "international” His use of this word loosely follows his transition to becoming an international star in the late 90’s.

We have included some names that could be interesting to look at based on Snoop Doggs bio on Wikipedia. In the top, one can see Trump, who Snoop Dogg mentioned in various years and latest in 2017 when Trump became president. One can see as well that he has relations with fellow rappers Pharrell and Wiz Khalifa. The relation with Pharrell is an older relation, as Snoop released an album in 2002 featuring Pharrell. Snoop Dogg’s relationship with Wiz Khalifa is a very close, but does not go that far back in time, since Wiz Khalifa is newer on the stage and simply younger than Snoop Dogg (meaning that he would not have known him before later in his career).

The conclusion here is that the changes in his own demeanor and personality cannot be linked to these words at least, since Snoop Dogg tends to be using these words mostly through all the years.

Sentiment Analysis


A sentiment analysis is an analysis of wether a text is positive or negative. We use a sentiment dictionary, which is a list of words that have been rated by unpaid first year psychology students on a scale from 1 to 9 in how they were affected by the word. This list is the basis for the analysis and

Overall sentiment

The overall sentiment of the hip hop artist is actually slightly above average. With a mean value of 5.38 on a scale that goes from 1 to 9, this seems like a good indication that the bad reputation hip hop artists have when it comes to a negative influence is highly overrated.

Overall sentiment


.

Negative wordcloud

To investigate this further we decided to look at some of the distinctive words for artists with both the most negative and the most positive sentiment values.
This wordcloud contains the top tf-idf words for the artists with the lowest sentiment values. This means that we here have the most distincitive words for the negative artists in our graph.
This wordcloud shows what we suspected, hip hop aim to provoke, with the profanities, the racial slurs and the mentions of violence the negative hip hop artists really try to grab the attention of the listeners using the obvious words to offend.

Negative wordcloud


.

Positive wordcloud

This wordcloud contains the top tf-idf words for the artists with the highest sentiment values. This means that we here have the most distincitive words for the happiest artists in our graph.
It is clear that this wordcloud is far more positive than the one for the negative words, however with words like "stripper", "bunnies" and "kitty" it is clear that rappers will be rappers.

Positive wordcloud



It is worth mentioning that only 36 out of 1062 artists was in either the positive or negative spectrum based on the low and high cutoff in the sentiment distribution. These cutoffs are made by taking the mean sentiment value and adding/subtracting two times the standard deviation.
This means that most artists have a fairly neutral mean vocabulary, including the top 10 artist listed in the previous section, meaning that going for extremes won't necessarily conclude in popularity.

Us


Lorem ipsum dolor sit amet, consectetur adipiscing elit duis sed dapibus leonec.

...

William With Hartmann

s153081

...

Marcus Daly

s181803

...

Michelle Lind Østrup

s143818