Wiki Matched Obits

A quick summary of Wiki Matched Obits

Matching Obituaries with Wikipedia Entries

March 11 - Analysis My Takeaways

Matched vs. Unmatched Obituaries

  • While only a small sample of 783 people was taken due to errors in the code (ids after 783 created a Nonetype error) I found that a majority of the obituaries did not match with a wikipedia entry
    • The proportion of unmatched obituraies is about 0.725, while the proportion of matched obituaries is 0.274 of the sample of 783 obituaries.
      Screen Shot 2020 04 12 At 12 36 08 Am
  • After creating a list of all the matched and unmatched obituaries from the sample, I calcluated the gender distribution of the matched and unmatched obituaries and created associated bar plots to show the difference
    • What I found was that the gender distributions of the matched and unmatched obituaries was very similar even though their is a signficantly higher propriton of unmatched obituaties
    • Of the unmatched obituaries, the proportion that is male is 0.8415, while the proportion that is female is 0.1585
    • Of the matched obituaries, the proportion that is male is 0.8837, while the proportion that is female is 0.1162

Some ids are coming up as a NoneType (so does this mean that there are obitaries that have no ids????

  • Could be a bug that needs to be worked out
  • The id 783 and so forth causes an error

April 7th- Analysis My Takeaways

Update:

  • I was able to work out what the bug was and why it was not working in order to properly check the correct proportions of of matched vs unmatched wikipedia articles.
  • Before I was creating a for loop and looping around
  • The previous method of looping around the id_list of 60,000+ obituaries was not effective and would crash at around id 783 for some reason. So I developed the a different loop which checked in a much faster way the proportion of matched/unmatched obituries to wikipedia articles.

Screen Shot 2020 04 12 At 12 28 47 Am

From the 60,000 + obituaries majority of them are unmatched

  • The proportion of obituarized people that had a matching wikipedia article was about 0.3824, while the proprtion that did not was about 0.6176.

  • This is interesting since one would think if someone is well known enough to be obituarized in the New York Times they would also have a wikipedia article written about them.

  • This leads to the question, are these proportions correct? This is something that could be explored further.