Wiki Matched Obits
A quick summary of Wiki Matched Obits
Matching Obituaries with Wikipedia Entries
March 11 - Analysis My Takeaways
Matched vs. Unmatched Obituaries
- While only a small sample of 783 people was taken due to errors in the code (ids after 783 created a Nonetype error) I found that a majority of the obituaries did not match with a wikipedia entry
- The proportion of unmatched obituraies is about 0.725, while the proportion of matched obituaries is 0.274 of the sample of 783 obituaries.
- The proportion of unmatched obituraies is about 0.725, while the proportion of matched obituaries is 0.274 of the sample of 783 obituaries.
- After creating a list of all the matched and unmatched obituaries from the sample, I calcluated the gender distribution of the matched and unmatched obituaries and created associated bar plots to show the difference
- What I found was that the gender distributions of the matched and unmatched obituaries was very similar even though their is a signficantly higher propriton of unmatched obituaties
- Of the unmatched obituaries, the proportion that is male is 0.8415, while the proportion that is female is 0.1585
- Of the matched obituaries, the proportion that is male is 0.8837, while the proportion that is female is 0.1162
Some ids are coming up as a NoneType (so does this mean that there are obitaries that have no ids????
- Could be a bug that needs to be worked out
- The id 783 and so forth causes an error
April 7th- Analysis My Takeaways
Update:
- I was able to work out what the bug was and why it was not working in order to properly check the correct proportions of of matched vs unmatched wikipedia articles.
- Before I was creating a for loop and looping around
- The previous method of looping around the id_list of 60,000+ obituaries was not effective and would crash at around id 783 for some reason. So I developed the a different loop which checked in a much faster way the proportion of matched/unmatched obituries to wikipedia articles.
From the 60,000 + obituaries majority of them are unmatched
-
The proportion of obituarized people that had a matching wikipedia article was about 0.3824, while the proprtion that did not was about 0.6176.
-
This is interesting since one would think if someone is well known enough to be obituarized in the New York Times they would also have a wikipedia article written about them.
-
This leads to the question, are these proportions correct? This is something that could be explored further.