超过一百万条的新闻头条信息
Context
This contains data of news headlines published over a period of 15 years.
Sourced from the reputable Australian news source ABC (Australian Broadcasting Corp.)
Agency Site: (http://www.abc.net.au)
Content
Format: CSV ; Single File
publish_date: Date of publishing for the article in yyyyMMdd format
headline_text: Text of the headline in Ascii , English , lowercase
Start Date: 2003-02-19 End Date: 2017-12-31
Total Records: 1,103,663
Inspiration
I look at this news dataset as a summarised historical record of noteworthy events in the globe from early-2003 to end-2017 with a more granular focus on Australia.
This includes the entire corpus of articles published by the ABC website in the given time range. With a volume of 200 articles per day and a good focus on international news, we can be fairly certain that every event of significance has been captured here.
Digging into the keywords, one can see all the important episodes shaping the last decade and how they evolved over time. Ex: financial crisis, iraq war, multiple US elections, ecological disasters, terrorism, famous people, Australian crimes etc.