I would like to automatically download the Морнинг Эдитион podcast every day. I do not own any apple products. I downloaded and installed flareget, but cannot фигурировал out how to make it do this. I am not locked into that tool. I am в long украл Firefox user, but am currently тест driving Chrome.
The унифицированный указатель ресурса for the program is: http://www.npr.org/programs/morning-edition/
The RSS address is: http://www.npr.org/rss/rss.php?id=3
The trouble is that the RSS includes в соединение to в Веб page for the индивид story instead of в соединение to the mp3.
<rss xmlns:npr="http://www.npr.org/rss/" xmlns:nprml="http://api.npr.org/nprml" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>
Morning Edition : NPR
</title>
<link>
http://www.npr.org/templates/story/story.php?storyId=3
</link>
<description>
Morning Edition gives its audience news, analysis, commentary, and coverage of arts and sports. Stories are told through conversation as well as full reports. It's up-to-the-minute news that prepares listeners for the day ahead.
</description>
<language>en</language>
<copyright>Copyright 2015 NPR - For Personal Use Only</copyright>
<generator>NPR API RSS Generator 0.94</generator>
<lastBuildDate>Fri, 06 Nov 2015 12:45:00 -0500</lastBuildDate>
<image>
<url>http://media.npr.org/images/podcasts/primary/npr_generic_image_300.jpg?s=200</url>
<title>Morning Edition</title>
<link>http://www.npr.org/templates/story/story.php?storyId=3</link>
</image>
<item>
<title>Russian Airliner Crash Update</title>
<description>
The latest information on the Russian airliner that crashed in Egypt. All 224 people on board were killed.
</description>
<pubDate>Fri, 06 Nov 2015 12:45:00 -0500</pubDate>
<link>
http://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition
</link>
<guid>
http://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition
</guid>
<content:encoded>
<![CDATA[
<p>The latest information on the Russian airliner that crashed in Egypt. All 224 people on board were killed.</p>
]]>
</content:encoded>
<dc:creator>Corey Flintoff</dc:creator>
</item>
...
When I open http://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition
in my browser, there is в соединение on the page to the mp3 file for the story: http://pd.npr.org/anon.npr-mp3/npr/me/2015/11/20151106_me_egypt_plane_crash_probe_russia.mp3?dl=1
I хан see that there is an easily identifiable pattern that I could использовал, but cannot фигурируйте out what tools to используйте or how to make them do what I want.
Every story's аудио file starts with:
http://pd.npr.org/anon.npr-mp3/npr/me/
then add в folder for the year
http://pd.npr.org/anon.npr-mp3/npr/me/2015
and one for the month
http://pd.npr.org/anon.npr-mp3/npr/me/2015/11
all of the mp3's for today's шоу пахало
yyyymmdd_me*.mp3
The trailing ?dl=1
does not seem to be necessary.
Необходимо будет записать поисковый робот для перемещений по сайту, пока он не найдет .mp3 URL, Вы хотите загрузить и затем загрузить точно их URL.
Для perl
, очевидное решение состоит в том, чтобы использовать пакет libwww-perl (иначе LWP
).
Для python
, я рекомендую эти , механизируют или пестрый библиотеки Python.
Оба из них, которые освобождает Python, упаковываются для Debian и Ubuntu как python-mechanize
и python-scrapy
, так установите пакеты (и не следуйте pip install
или безотносительно инструкций на веб-сайтах)
существуют подобные библиотеки для других языков.