What is XML?
The Extensible Markup Language (XML) is a markup language much like HTML or SGML. This is recommended by the World Wide Web Consortium and available as an open standard.
Today I will show you how to parse the OPML file type with python 2.7 version and XML python module.
This is the source script:
from xml.etree import ElementTree
import sys
file_opml = sys.argv[1]
def extract_rss_urls_from_opml(filename):
urls = []
with open(filename, 'rt') as f:
tree = ElementTree.parse(f)
for node in tree.findall('.//outline'):
url = node.attrib.get('xmlUrl')
if url:
urls.append(url)
return urls
urls = extract_rss_urls_from_opml(file_opml)
print urls
The result is a list with all your RSS links.