python-catalin: feed

Showing posts with label feed. Show all posts

Friday, February 14, 2014

Parsing feeds - get by attribute and value - part 2

Most developers use REST services or other data feeds that move data using XML.
This is a simple script to read online the xml file.
I used minidom but you can also use etree with ElementTree or cElementTree from etree.
I don't know if the ElementTree or cElementTree are more faster like minidom.
The script use urllib2 to open the file.
The file will show us the currency from each country.
The main goal of this script is : how to deal with attribute and value from xml files.
You can also see first part of this issue.
The structure of the xml file has also some attributes - currency.
Basicaly is something like this :

<!--xml version="1.0" encoding="UTF-8"?-->
-<dataset xsi:schemaLocation="http://www.bnr.ro/xsd nbrfxrates.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.bnr.ro/xsd">
-<header>
<publisher>National Bank of Romania</Publisher>
<publishingdate>2014-02-14</PublishingDate>
<messagetype>DR</MessageType>
</Header>
-<body>
<subject>Reference rates</Subject>
<origcurrency>RON</OrigCurrency>
-<cube date="2014-02-14">
<rate currency="AED">0.8909</Rate>
<rate currency="AUD">2.9529</Rate>
<rate currency="BGN">2.2913</Rate>
...

Now let see the script :

from xml.dom import minidom as dom
import urllib2

def fetchPage(url):
    a = urllib2.urlopen(url)
    return ''.join(a.readlines())

def extract(page):
    a = dom.parseString(page)

    item = a.getElementsByTagName('Rate')

    for i in item:
        if i.hasChildNodes() == True:
                print i.getAttribute('currency')+"-"+ i.firstChild.nodeValue

if __name__=='__main__':
    page = fetchPage("http://www.bnro.ro/nbrfxrates.xml")
    extract(page)

and the output is this :

AED-0.8909
AUD-2.9529
BGN-2.2913
BRL-1.3665
CAD-2.9879
CHF-3.6655
CNY-0.5394
CZK-0.1636
DKK-0.6005
EGP-0.4701
EUR-4.4813
GBP-5.4630
HUF-1.4517
INR-0.0527
JPY-3.2148
KRW-0.3078
MDL-0.2434
MXN-0.2467
NOK-0.5365
NZD-2.7388
PLN-1.0786
RSD-0.0387
RUB-0.0932
SEK-0.5074
TRY-1.4950
UAH-0.3865
USD-3.2721
XAU-137.6798
XDR-5.0505
ZAR-0.2981

Thursday, February 3, 2011

Read feed from sites.

Is a simple example for reading some feed.
I use two functions , first read url and secondary extract data.
This is the code source:


from xml.dom import minidom as dom
import urllib

def fetchPage(url):
    a = urllib.urlopen(url)
    return ''.join(a.readlines())


def extract(page):
    a = dom.parseString(page)
    item2 = a.getElementsByTagName('SendingDate')[0].firstChild.wholeText
    print "DATA ",item2
    item = a.getElementsByTagName('Cube')
    for i in item:
        if i.hasChildNodes() == True:
            e = i.getElementsByTagName('Rate')[10].firstChild.wholeText
            d = i.getElementsByTagName('Rate')[26].firstChild.wholeText
            print "EURO  ",e
            print "DOLAR ",d

if __name__=='__main__':
    page = fetchPage("http://www.bnro.ro/nbrfxrates.xml")

    extract(page)

Result is :


DATA  2011-02-03
EURO   4.2609
DOLAR  3.0921

This is all...

Thursday, February 4, 2010

Parsing feeds - part 1

From time to time I used conky. Is good for me, because i have all i need on my desktop.
How helped me python in this case?
For example i use one script to parse a feed from this url:
"http://www.bnro.ro/nbrfxrates.xml"
The example is simple to understand :

from xml.dom import minidom as dom
import urllib
def fetchPage(url):
a = urllib.urlopen(url)
return ''.join(a.readlines())

def extract(webpage):
a = dom.parseString(webpage)
item2 = a.getElementsByTagName('SendingDate')[0].firstChild.wholeText
print "DATA ",item2
item = a.getElementsByTagName('Cube')
for i in item:
if i.hasChildNodes() == True:
eur = i.getElementsByTagName('Rate')[10].firstChild.wholeText
dol = i.getElementsByTagName('Rate')[26].firstChild.wholeText
print "EURO  ",eur
print "DOLAR ",dol

if __name__=='__main__':
webpage = fetchPage("http://www.bnro.ro/nbrfxrates.xml")
extract(webpage)

The result is:

$python xmlparse.py
DATA  2010-02-04
EURO   4.1214
DOLAR  2.9749

With "urllib" package I read the url.
The result is parsing with functions from "dom" package.
I used this functions "parseString" and "getElementsByTagName".
More about this functions you will see on:
http://docs.python.org/library/xml.dom.minidom.html
This is all.

python-catalin

analitics

Pages