analitics

Pages

Monday, June 3, 2019

Python 3.7.3 : Working with wikipedia python module.

Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia.
Let's install it:
C:\Python373\Scripts>pip install wikipedia
First, let's test the default example:
C:\Python373>python.exe
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import wikipedia
>>> print(wikipedia.summary("Wikipedia"))
...
.Wikipedia has been criticized for exhibiting systemic bias, for presenting a m
ixture of "truths, half truths, and some falsehoods", and for being subject to m
anipulation and spin in controversial topics. But by 2017, Facebook announced th
at it would help readers detect fake news by suggesting links to related Wikiped
ia articles. YouTube announced a similar plan in 2018.
>>> wikipedia.search("Falticeni")
['Falticeni', 'Foresta Falticeni', 'Charles, Prince of Wales', 'Sofia Ionescu',
'Buciumeni River (?omuzul Mare)', '?omuzul Mare River', 'Constantin Schumacher',
'Ionu? Atodiresei', '1967 Cupa României Final', 'J. J. Benjamin']
>>> wikipedia.page("Falticeni")
...
>>> city=wikipedia.page("Falticeni")
>>> city.title
...
>>> city.content
...
>>> wikipedia.set_lang("fr")
>>> page=wikipedia.page("Null")
>>> page.title
'Null' 
You can extract links:
>>> page = wikipedia.page("List_of_works_by_Leonardo_da_Vinci")
>>> print(page.links)
You can test all of these:
>>> dir(wikipedia)
['API_URL', 'BeautifulSoup', 'Decimal', 'DisambiguationError', 'HTTPTimeoutError
', 'ODD_ERROR_MESSAGE', 'PageError', 'RATE_LIMIT', 'RATE_LIMIT_LAST_CALL', 'RATE
_LIMIT_MIN_WAIT', 'RedirectError', 'USER_AGENT', 'WikipediaException', 'Wikipedi
aPage', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__na
me__', '__package__', '__path__', '__spec__', '__version__', 'cache', 'datetime'
, 'debug', 'donate', 'exceptions', 'geosearch', 'languages', 'page', 'random', '
re', 'requests', 'search', 'set_lang', 'set_rate_limiting', 'set_user_agent', 's
tdout_encode', 'suggest', 'summary', 'sys', 'time', 'timedelta', 'unicode_litera
ls', 'util', 'wikipedia']
Read more at pypi website.