python-catalin

Friday, July 5, 2019

Python 3.7.3 : Use python with MySQL and MariaDB.

If you want to use MariaDB databases with python then you need to install the MySQL.
Use the pip tool to install the mysql-connector-python python module:

C:\Python373\Scripts>pip install mysql-connector-python
Collecting mysql-connector-python
...
Installing collected packages: mysql-connector-python
Successfully installed mysql-connector-python-8.0.16

The information about python module named mysql-connector-python version 8.0.16 can be found at this web page.
For the test, I install MariaDB from the official website with user root and my password.
Let's test the python module for MariaDB:

C:\Python373>python.exe
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import mysql
>>> import mysql.connector as mariadb

After install, start the command prompt of MariaDB and test it:

C:\Windows\System32>mysql -u root -p
Enter password: ****
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.4.6-MariaDB mariadb.org binary distribution

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.001 sec)
MariaDB [information_schema]> USE information_schema;
Database changed
MariaDB [information_schema]> SHOW TABLES;
...
| INNODB_TABLESPACES_SCRUBBING          |
| INNODB_SYS_SEMAPHORE_WAITS            |
+---------------------------------------+
77 rows in set (0.001 sec)

C:\Python373>python.exe
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import mysql.connector as mariadb
>>> mariadb_connection = mariadb.connect(user='root', password='test', database=
'information_schema')
>>> cursor = mariadb_connection.cursor()
>>> cursor.execute("USE information_schema")
>>> cursor.execute("SHOW TABLES")
>>> result = cursor.fetchall()
>>> print(result)
[('ALL_PLUGINS',), ('APPLICABLE_ROLES',), ('CHARACTER_SETS',), ('CHECK_CONSTRAIN
TS',), ('COLLATIONS',), ('COLLATION_CHARACTER_SET_APPLICABILITY',), ('COLUMNS',)
, ('COLUMN_PRIVILEGES',), ('ENABLED_ROLES',), ('ENGINES',), ('EVENTS',), ('FILES
',), ('GLOBAL_STATUS',), ('GLOBAL_VARIABLES',), ('KEY_CACHES',), ('KEY_COLUMN_US
AGE',), ('OPTIMIZER_TRACE',), ('PARAMETERS',), ('PARTITIONS',), ('PLUGINS',), ('
PROCESSLIST',), ('PROFILING',), ('REFERENTIAL_CONSTRAINTS',), ('ROUTINES',), ('S
CHEMATA',), ('SCHEMA_PRIVILEGES',), ('SESSION_STATUS',), ('SESSION_VARIABLES',),
 ('STATISTICS',), ('SYSTEM_VARIABLES',), ('TABLES',), ('TABLESPACES',), ('TABLE_
CONSTRAINTS',), ('TABLE_PRIVILEGES',), ('TRIGGERS',), ('USER_PRIVILEGES',), ('VI
EWS',), ('GEOMETRY_COLUMNS',), ('SPATIAL_REF_SYS',), ('CLIENT_STATISTICS',), ('I
NDEX_STATISTICS',), ('INNODB_SYS_DATAFILES',), ('USER_STATISTICS',), ('INNODB_SY
S_TABLESTATS',), ('INNODB_LOCKS',), ('INNODB_MUTEXES',), ('INNODB_CMPMEM',), ('I
NNODB_CMP_PER_INDEX',), ('INNODB_CMP',), ('INNODB_FT_DELETED',), ('INNODB_CMP_RE
SET',), ('INNODB_LOCK_WAITS',), ('TABLE_STATISTICS',), ('INNODB_TABLESPACES_ENCR
YPTION',), ('INNODB_BUFFER_PAGE_LRU',), ('INNODB_SYS_FIELDS',), ('INNODB_CMPMEM_
RESET',), ('INNODB_SYS_COLUMNS',), ('INNODB_FT_INDEX_TABLE',), ('INNODB_CMP_PER_
INDEX_RESET',), ('user_variables',), ('INNODB_FT_INDEX_CACHE',), ('INNODB_SYS_FO
REIGN_COLS',), ('INNODB_FT_BEING_DELETED',), ('INNODB_BUFFER_POOL_STATS',), ('IN
NODB_TRX',), ('INNODB_SYS_FOREIGN',), ('INNODB_SYS_TABLES',), ('INNODB_FT_DEFAUL
T_STOPWORD',), ('INNODB_FT_CONFIG',), ('INNODB_BUFFER_PAGE',), ('INNODB_SYS_TABL
ESPACES',), ('INNODB_METRICS',), ('INNODB_SYS_INDEXES',), ('INNODB_SYS_VIRTUAL',
), ('INNODB_TABLESPACES_SCRUBBING',), ('INNODB_SYS_SEMAPHORE_WAITS',)]

Thursday, July 4, 2019

Python 3.7.3 : Using itemgetter from operator python module.

The operator module exports a set of efficient functions corresponding to the intrinsic operators of Python. see documentation.
Today I will show you how to use itemgetter from this python module with python 3.7.3.
Let's see how to sort my two dictionaries named my_dict and my_dict2, using the classical lambda function;

C:\Python373>python.exe
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> my_dict = {'a': 11, 'b': 12, 'c': 13, 'd': 14}
>>> sorted(my_dict.items(), key = lambda x:x[1])
[('a', 11), ('b', 12), ('c', 13), ('d', 14)]
>>> my_dict2 = {'a': 31, 'b': 12, 'c': 33, 'd': 14}
>>> sorted(my_dict2.items(), key = lambda x:x[1])
[('b', 12), ('d', 14), ('a', 31), ('c', 33)]

Using the operator python module with my examples for both dictionaries:

>>> import operator
>>> sorted(my_dict.items(), key = operator.itemgetter(1))
[('a', 11), ('b', 12), ('c', 13), ('d', 14)]
>>> sorted(my_dict2.items(), key = operator.itemgetter(1))
[('b', 12), ('d', 14), ('a', 31), ('c', 33)]
>>> sorted(my_dict2.items(), key = operator.itemgetter(1), reverse=True)
[('c', 33), ('a', 31), ('d', 14), ('b', 12)]

The operator.itemgetter returns a callable object that fetches the item from its operand using the operand’s __getitem__() method.
If multiple items are specified, returns a tuple of lookup values.
Let's see another example:

>>> my_arr = []
>>> my_arr.append(["A","Z",77])
>>> my_arr.append(["bA","Zc",11])
>>> my_arr.append(["d","c",111])
>>> print(my_arr)
[['A', 'Z', 77], ['bA', 'Zc', 11], ['d', 'c', 111]]
>>> my_arr.sort(key=operator.itemgetter(1))
>>> my_arr
[['A', 'Z', 77], ['bA', 'Zc', 11], ['d', 'c', 111]]
>>> my_arr.sort(key=operator.itemgetter(1,2))
>>> my_arr
[['A', 'Z', 77], ['bA', 'Zc', 11], ['d', 'c', 111]]
>>> my_arr.sort(key=operator.itemgetter(2))
>>> my_arr
[['bA', 'Zc', 11], ['A', 'Z', 77], ['d', 'c', 111]]
>>> my_arr.sort(key=operator.itemgetter(2,1))
>>> my_arr
[['bA', 'Zc', 11], ['A', 'Z', 77], ['d', 'c', 111]]

After using the sort with itemgetter the array is changed it:


>>> i0 = operator.itemgetter(0)
>>> i0(my_arr)
['bA', 'Zc', 11]>>> i1 = operator.itemgetter(1)
>>> i1(my_arr)
['A', 'Z', 77]
>>> i2 = operator.itemgetter(2)
>>> i2(my_arr)
['d', 'c', 111]
>>> print(my_arr)
[['bA', 'Zc', 11], ['A', 'Z', 77], ['d', 'c', 111]]

Monday, July 1, 2019

Python 3.7.3 : Using the Pony python module.

The development team of the Pony project comes with this intro:
Using Pony object-relational mapper you can concentrate on writing the business logic of your application and use Python syntax for interacting with the database. Pony translates such queries into SQL and executes them in the database in the most efficient way.
The Pony python module can be installed on Python 2.7 or Python 3.
Today I tested with the Python 3.7.3 version:

C:\Python373>cd Scripts
C:\Python373\Scripts>pip install pony
Collecting pony
...
Successfully built pony
Installing collected packages: pony
Successfully installed pony-0.7.10

Let's start with one simple example:

C:\Python373>python.exe
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pony
>>> from pony import orm
>>> from pony.orm import *
>>> mydb = Database()
>>> class Person(mydb.Entity):
...     name = Required(str)
...     surname = Required(str)
...     age = Required(int)
...     house = Set('Home')
...
>>> class Home(mydb.Entity):
...     owner = Required(Person)
...     address = Required(str)
...     floors = Required(int)
...     perimeter = Required(int)
...
>>> show(Person)
class Person(Entity):
    id = PrimaryKey(int, auto=True)
    name = Required(str)
    surname = Required(str)
    age = Required(int)
    house = Set(Home)
>>> show(Home)
class Home(Entity):
    id = PrimaryKey(int, auto=True)
    owner = Required(Person)
    address = Required(str)
    floors = Required(int)
    perimeter = Required(int)

Let's see the relationships of this example.
The house variable is set to the type named Home.
The owner variable has type Person.
The Pony supports 4 database types: SQLite, MySQL, PostgreSQL, and Oracle.
Use absolute path, see the error generated by bind function when saving it to the database file:

>>> mydb.bind(provider='sqlite', filename='err_my.sqlite', create_db=True)
...
ValueError: When in interactive mode, please provide absolute file path. Got: 'err_my.sqlite'

The correct way to bind the database is:

mydb.bind(provider='sqlite', filename='C:\\Python373\\mydb.sqlite', create_db=True)

Using the Pony python module with databases is simple to use, see the documentation webpage:

# SQLite
db.bind(provider='sqlite', filename=':memory:')
# or
db.bind(provider='sqlite', filename='database.sqlite', create_db=True)
# PostgreSQL
db.bind(provider='postgres', user='', password='', host='', database='')
# MySQL
db.bind(provider='mysql', host='', user='', passwd='', db='')
# Oracle
db.bind(provider='oracle', user='', password='', dsn='')

The next step is the mapping process for declared entities to the corresponding tables in the database, like: creates tables, foreign key references, and indexes if necessary:

>>> mydb.generate_mapping(create_tables=True)

Let's fill the database using these objects:

data001 = Person(name='John',surname = 'Unknown', age=18)
data002 = Person(name='Unknown',surname = 'Unknown', age=81)
add001 = Home(owner = data001,address = '18', floors = 1, perimeter=0)
add002 = Home(owner = data002,address = '11', floors = 0, perimeter=1)

These objects will be saved only after the commit function is called.

>>> commit()

Let's see the next steps:
- Writing queries:

>>> select (var_p for var_p in Person)

>>> select (var_p for var_p in Person)[:]
[Person[1], Person[2]]
>>> select (var_p for var_p in Person)[:].show()
id|name   |surname|age
--+-------+-------+---
1 |John   |Unknown|18
2 |Unknown|Unknown|81

- Getting objects:

>>> p1 = Person[1]
>>> print(p1)
Person[1]
>>> print(p1.name)
John

- Updating an object:

>>> m = Person.get(name='John')
>>> print(m.age)
18

Even I used an SQLite database, the databases can be used writing queries according to the databases

>>> x = 25
>>> Person.select_by_sql('SELECT * FROM Person p WHERE p.age < $x')
[Person[1]]

Instead of creating models manually, you can use the examples from the Pony distribution package:

>>> from pony.orm.examples.estore import *
...
SELECT "Product"."id", "Product"."name", "Product"."description", "Product"."pic
ture", "Product"."price", "Product"."quantity"
FROM "Product" "Product"
WHERE 0 = 1

COMMIT
COMMIT
PRAGMA foreign_keys = true
CLOSE CONNECTION
RELEASE CONNECTION

You can see the database diagram at the eStore webpage.
This tutorial can be continued with additional pieces of information and examples.
You can read more at Pony ORM webpage.

Thursday, June 20, 2019

Python 3.7.3 : Read and save RSS data from goodreads website.

Today I will show you how to parse data from www.goodreads.com using the feedparser and save all into CSV file.
The Goodreads website comes with hundreds of great book recommendations from fellow readers, beloved authors, and let you add your favorite books.
The main goal was to have a structured link from the RSS file and the CSV file.
This issue was solve with arrays for each type of data.
First, let's install the feedparser python module with the pip tool:

C:\Python373\Scripts>pip install feedparser
Collecting feedparser
...
Successfully built feedparser
Installing collected packages: feedparser
Successfully installed feedparser-5.2.1

You need to get the RSS link with your books from your account.
The example is simple and has commented lines to understand easily how can I solve this issue.
This is the source code for reading all RSS data and put on the CSV file:

import feedparser
import csv
bookread_rss = "your RSS with data account"
feeds = feedparser.parse(bookread_rss)
print ("aditional RSS data")
print (feeds['feed']['title'])
print (feeds['feed']['link'])
print (feeds.feed.subtitle)
print (len(feeds['entries']))
print (feeds.version)
print (feeds.headers)
print (feeds.headers.get('content-type'))
print ("read RSS items")
# empty arrays for values by type
dates = []
titles = []
authors = []
links = []
pages =[]

# create the name of the CSV file
file_csv = 'my_goodreads_books.csv'

# prepare the CSV file with fix for error
# UnicodeEncodeError: 'charmap' codec can't encode character '\u0435' in position
# 30: character maps to 
cvs_out = csv.writer(open(file_csv, 'w',newline='',encoding="utf-8"))

#print(feeds)
for post in feeds.entries:
    date = "%d/%02d/%02d" % (post.published_parsed.tm_year,\
        post.published_parsed.tm_mon, \
        post.published_parsed.tm_mday,)
    # uncomment and will print on console
    #print("___")
    #print("post date: " + date)
    #print("post title: " + post.title)
    #print("post author: " + post.author_name)
    #print("post link: " + post.link)
    #print("post pages: " + post.num_pages)

    dates.append(date)
    titles.append(post.title)
    authors.append(post.author_name)
    links.append(post.link)
    pages.append(post.num_pages)

for d,t,a,l,p in zip(dates,titles,authors,links,pages):
    cvs_out.writerow((d,t,a,l,p))

The result will print you some info, see my example:

C:\Python373>python bookreader_rss_001.py
aditional RSS data
Catalin's bookshelf: all
https://www.goodreads.com/review/list_rss/52019632?key=pyfTLqvJXpg-_ghi4a6ZTZfJV
gLVXC8TcWyaBSyoiScgfXq3&shelf=%23ALL%23
Catalin's bookshelf: all
100
rss20
{'Server': 'Server', 'Date': 'Thu, 20 Jun 2019 12:22:55 GMT', 'Content-Type': 'a
pplication/xml; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'c
lose', 'Status': '200 OK', 'X-Frame-Options': 'ALLOWALL', 'X-XSS-Protection': '1
...

All date will be put into the my_goodreads_books.csv file.

Tuesday, June 18, 2019

Python 3.7.3 : Stemming with nltk.

Today I will start another tutorial about nltk python module and stemming.
The stemming is the process of producing morphological variants of a root/base word.
Stemming programs are commonly referred to as stemming algorithms or stemmers to reduces the words.
Errors in Stemming can be overstemming and understemming.
These two words are stemmed to the same root that are of different stems then the term is overstemming.
When two words are stemmed to same root that are not of different stems then the term used is understemming.
Applications of stemming are used in information retrieval systems like search engines or is used to determine domain vocabularies in domain analysis.
Let install this python module named nltk with pip tool:

C:\Python373\Scripts>pip install nltk
Collecting nltk
...
Successfully installed nltk-3.4.1 six-1.12.0

The nltk python module work with human language data for applying in statistical natural language processing (NLP).
It contains text processing libraries for tokenization, parsing, classification, stemming, tagging, graphical demonstrations, sample data sets, and semantic reasoning.
The next step is to download the models and data, see more at this official webpage.
First run this lines of code to update the nltk python module.

import nltk
nltk.download()

Let's test a simple implementation of stemming words using nltk python module:

from nltk.stem import PorterStemmer 
from nltk.tokenize import word_tokenize 
   
my_porter = PorterStemmer() 
   
quote = "Deep in the human unconscious is a pervasive need for a logical universe that makes sense."

words = word_tokenize(quote) 
   
for w in words: 
    print(w, " : ", my_porter.stem(w))

The result is something like this:

C:\Users\catafest>python stemming_001.py
Deep  :  deep
in  :  in
the  :  the
human  :  human
unconscious  :  unconsci
is  :  is
a  :  a
pervasive  :  pervas
need  :  need
for  :  for
a  :  a
logical  :  logic
universe  :  univers
that  :  that
makes  :  make
sense  :  sens
.  :  .

C:\Users\catafest>

You can read more about the stemming at Wikipedia.