analitics

Pages

Tuesday, June 18, 2019

Python 3.7.3 : Stemming with nltk.

Today I will start another tutorial about nltk python module and stemming.
The stemming is the process of producing morphological variants of a root/base word.
Stemming programs are commonly referred to as stemming algorithms or stemmers to reduces the words.
Errors in Stemming can be overstemming and understemming.
These two words are stemmed to the same root that are of different stems then the term is overstemming.
When two words are stemmed to same root that are not of different stems then the term used is understemming.
Applications of stemming are used in information retrieval systems like search engines or is used to determine domain vocabularies in domain analysis.
Let install this python module named nltk with pip tool:
C:\Python373\Scripts>pip install nltk
Collecting nltk
...
Successfully installed nltk-3.4.1 six-1.12.0
The nltk python module work with human language data for applying in statistical natural language processing (NLP).
It contains text processing libraries for tokenization, parsing, classification, stemming, tagging, graphical demonstrations, sample data sets, and semantic reasoning.
The next step is to download the models and data, see more at this official webpage.
First run this lines of code to update the nltk python module.
import nltk
nltk.download()
Let's test a simple implementation of stemming words using nltk python module:
from nltk.stem import PorterStemmer 
from nltk.tokenize import word_tokenize 
   
my_porter = PorterStemmer() 
   
quote = "Deep in the human unconscious is a pervasive need for a logical universe that makes sense."

words = word_tokenize(quote) 
   
for w in words: 
    print(w, " : ", my_porter.stem(w))
The result is something like this:
C:\Users\catafest>python stemming_001.py
Deep  :  deep
in  :  in
the  :  the
human  :  human
unconscious  :  unconsci
is  :  is
a  :  a
pervasive  :  pervas
need  :  need
for  :  for
a  :  a
logical  :  logic
universe  :  univers
that  :  that
makes  :  make
sense  :  sens
.  :  .

C:\Users\catafest>
You can read more about the stemming at Wikipedia.

Python 3.7.3 : Using getters and setters in object-oriented.

The main purpose of using getters and setters in object-oriented programs is to ensure data encapsulation.
Let's start with a simple example.
I created a class named my_class init with one variable named my_variable:
self._my_variable = my_variable
A new, initialized instance can be obtained by this line of code:
test_it = my_class()
The example use getter and setter methods to use this variable.
C:\Users\catafest>python
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Inte
l)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class my_class:
...     def __init__(self, my_variable = 0):
...          self._my_variable = my_variable
...
...     # getter method
...     def get_my_variable(self):
...         return self._my_variable
...
...     # setter method
...     def set_my_variable(self, x):
...         self._my_variable = x
...
>>> test_it = my_class()
>>> test_it.set_my_variable(1976)
>>> print(test_it.get_my_variable())
1976
>>> print(test_it._my_variable)
1976
In Python property() is a built-in function that creates and returns a property object.
Python has four arguments property: fget, fset, fdel, doc.
  • fget is a function for retrieving an attribute value;
  • fset is a fuction for setting an attribute value;
  • fdel is a function for deleting an attribute value;
  • doc creates a docstring for attribute.
This line of code will setting the my_variable using setter:
test_it.set_my_variable(1976)
This line of code will retrieving my_variable using getter:
print(test_it.get_my_variable())
A property object has three methods, getter(), setter(), and delete() to specify fget, fset and fdel individually.
How my example changes it:
>>> class my_class:
...      def __init__(self):
...           self._my_variable = 0
...
...      # function to get value of _my_variable
...      def get_my_variable(self):
...          print("getter method called")
...          return self._my_variable
...
...      # function to set value of _my_variable
...      def set_my_variable(self, a):
...          print("setter method called")
...          self._my_variable = a
...
...      # function to delete _my_variable attribute
...      def del_my_variable(self):
...          del self._my_variable
...
...      my_variable = property(get_my_variable, set_my_variable, del_my_variabl
e)
...
>>> test_it = my_class()
>>> test_it.my_variable = 1976
setter method called
>>> print(test_it.my_variable)
getter method called
1976
Using decorator with python @property is one of the built-in decorators.
The main purpose of any decorator is to change your class methods or attributes.
The user of your class no need to make any change in their code.
This is the final result:
>>> class my_class:
...      def __init__(self):
...           self._my_variable = 0
...
...      # using property decorator
...      # a getter function
...      @property
...      def my_variable(self):
...          print("getter method called")
...          return self._my_variable
...
...      # a setter function
...      @my_variable.setter
...      def my_variable(self, my_out):
...          if(my_out < 1976):
...             raise ValueError("... this my_variable has a criteria!!")
...          print("setter method called")
...          self._my_variable = my_out
...
>>> test_it = my_class()
>>> test_it.my_variable = 1979
setter method called
>>> print(test_it.my_variable)
getter method called
1979
>>> test_it.my_variable = 1975
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 16, in my_variable
ValueError: ... this my_variable has a criteria!!
This last example show you how to use @property decorator to create getters and setters in pythonic way.