Data handling with python
The 21st centuary seems to be the house of rising data , with 1.7MB of data being created every second by an individual,the need for data handling and manipulation is dire. Python is one among the many languages that can be used as a tool to clean and maneuver data , Lets get to know how ?
So why choose python and not another language ?
Here’s why , Python is :
Flexible- you can try something new and creative with python always.
Easy to learn -Python offers programmers the advantage of using fewer lines of code to accomplish tasks than one needs when using older languages.
Open source- Python is free and is designed to run on windows and Linux environments. There are many open-source Python libraries such as Data manipulation, Statistics,Natural language processing etc.
How is python used in each stage and analysis of data ?
First Stage- when you are presented with data , such as an excel sheet you need to derive insights from the rows and columns , Python offers libraries like Pandas and Numpy that can do the job quicker using parallel processing.
Second Stage- To get the necessary data, we have to scrape and search through the internet , python provides Scrapy and Beautiful soup to extract data.
Third Stage- Too many numbers are boring , and honestly quite tedious to look at , hence visualization of data using charts, graphs and plots make sit easier to understand it. Python’s Matplotlib , Seaborn etc are used to add colour to the data.
Fourth Stage- The next step is to train your model usingthe algorithms of machine learning.Pythons scikitlearn makes that easy for us.
File handling using python:
Python treats file differently as text or binary.
Working with the Open() function
we use the open() function to open a file in read or write mode.
Syantx :
open(filename, mode)
The open function takes 2 arguments filename specifying the name of the file you want to open and mode(not compulsory to pass as an argument) used to specify the mode in which the file will be opened.
Different modes are:
“ r “, for reading
“ w “, for writing
“ a “, for appending
“ r+ “, for both reading and writing
#program to open a file and read its contents
file=open("Animals.txt",r)
#prints each line in the file
for i in file:
print(i)
You can also use the file.read() function to read a file, or a specific part of the file as shown here:
file = open("file.txt", "r")
print (file.read(10))
Reads the first 10 characters of the file, provided it contained some text , and returns it as a string.
To manipulate a file using the write() function do the following:
#opens file in write mode and writes in the file.
file = open('Animals.txt','w')
file.write("Hey!, im writing in this file")
file.close()
Imagine if you want to add some text to an existing file without altering the previous text , you can use to append mode.
#appends text at the end of the file.
file = open('Animals.txt','a')
file.write("Let's add this to the file")
file.close()
We can also split lines using file handling in Python split(). The close() function is used to free the system of the resources it is using while running that file.
Database connectivity in python
Python can be used in database applications. One of the most popular databases is MySQL. To build this connection you have to install the MySQLdb which is an interface for connecting to a MySQL database server from Python.
You can refer to the following link for more info on how to install the connector MySQLdb : https://stackoverflow.com/questions/25865270/how-to-install-python-mysqldb-module-using-pip
having done that, make sure of the following:
- You have created a database of your choice.
- You have created a table in your database.
- This table has fields.
- User ID “xyz” and password “xyz123” are set to access the database.
- Python module MySQLdb is installed properly on your machine.
How to establish connection with an example database named EXAMPLEDB:
#!/usr/bin/python
import MySQLdb
# Open database connection
db = MySQLdb.connect("localhost","xyz","xyz123","EXAMPLEDB" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
We use the execute() method of the created cursor to create a table into the database.
#!/usr/bin/python
import MySQLdb
# Open database connection
db = MySQLdb.connect("localhost","xyz","xyz123","EXAMPLEDB" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Drop table if it already exist using execute() method.
cursor.execute("DROP TABLE IF EXISTS EMPLOYEE")
# Create table as per requirement
sql = """CREATE TABLE STUDENTINFO(
FIRST_NAME CHAR(20) NOT NULL,
LAST_NAME CHAR(20),
AGE INT,
SEX CHAR(1)
)"""
cursor.execute(sql)
# disconnect from server
db.close()
There are many other features like update, insert , alter etc that can be performed which we will look into later.
Thanks for tuning in :)