Data handling with python

The 21st centuary seems to be the house of rising data , with 1.7MB of data being created every second by an individual,the need for data handling and manipulation is dire. Python is one among the many languages that can be used as a tool to clean and maneuver data , Lets get to know how ?

So why choose python and not another language ?

Here’s why , Python is :

Flexible- you can try something new and creative with python always.

Easy to learn -Python offers programmers the advantage of using fewer lines of code to accomplish tasks than one needs when using older languages.

Open source- Python is free and is designed to run on windows and Linux environments. There are many open-source Python libraries such as Data manipulation, Statistics,Natural language processing etc.

How is python used in each stage and analysis of data ?

First Stage- when you are presented with data , such as an excel sheet you need to derive insights from the rows and columns , Python offers libraries like Pandas and Numpy that can do the job quicker using parallel processing.

Second Stage- To get the necessary data, we have to scrape and search through the internet , python provides Scrapy and Beautiful soup to extract data.

Third Stage- Too many numbers are boring , and honestly quite tedious to look at , hence visualization of data using charts, graphs and plots make sit easier to understand it. Python’s Matplotlib , Seaborn etc are used to add colour to the data.

Fourth Stage- The next step is to train your model usingthe algorithms of machine learning.Pythons scikitlearn makes that easy for us.

File handling using python:

Python treats file differently as text or binary.

Working with the Open() function

we use the open() function to open a file in read or write mode.

Syantx :

The open function takes 2 arguments filename specifying the name of the file you want to open and mode(not compulsory to pass as an argument) used to specify the mode in which the file will be opened.

Different modes are:

r “, for reading

w “, for writing

a “, for appending

r+ “, for both reading and writing

You can also use the file.read() function to read a file, or a specific part of the file as shown here:

Reads the first 10 characters of the file, provided it contained some text , and returns it as a string.

To manipulate a file using the write() function do the following:

Imagine if you want to add some text to an existing file without altering the previous text , you can use to append mode.

We can also split lines using file handling in Python split(). The close() function is used to free the system of the resources it is using while running that file.

Database connectivity in python

Python can be used in database applications. One of the most popular databases is MySQL. To build this connection you have to install the MySQLdb which is an interface for connecting to a MySQL database server from Python.

You can refer to the following link for more info on how to install the connector MySQLdb : https://stackoverflow.com/questions/25865270/how-to-install-python-mysqldb-module-using-pip

having done that, make sure of the following:

  • You have created a database of your choice.
  • You have created a table in your database.
  • This table has fields.
  • User ID “xyz” and password “xyz123” are set to access the database.
  • Python module MySQLdb is installed properly on your machine.

How to establish connection with an example database named EXAMPLEDB:

We use the execute() method of the created cursor to create a table into the database.

There are many other features like update, insert , alter etc that can be performed which we will look into later.

Thanks for tuning in :)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aditi Pateriya

A curious learner , an engineer by profession , exploring the world through writings