Analyzing DataFrames

Posted under » Python Data Analysis on 12 June 2023

From Read data into Dataframe

The head() method returns the headers and a specified number of rows, starting from the top. If the number of rows is not specified, the head() method will return the top 5 rows.

import pandas as pd

df = pd.read_csv('data.csv')

print(df.head(10)) 

The opposite of head is tail. The tail() method returns the headers and a specified number of rows, starting from the bottom.

print(df.tail()) 

The DataFrames object has a method called info(), that gives you more information about the data set.

print(df.info()) 

  <class 'pandas.core.frame.DataFrame'>
  RangeIndex: 169 entries, 0 to 168
  Data columns (total 4 columns):
   #   Column    Non-Null Count  Dtype  
  ---  ------    --------------  -----  
   0   Duration  169 non-null    int64  
   1   Pulse     169 non-null    int64  
   2   Maxpulse  169 non-null    int64  
   3   Calories  164 non-null    float64
  dtypes: float64(1), int64(3)
  memory usage: 5.4 KB
  None

The info() method also tells us how many Non-Null values there are present in each column, and in our data set it seems like there are 164 of 169 Non-Null values in the "Calories" column.

Empty values, or Null values, can be bad when analyzing data, and you should consider removing rows with empty values.

web security linux ubuntu python django git Raspberry apache mysql php drupal cake javascript css AWS data