Pandas intro.

Posted under » Python Data Analysis on 12 June 2023

Pandas or "Python Data Analysis" is a Python library used for working with data sets.

In simplest form

import pandas

mydataset = {
  'cats': ["Lion", "Tiger", "Puma"],
  'passmotion': [3, 7, 2]
}

myvar = pandas.DataFrame(mydataset)

print(myvar)

    cats  passmotion
0   Lion           3
1  Tiger           7
2   Puma           2

A Pandas Series is like a column in a table. It is a one-dimensional array holding data of any type.

If nothing else is specified, the values are labeled with their index number. First value has index 0, second value has index 1 etc. This label can be used to access a specified value, just like an array index.

import pandas as pd

a = [1, 7, 2] # series in a column

myvar = pd.Series(a, index = ["x", "y", "z"]) # label

print(myvar)

x    1
y    7
z    2

You can also use a dictionary when creating a Series. Dictionaries are used to store data values in key:value pairs. A dictionary is a collection which is ordered, changeable and do not allow duplicates.

Dictionaries are written with curly brackets, and have keys and values

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

To select only some of the items in the dictionary (calories), use the index argument and specify only the items you want to include in the Series.

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories, index = ["day1", "day2"])

print(myvar)

day1    420
day2    380
dtype: int64

Next : Dataframe ».

web security linux ubuntu python django git Raspberry apache mysql php drupal cake javascript css AWS data