· python pandas

Pandas: Add row to DataFrame

Usually when I’m working with Pandas DataFrames I want to add new columns of data, but I recently wanted to add a row to an existing DataFrame. It turns out there are more than one ways to do that, which we’ll explore in this blog post.

Let’s start by importing Pandas into our Python script:

import pandas as pd

We’ll start from a DataFrame that has two rows and the columns name and age:

df = pd.DataFrame(data=[{"name": "Mark", "age": 37}, {"name": "David", "age": 36}])
Table 1. DataFrame with no explicit index
name age

0

Mark

37

1

David

36

One way to add rows to this DataFrame is by creating a new DataFrame and joining it with our initial DataFrame using the append function:

to_append = pd.DataFrame([
    {"name": "Ryan", "age": 42},
    {"name": "John", "age": 25}
])
append_df = df.append(to_append)
Table 2. DataFrame with new row
name age

0

Mark

37

1

David

36

0

Ryan

42

1

John

25

The index for the new rows start again from 0, so we now have multiple rows with the index 0 and 1.

Another way that we can append a new row is using the loc function:

df.loc[2] = ["Ryan", 42]
df.loc[3] = ["John", 25]

If we take this approach we need to explicitly specify the index of the row. We can put whatever value we want, but let’s create indexes that increment the existing values:

Table 3. DataFrame with new row
name age

0

Mark

37

1

David

36

2

Ryan

42

3

John

25

What if we have an explicit index set on the DataFrame? We can convert the name column into an index by running the following code:

df_with_index = df.set_index("name")
Table 4. DataFrame with name index
name age

Mark

37

David

36

And now if we want to add a new row the index identifier should be a name instead of a numeric value:

df_with_index.loc["Ryan"] = [42]
df_with_index.loc["John"] = [25]
Table 5. DataFrame with name index with new row
name age

Mark

37

David

36

Ryan

42

John

25

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket