Pandas - get first n-rows based on percentage

I want to pop first 5% of record

There is no built-in method but you can do this:

You can multiply the total number of rows to your percent and use the result as parameter for head method.

n = 5df.head(int(len(df)*(n/100)))

So if your dataframe contains 1000 rows and n = 5% you will get the first 50 rows.

python pandas percentage

I've extended Mihai's answer for my usage and it may be useful to people out there.The purpose is automated top-n records selection for time series sampling, so you're sure you're taking old records for training and recent records for testing.

# having # import pandas as pd # df = pd.DataFrame... def sample_first_prows(data, perc=0.7):    import pandas as pd    return data.head(int(len(data)*(perc)))train = sample_first_prows(df)test = df.iloc[max(train.index):]

python pandas percentage

may be this will help:

tt  = tmp.groupby('id').apply(lambda x: x.head(int(len(x)*0.05))).reset_index(drop=True)

CodeHunter

Pandas - get first n-rows based on percentage

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last