How to use Regular Expressions in Pandas Dataframe

df.filter method in Pandas filters columns or rows of a dataframe as per the given regular expression, this method does not filter dataframe on its contents,filter is applied to the labels of the index or columns.

Create Dataframe with csv


import pandas as pd

df = pd.read_csv('https://storage.googleapis.com/gcptutorials.com/dataset/data.csv')
print(df.columns)

Output:


Index(['Date', 'Open Price', 'High Price', 'Low Price', 'Close Price', 'WAP',
       'No.of Shares', 'No. of Trades', 'Total Turnover (Rs.)',
       'Deliverable Quantity', '% Deli. Qty to Traded Qty', 'Spread High-Low',
       'Spread Close-Open'],
      dtype='object')

Selecting all columns with df.filter ending with string Open


print(df.filter(regex='Open$', axis=1))

Output:


      Spread Close-Open
0                -12.55
1                -22.00
2                 -4.15
3                  5.10
4                 -6.70
...                 ...
2507               5.90
2508             -26.25
2509             -16.60
2510             -12.00
2511              16.50

Selecting all columns with df.filter starting with string Open


print(df.filter(regex='^Open', axis=1))

Output:


      Open Price
0         527.00
1         549.00
2         549.00
3         543.05
4         548.80
...          ...
2507      892.00
2508      917.00
2509      922.90
2510      928.55
2511      910.00

Selecting all columns df.filter except those starting with string Open


print(df.filter(regex='^(?!Open).*', axis=1))

Output:


                  Date  High Price  ...  Spread High-Low  Spread Close-Open
0     14-February-2020      532.75  ...            20.50             -12.55
1     13-February-2020      549.00  ...            24.00             -22.00
2     12-February-2020      552.50  ...            14.30              -4.15
3     11-February-2020      551.45  ...            13.55               5.10
4     10-February-2020      555.35  ...            20.75              -6.70
...                ...         ...  ...              ...                ...
2507    8-January-2010      909.90  ...            21.85               5.90
2508    7-January-2010      917.70  ...            30.70             -26.25
2509    6-January-2010      925.00  ...            21.00             -16.60
2510    5-January-2010      939.00  ...            25.95             -12.00
2511    4-January-2010      930.00  ...            33.80              16.50


Follow US on Twitter: