import pandas as pd df_excl = pd.DataFrame({"id": ["12345"]}) df = pd.DataFrame({"id": ["12345", "67890"]}) result = df[~df.id.isin(df_excl[["id"]])] print(result)
Guess what’s the result of above snippet? Just a dataframe with “67890”? No, the result is
id 0 12345 1 67890
Why the “12345” has not been excluded? The reason is quite tricky: df_excl[["id"]]
is a DataFrame but what we need in isin()
is Series! So we shouldn’t use [[]]
here, but []
The correct code should use df_excl["id"]
, as below:
... result = df[~df.id.isin(df_excl["id"])] print(result)