Click here to Skip to main content
15,881,281 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
So we have a Pandas DataFrame with certain values at certain times.

For example:


Quote:
@ts @value Glucose Diff smooth_diff new P N C1 C2

135 2021-10-29 11:16:00 167 167.0 -3.0 15.45 15.45 17.95 17.45 NaN 0.0
155 2021-10-29 12:56:00 162 162.0 -15.0 15.35 15.35 17.95 16.00 NaN 0.0
243 2021-10-29 20:16:00 133 133.0 0.0 15.25 15.25 19.85 15.75 NaN 0.0
245 2021-10-29 20:26:00 134 134.0 0.0 15.50 15.50 15.75 15.60 NaN 0.0
113 2021-10-29 09:26:00 130 130.0 1.0 16.75 16.75 0.00 21.70 NaN NaN


Quote:
Now we want to drop the rows that are in an 1 hour time interval (the @ts column) of each other (so in this example we want to drop the row at 2021-10-29 20:26:00 as it is within an 1 hour time span of the previous one), but we can't seem to figure out a way to do this.

Any help?


What I have tried:

We can't figure it out as we don't know how to work with TimeStamp in pandas.
Posted
Updated 4-Jun-21 4:28am
Comments
Richard MacCutchan 4-Jun-21 10:26am    
You do not say where this data is coming from, but it mat be easier to remove the unwanted data as you read it.

1 solution

The information you want is at pandas.Timestamp — pandas 1.2.4 documentation[^].

[edit]
This is not perfect, but shows how to get the timedelta value between two items in the table, and remove the offending rows:
Python
# requires 
from datetime import datetime
from datetime import date
from datetime import time
from datetime import timedelta

# sample data taken from above question
data = ('135 2021-10-29 10:16:00 167 167.0 -3.0 15.45 15.45 17.95 17.45\n'
        '155 2021-10-29 12:56:00 162 162.0 -15.0 15.35 15.35 17.95 16.00\n'
        '243 2021-10-29 20:16:00 133 133.0 0.0 15.25 15.25 19.85 15.75\n'
        '245 2021-10-29 20:26:00 134 134.0 0.0 15.50 15.50 15.75 15.60\n'
        '113 2021-10-29 09:26:00 130 130.0 1.0 16.75 16.75 0.00 21.70')
rawtable = pd.read_csv(StringIO(data), header=None, delimiter=' ')

# sort the table on the time values. NB modifiy this to include the dates
table = rawtable.sort_values(2, 0, key=lambda col: col.str.lower(), ignore_index=True)

tmin = timedelta(0, 0, 0, 0, 60) # the differences must be at least 60 minutes
tprev = datetime.combine(datetime.min, time(0, 0))

# iterate the rows of data to find any deltas less than the minimum
for index, row in table.iterrows():
    s = row.iloc[1:3]
    dt = date.fromisoformat(s[1])    # capture the date field
    td = time.fromisoformat(s[2])    # and the time
    dttd = datetime.combine(dt, td)  # create the DateTime value
    # print(index, dttd)
    delta = dttd - tprev             # calculate the difference between this and the previous row
    if delta < tmin:  # if the interval is too small
        # print('index:', index, 'delta:', delta)
        table.drop(index, inplace=True) # drop this row from the table
    tprev = dttd  # save the current time as the new previous value
print("\n\n")
print(table)  # display the modified table


[/edit]
 
Share this answer
 
v3
Comments
Sybren Postma 4-Jun-21 10:32am    
Thank you very much for your response Richard, do you by any change know how we can compute calculations on TimeStamps? For example if we want to add an hour to an existing TimeStamp how would we do this?
Richard MacCutchan 4-Jun-21 12:05pm    
Sorry I am new to pandas. But I would guess that you need to convert the time strings from the data into TimeStamps and then use pandas.Timedelta — pandas 1.2.4 documentation[^] to calculate the differences.
Richard MacCutchan 5-Jun-21 7:19am    
See my updated answer.
Sybren Postma 15-Jun-21 7:50am    
This solved our problem thank you so much! We'll buy you pizza :)
Richard MacCutchan 15-Jun-21 7:53am    
Thanks, but I am happy just to learn something about pandas.

BTW could you click the "Accept answer" link to show the problem is solved?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900