Linking Filter Validation

[1]:
import pandas as pd
import numpy as np
from sorcha.modules.PPLinkingFilter import PPLinkingFilter

This function aims to mimic the effects of the Solar System Processing pipeline in linking objects. More information can be found here. If we use the SSP defaults, for an object to be linked, it must have:

  • At least 2 observations in a night to constitute a valid tracklet.

  • These observations must have an angular separation of at least 0.5 arcseconds in order to be recognised as separate.

  • However, subsequent observations in a tracklet must occur within 90 minutes or 0.0625 days.

  • At least 3 tracklets must be observed to form a valid track.

  • These tracklets must be observed in less than 15 days.

We also expect 95% of objects to be linked. For now, we will set this parameter to 100% in order to test the others.

These six parameters can be changed in the config file and are found in the [LINKINGFILTER] section.

[2]:
min_observations = 2
min_angular_separation = 0.5
max_time_separation = 0.0625
min_tracklets = 3
min_tracklet_window = 15
detection_efficiency = 1
night_start_utc = 17.0

Let’s create an object that should definitely be linked according to these parameters.

[3]:
obj_id = ["pretend_object"] * 6
field_id = np.arange(1, 7)
times = [60000.03, 60000.06, 60005.03, 60005.06, 60008.03, 60008.06]
ra = [142, 142.1, 143, 143.1, 144, 144.1]
dec = [8, 8.1, 9, 9.1, 10, 10.1]
[4]:
observations = pd.DataFrame(
    {
        "ObjID": obj_id,
        "FieldID": field_id,
        "fieldMJD_TAI": times,
        "RA_deg": ra,
        "Dec_deg": dec
    }
)
[5]:
observations
[5]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg
0 pretend_object 1 60000.03 142.0 8.0
1 pretend_object 2 60000.06 142.1 8.1
2 pretend_object 3 60005.03 143.0 9.0
3 pretend_object 4 60005.06 143.1 9.1
4 pretend_object 5 60008.03 144.0 10.0
5 pretend_object 6 60008.06 144.1 10.1

Now let’s run the linking filter. As this object should be linked, we should receive the same dataframe back.

[6]:
linked_observations = PPLinkingFilter(observations, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[7]:
linked_observations
[7]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD
0 pretend_object 1 60000.03 142.0 8.0 True 60007.0
1 pretend_object 2 60000.06 142.1 8.1 True 60007.0
2 pretend_object 3 60005.03 143.0 9.0 True 60007.0
3 pretend_object 4 60005.06 143.1 9.1 True 60007.0
4 pretend_object 5 60008.03 144.0 10.0 True 60007.0
5 pretend_object 6 60008.06 144.1 10.1 True 60007.0

Success! The object was successfully linked. Now let’s play with this dataframe a little. First, let’s remove the first observation, so that we only have two complete tracklets.

[8]:
observations_two_tracklets = observations.iloc[1:].copy()
[9]:
observations_two_tracklets
[9]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked
1 pretend_object 2 60000.06 142.1 8.1 True
2 pretend_object 3 60005.03 143.0 9.0 True
3 pretend_object 4 60005.06 143.1 9.1 True
4 pretend_object 5 60008.03 144.0 10.0 True
5 pretend_object 6 60008.06 144.1 10.1 True
[10]:
unlinked_observations = PPLinkingFilter(observations_two_tracklets, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[11]:
unlinked_observations
[11]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD

As expected, we no longer link the object. Now let’s try putting the last two observations outside of the 15-day window.

[12]:
observations_large_window = observations.copy()
observations_large_window['fieldMJD_TAI'] = [60000.03, 60000.06, 60005.03, 60005.06, 60016.03, 60016.06]
[13]:
unlinked_observations = PPLinkingFilter(observations_large_window, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[14]:
unlinked_observations
[14]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD

Once again, we no longer link the object. What if we move the first two objects much closer to each other so that they no longer form a valid tracklet?

[15]:
observations_small_sep = observations.copy()
observations_small_sep["RA_deg"] = [142, 142.00001, 143, 143.1, 144, 144.1]
observations_small_sep["Dec_deg"] = [8, 8.00001, 9, 9.1, 10, 10.1]
[16]:
unlinked_observations = PPLinkingFilter(observations_small_sep, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[17]:
unlinked_observations
[17]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD

And the object is no longer linked. Finally, let’s move the first two observations much further apart in time so that they once again no longer form a valid tracklet.

[18]:
observations_large_time = observations.copy()
observations_large_time["fieldMJD_TAI"] = [60000.03, 60000.10, 60005.03, 60005.06, 60008.03, 60008.06]
[19]:
unlinked_observations = PPLinkingFilter(observations_large_time, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[20]:
unlinked_observations
[20]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD

And as expected, we no longer link the object.

Finally, let’s check that the detection efficiency works as expected. Let’s set it to 0.95.

[21]:
detection_efficiency = 0.95

Now let’s make a dataframe of the same linked object repeated 10000 times.

[22]:
objs = [["pretend_object_" + str(a)] * 6 for a in range(0, 10000)]
obj_id_long = [item for sublist in objs for item in sublist]
field_id_long = list(np.arange(1, 7)) * 10000
times_long = [60000.03, 60000.06, 60005.03, 60005.06, 60008.03, 60008.06] * 10000
ra_long = [142, 142.1, 143, 143.1, 144, 144.1] * 10000
dec_long = [8, 8.1, 9, 9.1, 10, 10.1] * 10000
[23]:
observations_long = pd.DataFrame(
    {
        "ObjID": obj_id_long,
        "FieldID": field_id_long,
        "fieldMJD_TAI": times_long,
        "RA_deg": ra_long,
        "Dec_deg": dec_long
    }
)
[24]:
observations_long
[24]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg
0 pretend_object_0 1 60000.03 142.0 8.0
1 pretend_object_0 2 60000.06 142.1 8.1
2 pretend_object_0 3 60005.03 143.0 9.0
3 pretend_object_0 4 60005.06 143.1 9.1
4 pretend_object_0 5 60008.03 144.0 10.0
... ... ... ... ... ...
59995 pretend_object_9999 2 60000.06 142.1 8.1
59996 pretend_object_9999 3 60005.03 143.0 9.0
59997 pretend_object_9999 4 60005.06 143.1 9.1
59998 pretend_object_9999 5 60008.03 144.0 10.0
59999 pretend_object_9999 6 60008.06 144.1 10.1

60000 rows × 5 columns

If detection efficiency were perfect, all of these objects would be linked. However, it is not. We have set the detection efficency to 0.95, so we should expect to return roughly 95% of these objects from the linking filter. Let’s find out.

[25]:
long_linked_observations = PPLinkingFilter(observations_long, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)
[26]:
long_linked_observations
[26]:
ObjID FieldID fieldMJD_TAI RA_deg Dec_deg object_linked date_linked_MJD
0 pretend_object_0 1 60000.03 142.0 8.0 True 60007.0
1 pretend_object_1624 1 60000.03 142.0 8.0 True 60007.0
2 pretend_object_5206 1 60000.03 142.0 8.0 True 60007.0
3 pretend_object_5205 1 60000.03 142.0 8.0 True 60007.0
4 pretend_object_1625 1 60000.03 142.0 8.0 True 60007.0
... ... ... ... ... ... ... ...
59995 pretend_object_5720 6 60008.06 144.1 10.1 True 60007.0
59996 pretend_object_5721 6 60008.06 144.1 10.1 True 60007.0
59997 pretend_object_5722 6 60008.06 144.1 10.1 True 60007.0
59998 pretend_object_5708 6 60008.06 144.1 10.1 True 60007.0
59999 pretend_object_9999 6 60008.06 144.1 10.1 True 60007.0

60000 rows × 7 columns

[27]:
len(long_linked_observations["ObjID"].unique())/10000
[27]:
1.0

This is close enough - the detection efficiency is stochastic, so some variation is to be expected.