INFO:root:Generating CSV. INFO:root:Created data/interim/gh114/gh114.csv. INFO:root:Pre-computing distances. INFO:root:Generating 3 partitions. INFO:root:Partition not possible at threshold 0.3. Less than 25 % of sequences found in a split: INFO:root:- 66.67 % (36/54) in split 1. INFO:root:- 31.48 % (17/54) in split 2. INFO:root:- 1.85 % (1/54) in split 3. INFO:root:Increasing threshold from 0.3 to 0.35 to achieve balance. INFO:root:Partition not possible at threshold 0.35. Less than 25 % of sequences found in a split: INFO:root:- 65.45 % (36/55) in split 1. INFO:root:- 23.64 % (13/55) in split 2. INFO:root:- 10.91 % (6/55) in split 3. INFO:root:Increasing threshold from 0.35 to 0.4 to achieve balance. INFO:root:Partition not possible at threshold 0.4. Less than 25 % of sequences found in a split: INFO:root:- 67.27 % (37/55) in split 1. INFO:root:- 20.00 % (11/55) in split 2. INFO:root:- 12.73 % (7/55) in split 3. INFO:root:Increasing threshold from 0.4 to 0.45 to achieve balance. INFO:root:Partition not possible at threshold 0.45. Less than 25 % of sequences found in a split: INFO:root:- 20.93 % (9/43) in split 1. INFO:root:- 25.58 % (11/43) in split 2. INFO:root:- 53.49 % (23/43) in split 3. INFO:root:Increasing threshold from 0.45 to 0.5 to achieve balance. INFO:root:Partition not possible at threshold 0.5. Less than 25 % of sequences found in a split: INFO:root:- 28.30 % (15/53) in split 1. INFO:root:- 18.87 % (10/53) in split 2. INFO:root:- 52.83 % (28/53) in split 3. INFO:root:Increasing threshold from 0.5 to 0.55 to achieve balance. INFO:root:Dataset successfully split at threshold 0.55. INFO:root:Summary: INFO:root:- 36.36 % (20/55) in split 1, with p(class=1) = 50.00 %. INFO:root:- 32.73 % (18/55) in split 2, with p(class=1) = 44.44 %. INFO:root:- 30.91 % (17/55) in split 3, with p(class=1) = 35.29 %. INFO:root:Final dataset size: 55. INFO:root:File saved in data/processed/gh114/gh114.csv