INFO:root:Generating CSV. INFO:root:Created data/interim/ppat/ppat.csv. INFO:root:Pre-computing distances. INFO:root:Generating 3 partitions. INFO:root:Partition not possible at threshold 0.5. Less than 25 % of sequences found in a split: INFO:root:- 1.82 % (11/603) in split 1. INFO:root:- 19.90 % (120/603) in split 2. INFO:root:- 78.28 % (472/603) in split 3. INFO:root:Increasing threshold from 0.5 to 0.525 to achieve balance. INFO:root:Partition not possible at threshold 0.525. Less than 25 % of sequences found in a split: INFO:root:- 17.58 % (103/586) in split 1. INFO:root:- 29.18 % (171/586) in split 2. INFO:root:- 53.24 % (312/586) in split 3. INFO:root:Increasing threshold from 0.525 to 0.55 to achieve balance. INFO:root:Dataset successfully split at threshold 0.55. INFO:root:Summary: INFO:root:- 29.59 % (182/615) in split 1, with p(class=1) = 68.68 %. INFO:root:- 38.05 % (234/615) in split 2, with p(class=1) = 68.38 %. INFO:root:- 32.36 % (199/615) in split 3, with p(class=1) = 71.86 %. INFO:root:Final dataset size: 615. INFO:root:File saved in data/processed/ppat/ppat.csv