Challenge:
The file Challenge.txtcontain a list of itineraries. Each row (or itinerary) contains a list of destinations separated by the char ";".
SUPER-HARD: find the 10 most frequent subset of destinations between all itineraries (regardless the subset size and ignoring size 1)and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination):
1. AB: 1000
2. ACD: 500
3. ABCD: 100
4. AFD: 50
5. AR: 20
6: ABCDEFG: 5
7: AFGP: 4
8: AT: 3
9: BN: 2
10: GT: 2
HARD: find the 10 most frequent size-n subset of destinations between all the itineraries and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination)AND given n= 3:
1. ABC: 1000
2. ACD: 500
3. ADE: 100
4. AFD: 50
5. ARE: 20
6: EFG: 5
7: AFP: 4
8: ATR: 3
9: BNK: 2
10: KGT: 2
HARD - : find the 10 most frequent size 3 subset of destinations between all the itineraries and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination) note: n is fixed to 3.
1. ABC: 1000
2. ACD: 500
3. ADE: 100
4. AFD: 50
5. ARE: 20
6: EFG: 5
7: AFP: 4
8: ATR: 3
9: BNK: 2
10: KGT: 2
The file Challenge.txtcontain a list of itineraries. Each row (or itinerary) contains a list of destinations separated by the char ";".
SUPER-HARD: find the 10 most frequent subset of destinations between all itineraries (regardless the subset size and ignoring size 1)and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination):
1. AB: 1000
2. ACD: 500
3. ABCD: 100
4. AFD: 50
5. AR: 20
6: ABCDEFG: 5
7: AFGP: 4
8: AT: 3
9: BN: 2
10: GT: 2
HARD: find the 10 most frequent size-n subset of destinations between all the itineraries and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination)AND given n= 3:
1. ABC: 1000
2. ACD: 500
3. ADE: 100
4. AFD: 50
5. ARE: 20
6: EFG: 5
7: AFP: 4
8: ATR: 3
9: BNK: 2
10: KGT: 2
HARD - : find the 10 most frequent size 3 subset of destinations between all the itineraries and give the amount of times each subset appears on the dataset.
Example of expected output (each letter is a destination) note: n is fixed to 3.
1. ABC: 1000
2. ACD: 500
3. ADE: 100
4. AFD: 50
5. ARE: 20
6: EFG: 5
7: AFP: 4
8: ATR: 3
9: BNK: 2
10: KGT: 2