10.2 Finding the Longest Protein
This was another problem that most people seemed to have trouble with. So, I thought that I would also present two approaches for this question:
10.2.1 Approach #1: a linear search
Here, we will declare a variable longest
that captures the sublist with the longest protein. For no particular reason, I assume that the first sublist of many_lists_copy
has the longest protein; I then used a for
loop to iterate through many_lists_copy
to find the sublist with the longest protein.
= many_lists_copy[0]
longest
for i in many_lists_copy:
if len(i[-1]) > len(longest[-1]):
= i
longest
print(f"The longest protein has accession {longest[0]} and length {len(longest[-1])}")
Note that many_lists_copy
’s sublists also have the same structure as the sublists in many_lists
!
10.2.2 Approach #2: Using sorted()
The sorted()
function returns the sorted version of a list. Depending on what you want to do with the accession IDs and the associated protein lengths, the sorted()
function may be ideal.
Nevertheless, I use an anonymous function to sort many_lists_copy
by their protein lengths in reverse order (specified by the reverse = True
argument):
= sorted(many_lists_copy, key = lambda x : len(x[-1]), reverse = True)[0]
longest
print(f"The longest protein has accession {longest[0]} and length {len(longest[-1])}")
I then chose the first sublist returned by the sorted()
function!