Task: Binary Classification. Given two protein amino acid sequences, predict whether they physically interact or not.. Dataset: Human Reference Interactome with 51,813 positive PPI pairs across 8,248 proteins, plus equal number of experimentally-validated negative pairs (1) Protein 1 (ENSG00000127507) has sequence MGGRVFLVFLAFCVWLTLPGAETQDSRGCARWCPQDSSCVNATACRCNPGFSSFSEIITTPMETCDDINECATLSKVSCGKFSDCWNTEGSYDCVCSPGYEPVSGAKTFKNESENTCQDVDECQQNPRLCKSYGTCVNTLGSYTCQCLPGFKLKPEDPKLCTDVNECTSGQNPCHSSTHCLNNVGSYQCRCRPGWQPIPGSPNGPNNTVCEDVDECSSGQHQCDSSTVCFNTVGSYSCRCRPGWKPRHGIPNNQKDTVCEDMTFSTWTPPPGVHSQTLSRFFDKVQDLGRDYKPGLANNT.... Protein 2 (ENSG00000141401) has sequence MKPSGEDQAALAAGPWEECFQAAVQLALRAGQIIRKALTEEKRVSTKTSAADLVTETDHLVEDLIISELRERFPSHRFIAEEAAASGAKCVLTHSPTWIIDPIDGTCNFVHRFPTVAVSIGFAVRQELEFGVIYHCTEERLYTGRRGRGAFCNGQRLRVSGETDLSKALVLTEIGPKRDPATLKLFLSNMERLLHAKAHGVRVIGSSTLALCHLASGAADAYYQFGLHCWDLAAATVIIREAGGIVIDTSGGPLDLMACRVVAASTREMAMLIAQALQTINYGRDDEK*MKPSGEDQAAL.... Result: 0 (the proteins do not interact). (2) Protein 2 (ENSG00000142494) has sequence MEAPEEPAPVRGGPEATLEVRGSRCLRLSAFREELRALLVLAGPAFLVQLMVFLISFISSVFCGHLGKLELDAVTLAIAVINVTGVSVGFGLSSACDTLISQTYGSQNLKHVGVILQRSALVLLLCCFPCWALFLNTQHILLLFRQDPDVSRLTQTYVTIFIPALPATFLYMLQVKYLLNQGIVLPQIVTGVAANLVNALANYLFLHQLHLGVIGSALANLISQYTLALLLFLYILGKKLHQATWGGWSLECLQDWASFLRLAIPSMLMLCMEWWAYEVGSFLSGILGMVELGAQSIVYE.... Result: 0 (the proteins do not interact). Protein 1 (ENSG00000170925) has sequence MALRPEDPSSGFRHGNVVAFIIEKMARHTKGPEFYFENISLSWEEVEDKLRAILEDSEVPSEVKEACTWGSLALGVRFAHRQGQLQNRRVQWLQGFAKLHRSAALVLASNLTELKEQQEMECNEATFQLQLTETSLAEVQRERDMLRWKLFHAELAPPQGQGQATVFPGLATAGGDWTEGAGEQEKEAVAAAGAAGGKGEERYAEAGPAPAEVLQGLGGGFRQPLGAIVAGKLHLCGAEGERSQVSTNSHVCLLWAWVHSLTGASSCPAPYLIHILIPMPFVRLLSHTQYTPFTSKGHRT.... (3) Protein 1 (ENSG00000160862) has sequence MVRMVPVLLSLLLLLGPAVPQENQDGRYSLTYIYTGLSKHVEDVPAFQALGSLNDLQFFRYNSKDRKSQPMGLWRQVEGMEDWKQDSQLQKAREDIFMETLKDIVEYYNDSNGSHVLQGRFGCEIENNRSSGAFWKYYYDGKDYIEFNKEIPAWVPFDPAAQITKQKWEAEPVYVQRAKAYLEEECPATLRKYLKYSKNILDRQDPPSVVVTSHQAPGEKKKLKCLAYDFYPGKIDVHWTRAGEVQEPELRGDVLHNGNGTYQSWVVVAVPPQDTAPYSCHVQHSSLAQPLVVPWEAS*X.... Protein 2 (ENSG00000137770) has sequence MRLRTRKASQQSNQIQTQRTARAKRKYSEVDDSLPSGGEKPSKNETGLLSSIKKFIKGSTPKEERENPSKRSRIERDIDNNLITSTPRAGEKPNKQISRVRRKSQVNGEAGSYEMTNQHVKQNGKLEDNPSSGSPPRTTLLGTIFSPVFNFFSPANKNGTSGSDSPGQAVEAEEIVKQLDMEQVDEITTSTTTSTNGAAYSNQAVQVRPSLNNGLEEAEETVNRDIPPLTAPVTPDSGYSSAHAEATYEEDWEVFDPYYFIKHVPPLTEEQLNRKPALPLKTRSTPEFSLVLDLDETLVH.... Result: 0 (the proteins do not interact). (4) Protein 2 (ENSG00000196584) has sequence MCSAFHRAESGTELLARLEGRSSLKEIEPNLFADEDSPVHGDILEFHGPEGTGKTEMLYHLTARCILPKSEGGLEVEVLFIDTDYHFDMLRLVTILEHRLSQSSEEIIKYCLGRFFLVYCSSSTHLLLTLYSLESMFCSHPSLCLLILDSLSAFYWIDRVNGGESVNLQESTLRKCSQCLEKLVNDYRLVLFATTQTIMQKASSSSEEPSHASRRLCDVDIDYRPYLCKAWQQLVKHRMFFSKQDDSQSSNQFSLVSRCLKSNSLKKHFFIIGESGVEFC*. Result: 0 (the proteins do not interact). Protein 1 (ENSG00000139921) has sequence MAPSGSLAVPLAVLVLLLWGAPWTHGRRSNVRVITDENWRELLEGDWMIEFYAPWCPACQNLQPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHCKDGEFRRYQGPRTKKDFINFISDKEWKSIEPVSSWFGPGSVLMSSMSALFQLSMWIRTCHNYFIEDLGLPVWGSYTVFALATLFSGLLLGLCMIFVADCLCPSKRRRPQPYPYPSKKLLSESAQPLKKVEEEQEADEEDVSEEEAESKEGTNKDFPQNAIRQRSLGPSLATDKS*MAPSGSLAVPLAVLVLLLW.... (5) Protein 1 (ENSG00000163817) has sequence MEKARPLWANSLQFVFACISYAVGLGNVWRFPYLCQMYGGGSFLVPYIIMLIVEGMPLLYLELAVGQRMRQGSIGAWRTISPYLSGVGVASVVVSFFLSMYYNVINAWAFWYLFHSFQDPLPWSVCPLNGNHTGYDEECEKASSTQYFWYRKTLNISPSLQENGGVQWEPALCLLLAWLVVYLCILRGTESTGKIEQLANPKAWINAATQIFFSLGLGFGSLIAFASYNEPSNNCQKHAIIVSLINSFTSIFASIVTFSIYGFKATFNYENCLKKVSLLLTNTFDLEDGFLTASNLEQVK.... Protein 2 (ENSG00000169583) has sequence MAETKLQLFVKASEDGESVGHCPSCQRLFMVLLLKGVPFTLTTVDTRRSPDVLKDFAPGSQLPILLYDSDAKTDTLQIEDFLEETLGPPDFPSLAPRYRESNTAGNDVFHKFSAFIKNPVPAQDEALYQQLLRALARLDSYLRAPLEHELAGEPQLRESRRRFLDGDRLTLADCSLLPKLHIVDTVCAHFRQAPIPAELRGVRRYLDSAMQEKEFKYTCPHSAEILAAYRPAVHPR*. Result: 0 (the proteins do not interact). (6) Protein 1 (ENSG00000080815) has sequence MTELPAPLSYFQNAQMSEDNHLSNTVRSQNDNRERQEHNDRRSLGHPEPLSNGRPQGNSRQVVEQDEEEDEELTLKYGAKHVIMLFVPVTLCMVVVVATIKSVSFYTRKDGQLIYTPFTEDTETVGQRALHSILNAAIMISVIVVMTILLVVLYKYRCYKVIHAWLIISSLLLLFFFSFIYLGEVFKTYNVAVDYITVALLIWNFGVVGMISIHWKGPLRLQQAYLIMISALMALVFIKYLPEWTAWLILAVISVYDLVAVLCPKGPLRMLVETAQERNETLFPALIYSSTMVWLVNMAE.... Protein 2 (ENSG00000165322) has sequence MKMADRSGKIIPGQVYIEVEYDYEYEAKDRKIVIKQGERYILVKKTNDDWWQVKPDENSKAFYVPAQYVKEVTRKALMPPVKQVAGLPNNSTKIMQSLHLQRSTENVNKLPELSSFGKPSSSVQGTGLIRDANQNFGPSYNQGQTVNLSLDLTHNNGKFNNDSHSPKVSSQNRTRSFGHFPGPEFLDVEKTSFSQEQSCDSAGEGSERIHQDSESGDELSSSSTEQIRATTPPNQGRPDSPVYANLQELKISQSALPPLPGSPAIQINGEWETHKDSSGRCYYYNRGTQERTWKPPRWTR.... Result: 0 (the proteins do not interact). (7) Protein 1 (ENSG00000167769) has sequence MPSIFAYQSSEVDWCESNFQYSELVAEFYNTFSNIPFFIFGPLMMLLMHPYAQKRSRYIYVVWVLFMIIGLFSMYFHMTLSFLGQLLDEIAILWLLGSGYSIWMPRCYFPSFLGGNRSQFIRLVFITTVVSTLLSFLRPTVNAYALNSIALHILYIVCQEYRKTSNKELRHLIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYLHSIWHVLISITFPYGMVTMALVDANYEMPGETLKVRYWPRDSWPVGLPYVEIRGDDKDC*. Protein 2 (ENSG00000163611) has sequence MSFVRVNRCGPRVGVRKTPKVKKKKTSVKQEWDNTVTDLTVHRATPEDLVRRHEIHKSKNRALVHWELQEKALKRKWRKQKPETLNLEKRRLSIMKEILSDQYQMQDVLEKSDHLIAAAKELFPRRRTGFPNVTVAPDSSQGPIVVNQDPITQSIFNESVIEPQALNDVDGEEEGTVNSQSGESENENELDNSLNSQSNTNTDRFLQQLTEENFELISKLWTDIQQKIATQSQITPPGTPSSALSSGEQRAALNATNAVKRLQTRLQPEESTETLDSSYVVGHVLNSRKQKQLLNKVKRK.... Result: 0 (the proteins do not interact). (8) Protein 1 (ENSG00000169398) has sequence MAAAYLDPNLNHTPNSSTKTHLGTGMERSPGAMERVLKVFHYFESNSEPTTWASIIRHGDATDVRGIIQKIVDSHKVKHVACYGFRLSHLRSEEVHWLHVDMGVSSVREKYELAHPPEEWKYELRIRYLPKGFLNQFTEDKPTLNFFYQQVKSDYMLEIADQVDQEIALKLGCLEIRRSYWEMRGNALEKKSNYEVLEKDVGLKRFFPKSLLDSVKAKTLRKLIQQTFRQFANLNREESILKFFEILSPVYRFDKECFKCALGSSWIISVELAIGPEEGISYLTDKGCNPTHLADFTQVQ.... Protein 2 (ENSG00000205744) has sequence MESRAEGGSPAVFDWFFEAACPASLQEDPPILRQFPPDFRDQEAMQMVPKFCFPFDVEREPPSPAVQHFTFALTDLAGNRRFGFCRLRAGTQSCLCILSHLPWFEVFYKLLNTVGDLLAQDQVTEAEELLQNLFQQSLSGPQASVGLELGSGVTVSSGQGIPPPTRGNSKPLSCFVAPDSGRLPSIPENRNLTELVVAVTDENIVGLFAALLAERRVLLTASKLSTLTSCVHASCALLYPMRWEHVLIPTLPPHLLDYCCAPMPYLIGVHASLAERVREKALEDVVVLNVDANTLETTFN.... Result: 0 (the proteins do not interact).