Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The compound is Nc1ccc(S(N)(=O)=O)cc1I. The target protein (P40881) has sequence MMFNKQIFTILILSLSLALAGSGCISEGAEDNVAQEITVDEFSNIRENPVTPWNPEPSAPVIDPTAYIDPQASVIGEVTIGANVMVSPMASIRSDEGMPIFVGDRSNVQDGVVLHALETINEEGEPIEDNIVEVDGKEYAVYIGNNVSLAHQSQVHGPAAVGDDTFIGMQAFVFKSKVGNNCVLEPRSAAIGVTIPDGRYIPAGMVVTSQAEADKLPEVTDDYAYSHTNEAVVYVNVHLAEGYKETS. The pKi is 6.2. (2) The small molecule is CC(/C=C/c1ccccc1-c1cc(C(C)C)cc(C(C)C)c1OCCCF)=C\C(=O)O. The target protein (P37230) has sequence MVDTESPICPLSPLEADDLESPLSEEFLQEMGNIQEISQSLGEESSGSFSFADYQYLGSCPGSEGSVITDTLSPASSPSSVSCPAVPTSTDESPGNALNIECRICGDKASGYHYGVHACEGCKGFFRRTIRLKLAYDKCDRSCKIQKKNRNKCQYCRFHKCLSVGMSHNAIRFGRMPRSEKAKLKAEILTCEHDLKDSETADLKSLAKRIHEAYLKNFNMNKVKARVILAGKTSNNPPFVIHDMETLCMAEKTLVAKMVANGVENKEAEVRFFHCCQCMSVETVTELTEFAKAIPGFANLDLNDQVTLLKYGVYEAIFTMLSSLMNKDGMLIAYGNGFITREFLKNLRKPFCDIMEPKFDFAMKFNALELDDSDISLFVAAIICCGDRPGLLNIGYIEKLQEGIVHVLKLHLQSNHPDDTFLFPKLLQKMVDLRQLVTEHAQLVQVIKKTESDAALHPLLQEIYRDMY. The pKi is 5.0. (3) The compound is C[C@@H]1NC[C@@H](O)[C@H](O)C1(F)F. The target protein (P48825) has sequence MKLSWLEAAALTAASVVSADELAFSPPFYPSPWANGQGEWAEAYQRAVAIVSQMTLDEKVNLTTGTGWELEKCVGQTGGVPRLNIGGMCLQDSPLGIRDSDYNSAFPAGVNVAATWDKNLAYLRGQAMGQEFSDKGIDVQLGPAAGPLGRSPDGGRNWEGFSPDPALTGVLFAETIKGIQDAGVVATAKHYILNEQEHFRQVAEAAGYGFNISDTISSNVDDKTIHEMYLWPFADAVRAGVGAIMCSYNQINNSYGCQNSYTLNKLLKAELGFQGFVMSDWGAHHSGVGSALAGLDMSMPGDITFDSATSFWGTNLTIAVLNGTVPQWRVDDMAVRIMAAYYKVGRDRLYQPPNFSSWTRDEYGFKYFYPQEGPYEKVNHFVNVQRNHSEVIRKLGADSTVLLKNNNALPLTGKERKVAILGEDAGSNSYGANGCSDRGCDNGTLAMAWGSGTAEFPYLVTPEQAIQAEVLKHKGSVYAITDNWALSQVETLAKQASVSL.... The pKi is 3.0. (4) The drug is C[N+](C)(C)CCOC(N)=O. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFCTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKi is 4.7. (5) The drug is CC(=O)NS(=O)(=O)[C@H]1C2CCC(CC2)[C@@H]1Nc1nc(-c2c[nH]c3ncc(F)cc23)ncc1F. The target protein (P53350) has sequence MSAAVTAGKLARAPADPGKAGVPGVAAPGAPAAAPPAKEIPEVLVDPRSRRRYVRGRFLGKGGFAKCFEISDADTKEVFAGKIVPKSLLLKPHQREKMSMEISIHRSLAHQHVVGFHGFFEDNDFVFVVLELCRRRSLLELHKRRKALTEPEARYYLRQIVLGCQYLHRNRVIHRDLKLGNLFLNEDLEVKIGDFGLATKVEYDGERKKTLCGTPNYIAPEVLSKKGHSFEVDVWSIGCIMYTLLVGKPPFETSCLKETYLRIKKNEYSIPKHINPVAASLIQKMLQTDPTARPTINELLNDEFFTSGYIPARLPITCLTIPPRFSIAPSSLDPSNRKPLTVLNKGLENPLPERPREKEEPVVRETGEVVDCHLSDMLQQLHSVNASKPSERGLVRQEEAEDPACIPIFWVSKWVDYSDKYGLGYQLCDNSVGVLFNDSTRLILYNDGDSLQYIERDGTESYLTVSSHPNSLMKKITLLKYFRNYMSEHLLKAGANITPR.... The pKi is 5.4. (6) The drug is COc1ccccc1N1CCN(CCCCn2ncc(=O)n(C)c2=O)CC1. The target protein (P17870) has sequence MGDKGTRVFKKASPNGKLTVYLGKRDFVDHIDLVEPVDGVVLVDPEYLKERRVYVTLTCAFRYGREDLDVLGLTFRKDLFVANVQSFPPAPEDKKPLTRLQERLIKKLGEHAYPFTFEIPPNLPCSVTLQPGPEDTGKACGVDYEVKAFCAENLEEKIHKRNSVRLVIRKVQYAPERPGPQPTAETTRQFLMSDKPLHLEASLDKEIYYHGEPISVNVHVTNNTNKTVKKIKISVRQYADICLFNTAQYKCPVAMEEADDTVAPSSTFCKVYTLTPFLANNREKRGLALDGKLKHEDTNLASSTLLREGANREILGIIVSYKVKVKLVVSRGGLLGDLASSDVAVELPFTLMHPKPKEEPPHREVPEHETPVDTNLIELDTNDDDIVFEDFARQRLKGMKDDKEEEEDGTGSPRLNDR. The pKi is 5.0. (7) The compound is CC(C)[C@H]1C(=O)N[C@H](CO)Cc2cc(NC(=O)COCCOCCOCCOCCOCCOCC(=O)Nc3ccc4c(c3)N(C)[C@@H](C(C)C)C(=O)N[C@H](CO)C4)ccc2N1C. The target protein (P04409) has sequence MADVFPAAEPAAPQDVANRFARKGALRQKNVHEVKNHRFIARFFKQPTFCSHCTDFIWGFGKQGFQCQVCCFVVHKRCHEFVTFSCPGADKGPDTDDPRSKHKFKIHTYGSPTFCDHCGSLLYGLIHQGMKCDTCDMNVHKQCVINVPSLCGMDHTEKRGRIYLKAEVTDEKLHVTVRDAKNLIPMDPNGLSDPYVKLKLIPDPKNESKQKTKTIRSTLNPRWDESFTFKLKPSDKDRRLSEEIWDWDRTTRNDFMGSLSFGVSELMKMPASGWYKLLNQEEGEYYNVPIPEGDEEGNVELRQKFEKAKLGPAGNKVISPSEDRRQPSNNLDRVKLTDFNFLMVLGKGSFGKVMLADRKGTEELYAIKILKKDVVIQDDDVECTMVEKRVLALLDKPPFLTQLHSCFQTVDRLYFVMEYVNGGDLMYHIQQVGKFKEPQAVFYAAEISIGLFFLHKRGIIYRDLKLDNVMLDSEGHIKIADFGMCKEHMMDGVTTRTFCG.... The pKi is 5.3.