This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 7.3. The small molecule is CC(F)(F)c1cncc(-c2nc(NC3CCC(F)(F)CC3)nc(NC3CCC(F)(F)CC3)n2)n1. The target protein sequence is MAGYLRVVRSLCRASGSRPAWAPAALTAPTSQEQPRRHYADKRIKVAKPVVEMDGDEMTRIIWQFIKEKLILPHVDIQLKYFDLGLPNRDQTDDQVTIDSALATQKYSVAVKCATITPDEARVEEFKLKKMWKSPNGTIGNILGGTVFREPIICKNIPRLVPGWTKPITIGRHAHGDQYKATDFVADRAGTFKMVFTPKDGSGVKEWEVYNFPAGGVGMGMYNTDESISGFAHSCFQYAIQKKWPLYMSTKNTILKAYDGRFKDIFQEIFDKHYKTDFDKNKIWYEHRLIDDMVAQVLKSSGGFVWACKNYDGDVQSDILAQGFGSLGLMTSVLVCPDGKTIEAEAAHGTVTRHYREHQKGRPTSTNPIASIFAWTRGLEHRGKLDGNQDLIRFAQMLEKVCVETVESGAMTKDLAGCIHGLSNVKLNEHFLNTTDFLDTIKSNLDRALGRQ. (2) The small molecule is O=c1sn(Cc2ccccc2)c(=O)n1Cc1ccccc1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.0. (3) The compound is COC(=O)c1ccc(C(C/C=C/c2ccccc2)(Cc2ccc(C(F)(F)P(=O)(O)O)cc2)n2nnc3ccccc32)cc1. The target protein sequence is MDYKDDDDKLEFMEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDVSPFDHSRIKLHQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMVWEQKSRGVVMLNRVMEKGSVKCAQYWPQKEEKEMIFEDTNLKLTLISEDIKSYYTVRQLELENLTTQETREILHFHYTTWPDFGVPESPASFLNFLFKVRESGSLSPEHGPVVVHCSAGIGRSGTFCLADTCLLLMDKRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELSHED. The pIC50 is 6.8. (4) The small molecule is Cc1ccc(Cn2ccsc2=NC(=O)COC(=O)c2ccc(O)cc2O)cc1. The target protein (Q96LD8) has sequence MDPVVLSYMDSLLRQSDVSLLDPPSWLNDHIIGFAFEYFANSQFHDCSDHVSFISPEVTQFIKCTSNPAEIAMFLEPLDLPNKRVVFLAINDNSNQAAGGTHWSLLVYLQDKNSFFHYDSHSRSNSVHAKQVAEKLEAFLGRKGDKLAFVEEKAPAQQNSYDCGMYVICNTEALCQNFFRQQTESLLQLLTPAYITKKRGEWKDLITTLAKK. The pIC50 is 4.0. (5) The target protein (Q9NQR1) has sequence MGEGGAAAALVAAAAAAAAAAAAVVAGQRRRRLGRRARCHGPGRAAGGKMSKPCAVEAAAAAVAATAPGPEMVERRGPGRPRTDGENVFTGQSKIYSYMSPNKCSGMRFPLQEENSVTHHEVKCQGKPLAGIYRKREEKRNAGNAVRSAMKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPPSSCDSTNAAIAKQALKKPIKGKQAPRKKAQGKTQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGMKIDLIDGKGRGVIATKQFSRGDFVVEYHGDLIEITDAKKREALYAQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQTKLHDIDGVPHLILIASRDIAAGEELLYDYGDRSKASIEAHPWLKH. The compound is COc1cc2nc(N3CCCC3)nc(NCCCCCN3CCCCC3)c2cc1OC. The pIC50 is 5.1. (6) The small molecule is CC(C)[C@H](OC(=O)N[C@](C)(Cc1cccc(F)c1F)C(=O)NCCCCCCCNC(N)=O)c1ccccc1. The target protein (P30098) has sequence MASPAGNLSAWPGWGWPPPAALRNLTSSPAPTASPSPAPSWTPSPRPGPAHPFLQPPWAVALWSLAYGAVVAVAVLGNLVVIWIVLAHKRMRTVTNSFLVNLAFADAAMAALNALVNFIYALHGEWYFGANYCRFQNFFPITAVFASIYSMTAIAVDRYMAIIDPLKPRLSATATRIVIGSIWILAFLLAFPQCLYSKIKVMPGRTLCYVQWPEGSRQHFTYHMIVIVLVYCFPLLIMGITYTIVGITLWGGEIPGDTCDKYQEQLKAKRKVVKMMIIVVVTFAICWLPYHIYFILTAIYQQLNRWKYIQQVYLASFWLAMSSTMYNPIIYCCLNKRFRAGFKRAFRWCPFIHVSSYDELELKATRLHPMRQSSLYTVTRMESMSVVFDSNDGDSARSSHQKRGTTRDVGSNVCSRRNSKSTSTTASFVSSSHMSVEEGS. The pIC50 is 8.4. (7) The compound is COc1ccc(-c2cc3c(N4CC(C(=O)NCc5ccc(C)cc5)C4)ncnn3c2)cc1F. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... The pIC50 is 7.3. (8) The small molecule is C=CS(=O)(=O)Nc1cccc(Cc2nn(C3CCCC3)c3ncnc(N)c23)c1. The target protein sequence is QTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 5.3. (9) The drug is Cc1ccc(C(=O)Nc2ccc(-c3nc4c(c(=O)n(C)c(=O)n4C)n3C)cc2)cc1. The target protein (Q8BW75) has sequence MSNKSDVIVVGGGISGMAAAKLLHDCGLSVVVLEARDRVGGRTYTIRNKNVKYVDLGGSYVGPTQNRILRLAKELGLETYKVNEVERLIHFVKGKSYAFRGPFPPVWNPITYLDNNNLWRTMDEMGQEIPSDAPWKAPLAEEWDYMTMKELLDKICWTKSTKQIATLFVNLCVTAETHEVSALWFLWYVKQCGGTTRIISTTNGGQERKFIGGSGQVSERIKDILGDRVKLERPVIHIDQTGENVIVKTLNHEIYEAKYVISAIPPALGMKIHYSPPLPMLRNQLISRVPLGSVIKCMVYYKEPFWRKKDFCGTMVIEGEEAPIAYTLDDTKPDGTYAAIMGFILAHKARKLVRLTKEERLRKLCELYAKVLNSQEALQPVHYEEKNWCEEQYSGGCYTTYFPPGILTQYGRVLRQPVGKIFFAGTETASHWSGYMEGAVEAGERAAREILHAIGKIPEDEIWQPEPESLDVPARPITSTFLERHLPSVPGLLKLFGLTT.... The pIC50 is 5.0.