This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1ccc(S(=O)(=O)Nc2ccc(C(=O)/C=C/c3ccc(O)cc3)cc2)cc1. The target protein (P06278) has sequence MKQQKRLYARLLTLLFALIFLLPHSAAAAANLNGTLMQYFEWYMPNDGQHWKRLQNDSAYLAEHGITAVWIPPAYKGTSQADVGYGAYDLYDLGEFHQKGTVRTKYGTKGELQSAIKSLHSRDINVYGDVVINHKGGADATEDVTAVEVDPADRNRVISGEHRIKAWTHFHFPGRGSTYSDFKWHWYHFDGTDWDESRKLNRIYKFQGKAWDWEVSNENGNYDYLMYADIDYDHPDVAAEIKRWGTWYANELQLDGFRLDAVKHIKFSFLRDWVNHVREKTGKEMFTVAEYWQNDLGALENYLNKTNFNHSVFDVPLHYQFHAASTQGGGYDMRKLLNSTVVSKHPLKAVTFVDNHDTQPGQSLESTVQTWFKPLAYAFILTRESGYPQVFYGDMYGTKGDSQREIPALKHKIEPILKARKQYAYGAQHDYFDHHDIVGWTREGDSSVANSGLAALITDGPGGAKRMYVGRQNAGETWHDITGNRSEPVVINSEGWGEFH.... The pIC50 is 4.1. (2) The compound is COc1ccc(CC(=O)c2ccc(O)c(O)c2)cc1. The target protein (O42713) has sequence MSLIATVGPTGGVKNRLNIVDFVKNEKFFTLYVRSLELLQAKEQHDYSSFFQLAGIHGLPFTEWAKERPSMNLYKAGYCTHGQVLFPTWHRTYLSVLEQILQGAAIEVAKKFTSNQTDWVQAAQDLRQPYWDWGFELMPPDEVIKNEEVNITNYDGKKISVKNPILRYHFHPIDPSFKPYGDFATWRTTVRNPDRNRREDIPGLIKKMRLEEGQIREKTYNMLKFNDAWERFSNHGISDDQHANSLESVHDDIHVMVGYGKIEGHMDHPFFAAFDPIFWLHHTNVDRLLSLWKAINPDVWVTSGRNRDGTMGIAPNAQINSETPLEPFYQSGDKVWTSASLADTARLGYSYPDFDKLVGGTKELIRDAIDDLIDERYGSKPSSGARNTAFDLLADFKGITKEHKEDLKMYDWTIHVAFKKFELKESFSLLFYFASDGGDYDQENCFVGSINAFRGTAPETCANCQDNENLIQEGFIHLNHYLARDLESFEPQDVHKFLKE.... The pIC50 is 3.8. (3) The drug is CCC1=C(C(C)C)/C(=C/C(C)=C\C=C/C(C)=C/C(=O)O)CCC1. The target protein (P28700) has sequence MDTKHFLPLDFSTQVNSSSLNSPTGRGSMAVPSLHPSLGPGIGSPLGSPGQLHSPISTLSSPINGMGPPFSVISSPMGPHSMSVPTTPTLGFGTGSPQLNSPMNPVSSTEDIKPPLGLNGVLKVPAHPSGNMASFTKHICAICGDRSSGKHYGVYSCEGCKGFFKRTVRKDLTYTCRDNKDCLIDKRQRNRCQYCRYQKCLAMGMKREAVQEERQRGKDRNENEVESTSSANEDMPVEKILEAELAVEPKTETYVEANMGLNPSSPNDPVTNICQAADKQLFTLVEWAKRIPHFSELPLDDQVILLRAGWNELLIASFSHRSIAVKDGILLATGLHVHRNSAHSAGVGAIFDRVLTELVSKMRDMQMDKTELGCLRAIVLFNPDSKGLSNPAEVEALREKVYASLEAYCKHKYPEQPGRFAKLLLRLPALRSIGLKCLEHLFFFKLIGDTPIDTFLMEMLEAPHQAT. The pIC50 is 6.1. (4) The compound is O=C(/C=C/c1cc(O)c(O)c([N+](=O)[O-])c1)c1ccccc1. The target protein (P17289) has sequence MPTPNAASPQAKGFRRAVSELDAKQAEAIMSPRFVGRRQSLIQDARKEREKAEAAASSSESAEAAAWLERDGEAVLTLLFALPPTRPPALTRAIKVFETFEAHLHHLETRPAQPLRAGSPPLECFVRCEVPGPVVPALLSALRRVAEDVRAAGESKVLWFPRKVSELDKCHHLVTKFDPDLDLDHPGFSDQAYRQRRKLIAEIAFQYKQGDPIPHVEYTAEETATWKEVYSTLRGLYPTHACREHLEAFELLERFCGYREDRIPQLEDVSRFLKERTGFQLRPAAGLLSARDFLASLAFRVFQCTQYIRHASSPMHSPEPECCHELLGHVPMLADRTFAQFSQDIGLASLGVSDEEIEKLSTLYWFTVEFGLCKQNGEVKAYGAGLLSSYGELLHSLSEEPEIRAFDPDAAAVQPYQDQTYQPVYFVSESFSDAKDKLRSYASRIQRPFSVKFDPYTLAIDVLDSPHAIRHALDGVQDEMQALAHALNAIS. The pIC50 is 4.3. (5) The compound is Cc1cn(CCOCCOC(c2ccccc2)(c2ccccc2)c2ccc(Cl)cc2)c(=O)[nH]c1=O. The target protein sequence is MASYPCHQHASAFDQAARSRGHSNRRTALRPRRQQEATEVRLEQKMPTLLRVYIDGPHGMGKTTTTQLLVALGSRDDIVYVPEPMTYWQVLGASETIANIYTTQHRLDQGEISAGDAAVVMTSAQITMGMPYAVTDAVLAPHIGGEAGSSHAPPPALTLIFDRHPIAALLCYPAARYLMGSMTPQAVLAFVALIPPTLPGTNIVLGALPEDRHIDRLAKRQRPGERLDLAMLAAIRRVYGLLANTVRYLQGGGSWREDWGQLSGTAVPPQGAEPQSNAGPRPHIGDTLFTLFRAPELLAPNGDLYNVFAWALDVLAKRLRPMHVFILDYDQSPAGCRDALLQLTSGMVQTHVTTPGSIPTICDLARTFAREMGEAN. The pIC50 is 5.6. (6) The compound is Cc1onc(-c2ccccc2)c1CN(C1Cc2cc(C#N)ccc2N(Cc2cncn2C)C1)S(=O)(=O)c1ccccn1. The target protein sequence is MGFTSLGLSAPILKAVEEQGYSTPSPIQLQAIPAVIEGKDVMAAAQTGTGKTAGFTLPLLERLSNGPKRKFNQVRALVLTPTRELAAQVHESVEKYSKNLPLTSDVVFGGVKVNPQMQRLRRGVDVLVATPGRLLDLANQNAIKFDQLEILVLDEADRMLDMGFIHDIKKILAKLPKNRQNLLFSATFSDEIRQLAKGLVKDPVEISVAKRNTTAETVEQSVYVMDKGRKPKVLTKLIKDNDWKQVLVFSKTKHGANRLAKTLEEKGVSAAAIHGNKSQGARTKALANFKSGQVRVLVATDIAARGLDIEQLPQVINVDLPKVPEDYVHRIGRTGRAGATGKAISFVSEDEAKELFAIERLIQKVLPRHVLEGFEPVNKVPESKLDTRPIKPKKPKKPKAPRVEHKDGQRSGENRNGNKQGAKQGQKPATKRTPTNNPSGKKEGTDSDKKKRPFSGKPKTKGTGENRGNGSNFGKSKSTPKSDVKPRRQGPRPARKPKAN.... The pIC50 is 9.1. (7) The compound is CN(C(=O)CCCOc1ccc2[nH]c(=O)ccc2c1)C1CCCCC1. The target protein sequence is QAIHKPRVNPVTSLSENYTCSDSEESSEKDKLAIPKRLRRSLPPGLLRRVSSTWTTTTSATGLPTLEPAPVRRDRSTSIKLQEAPSSSPDSWNNPVMMTLTKSRSFTSSYAISAANHVKAKKQSRPGALAKISPLSSPCSSPLQGTPASSLVSKISAVQFPESADTTAKQSLGSHRALTYTQSAPDLSPQILTPPVICSSCGRPYSQGNPADEPLERSGVATRTPSRTDDTAQVTSDYETNNNSDSSDIVQNEDETECLREPLRKASACSTYAPETMMFLDKPILAPEPLVMDNLDSIMEQLNTWNFPIFDLVENIGRKCGRILSQVSYRLFEDMGLFEAFKIPIREFMNYFHALEIGYRDIPYHNRIHATDVLHAVWYLTTQPIPGLSTVINDHGSTSDSDSDSGFTHGHMGYVFSKTYNVTDDKYGCLSGNIPALELMALYVAAAMHDYDHPGRTNAFLVATSAPQAVLYNDRSVLENHHAAAAWNLFMSRPEYNFLI.... The pIC50 is 7.8.