This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCN(CC#Cc1ccc2c(-c3ccc(Br)cc3)nsc2c1)CCO. The target protein (P38604) has sequence MTEFYSDTIGLPKTDPRLWRLRTDELGRESWEYLTPQQAANDPPSTFTQWLLQDPKFPQPHPERNKHSPDFSAFDACHNGASFFKLLQEPDSGIFPCQYKGPMFMTIGYVAVNYIAGIEIPEHERIELIRYIVNTAHPVDGGWGLHSVDKSTVFGTVLNYVILRLLGLPKDHPVCAKARSTLLRLGGAIGSPHWGKIWLSALNLYKWEGVNPAPPETWLLPYSLPMHPGRWWVHTRGVYIPVSYLSLVKFSCPMTPLLEELRNEIYTKPFDKINFSKNRNTVCGVDLYYPHSTTLNIANSLVVFYEKYLRNRFIYSLSKKKVYDLIKTELQNTDSLCIAPVNQAFCALVTLIEEGVDSEAFQRLQYRFKDALFHGPQGMTIMGTNGVQTWDCAFAIQYFFVAGLAERPEFYNTIVSAYKFLCHAQFDTECVPGSYRDKRKGAWGFSTKTQGYTVADCTAEAIKAIIMVKNSPVFSEVHHMISSERLFEGIDVLLNLQNIG.... The pIC50 is 6.5. (2) The compound is N=C(N)NC(=O)Cn1c(-c2ccccc2)ccc1C12CC3CC(CC(C3)C1)C2. The target protein (P0DJD7) has sequence MKWLLLLGLVALSECIMYKVPLIRKKSLRRTLSERGLLKDFLKKHNLNPARKYFPQWEAPTLVDEQPLENYLDMEYFGTIGIGTPAQDFTVVFDTGSSNLWVPSVYCSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSGSVVIFGGIDSSYYTGSLNWVPVTVEGYWQITVDSITMNGEAIACAEGCQAIVDTGTSLLTGPTSPIANIQSDIGASENSDGDMVVSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSCISGFQGMNLPTESGELWILGDVFIRQYFTVFDRANNQVGLAPVA. The pIC50 is 4.3. (3) The compound is Cn1c(=O)c2c(-c3ccccc3)n3c(c2n(C)c1=O)C(c1ccccc1F)OCC3(C)C. The target protein (P15682) has sequence MATKGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVDGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELIRMIKRGINDRNFWRGENGRRTRIAYERMCNILKGKFQTAAQRAMVDQVRESRNPGNAEFEDLIFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVSSFIRGTKVVPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQRASSGQISIQPTFSVQRNLPFDRPTIMAAFTGNTEGRTSDMRTEIIRLMESARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN. The pIC50 is 5.3. (4) The compound is CCOC(=O)c1cc(C2=CCC[C@@H]3CC[C@H]2N3)on1. The target protein (P02708) has sequence MEPWPLLLLFSLCSAGLVLGSEHETRLVAKLFKDYSSVVRPVEDHRQVVEVTVGLQLIQLINVDEVNQIVTTNVRLKQGDMVDLPRPSCVTLGVPLFSHLQNEQWVDYNLKWNPDDYGGVKKIHIPSEKIWRPDLVLYNNADGDFAIVKFTKVLLQYTGHITWTPPAIFKSYCEIIVTHFPFDEQNCSMKLGTWTYDGSVVAINPESDQPDLSNFMESGEWVIKESRGWKHSVTYSCCPDTPYLDITYHFVMQRLPLYFIVNVIIPCLLFSFLTGLVFYLPTDSGEKMTLSISVLLSLTVFLLVIVELIPSTSSAVPLIGKYMLFTMVFVIASIIITVIVINTHHRSPSTHVMPNWVRKVFIDTIPNIMFFSTMKRPSREKQDKKIFTEDIDISDISGKPGPPPMGFHSPLIKHPEVKSAIEGIKYIAETMKSDQESNNAAAEWKYVAMVMDHILLGVFMLVCIIGTLAVFAGRLIELNQQG. The pIC50 is 5.0.