Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is COC12C=C[C@]3(C4C1CC(C)(C)[C@@H]4O)[C@H]1Cc4ccc(O)c5c4[C@@]3(CCN1CC1CC1)[C@H]2O5. The target protein (P42866) has sequence MDSSAGPGNISDCSDPLAPASCSPAPGSWLNLSHVDGNQSDPCGPNRTGLGGSHSLCPQTGSPSMVTAITIMALYSIVCVVGLFGNFLVMYVIVRYTKMKTATNIYIFNLALADALATSTLPFQSVNYLMGTWPFGNILCKIVISIDYYNMFTSIFTLCTMSVDRYIAVCHPVKALDFRTPRNAKIVNVCNWILSSAIGLPVMFMATTKYRQGSIDCTLTFSHPTWYWENLLKICVFIFAFIMPVLIITVCYGLMILRLKSVRMLSGSKEKDRNLRRITRMVLVVVAVFIVCWTPIHIYVIIKALITIPETTFQTVSWHFCIALGYTNSCLNPVLYAFLDENFKRCFREFCIPTSSTIEQQNSARIRQNTREHPSTANTVDRTNHQLENLEAETAPLP. The pKi is 9.2. (2) The compound is O=[As](O)(O)c1ccccc1. The target protein sequence is MMAYKVLHFVVISLGLVTLVASRCDFNYYNQRAWLSCPGSQCGGNRQSPINIDTEKTKANNSLIALRFNDYDDPVDGDFENLGTTVEFVPETKDATLTNHLGTYDLLQFHFHWGRDSSEGSEHRIDDEQYSAEIHFVHLKQGASPSDTAGDTFSVVAVLCEAADIPIRGVWAKLSPVPTGHEDSHSVSDLVYTDLLPRNRDYYHYEGSLTTPLCDETVQWFVLKNTIKIPKAFLTMLRRVESDEDGTLLTFNFRNLQRLNGRQVFEFPPDVDNGEDKKRKRRNNRHGRDHHG. The pKi is 4.1. (3) The compound is c1ccc2c(c1)[c+]1ccn2CCCCCn2cc[c+](c3ccccc32)NCCCCCN1. The target protein (P70605) has sequence MDTSGHFHDSGVGDLDEDPKCPCPSSGDEQQQQQQPPPPSAPPAVPQQPPGPLLQPQPPQLQQQQQQQQQQQQQQQQQQQAPLHPLPQLAQLQSQLVHPGLLHSSPTAFRAPNSANSTAILHPSSRQGSQLNLNDHLLGHSPSSTATSGPGGGSRHRQASPLVHRRDSNPFTEIAMSSCKYSGGVMKPLSRLSASRRNLIEAEPEGQPLQLFSPSNPPEIIISSREDNHAHQTLLHHPNATHNHQHAGTTAGSTTFPKANKRKNQNIGYKLGHRRALFEKRKRLSDYALIFGMFGIVVMVIETELSWGLYSKDSMFSLALKCLISLSTIILLGLIIAYHTREVQLFVIDNGADDWRIAMTYERILYISLEMLVCAIHPIPGEYKFFWTARLAFSYTPSRAEADVDIILSIPMFLRLYLIARVMLLHSKLFTDASSRSIGALNKINFNTRFVMKTLMTICPGTVLLVFSISLWIIAAWTVRVCERYHDQQDVTSNFLGAMW.... The pKi is 8.8. (4) The small molecule is CCCn1c(=O)c2[nH]c(C34CC5CC(CC3C5)C4)nc2n(CCC)c1=O. The target protein (Q8TCC7) has sequence MTFSEILDRVGSMGHFQFLHVAILGLPILNMANHNLLQIFTAATPVHHCRPPHNASTGPWVLPMGPNGKPERCLRFVHPPNASLPNDTQRAMEPCLDGWVYNSTKDSIVTEWDLVCNSNKLKEMAQSIFMAGILIGGLVLGDLSDRFGRRPILTCSYLLLAASGSGAAFSPTFPIYMVFRFLCGFGISGITLSTVILNVEWVPTRMRAIMSTALGYCYTFGQFILPGLAYAIPQWRWLQLTVSIPFFVFFLSSWWTPESIRWLVLSGKSSKALKILRRVAVFNGKKEEGERLSLEELKLNLQKEISLAKAKYTASDLFRIPMLRRMTFCLSLAWFATGFAYYSLAMGVEEFGVNLYILQIIFGGVDVPAKFITILSLSYLGRHTTQAAALLLAGGAILALTFVPLDLQTVRTVLAVFGKGCLSSSFSCLFLYTSELYPTVIRQTGMGVSNLWTRVGSMVSPLVKITGEVQPFIPNIIYGITALLGGSAALFLPETLNQPL.... The pKi is 5.4. (5) The pKi is 6.1. The drug is COc1cccc(C(=O)NCCN2CCN(c3ccccn3)CC2)c1. The target protein (P19020) has sequence MAPLSQISTHLNSTCGAENSTGVNRARPHAYYALSYCALILAIIFGNGLVCAAVLRERALQTTTNYLVVSLAVADLLVATLVMPWVVYLEVTGGVWNFSRICCDVFVTLDVMMCTASILNLCAISIDRYTAVVMPVHYQHGTGQSSCRRVALMITAVWVLAFAVSCPLLFGFNTTGDPSICSISNPDFVIYSSVVSFYVPFGVTVLVYARIYIVLRQRQRKRILTRQNSQCISIRPGFPQQSSCLRLHPIRQFSIRARFLSDATGQMEHIEDKQYPQKCQDPLLSHLQPPSPGQTHGGLKRYYSICQDTALRHPSLEGGAGMSPVERTRNSLSPTMAPKLSLEVRKLSNGRLSTSLRLGPLQPRGVPLREKKATQMVVIVLGAFIVCWLPFFLTHVLNTHCQACHVSPELYRATTWLGYVNSALNPVIYTTFNVEFRKAFLKILSC. (6) The compound is COc1ccc2c(Oc3ccc(NC(=O)c4c(C)n(CC(C)(C)O)n(-c5ccccc5)c4=O)nc3)ccnc2c1. The target protein (P16056) has sequence MKAPTVLAPGILVLLLSLVQRSHGECKEALVKSEMNVNMKYQLPNFTAETPIQNVVLHGHHIYLGATNYIYVLNDKDLQKVSEFKTGPVLEHPDCLPCRDCSSKANSSGGVWKDNINMALLVDTYYDDQLISCGSVNRGTCQRHVLPPDNSADIQSEVHCMFSPEEESGQCPDCVVSALGAKVLLSEKDRFINFFVGNTINSSYPPGYSLHSISVRRLKETQDGFKFLTDQSYIDVLPEFLDSYPIKYIHAFESNHFIYFLTVQKETLDAQTFHTRIIRFCSVDSGLHSYMEMPLECILTEKRRKRSTREEVFNILQAAYVSKPGANLAKQIGASPSDDILFGVFAQSKPDSAEPVNRSAVCAFPIKYVNDFFNKIVNKNNVRCLQHFYGPNHEHCFNRTLLRNSSGCEARSDEYRTEFTTALQRVDLFMGRLNQVLLTSISTFIKGDLTIANLGTSEGRFMQVVLSRTAHLTPHVNFLLDSHPVSPEVIVEHPSNQNGY.... The pKi is 8.7.