Why does emoji have two different utf-8 codes? How to ...
文章推薦指數: 80 %
But in a web tool bianma, all the two types of utf-8 code can be converted into emoji correctly. input code. ouput. So, my question is : Why ... Home Public Questions Tags Users Collectives ExploreCollectives FindaJob Jobs Companies Teams StackOverflowforTeams –Collaborateandshareknowledgewithaprivategroup. CreateafreeTeam WhatisTeams? Teams CreatefreeTeam CollectivesonStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. Learnmore Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. Learnmore Whydoesemojihavetwodifferentutf-8codes?Howtoconvertemojifromutf-8,useNSStringinios? AskQuestion Asked 5years,11monthsago Active 3yearsago Viewed 34ktimes 19 7 Wehavefoundanissue,thatsomeemojihavetwoutf-8codes,suchas: emojiunicodeutf-8anotherutf-8 😁U+1F601\xf0\x9f\x98\x81\xed\xa0\xbd\xed\xb8\x81 Butioslanguagecan'tdecodetheothertypeofutf-8,soresultinganerrorwhenidecodestringfromutf-8. Inalldocumentsifound,icanjustfindonetypeofutf-8codeforaemoji,nowheretofindtheother. Documentsireferencedincludes: emojicodelink wholeutf-8codelink Butinawebtoolbianma,allthetwotypesofutf-8codecanbeconvertedintoemojicorrectly. So,myquestionis: Whydoestherehavetwotypesofutf-8codesforoneemoji? Wherehasadocumentwhichincludesthetwotypesofutf-8codes? Howtocorrectlyconvertstringfromutf-8,usingNSStringinioslanguage? iosunicodeutf-8nsstringemoji Share Follow askedDec22'15at5:34 pinchwangpinchwang 35311goldbadge22silverbadges1212bronzebadges 4 ThishadmeintriguedasmyfirstthoughtwasthatthelongUTF-8representationwastwoUTF-8blocks.ItturnsoutthattherearetwovariationsofUTF-8,CESU-8andModifiedUTF-8,whichencodeUTF-16style.Youmaybeabletousethisarticleiphonedevsdk.com/forum/iphone-sdk-development/…towriteadecoderifthere'snosuitableiOS/Objective-Cnativedecoder. – AlastairMcCormack Dec22'15at11:33 @AlastairMcCormackThat'stheanswerIthink.Youshouldpostthatasananswer. – roeland Dec22'15at22:23 @user692793Pleaseneverposttextasimages,especiallynotcodeoroutput. – roeland Dec22'15at22:24 [email protected],butasI'mnotanObjective-CcoderI'llleaveittosomeoneelsetopickuptheglory:) – AlastairMcCormack Dec22'15at22:26 Addacomment | 2Answers 2 Active Oldest Votes 16 0xF0,0x9F,0x98,0x81 IsthecorrectUTF-8encodingforU+1F601😁. 0xED,0xA0,0xBD,0xED,0xB8,0x81 IsnotavalidUTF-8sequence(*).Itshouldreallyberejected;iOSiscorrecttodoso. Thisisabuginthebianmatool:theconvertUtf8BytesToUnicodeCodePointsfunctionismorelenientaboutwhatinputitacceptsthanthespecifiedalgorithminegRFC3629. ThishappenstoreturnaworkingstringonlybecausethetooliswritteninJavaScript.HavingdecodedtheabovebytesequencetothebogussurrogatecodepointsequenceU+D83D,U+DE01itthenconvertsthatintoaJavaScriptstringusingadirectcode-point-to-code-unitmappinggiving\uD83D\xDE01.Asthisisthecorrectwaytoencode😁inaUTF-16stringitappearstohaveworked. (*:ItisavalidCESU-8sequence,butthatencodingisjust“bogusbrokenencodingforcompatibilitywithbadly-writtenhistoricaltools”andshouldgenerallybeavoided.) Youshouldnotusuallyencounterasequencelikethis;itistypicallynotworthcateringforunlessyouhaveaspecificsourceofthiskindofmalformeddatawhichyoudon'thavethepowertogetfixed. Share Follow editedOct7at7:59 CommunityBot 111silverbadge answeredDec22'15at23:03 bobincebobince 508k102102goldbadges632632silverbadges810810bronzebadges 3 Thankyouverymuchforanswer.WereadstringdatafromourserverwhichuseC++language,afterserverconvertunicodestringtoutf-8,thisissueoccurs.Onemorethingneedtomentionisthat,whenourclientreceivedataasastringvaluecstr,andprintf("%s",cstr)it'scorrect.ButwhenconvertstringtoNSString,NSString*ocstr=[[NSStringalloc]initWithBytes:cstr.c_str()length:cstr.length()encoding:NSUTF8StringEncoding];ocstrresultsasnil.whyappledonotsupporttheCESU-8sequence?Dowehavefunctiontoresolvetheissue? – pinchwang Dec23'15at2:12 IwouldfirstlookattheC++serverUTF-8encoder,toseeifitcanbefixedproperlyatsource.CESU-8isconsideredanundesirableanomalythatyou'dneverdeliberatelywanttouse;mostsystemsdon'tsupportit.Ifyouhavetoacceptityou'llneedtowriteyourownCESU-8decoderwalkingthroughtheinputbytearray(oruseanexistinglibrary,egICUthoughthatwouldbeareallyheavydependencyjustforthis). – bobince Dec23'15at11:36 Justasasidenote,thereisoneparticularlybothersomesourceofencodinglikethis:JNI(JavaNativeInterface).Ifyouattempttoretrieve"UTF-8"bytesfromaJavastringyouwillreceivethe"modifiedUTF-8"variant.Thatisaratherlargesourceofmalformeddatathatcannotbefixed,unfortunately. – borrrden Jul12'18at17:02 Addacomment | -3 Thisworkedformeinphptosendamessagewithemojitotelegrambot: $message_text="\xf0\x9f\x98\x81"; Share Follow answeredJun12'18at9:41 PolinaPolina 1133bronzebadges 1 Thisisjusttoo-off-topic. – Moradnejad Apr30'19at11:47 Addacomment | YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggediosunicodeutf-8nsstringemojioraskyourownquestion. TheOverflowBlog Podcast398:Feelinginsecureaboutcode’ssecurity Newdata:Whatdeveloperslookforinfuturejobopportunities FeaturedonMeta Reducingtheweightofourfooter NewresponsiveActivitypage TwoBornottwoB-Farewell,BoltClockandBhargav! Communityinputneeded:Therulesforcollectivesarticles Related 1 stringWithCString:encoding:returnednil,whentheinputcstringcontainsemoji 3 HowdoesFacebookencodeemojiinthejsonGraphAPI? 0 Howtocleanse'emojis'fromJSONfile,beforeinputtingintoSQL 3 WhydoesPythonrecognisethisUTF-8characterastwocharactersratherthanone 5 Troubleconvertingtoutf-8 3 Python3+Django1.10+mysqlclient1.3.9:cannotsaveemojicharacters 8 PyMySQLWarning:(1366,"Incorrectstringvalue:'\\xF0\\x9F\\x98\\x8Dt...') 2 StringlengthsdifferinPython3fromfileandthroughcopy-and-paste 0 Howtoconvertutf-8encodingtoastring? HotNetworkQuestions WhatbiomaterialswithhighercompressivestrengththanbonecanIusetomakebigbutproportionatehumanoids? HowdoIexplainthedistributivityofmultiplicationtoastudentwithoutusingtheanalogyofareas? UnexplainablyLarge(empty)ExcelFile Isitappropriatetoaskaboutthenumberofapplicantstoaposition? WhyisthesnowlinehigherintheHimalayasthanontheequator? BooksaboutbusinessandTorah Istheword"samsara"composedofsimplerconceptsetymologicallyorotherwise? WhydoIhearhigherharmonics,whenthestringsaren'tfreetovibrate? Writeanumberinoverflowedbinary HowdoyoudividetheMuzzarellapizzainto7parts? Whyisasegmentationfaultnotrecoverable? ImpresspotentialPhDadvisorwithmywork,willitbackfire? Helpindeterminingthefeaturesofafictionalplanet Howtoselfteachmathematics? Whatdoes"teamsarelopsided"mean? EnergydiagramforanexothermicreactioninTikZ Howdoesthemeaningof"called/kletos"in1Corinthians1:24comparewithitsmeaninginMatthew22:14? “Letforeach𝑖”vs.“Foreach𝑖let” HowmanyphysicalCPUsdoesmymachinehave? CanIconverta1x9speedbicycletosinglespeedbyputting9identicalcogsintocassette? Findingcategorywithmaximumlikelihoodmethod Peerreviewingwithdyslexia CantheSchrodingerequationbesolvedfordeuterium? Iamlookingforawordtodescribeapersonorentitywhoispermittedbysocietytodobad/greedythingsbecausetheyhavebeencharitable morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. default Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings
延伸文章資訊
- 1以JS处理emoji表情为例简介UTF-8编码
以JS处理emoji表情为例简介UTF-8编码. Emoji 表情符号是直接保存在字符中的标签,不是一张图片,而是可以理解为和一个汉字同类的东西。
- 2emoji、utf-8、Unicode的讲解 - 就是这个范儿
通俗的讲解emoji、utf8和Unicode之间的关系. ... 对于扩展性最强的标准UTF-8 编码方式,是肯定可以对其进行编码存储的,一般emoji 码点会占用4 字节。
- 3UTF-8 Emoji Unicode 轉換器 - 線上工具
字元與字串(字符串)轉換為十進制或十六進制編碼,支持UTF-8,Emoji,CSS,HTML,Unicode,Escaped Unicode,Decimal NCRs,Hexadecimal N...
- 4Emoji unicode characters for use on the web - Experimental ...
- 5特殊字符(包括emoji)梳理和UTF8编码解码原理 - 知乎专栏
背景知识emoji表情符号,是20世纪90年代由NTT Docomo栗田穣崇(Shigetaka Kurit)创建 ... 在Java里UTF-8,只支持双字节即\u0000-\uFFFF,em...