Understanding Swift Strings, Emoji, Characters, and Scalars
文章推薦指數: 80 %
By using emoji and observing the way Swift handles them, we're going to dive into the topic. Lets' say you want to check if a string ... GetstartedOpeninappSigninGetstartedFollow198KFollowers·ArchiveWriteForUsSupportUsAboutGetstartedOpeninappUnderstandingSwiftStrings,Emoji,Characters,andScalarsLearnhowSwiftworkswithcharacters,usingemojiasafunandeasyexampleKevinRNov17,2019·7minreadPhotobyiabzdonUnsplashTherelationshipbetweencharactersandglyphscanbeabitconfusing.ByusingemojiandobservingthewaySwifthandlesthem,we’regoingtodiveintothetopic.Lets’sayyouwanttocheckifastringcontainsoneormoreemoji—howwouldyoudothat?ALittleBackgroundonEmojiThewordcomesfromtwoJapanesewords:絵meaningpicture(e)and文字meaningcharacter(moji/mohdzi).Thefactthewordmightmakeyouthinkofemoticonoremotionispurelycoincidental.They’vebeenaroundlongerthanyoumightthink.Althoughtheygainedworldwidepopularityaround2010,theywerealreadybeingusedinJapansince1997.Startingoutwithasetoflessthan80symbols,theemojisethasgrowntocontainover1,200icons.2010wasalsotheyearthefirstsetofemojiwasaddedtotheUnicodestandard.Unicodeisanindustrystandarddesignedtounifythehandlingandpresentationoftext.Italsocontainsanindexofcharactersfromwritingsystemsfromallaroundtheworldtodoso,bothcurrentandancient.Thisstandardkeepsgrowing—thelatestversionasofwriting(12.1)consistsofnearly138,000characters.Here’ssomeexamplesofcharactersdefinedintheUnicodestandardThestandardincludesnotonlycharactersfromalphabetsfromaroundtheworldbutalsospecialcharactersthataren’tvisibleandcan’tbeusedindependently.We’llgetintothatlater.IhighlyrecommendcheckingouttheUnicode®CharacterTabletogetanideaofthescaleofthis.Simplyscrolldownthetableonthemainpagetodiscoverthecombinationsandpossibilities.DivinginEverycharacterdefinedbytheUnicodestandardhasahexadecimalidentifier(unicodenumber)andcharactersarecategorizedintoblocks,suchasHebreworArabic.It’simportanttotounderstandthedifferencebetweencharacters,glyphs,andscalars.UnicodeconsistsofcharactersspecifiedbyaUnicodenumber.Acharactermayormaynotberepresentedonascreen.Also,acombinationofcharactersmayresultinonecharacteronscreen.Swiftdistinguishesbetweentheretermsbydefiningthemslightlydifferent.It’squiteacomplexstory,butthegistofitis:AstringconsistsofcharactersAcharacterconsistsofunicodescalarsEachUnicodescalarrepresentsaUnicodecharacterBacktotheUnicodecharacters.Here’sanexample:thegrinningface(😀)isidentifiedasU+1F600andispartoftheemoticonsblock.YoucanrepresenttheemojiinaSwiftstringinseveralways:letsmiley1=“😀”letsmiley2=“\u{1F600}”//Hexcode,also"😀"“Sowecanfindtheunicodeblockforemojiandcheckifacharacterisfromthatblock?”Well,no.There’snotoneblockforEmojicharacters.There’sseparateblocksfortransportandmaps,supplementalsymbolsandpictographs,andawholebunchoficonsinmiscellaneoussymbolsandpictographs.Evenifwedeterminewhichblocksorwhatlistofcharactersareemoji,chancesareitwon’tbefuture-proof.Thestandardisever-evolvingandexpanding.Butifyoudigdeeper,you’llseethatlastblockalsohasstrangecharacters,forexample:NotchedleftsemicirclewiththreedotsI’mnotsurewhatit’ssupposedtobe(apartfromtheliteraldescriptiongivenbyUnicode),butmybrowsersuredoesn’tknowhowtodisplayit:HowmybrowserrenderstheaforementionedcharacterApplyingthistocodeUptoSwift4.2,wewerestucktryingtofigureoutifacharacterisanemojibycheckingiftheUnicodenumberbelongedtooneofthepredefinedUnicodeblocks.ButalongcameSwift5.0,andwithit,anewUnicode.Scalar.Propertiesclass,givingusawholerangeofflagstohelpusfigureoutwhatwe’redealingwith.WecanfetchthearrayofUnicodescalarsthatrepresentourstringveryeasily.Enoughboringtalk—here’sanexample://Here'souremojiletsmiley="😀"//GetaniterableofthescalarsinourStringletscalars=smiley.unicodeScalars//UnicodeScalarViewinstance//Wehaveonecharacter,sowe'llbegettingthatoneletfirstScalar=scalars.first//is128512//Notethat128512isactuallythedecimal//valueforhexadecimal1F600(theunicodeidentifierfor😀)//Getthepropertiesletproperties=firstScalar?.properties//Checkifit'sanEmojiletisEmoji=properties?.isEmoji//=trueEureka!Sowe’redone?Well,no.Howaboutthis://Strangelyenough,thiswillreturntrue:"3".unicodeScalars.first?.properties.isEmojiThisisbecausethescalar3canbepresentedasanemoji,eventhoughinthisparticularcaseitisn’t.ThepropertyisEmojiisreallymisleadinginthisway.Luckily,there’sanotherproperty://Thiswillreturntruelikebefore:"😀".unicodeScalars.first?.properties.isEmojiPresentation//Andthiswillreturnfalselikeweexpect:"3".unicodeScalars.first?.properties.isEmojiPresentation//Bytheway,that'NotchedLeftSemicirclewithThreeDots'//alsoreturnsfalse,aswecannotactuallyrenderit:"🕃".unicodeScalars.first?.properties.isEmojiPresentation//Unfortunately,thisdoesn'tholdtrueforallemoji:"🌶".unicodeScalars.first?.properties.isEmojiPresentation//false"🌶".unicodeScalars.first?.properties.generalCategory==.some(.otherSymbol)//trueAlotbetter,right?Butwe’renotquitethereyet.There’salsocharactersthatactuallyconsistofmultipleglyphs.SeehowweusedunicodeScalars.first?Considerthefollowingexamples:"1️⃣".unicodeScalars.first?.properties.isEmojiPresentation//false"♦️".unicodeScalars.first?.properties.isEmojiPresentation//false"👍🏻".unicodeScalars.first?.properties.isEmojiPresentation//true"👨👩👧👧".unicodeScalars.first?.properties.isEmojiPresentation//trueToexplainwhythisishappening,let’stakealookattheunicodeScalarsproperty.ThepropertyunicodeScalarsreturnsaninstanceofUnicodeScalarView.ThedebugDescriptionofitwilljustresultintheoriginalstring,soinspectingthecontentsdirectly(orloggingit)doesn’toffermuchinsight.Fortunately,there’samapfunctionthat’llreturnaregulararray,soweendupwithanarrayoftheUnicode.Scalarelements://ThiswillcreateanUnicodeScalarViewletscalarView="1️⃣".unicodeScalars//Maptheviewsowegetaregulararraywhichwecaninspectletscalars=scalarView.map{$0}Theresultingcontainsthreevalues:Decimal49(hexU+0031):AplainoldDigit1Decimal65039(hexU+FE0F):VariationSelector-16Decimal8419(hexU+20E3):CombiningEnclosingKeycapTherearethosespecialscalarswewerementioningearlier.Sothecombinationofthesecharactersisusedtoformtheemoji,turningaregularnumber1intothissymbol.Thesecondandthirdscalarmodifytheinitialone.Justtoclarify,youcanalsousethehexadecimalunicodeidentifierstocreatethiscombinationmanually:"\u{0031}"//turnsinto:1"\u{0031}\u{20E3}"//turnsinto:1⃣"\u{0031}\u{FE0F}\u{20E3}"//turnsinto:1️⃣Similarly,otheremojicanbecombined://BlackDiamondSuitEmoji"\u{2666}"//♦//Adding'VariationSelector-16':"\u{2666}\u{FE0F}"//♦️//Thumbsupsign:"\u{1F44D}"//👍//Adding'EmojiModifierFitzpatrickType-4':"\u{1F44D}\u{1F3FD}"//👍🏽//Man,Woman,Girl,Boy"\u{1F468}\u{1F469}\u{1F467}\u{1F466}"//👨👩👧👦//Adding'ZeroWidthJoiner'betweeneach"\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}\u{200D}\u{1F466}"//👨👩👧👦Yep,that’ssevenscalarscombinedintoonecharacter.Finally,it’sgoodtonotenoteverycharacterthat’smadeupofmultiplescalarsisanemoji:"\u{0061}"//Letter:a"\u{0302}"//CircumflexAccent:̂"\u{0061}\u{0302}"//Combinesinto:âLittlesidenote:Maybeyou’veseenmessages/textonlinethatseemprettymessedup(almostlikeit’saglitchinthematrix),lookingsomethinglikethis:ThisisgenerallyreferredtoasZalgo,andactuallyjustconsistsofmanyUnicodecharactersbeingmergedintosinglecharactersonscreen:letlotsOfScalars="E̵͉͈̥̝͛͊̂͗͊̈́̄͜"letscalars=lotsOfScalars.unicodeScalars.map{$0}//Mergeintoastring,addingspacestoseethemindividually//Thiswillresultin:E̵͉͈̥̝͛͊̂͗͊̈́̄͜letscalarList=scalars.reduce("",{"\($0)\($1)"})GettingtheRightInformationLet’scombinethisinformationtoaddsomehelperpropertiestothecharacterandstringclasses.We’ll:Checkifacharacterisexactlyonescalarthat’llbepresentedasanemojiCheckifacharacterconsistsofmultiplescalarsthat’llbecombinedintoanemojiextensionCharacter{varisSimpleEmoji:Bool{guardletfirstScalar=unicodeScalars.firstelse{returnfalse}returnfirstScalar.properties.isEmoji&&firstScalar.value>0x238C}varisCombinedIntoEmoji:Bool{unicodeScalars.count>1&&unicodeScalars.first?.properties.isEmoji??false}varisEmoji:Bool{isSimpleEmoji||isCombinedIntoEmoji}}Next,we’lladdsomecomputedpropertiestothestringtoaccessourcharacterextension:extensionString{varisSingleEmoji:Bool{returncount==1&&containsEmoji}varcontainsEmoji:Bool{returncontains{$0.isEmoji}}varcontainsOnlyEmoji:Bool{return!isEmpty&&!contains{!$0.isEmoji}}varemojiString:String{returnemojis.map{String($0)}.reduce("",+)}varemojis:[Character]{returnfilter{$0.isEmoji}}varemojiScalars:[UnicodeScalar]{returnfilter{$0.isEmoji}.flatMap{$0.unicodeScalars}}}Nowcheckingourstringforemojibecomesverysimple:"â".isSingleEmoji//false"3".isSingleEmoji//false"3️⃣".isSingleEmoji//true"3️⃣".emojiScalars//[51,65039,8419]"👌🏿".isSingleEmoji//true"🙎🏼♂️".isSingleEmoji//true"👨👩👧👧".isSingleEmoji//true"👨👩👧👧".containsOnlyEmoji//true"🏴".isSingleEmoji//true"🏴".containsOnlyEmoji//true"Hello👨👩👧👧".containsOnlyEmoji//false"Hello👨👩👧👧".containsEmoji//true"👫Héllo👨👩👧👧".emojiString//"👫👨👩👧👧""👫Héllœ👨👩👧👧".emojiScalars//[128107,128104,8205,128105,8205,128103,8205,128103]"👫Héllœ👨👩👧👧".emojis//["👫","👨👩👧👧"]"👫Héllœ👨👩👧👧".emojis.count//2"👫👨👩👧👧👨👨👦".isSingleEmoji//false"👫👨👩👧👧👨👨👦".containsOnlyEmoji//trueSummingItUpThere’sanimportantdifferencebetweencharactersandscalars:Basically,it’suptothestringthatdefinesthescalarsandthesystemrenderingittodecidewhichcharactersthescalarswillresultin.WhileUnicodedefineseachcodepointasacharacter,Swiftreallycallsthesescalarsandusesthetermcharacterforthecombinationofscalarsthatmayresultinasingleglyphinthestring.Isaymaybecausethingslikecontrolcharacters(e.g.,nullandbackspace)willbecountedasseparatecharacters.Thanksforreading!KevinRiOSDeveloper,Swiftenthousiast.WorkingforTeamRockstarsITintheNetherlandsFollow238238 238EmojiXcodeUnicodeSwiftProgrammingMorefromBetterProgrammingFollowAdviceforprogrammers.Here’swhyyoushouldsubscribe:https://bit.ly/bp-subscribeReadmorefromBetterProgrammingMoreFromMediumTutorial:FromSketchtoXcode — AddingaTabBarJuanMaguidinSketch2ReactSlideableTextViewsinSwiftwithPageControlThomasBernhardSwiftUI’sViewprotocolOscarByströmEricssoninNerdForTechFinalvsStaticvClassmethodsWillChiuIntroducingSiriLibs:TheFirstMadLibsStyleGameForSiriShortcutsJustinCoxCombine,Publishers,andCoreDataApostolosGiokasinBetterProgrammingBuildinganativeeditorforiOSRajdeepKwatrainLevelUpCodingDecimalChainatCoinMarketCapDecimalChain
延伸文章資訊
- 1safx/Emoji-Swift: String extension converting to and ... - GitHub
String extension converting to and from emoji character and Emoji-One - GitHub - safx/Emoji-Swift...
- 2Working With Emoji in Swift. Emoji aren't just simple strings
Emoji have become a big part of our life. iPhones and iPads have a special emoji keyboard (unless...
- 3Emoji unicode characters for use on the web - Experimental ...
- 4Emoji List, v14.0 - Unicode
This chart provides a list of the Unicode emoji characters and sequences, with single image and ....
- 5Encode Emoji to string - Stack Overflow
128520 is the unicode scalar value of "😈": let text = "😈😀" let encoded = text.unicodeScalars.map ...