Intractable Problems In Malware Analysis And

Transcription

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 2018Intractable Problems in Malware Analysis and Practical SolutionsAli Aydın SelçukDept. of Computer EngineeringTOBB University of Economics andTechnologyAnkara, TurkeyFatih Orhan, Berker BaturComodo Security Solutions, Inc.Clifton, NJ, USAAbstractMalware analysis is a challenging task in thetheory as well as the practice of computer science.Many important problems in malware analysis havebeen shown to be undecidable. These problemsinclude virus detection, detecting unpackingexecution, matching malware samples against a setof given templates, and detecting trigger-basedbehavior. In this paper, we give a review of theundecidability results in malware analysis anddiscuss what can be done in practice.of these results are based on the fact that preciselydeciding whether a given program/input satisfies acertain post-condition, for an arbitrary postcondition, is undecidable. The proofs are based ontwo general techniques: Either they build a selfcontradictory program assuming the existence of adecider for the given problem, similar to [6], or theygive a reduction from a well-known undecidableproblem, such as the Halting Problem, similar to [7].In this section, we review some of the mostsignificant undecidability results in the field.1. Introduction2.1. Undecidability of the General VirusDetection ProblemThe number of malware programs encountered bysecurity companies multiplies every year. Each ofthese programs needs to be analyzed by static anddynamic analysis tools. The task of running eachprogram in a controlled environment and analyzingits behavior manually is a tedious and labor-intensivetask. Therefore, there is a great need for automationof this process and for tools that will help with theanalysis.One of the most significant theoretical results inmalware analysis is from the seminal works ofCohen on computer viruses [6, 7] where he showedthat a program that detects all computer virusesprecisely is impossible. Later, Chess and White [4]gave an example of a polymorphic virus that cannotbe precisely detected by any program. Other resultsfollowed [2, 5, 21] which stated the impossibility ofcertain critical tasks in static and dynamic malwareanalysis.In this paper, we give a brief survey of the majorundecidability results found in the malware analysisliterature. Then we give examples from the positiveside showing what can be done on these undecidableproblems in practice.The first result on the undecidability of thegeneral virus detection problem is due to Cohen [6].Using a well-known proof technique, he argued that:“In order to determine that a given program ‘P’ isa virus, it must be determined that P infects otherprograms. This is undecidable since P could invokeany proposed decision procedure ‘D’ and infect otherprograms if and only if D determines that P is not avirus. We conclude that a program that preciselydiscerns a virus from any other program byexamining its appearance is infeasible.”He gave the following piece of program“contradictory-virus” as an example that cannot bedetected by a virus detector D in a correct way:2. Malware Analysis and UndecidabilitySince Cohen [6] gave the first formal treatment ofcomputer viruses, many problems in malwareanalysis have been shown to be undecidable. ManyCopyright 2018, Infonomics Society588

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 2018As Cohen [6] observed, “ if the decisionprocedure D determines CV to be a virus, CV will notinfect other programs, and thus will not act as avirus. If D determines that CV is not a virus, CV willinfect other programs, and thus be a virus. Therefore,the hypothetical decision procedure D is selfcontradictory, and precise determination of a virusby its appearance is undecidable.” A minor flaw inthis argument was observed by Steinparz [25], whonoted that this argument only shows theimpossibility of a virus detector which is not a virusitself. Otherwise, if D is a virus itself, it can return“true” on contradictory-virus and be correct.A more formal proof was again given by Cohenhimself [7] by a reduction from the Halting Problem.He showed that the existence of a precise virusdetector would imply a decider for the HaltingProblem and hence is not possible.Furthermore, Cohen [8] observed that whether asequence is a virus or not depends on theenvironment in which it is run. Thus any givensequence is or is not a virus as a function of theenvironment in which it is placed.2.2. Existence of an Undetectable VirusAs summarized above, Cohen [6, 7] showed theimpossibility of a virus detector that detects allviruses precisely. Chess and White [4] extended thisresult by showing that there are viruses, in theory,with no error-free detectors. They explained, “Thatis, not only can we not write a program that detectsall viruses known and unknown with no falsepositives, but in addition there are some viruses forwhich, even when we have a sample of the virus inhand and have analyzed it completely, we cannotwrite a program that detects just that particular viruswith no false positives.”The result of Chess and White is based on anextension of the contradiction argument in Cohen’sfirst paper [6]: Consider a polymorphic virus W thatis able to modify its code. This virus modifies itsspreading condition such that if some particularsubroutine in it returns “false” on W itself, it spreads.Furthermore, this subroutine is subject to change as apart of W’s polymorphism. Now, if some detectorcode C were to detect W, there is at least oneinstance of this polymorphic virus, where thesubroutine is replaced by C, that cannot be detectedby C: Just like Cohen’s argument, detection by Cwould result in the virus’ not spreading, and hencewould imply a false positive.To illustrate their point, they gave an examplepseudocode of such a virus W, one instance of whichis r:if subroutine one(r) then exit, else{replace the text ofsubroutine onewith a random program;spread;exit;}subroutine one:return false;Copyright 2018, Infonomics SocietyThey noted that for any algorithm C that detectsW, there is a program s for which C does not returnthe correct result:if subroutine one(s) then exit, else{replace the text of subroutine onewith a random program;spread;exit;}subroutine one:return C(argument);If C(s) returns true, then s will just exit, but ifC(s) returns false, then s is an instance of the virusW.The same argument shows the non-existence of adetector for W under a looser notion of detection aswell: Say a program “detects” a virus V if it (i)returns “true” on every program infected with V, (ii)returns “false” on every program not infected withany virus, (iii) may return “true” or “false” on aprogram that is infected with some virus other thanV. The impossibility argument above applies to thislooser notion of detection verbatim. Hence, Chessand White [4] concluded that there exists, in theory,some virus that cannot be detected precisely by anyvirus detector even under this looser notion ofdetection.2.3. Semantic-Aware Malware DetectionA malware detector based on a pattern matchingapproach is fundamentally limited againstobfuscation techniques used by hackers. The goal ofmalware obfuscation is to morph or modify themalware to evade detection. A piece of malware canmodify itself by, for example, encrypting its payload,and then later decrypting it during execution. Apolymorphic virus tries to obfuscate its decryptioncode using several transformations, such as ment. Metamorphic viruses, on the otherhand, try to evade detection through obfuscating theentire code. When they replicate, these viruseschange their code by techniques such as substitutionof equivalentinstruction sequences, codetransposition, register reassignment, and change ofconditional jumps. The fundamental limitation of thepattern-matching approach for malware detection isthat it is mainly syntactic and does not consider thesemantics of the program flow and the instructions.Christodorescu et al. [5] studied a method toovercome this limitation by incorporating instructionsemantics to detect malicious code traits. In theirframework, malicious behavior is defined by handconstructed “templates”. A template T is defined as a3-tuple (IT, VT, CT): IT is a sequence of instructions,589

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 2018and VT is the set of variables and CT is the set ofsymbolic constants that appear in IT. An “executioncontext” of a template T (IT, VT, CT) is anassignment of values to the symbolic constants of theset CT.For a given program P, they say that P satisfies atemplate T (denoted by P T) if P contains aninstruction sequence I such that I contains a behaviorspecified by T. The problem of deciding whether agiven piece of code contains such a templatebehavior (i.e., P T) is modeled as the “TemplateMatching Problem”.The Template Matching Problem turns out to beundecidable. Christodorescu et al. [5] gave areduction from the Halting Problem to the TemplateMatching Problem, and stated that a precise solutionfor the general Template Matching Problem isimpossible.2.4. Automatic Unpacking for MalwareDetectionAn obfuscation mechanism that is much used bymodern malware is to hide the malicious portion ofthe payload as data at compile time, and thentransform it into an executable at run time, abehavior known as “unpack and execute”. Theunpack transformation can be something simple,such as an XOR by a block of random-looking data,or something more complex, such as decryption by acipher like AES.Royal et al. [21] worked on detecting suchpolymorphic viruses by focusing on the result of theunpack operation. The idea is to compare theexecutable code during the run time with that beforethe run time. When a change is detected, it is writtenout for further analysis.The code and the data sections of a program areformally modeled as a Turing machine M and itsinput w. Then the unpack detection problem becomeswhether w contains another program in it that will beemulated by M during computation. This problemcan be formulated as the following formal language:UnpackExTM { M, w : M is a UTM, and Msimulates a Turing machine on its tape in itscomputation on w}Royal et al. [21] gave a theorem which stated thatthe UnpackExTM language is undecidable. Theyproved this result by a reduction from the HaltingProblem. Their proof can be summarized as follows:A mapping reduction HALT TM UnpackExTMwill be given to prove that UnpackExTM isundecidable.Let f be a function that takes M,w as input andcomputes M',w' as output where M,w HALTTM if and only if M',w' UnpackExTM. TheTuring machine F given below computes f:Copyright 2018, Infonomics SocietyF “On input M,w , a valid encoding of a Turingmachine M and an input string w:1. Construct a Turing machine T:T “On input x:1. Ignore x and halt.”2. Construct the following UTM M' from M:M' is the same as M, except:for all q Q, if (q, ) goes to a halting state thenReplace this transition with a transitionthat begins simulating T on the inputtape. I.e., change the transition to (q, ) (qstart,T, — , —).3. Output M', T, w .”The output of the mapping F, a UTM M', willexecute a Turing machine T in those cases where Mwill halt on w. A decider for UnpackExTM coulddecide whether M' will execute T and so decideHALTTM. But HALTTM does not have a decider.Hence, a decider for UnpackExTM cannot exist.Therefore, UnpackExTM is undecidableHence, it turns out that determining preciselywhether a given program contains some unpackexecute behavior in it is impossible.2.5. Automatically Identifying Trigger-BasedBehaviorA common feature found in modern malware is tocontain some hidden malicious behavior that isactivated only when triggered; such behavior iscalled trigger-based behavior. Various conditions areused for triggering, such as date and time, somesystem event, or a command received over thenetwork.Brumley et al. [2] studied how to automaticallydetect and analyze trigger-based behavior inmalware. Their approach employs mixed symbolicand concrete execution to automatically exploredifferent code paths. When a path is explored, aformula is constructed representing the condition thatwould trigger execution down the path. Then a solveris employed to see whether the condition can be true,and if so, what trigger value would satisfy it.Like many other problems in malware analysis,an exact, automatic identification of trigger-basedbehavior turns out to be undecidable by a reductionfrom the halting problem. Brumley et al. [2]observed that “Identifying trigger-based behaviors inmalware is an extremely challenging task. Attackersare free to make code arbitrarily hard to analyze.This follows from the fact that, at a high level,deciding whether a piece of code contains triggerbased behavior is undecidable, e.g., the triggercondition could be anything that halts the program.Thus, a tool that uncovers all trigger-based behaviorall the time reduces to the halting problem.”590

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 20182.6. Self-ModifyingGrammarsCodeandFormalFiliol [11] studied the complexity of detectingself-modifying code (i.e., polymorphic andmetamorphic viruses) using formal grammars. Heworked on the formalization of metamorphism bymeans of formals languages and grammars. Heshowed how code mutation techniques can bemodelled by formal grammars, and how theirdetection can be converted to the problem ofdeciding a language.He modelled a self-modifying program as agrammar G2 whose language consists of grammarsthat are produced from a starting grammar G1according to the derivation rules specified by G2.This definition was used to describe the fact that thevirus kernel changes from one metamorphic form tothe other: It is both the virus code and the set ofmutation rules that change. (This view ofmetamorphic viruses resembles two-level 2VWgrammars, as Filiol pointed out.)In this context, detecting whether a given programis a form of a given metamorphic virus is an instanceof the language decision problem for Class 0 (free)grammars. Filiol showed that this problem can bereduced from the Halting Problem and hence isundecidable.2.7. NP-Complete ProblemsAlthough the general cases of the aforementionedproblems are undecidable, it turns out that it ispossible to obtain their decidable versions byassuming some bound on the time or memoryavailable to the malware.Spinellis [24] showed that a length-boundedversion of Cohen’s problem is decidable and NPcomplete, by a reduction from the BooleanSatisfiability Problem (SAT).Borello and Mé [1] showed that detecting whethera given program P is a metamorphic variant ofanother given program Q is decidable and NPcomplete.On a related venue, Fogla and Lee [12] showedthat detecting a polymorphic blending attack, whichcan blend in with normal traffic and can evade ananomaly-based IDS, is also NP-complete, by areduction from the 3-SAT problem.Song et al. [23] studied the strengths andweaknesses of polymorphic shellcode. Theydeveloped metrics to measure the capabilities ofpolymorphic engines. In the end, they concluded thatpolymorphic behavior in general is too greatly variedto be modelled and detected effectively.Bueno et al. [3] showed that the space- and timebounded versions of the unpacking problem aredecidable, and the time-bounded version is NPcomplete.Copyright 2018, Infonomics SocietyOf course, a problem’s being NP-complete ishardly good news. It is usually interpreted as that noefficient solution exists for the worst case of thatproblem. However, efficient solutions may exist forthe average case, or it can be possible to obtainreasonably good solutions by heuristics orapproximation algorithms.3. Practical SolutionsDespite the negative theoretical results onundecidability of some fundamental questions inmalware analysis, practical tools have been in actionsince the very early days of computer viruses. Bytolerating some degree of inaccuracy (i.e., toleratingsome degree of false positives or negatives, orallowing inconclusive results), it is possible to buildalgorithms that are very effective in practice. In thissection, we summarize some of the tools developedfor the problems reviewed in Section 2.3.1. DetectingMatchingMalwarebyTemplateDespite the fact that the general TemplateMatching Problem is undecidable, it is possible todetect malware using template matching algorithmsthat are mostly accurate. Christodorescu et al. [5]developed a toolkit for that purpose. The toolkitworks in two phases: First, the binary program to beanalyzed is disassembled, a control graph isconstructed, one per program function, and anintermediate representation (IR) is generated. The IRis further processed and put into an architecture- andplatform-independent form. In the second phase, theIR is compared against a given set of malwaretemplates. Each comparison either returns “yes” or“don’t know”. Suggested malware templates forcomparison include procedures such as a decryptionloop or mass mail sending.Christodorescu et al. [5] tested their tool on a realworld malware sample consisting of seven variantsof Netsky (B, C, D, O, P, T, and W), seven variantsof Bagle (I, J, N, O, P, R, and Y), and seven variantsof Sober (A, C, D, E, F, G, and I), all being emailworms with many diverse forms found in the wild.The authors tested the malware against templatescapturing the decryption loop and mass mailingfunctionalities. The tool detected all Netsky andBagle variants with 100% success. The Sober wormwas not detected due to a limitation in the prototypeimplementation, related to matching calls into , their test demonstrated the success oftheir template matching algorithm on diverse formsof malware.The tool was tested on a benign sample as well inorder to test its false positive rates. 97.78% of theprograms in the given sample were detected as591

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 2018benign after successful disassembly, while 2.22%could not be disassembled.Kwon et al. [16] developed a technique calledBinGraph, in which they leveraged the semantics inAPI call sequences of different malware families astemplates and detected metamorphic malwaresamples using signatures and template matching.They extracted signatures from subgraphs of APIcalls and used them to represent semantic behaviorsof metamorphic malware. In first phase, using theImport Address Table (IAT) from executable theyconstructed initial behavior graph with edgesrepresenting API call sequences. Followed bysubgraph extraction and graph abstraction, theystored the abstracted semantic graphs in a 128x128adjacency matrix. At the final step of sematicsignature extraction, they applied graph miningtechniques using a greedy strategy to select candidatesignatures.Kwon et al. [16] used 166 malware programs(randomly selected 20% of their malware collection)and generated 32 semantic signatures representingthis set. They compared these signatures against theset of remaining 661 (unseen) malicious and 1,202safe binaries. They achieved a 98.18% detection rateon the malicious sample and detected 649 malwareprograms, with no positive matches on the benignsample.Luh et al. [17] used sentiment analysis techniqueby collecting kernel level events in order to modelmalicious and benign behaviors. Timestampedprocess, thread, image load, file, registry andNetwork event logs were collected from theWindows kernel and runaway entries wereeliminated. Log-likelihood ratio scores for each preprocessed bi-gram event traces were calculated andused in compilation of malicious and safe semanticdictionaries. At the final step, score normalizationand adjustment was applied to calculated valuesusing whether monitored trace is a part of somemalicious event sequence or not.Safe event logs are collected from standardWindows users having more than 80 OS session andover 500 processes in each. Different types ofmalware samples like MyDoom, Zeus, Koobface,etc. were used to infect machines in controlledenvironments and their respective event traces wereused in bi-gram extraction from malicious behaviors.Luh et al. [17] achieved 98.2% accuracy formalicious behavior detection with low false positiverates. Moreover, it is reported that thresholdoptimization on determined confidence values couldincrease the performance of related classificationtechnique up to 100.0% for this particular test setused in conducted experiment.Copyright 2018, Infonomics Society3.2. Detecting Unpack-Execute BehaviorAlthough the general problem of unpack-executebehavior is undecidable, Royal et al. [21] gave analgorithm for a bounded version of this problem. Letn denote the number of instructions of a givenprogram P to execute before it halts. The programExtractUnpackedCode(P,n) works in two phases: Phase 1: Static Analysis. Program P isdisassembled to identify code and data. Blocksof code that are separated by non-instructiondata are partitioned into sequences ofinstructions. These sequences form the set I,which will be queried repeatedly in the nextphase to detect if P is executing unpacked code. Phase 2: Dynamic Analysis. Program P isexecuted one instruction at a time. The currentinstruction sequence is captured by in-memorydisassembly starting at the current value of theprogram counter until some non-instruction datais encountered. The current instruction sequenceis compared against each instruction sequence inthe set I. If the current sequence is not asubsequence of any instruction sequence in I,then it did not exist in P.Royal et al. [21] developed this algorithm into apractical tool for MS Windows systems, calledPolyUnpack. They tested the tool on the OARCmalware suspect repository and compared itsperformance with that of the Portable ExecutableIdentifier (PEiD), a popular reverse-engineering toolwhich uses a specific set of signatures to ed very well and was able to identify manysamples with unpack-execute behavior which PEiDwas unable to detect.Another technique developed by Korczynski [15]helped reconstructing packed binaries with selfmodifying code blocks up to some accuracy, easingfurther analysis on them by malware analysts. Bothstatic and dynamic analysis were used in relatedwork and binaries having self-modifying propertyand IAT destruction are focused for binaryreconstruction. Tough it is not directly related withmalware detection, Korczynski’s [15] techniquehelps further methods and algorithms to bedeveloped for bounded version of undecidableunpack-execute behavior.There are mainly two phases in general unpackingtechnique of developed by Korczynski [15]: First phase: Self-modifying code detection,where dynamically loaded modules are beingtracked and several snapshots are being nces and memory writes Second phase: Recovering import address tableusing dynamically loaded function branchesusing a heuristic filtering method592

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 2018Korczynski [15] carried out two experiments toverify that the developed technique improves selfmodifying code discovery and IAT reconstructioncompared with the clean memory dumps. 117malware samples belonging to 9 different malwarefamilies were used in the first experiment. Initially,35% of the malware had no IAT in the clean memorydumps. This sample was analyzed and successfulIAT reconstructions up to some degree wereperformed on 66% of them. In the secondexperiment, a simple so-called hello worldapplication was developed and packed with 21different packers. IAT reconstruction was partiallydone on 61% of the packed binaries.Kim et al. [14] studied distinctive properties ofobfuscation techniques applied on safe and malicioussamples and developed a technique named asDynODet, first to detect dynamic obfuscation andthen use features of present obfuscation to classifyunknown sample as either clean of malware.DynODet uses both static and dynamic analysisresults while determining whether any obfuscation ispresent or not. Static analysis leverages finding theexpected path of the program before its executionand using this discovery to compare seen actualpaths in dynamic analysis. Six different obfuscationtechniques were tracked and used by Kim et al. [14]:self-modification, section mislabel obfuscation,dynamically generated code, unconditional toconditional branch obfuscation, exception-basedobfuscation, and overlapping code sequences.Kim et al. [14] used 6,192 safe Windowsprograms in their experiment and using distinctiveobfuscation features they reduced the false positiverate nearly 70% in terms of one or more falseobfuscation detection on these safe samples. Totally100,208 malware samples used for malicious data setand using distinctive obfuscation features, one ormore of the tracked obfuscation techniques weredetected on 32.74% of the malware set with adetection rate of 2.5% on the clean set. Findingsfrom the DynODet research showed its usefulness todevelop a new detection technique using thepresence of distinctive obfuscation techniques inunknown samples.3.3. Detecting Trigger-Based BehaviorDetection of trigger-based behavior by manualanalysis is a virtually impossible task due to theintensive labor required. On the other hand, a preciseautomatic analysis is not possible either; as explainedin Section 2.5, the general problem of automaticidentification of trigger-based behavior isundecidable. Nevertheless, a great deal of help canbe obtained from automatic analysis to alleviate theburden of manual analysis. Brumley et al. [2]designed a tool for this task. Their approachconsisted of several phases: First, the different typesCopyright 2018, Infonomics Societyof triggers of interest are specified. Then, differentcode paths are explored using mixed symbolic andconcrete execution. For a path explored by thisprocess, a formula is constructed representing thecondition that would trigger execution down thepath. Then a solver is employed to see whether thecondition can be true, and if so, what trigger valuewould satisfy it.Brumley et al. [2] developed this approach into aprogram called MineSweeper. They testedMineSweeper on real-world malware containingtrigger-based behavior. On every case, MineSweeperwas able to detect the trigger condition and thetrigger-based behavior. The analysis time varieddepending on the complexity of the malware, from 2to 28 minutes. In general, MineSweeper is notguaranteed to detect every piece of malwarecontaining trigger-based behavior, but it candefinitely be used as a tool of great assistance overthe impractical alternative of manual analysis.Kang et al. [13] researched specifically botnetmalware samples and developed a trigger-basedbehavior detection technique called BotMelt. In anapproach different from [2], they used dynamicsymbolic execution (also known as symbolpropagation) where network packets were used tomark the data flow. This technique allows revealingoutbound trigger conditions different from locallydecided trigger conditions. Authors evaluated theirtechnique in terms of validness of executed codes,malicious activity detection rate and behaviordetection ratio among all possible behavior branches.four different botnet malware samples, HwDoor,Bisonal, KeyBoy, and Plez were used in evaluationexperiments, and BotMelt yielded successful resultsin all evaluation criteria.Papp et al. [18] developed a framework to detecttrigger-based malicious behavior in source code levelusing a semi-automated approach. First, automatedsource code instrumentation technique is beingapplied to discover function calls and variables thatinteract with running environment. Then freshsymbolic values are given via replaced dummyfunctions and mixed concrete and symbolicexecution tool is used to generate potentiallymalicious test cases that could trigger hiddenmalicious behavior of malware. At the final step,execution traces for each generated test case inputwere generated and passed to analysts for manualanalysis in order to classify them as either maliciousor benign.In the experiment phase, Papp et al. [18] collectedfive real-world malware samples, all written in C andincluding some kind of trigger-based behavior inimplementation.Thedevelopedframeworksucceeded in revealing trigger-based behaviors inthree of them. Moreover, unsuccessful cases ofhidden behavior detection were reported as faileddue to limitations of the open-source tools used.593

Journal of Internet Technology and Secured Transactions (JITST), Volume 7, Issue 2, June 20183.4. Malware Protection by Whitelisting andDefault Deny ApproachGiven that some fundamental problems inmalware analysis and detection are undecidable, analternative solution applied by practical securitytools (e.g., Comodo’s Endpoint Security [9]) is thedefault deny approach to protect users from malwareinfections: Rather than blocking only blacklistedmalware applications and allowing all other safe andunknown applications, only whitelisted applications[22] are permitted to run on a host’s real operatingsystem (OS). Samples blacklisted as malware areblocked by default. And, unknown programs arepermitted to run in some virtualized environmentssuch as containments, sandboxes, etc. Creating theseshadow file systems, registries, and communicationports helps blocking most damages caused bymalicious application, and correctly defining aprogram for virus detection [10].Although the default deny approach provides ahigher level of protection compared to the defaultallow approach, one of the current problems usingthese virtualized environments is encountered inapplicatio

Intractable Problems in Malware Analysis and Practical Solutions Ali Aydın Selçuk Dept. of Computer Engineering TOBB University of Economics and Technology Ankara, Turkey Fatih Orhan, Berker Batur Comodo Security Solutions, Inc. Clifton, NJ, U