N FIND To

Transcription

G .HintonCognitive Studies ProgramUniversity of Sussex, BrightonUSIN0 REUXlLTIW n FIND A PUPPE!CABS!l!RACTThe problem of finding a puppet in a configuration of overlapping, transparent rectangles is used t o show how a relaxat i o n algorithm can extract the globally best figure from anetwork of conflicting l o c a l interpretatiorm.The program takes a s input the co-ordinates of the corners ofsome overlapping, t r e n q a r e n t rectangles (See figure 1 ) . Theproblem is to find the best possible instantiation of a modelof a puppet. The d i f i i c u l t y is t h a t if we only consider arectangle and its overlapping neighbours, then each rectanglecould be several different puppet p a r t s or none at all,, aol o c a l ambiguities have to be resolved by finding the bestglobal interpretation. The aim of t h i s psper i s to show howa relaxation method can be used instead of the obvious searchthrough the space of a J l combinations of locally possible interpretations. The relaxation method has several adva tages:1. Using p a r a l l e l computation the best global i n t e r p r e bt i o n can be found quiokly. The time taken is not exponent i a l in the number of l o c a l p o s s i b i l i t i e s because combinat i o n s are not dealt w i t h explic.itJz,2. The computing space required increases only l i n e a r l ywith t h e number of poasibilities, which makes this methodb e t t e r than an exhaustive, breadth-first p a r a l l e l search,f o r which there i s a combinatorial explosion in space.3. It produces the best global interpretation, not just agood one a s in heuriatic search.All these reasons make relaxation look good a s a model ofhow the brain reerolves conflicting low-level visual hypotheses.A conventional, s e r i a l A.I. search would be very erlow, givent h e brain's sluggish hardware (Sutherland 1974).The puppet, which is ahrgrs depicted in side view, consists off i f t e e n rectangular p a r t s having t h e following pmperties'and

relationships a1. Each part has a proximal end and a distal end. Theproximal end i s the one anatomicdl4 neareet t o the top ofthe head. The length of a part, measured along theproximal-distal axis must be greater than i t e width.2. The trunk must be wider then any of the upper limbparts, and each of these, in turn, must be wider than itsconneoted lower limb-part. Also, the head end trunkmust be wider than the neck.3. The head must be greater in area than t h e neck and thelower limb-parts must be larger than t h e i r assooiatedhands o r feet.4. Anatomicdly connected parts must overlap in the rightway. This is defined by sqecjl'ying sonea in eaoh part andthan specpairs of zones, one in eaoh part, whichmust or maat not overlap. The definition of a satisfactory doint between c a l f and thigh is shown in figare 2,together with some examples and near misses.This puppet model i s f a i r l y arbitrary, but something morethan simple connectivity must be ueed t o exclude cases l i k efigure 3. One way in which people are more f l a d b l e (as p e rceiverst) i s t h a t they -1 allow some relations o r proportionst o be stretched provided the r e s t are reasonable. The implications of t h i s w i l l be discussed l a t e r .INCOMPLETE PUPPETSThe datcirstrmcture which represents the interpretation of amotangle as a puppet part i s oalled a percept and has s l o t swhich are f i l l e d by relations t o other percepts. The relationsare also represented explicitly by data-structures, which havetwo d o t s , one f o r each of the related percepts. When there isnothing better in the picture, people happily find incompletepuppets, i.e. ones in which some percepts have vacant d o t s .The program can do the same if it is given some way of evaluat i n g incomplete puppets so that it cen avoid poor global interpretations when there are better alternatives. Currently, thebest puppet is defined as the one contsining the moat anatomical relations w h i l s t sa iaFgring t h e following constraints:1. No rectangle cen be seen a8 more than one part.2. No s l o t can be f i l l e d by more then one relation,except f o r the thigh and upper-arm d o t s i n the trank,which can have two.3. No type of part oan be instantiated more times than itoccurs in t h e modelt e.g. there must not be more than two

thighs.4. A relation cannot exist unless its percepts do.This definition has its problems (see discussion section),but f o r q pictures it i s adequate.For pictures containing perfect puppets plus some optionalextra rectangles, the task can be done f a i r l y simply using atechnique l i k e the Waltz filter (Waltz 1972). 'Each rectanglei s given a list of locally poeeible percepts and then, f o rexample, a l l those potential necks which are not connected t oany potential head are deleted. Iterating t h i s process oftenproduces a unique global interpretation and always greatly reduces the search space. Unfortunately, Ff a part of thepuppet is missing, then the correct interpretations of theadjacent parts get deleted u n t i l eventually all percepts arewiped out. If the f i l t e r i n g process i s weakened sufficientlyt o prevent incorrect deletion, it loses llbost of i t s power, andwithout an effective f i l t e r , the search becomes large, sincethe possibility that the puppet i s imperfect prevents earlybacktracking.GBOWING A NETWORK OF LOCAL INTEBPRETATIONSSince the potential incompleteness of the puppet makes it hardto rule out any percepts on local grounds, the program uses thealternative approach of ruling them in, starting from localconfigurations which strongly suggest particular percepts.From these nuclei, it grows a network by attempting to f i l l thevacant s l o t s with relations t o already existing percepts. Ifthis fails, and there are suitable overlapping rectangles,relations t o freshly created percepts are used, and the others l o t s of these new percepts then act as M h e r growing points.Provided the beat instantiation of the model contains a tiemt one nucleus, tne resuiting network w i l l contain all therequired percepts. It w i l l also contain many others and somes l o t s will be f i l l e d by s e v a competinglrelations (Seefigure l b ) Generally, however, a network grown i n this wayw i l l be considerably smaller than one consisting of all thelocal possibilities.INTERACTIONS BETWEEN PEBCEPTSParallel processing must create a surfeit of local possibilit i e s t o be sure of generating the correct ones, so the timeadvantages of parallelity are l o s t unless there i s a f a s t wayof eliminating the rest. Simple local competition w i l l notwork because a correct percept sometimes has a locally betteralternative (See figure 4) but if percepts are also allowedto help one another via t h e i r relations, then support mey propagate through the network t o aid a globally consistent but,

locally inferior percept (See figure 4 again). A eystem oft h i s type, in which global patterns emerge from local interactions, i s attractive a s a basis f o r Gestalt phenomena, butodly if t h e system quickly reaches a stable s t a t e and there i ssome guarantee t h a t the best pattern emerges.A certain amount of mathematics i s required t o show how theinteractions can be organised so as to s a t i s f y these conditions.In this brief presentation, I have decided to omit proof8 andthe precise formalism they require, and concentrate instead onmaking the flavour of the ideas more easily available t o ageneral audienhk. A more formal treatment of a similar, independently developed system can be found in RosernPeld e t al.(1975), though t h e i r f a i l u r e to distinguish adequately betweenpreferences and constraints (see below) makes them abandon alinear model prematurely. . .,. .PE2WENCEbCONSTRAINT NETWORKSFinding t h e best puppet i s equivalent to extracting f r o m a n e twork whose nodes are the percepts and relations the best subnet satisfging certain constraints. If t h e value of a subnetcan be expressed as the eum of the preferences f o r i t s indivi.dual nodes, and if the constraints are equivalent to.hyperplanes in the space of possible s t a t e s (see below), then arelaxation method can be applied. Each node is given an associated r e a l number between 1 and 0 called its credibility.This quantity, which should not be confused with the preference,can be interpreted a s the current probability t h a t t h e node iscorrect, i.e. part of the best consistent subnet. The cons t r a i n t s are expressed as l i n e a r equalities and inequalitiesbetween credibiiitiee. FOT m-gu,ii 01 iii is aayr'eesad Seic(n) c(m) , 1 where c(n) .ie the credibility of node n.The c r e d i b i l i t i e s of the nodes can be represented a s theaxes of a multi-dimensional space. A oredibility distributionis then a point in the space, and a aonstraint oorresponde toa byper-plane. To satisfy an equality o r inequality constraint, a point must l i e on t h e relevant hyper-plane o r on theappropriate side of it. The s t a t e s which s a t i s f y a l l the cons t r a i n t s are called legal s t a t e s and the region of the spacemegponding t o them i s a convex polyhedron, because it isthe intersection of some byper-planes (equality constraints)and some half-spaces (inequality constraints).The value of a credibility distribution i s the scalar produet of the preference vector w i t h the credibility vector. Inspatial terms t h i s means t h a t the preferences f o r the individual nodes define a direction in the spaae of credibility distributions, and the best l e g a l s t a t e is the one furthest inthis direction. In general, this w i l l be a vertex of the.,

legal region and can be found using the Simplex algorithm, astandard technique in linear progrermaiag (Pierre 1969), o r byrelaxation which may be better given parsllel hardware.Starting w i t h any assignment of credibilities, each node inturn has its credibility altered so as t o minimitie the t o t a lamount by w h i c h the constraints involrlng the node areviolated. It may help to think of the constraints as exertingforces proportional t o the sise of the violation. These forcescause the credibility to move t o en equilibrium position inwhich they are balanced. Iterating the process f o r all thenodes, continually reduces the stam of all the violations u n t i lthere are none left.This process moves the credibilitg distribution into thelegal region. I. order to find the best point in t h i s region,much weaker forces, proportional t o the preferences are appliedduring relaxation. This dlows the preferences to pushthe cred i b i l i t y distribution In the best direction, but prevents themfrom causing significant violations.We have seen how the best legal s t a t e can be found. If itonly contains credibilities of one or zero, then the nodeswith a credibility of one definitely correspond to the beetmnsistent suhnet. If, however, intermediate credibilitiesoccur in the best state, as they do in some a s yet ill-definedcircumstances, a search must be performed by fixing some nodesa t one o r zero, and using relaxation t o find the values of theothers.A definition of the best instentiation of the puppet was givenIn the section on incomplete puppets. The constraints l i s t e dthere can be expressed in terms of credibilities aa follows:1. For percepts m m n c c l ;od S Z e f q l e ,-2.For relations competing f o r a s l o t in a percept:3. For percepts of a type-of part that occurs n times i nthe model:174. For a relation, r, between two percepts, p,q:c(r) c(p) d44 c(q)The preferences are zero f o r a l l percepts and positive andequal f o r all relations. Figures 1 and 4 show two of thepictures on which the program has been tried. So far, it hasd w foundathe best puppet, but further analysis and testing

are required. Notice how sensibly the relaxation method resolves the l o c a l ambiguities concerning the trunk i n figure 1.1,DISCUSSIONThe task which has been used t o reveel the principles of a relaxation approach, has been simplified in many ways. Oneeasily modified feature i s the laok of attention to the anglesof the knee and elbow joints. A better puppet model would require t h a t the elbows bent one way and the knees the other,and it should be possible to mobilise such knowledge t o makethe best puppet emerge. The theoretical i n t e r e s t of t h i s typeof constraint i s t h a t it is non-local, l i k e number agreementwhich i s problematic f o r context f r e e grammars ( y o n s1968).The practical solution i s t o introduce global nodes representing the side of the puppet in view. These side nodes are related t o each other by an exclusive-or constraint and each relevant relation is related to the compatible global side nodeby a material implication constraint. The best instantiationw i l l now have compatible knees and elbows. In some cases t h i si s too severe a restriction, since a broken elbow is b e t t e rthan none. So alternative weaker relations without the extraconstraints are introduced. These conflict with the strongerrelations, so i f they have lower preferences, good elbows willbeat poor ones, but poor ones w i l l beat none.A more serious simplification is t h a t relations and p r o p o rtions are either definitely satisfactory o r definitely not.Intermediate cases could be included i n the network, but thereare so many of them in a complex scene that the network wouldbecome cumbersome, and a l o t of computing would have to be donefor vgqr little &IL-I,The af.nm&ivnta st&cmating only the good percepts and relations. A s the relaxationprocess is applied, some percepts will emerge w i t h high credib i l i t i e s and t h e i r vacant s l o t s can t h a be developed. Integrating the growing and the r n i n g of the network in this way,effectively prevents a l o c a l cue which conflicts Kith globallybetter alternatives, from i n i t i a t i n g a search f o r supportingevidence. Its disadvantage is t h a t there i s no longer anyguarantee t h a t the f i n a l global interpretation i s the best one,because some of the nodes of the best possible interpretationm e y never be added t o the network.Finally, in the simple task described, there are only twolevels of structure, the puppet and the rectangles. These donot give r i s e t o situations in which higher l e v e l knowledge i srequired t o form the right lower l e v e l structures. Given animperfect l i n e drawing with occlusion, however, puppet knowledge may be required t o decide which l i n e segments form a rectangle. The obvious way of doing t h i s is t o find the bestpuppet composed of easily found rectangles, and then to search

f o r poorly depicted parts t o aomplete it. A different andnovel method i s t o create many p o t e n t i d reatangles and to setup constraints between them. Bectangles which conflict overthe interpretation of a l i n e segment, f o r example, cannot bothdepiat parts of the best puppet. Instead of using relaxationimmediately t o find the best consistent set of rectangles, a l lthe possibilities can be used t o find potential percepts andrelations. The percepts can then be linked by material implication constraints t o t h e i r comespondbg rectangles, thuscreating a larger network,containing conflicting structures atmrnning the whole l o t at once, the bestseveral levels.puppet can be found, and higher level knowledge aan be made t oinfluence the choice of rectangles.Lyons, J (1968) Introduction t o Theoretical Linguistics.Cambridge University Press.Pierre,D.A.(1969) Opthisation Theorg with Applications.John Wiley and Sons.Rosenfeld,A., B.A. Hummel & S.W.Zucker(1975) SameLabelling by Relaxation Operations. Technical Report TR379, Computer Science Centre, University of Margland,College Park.Sutherland,N .So (1974) Intelligent Picture ProcessManuscript t o be published i n s Sutherland,N.S (ed.Tutorial Essays i n Psychology. Lamy &lbaum Associates.Washington.Waltr,D.(1975) Understanding Line Drawings of Scenes withShadows, in The Psychology of Computer Vision, Winston,P.H.(ed.) HcGraw H i l l s New York. 3I would l i k e t o thank all the members of the A.I. community atSussex, particularly Christopher Longuet-liggins and AaronSloman, f o r t h e i r helpful criticism and usaFul suggestions.

Figure l a .LOCALA puppet with some extra rectangles.FOB F BEFOIE BET AXBTION:FMlWN TRUNK: NECK I U F P E U W G H I JKTHIGH G D XF UP TRUNK: NZCK D UPPERABM Q D 33 THIGH G H I JKF DOWN HEAD8 NECK DF UP TBTJNKa NECK D UPWUW4 Q E THIGH K IFiarre lb. The local hypotheses for a rectangle before andafter relaxation. The second column indicates the directionin the picture, in which the proximal end of the percept i sfacing. Directly related percepts are ahown after the colon.

DISTALPR09[PIBLrnEmPROF-e2a.HALFDISTAL HALFFour zones of a percept.CBLFPBOX. ENDI THIGHIDIST. EtlDIDIST. HALF1 DIST. ENDI OVERLAP?I S T OVEXGAPIMUSTBOT OVBITJLP/tionship in terms of zone relationships.Figure 2c. Two examples of a satisfactory knee joint (top)and three near misses. An arrow indicates the distalproximal direction. The thigh i s always the wider of the two.

portions.Relaxation picks out the interpretation of A as athough a c a l f i s a locally better alternative.157

The puppet, which is ahrgrs depicted in side view, consists of fifteen rectangular parts having the following pmperties'and . relationships a 1. Each part has a proximal end and a distal end. The proximal end is the one anatomicdl4 neareet to the